Skip to content

Commit

Permalink
ENH Update drug categorization logic (#54)
Browse files Browse the repository at this point in the history
- Always return immediate child of 'antibiotic molecule' for drug categorization
- Antibiotic mixture ARO mapping is excluded for all cases. Return constituent drug classes instead
- Use 'has_part' relationship to handle antibiotic mixtures and give drug class rather than 'antibiotic mixture'
  • Loading branch information
Vedanth-Ramji authored Jun 24, 2024
1 parent 9b03335 commit 3394a82
Show file tree
Hide file tree
Showing 21 changed files with 912 additions and 823 deletions.
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,9 @@

### Update drug categorization
- confers_resistance_to() now gets drugs for the whole AMR gene family. For example, OXA-19 previously only returned cephalosporin and penam, but now will also return oxacillin (from AMR gene family).
- Implementation of drugs_to_drug_classes() has also been fixed. Previously, the drug class was obtained from the superclasses of the drugs list passed without a thorough check if the drug class was the immediate child of 'antibiotic molecule'. These checks have now been put in place.
- drugs_to_drug_classes() also uses the 'has_part' ARO relationship now to get drug classes for antibiotic mixtures. In case of antibiotic mixtures, the drug classes of the drugs associated with 'has_part' are returned rather than 'antibiotic mixture' (ARO:3000707).
- 'antibiotic mixture' will not be reported as a drug class, rather the individual antibiotic classes making up the antibiotic mixture will be reported.

### Manual curation
- argannot_curation: (Tet)tetH:EF460464:6286-7839:1554 was incorrectly annotated as ARO:3004797 which is a beta-lactamase due to a loose RGI hit. This was manually curated to ARO:3000175.
Expand Down
40 changes: 35 additions & 5 deletions argnorm/drug_categorization.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,26 @@
ARO = lib.get_aro_ontology()
confers_resistance_to_drug_class_rel = ARO.get_relationship('confers_resistance_to_drug_class')
confers_resistance_to_antibiotic_rel = ARO.get_relationship('confers_resistance_to_antibiotic')
has_part_rel = ARO.get_relationship('has_part')

def _get_drug_classes(super_classes_list: List[str]) -> List[str]:
"""
- Helper function to traverse up and record immediate child of 'antibiotic molecule' in ARO
- Traverses up ARO until immediate child of 'antibiotic molecule' class reached and 'antibiotic mixture' class not reached
- antibiotic molecule -> ARO:1000003
- antibiotic mixture -> ARO:3000707
"""
output = []

for super_class in super_classes_list:
super_class_classes = list(super_class.superclasses(1))
antibiotic_molecule_node = [ARO['ARO:1000003']]

# checking if immediate child of 'antibiotic molecule' is reached & it is not 'antibiotic mixture'
if super_class_classes[1:] == antibiotic_molecule_node and super_class.id != 'ARO:3000707':
output.append(super_class.id)

return output

def confers_resistance_to(aro_num: str) -> List[str]:
'''
Expand Down Expand Up @@ -55,15 +75,25 @@ def drugs_to_drug_classes(drugs_list: List[str]) -> List[str]:
to the function in the drugs_list.
'''
drug_classes = []
temp_drug_classes = []

for drug in drugs_list:
drug_instance = ARO[drug]
drug_instance_superclasses = list(drug_instance.superclasses())
superclasses_len = len(drug_instance_superclasses)
temp_drug_classes += _get_drug_classes(drug_instance_superclasses)

has_part_nodes = drug_instance.relationships.get(has_part_rel, [])
for has_part_node in has_part_nodes:
has_part_node_superclasses = list(has_part_node.superclasses())[1:]

for super_class in has_part_node_superclasses:
super_class_categories = list(super_class.superclasses())
temp_drug_classes += _get_drug_classes(super_class_categories)

if temp_drug_classes == []:
temp_drug_classes.append(drug_instance.id)

if superclasses_len >= 3:
drug_classes.append(drug_instance_superclasses[superclasses_len - 3].id)
else:
drug_classes.append(drug_instance_superclasses[0].id)
drug_classes += list(set(temp_drug_classes))
temp_drug_classes = []

return sorted(drug_classes)
220 changes: 110 additions & 110 deletions outputs/hamronized/abricate.argannot.tsv

Large diffs are not rendered by default.

220 changes: 110 additions & 110 deletions outputs/hamronized/abricate.megares.tsv

Large diffs are not rendered by default.

184 changes: 92 additions & 92 deletions outputs/hamronized/abricate.ncbi.tsv

Large diffs are not rendered by default.

228 changes: 114 additions & 114 deletions outputs/hamronized/abricate.resfinder.tsv

Large diffs are not rendered by default.

66 changes: 33 additions & 33 deletions outputs/hamronized/abricate.resfinderfg.tsv

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion outputs/hamronized/amrfinderplus.ncbi.orfs.tsv
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ amrfinderplus.ncbi.orfs.tsv tet(Q) tetracycline resistance ribosomal protection
amrfinderplus.ncbi.orfs.tsv bexA multidrug efflux MATE transporter BexA NCBI Reference Gene Database 2023-Nov-01 BAB64566.1 amrfinderplus 3.10.30 gene_presence_detected EFFLUX 80.36 EFFLUX 18 1085 356 k119_41685 443 - 92.98 ARO:3003953 ARO:0000045,ARO:3000662 ARO:0000001,ARO:3005386
amrfinderplus.ncbi.orfs.tsv lnu(C) lincosamide nucleotidyltransferase Lnu(C) NCBI Reference Gene Database 2023-Nov-01 WP_063851341.1 amrfinderplus 3.10.30 gene_presence_detected LINCOSAMIDE 100.0 LINCOSAMIDE 234 725 164 k119_46979 164 - 97.56 ARO:3002837 ARO:0000046 ARO:0000017
amrfinderplus.ncbi.orfs.tsv sat4 streptothricin N-acetyltransferase Sat4 NCBI Reference Gene Database 2023-Nov-01 WP_000627290.1 amrfinderplus 3.10.30 gene_presence_detected STREPTOTHRICIN 86.11 STREPTOTHRICIN 8 472 155 k119_47732 180 - 100.0 ARO:3002897 ARO:0000012 ARO:3000034
amrfinderplus.ncbi.orfs.tsv aph(3')-IIIa aminoglycoside O-phosphotransferase APH(3')-IIIa NCBI Reference Gene Database 2023-Nov-01 WP_001096887.1 amrfinderplus 3.10.30 gene_presence_detected AMIKACIN/KANAMYCIN 100.0 AMINOGLYCOSIDE 207 998 264 k119_48139 264 - 100.0 ARO:3002647 ARO:0000005,ARO:0000013,ARO:0000021,ARO:0000024,ARO:0000049,ARO:3000652,ARO:3000655,ARO:3000657,ARO:3000658 ARO:0000016,ARO:0000016,ARO:0000016,ARO:0000016,ARO:0000016,ARO:0000016,ARO:0000016,ARO:0000016,ARO:3000707
amrfinderplus.ncbi.orfs.tsv aph(3')-IIIa aminoglycoside O-phosphotransferase APH(3')-IIIa NCBI Reference Gene Database 2023-Nov-01 WP_001096887.1 amrfinderplus 3.10.30 gene_presence_detected AMIKACIN/KANAMYCIN 100.0 AMINOGLYCOSIDE 207 998 264 k119_48139 264 - 100.0 ARO:3002647 ARO:0000005,ARO:0000013,ARO:0000021,ARO:0000024,ARO:0000049,ARO:3000652,ARO:3000655,ARO:3000657,ARO:3000658 ARO:0000016,ARO:0000016,ARO:0000016,ARO:0000016,ARO:0000016,ARO:0000016,ARO:0000016,ARO:0000016,ARO:0000016
amrfinderplus.ncbi.orfs.tsv aadS aminoglycoside 6-adenylyltransferase AadS NCBI Reference Gene Database 2023-Nov-01 WP_003013318.1 amrfinderplus 3.10.30 gene_presence_detected STREPTOMYCIN 100.0 AMINOGLYCOSIDE 34628 35488 287 k119_48233 287 + 100.0 ARO:3004683 ARO:0000040 ARO:0000016
amrfinderplus.ncbi.orfs.tsv tet(X2) tetracycline-inactivating monooxygenase Tet(X2) NCBI Reference Gene Database 2023-Nov-01 WP_008651082.1 amrfinderplus 3.10.30 gene_presence_detected TETRACYCLINE 100.0 TETRACYCLINE 12370 13533 388 k119_48273 388 + 99.74 ARO:3000205 ARO:0000030,ARO:0000051,ARO:0000069,ARO:3000152,ARO:3000528,ARO:3000667,ARO:3000668 ARO:3000050,ARO:3000050,ARO:3000050,ARO:3000050,ARO:3000050,ARO:3000050,ARO:3000050
amrfinderplus.ncbi.orfs.tsv tet(O) tetracycline resistance ribosomal protection protein Tet(O) NCBI Reference Gene Database 2023-Nov-01 WP_014636291.1 amrfinderplus 3.10.30 gene_presence_detected TETRACYCLINE 100.0 TETRACYCLINE 978 2894 639 k119_60190 639 + 99.22 ARO:3000190 ARO:0000051,ARO:0000069,ARO:3000152,ARO:3000528,ARO:3000667,ARO:3000668 ARO:3000050,ARO:3000050,ARO:3000050,ARO:3000050,ARO:3000050,ARO:3000050
Expand Down
Loading

0 comments on commit 3394a82

Please sign in to comment.