diff --git a/articles/examples.html b/articles/examples.html index 1cf2b9c..070b3ea 100644 --- a/articles/examples.html +++ b/articles/examples.html @@ -89,7 +89,7 @@

Introduction

Kamil Slowikowski

-

2023-11-27

+

2023-11-29

hlabud is an R package that provides functions to facilitate download and analysis of human leukocyte antigen (HLA) genotype sequence alignments from IMGTHLA in R.

diff --git a/articles/visualize-hla-structure.html b/articles/visualize-hla-structure.html index 55ceeb6..5c12383 100644 --- a/articles/visualize-hla-structure.html +++ b/articles/visualize-hla-structure.html @@ -88,7 +88,7 @@

Introduction

Kamil Slowikowski

-

2023-11-27

+

2023-11-29

In this vignette, we explore a few different methods for visualizing the molecular structure of HLA proteins. First, we’ll look at an example of how to use the NGLVieweR R package to @@ -164,8 +164,8 @@

Using NGLVieweR z_offSet = -20 ) %>% setSpin() -
-

In the view above, we see the blue peptide and the red HLA-B protein. +

+

In the view above, we see the blue peptide and the red HLA-B protein. The tyrosine at position 9 is highlighted with a ball+stick representation, and it is also labeled with a text label. The structure is rotating so we can getter a better view.

diff --git a/pkgdown.yml b/pkgdown.yml index 50f9856..27dd21e 100644 --- a/pkgdown.yml +++ b/pkgdown.yml @@ -4,7 +4,7 @@ pkgdown_sha: ~ articles: examples: examples.html visualize-hla-structure: visualize-hla-structure.html -last_built: 2023-11-27T23:51Z +last_built: 2023-11-29T16:25Z urls: reference: https://slowkow.github.io/hlabud/reference article: https://slowkow.github.io/hlabud/articles diff --git a/search.json b/search.json index 6081911..150d748 100644 --- a/search.json +++ b/search.json @@ -1 +1 @@ -[{"path":"https://slowkow.github.io/hlabud/LICENSE.html","id":null,"dir":"","previous_headings":"","what":"GNU General Public License","title":"GNU General Public License","text":"Version 3, 29 June 2007Copyright © 2007 Free Software Foundation, Inc.  Everyone permitted copy distribute verbatim copies license document, changing allowed.","code":""},{"path":"https://slowkow.github.io/hlabud/LICENSE.html","id":"preamble","dir":"","previous_headings":"","what":"Preamble","title":"GNU General Public License","text":"GNU General Public License free, copyleft license software kinds works. licenses software practical works designed take away freedom share change works. contrast, GNU General Public License intended guarantee freedom share change versions program–make sure remains free software users. , Free Software Foundation, use GNU General Public License software; applies also work released way authors. can apply programs, . speak free software, referring freedom, price. General Public Licenses designed make sure freedom distribute copies free software (charge wish), receive source code can get want , can change software use pieces new free programs, know can things. protect rights, need prevent others denying rights asking surrender rights. Therefore, certain responsibilities distribute copies software, modify : responsibilities respect freedom others. example, distribute copies program, whether gratis fee, must pass recipients freedoms received. must make sure , , receive can get source code. must show terms know rights. Developers use GNU GPL protect rights two steps: (1) assert copyright software, (2) offer License giving legal permission copy, distribute /modify . developers’ authors’ protection, GPL clearly explains warranty free software. users’ authors’ sake, GPL requires modified versions marked changed, problems attributed erroneously authors previous versions. devices designed deny users access install run modified versions software inside , although manufacturer can . fundamentally incompatible aim protecting users’ freedom change software. systematic pattern abuse occurs area products individuals use, precisely unacceptable. Therefore, designed version GPL prohibit practice products. problems arise substantially domains, stand ready extend provision domains future versions GPL, needed protect freedom users. Finally, every program threatened constantly software patents. States allow patents restrict development use software general-purpose computers, , wish avoid special danger patents applied free program make effectively proprietary. prevent , GPL assures patents used render program non-free. precise terms conditions copying, distribution modification follow.","code":""},{"path":[]},{"path":"https://slowkow.github.io/hlabud/LICENSE.html","id":"id_0-definitions","dir":"","previous_headings":"TERMS AND CONDITIONS","what":"0. Definitions","title":"GNU General Public License","text":"“License” refers version 3 GNU General Public License. “Copyright” also means copyright-like laws apply kinds works, semiconductor masks. “Program” refers copyrightable work licensed License. licensee addressed “”. “Licensees” “recipients” may individuals organizations. “modify” work means copy adapt part work fashion requiring copyright permission, making exact copy. resulting work called “modified version” earlier work work “based ” earlier work. “covered work” means either unmodified Program work based Program. “propagate” work means anything , without permission, make directly secondarily liable infringement applicable copyright law, except executing computer modifying private copy. Propagation includes copying, distribution (without modification), making available public, countries activities well. “convey” work means kind propagation enables parties make receive copies. Mere interaction user computer network, transfer copy, conveying. interactive user interface displays “Appropriate Legal Notices” extent includes convenient prominently visible feature (1) displays appropriate copyright notice, (2) tells user warranty work (except extent warranties provided), licensees may convey work License, view copy License. interface presents list user commands options, menu, prominent item list meets criterion.","code":""},{"path":"https://slowkow.github.io/hlabud/LICENSE.html","id":"id_1-source-code","dir":"","previous_headings":"TERMS AND CONDITIONS","what":"1. Source Code","title":"GNU General Public License","text":"“source code” work means preferred form work making modifications . “Object code” means non-source form work. “Standard Interface” means interface either official standard defined recognized standards body, , case interfaces specified particular programming language, one widely used among developers working language. “System Libraries” executable work include anything, work whole, () included normal form packaging Major Component, part Major Component, (b) serves enable use work Major Component, implement Standard Interface implementation available public source code form. “Major Component”, context, means major essential component (kernel, window system, ) specific operating system () executable work runs, compiler used produce work, object code interpreter used run . “Corresponding Source” work object code form means source code needed generate, install, (executable work) run object code modify work, including scripts control activities. However, include work’s System Libraries, general-purpose tools generally available free programs used unmodified performing activities part work. example, Corresponding Source includes interface definition files associated source files work, source code shared libraries dynamically linked subprograms work specifically designed require, intimate data communication control flow subprograms parts work. Corresponding Source need include anything users can regenerate automatically parts Corresponding Source. Corresponding Source work source code form work.","code":""},{"path":"https://slowkow.github.io/hlabud/LICENSE.html","id":"id_2-basic-permissions","dir":"","previous_headings":"TERMS AND CONDITIONS","what":"2. Basic Permissions","title":"GNU General Public License","text":"rights granted License granted term copyright Program, irrevocable provided stated conditions met. License explicitly affirms unlimited permission run unmodified Program. output running covered work covered License output, given content, constitutes covered work. License acknowledges rights fair use equivalent, provided copyright law. may make, run propagate covered works convey, without conditions long license otherwise remains force. may convey covered works others sole purpose make modifications exclusively , provide facilities running works, provided comply terms License conveying material control copyright. thus making running covered works must exclusively behalf, direction control, terms prohibit making copies copyrighted material outside relationship . Conveying circumstances permitted solely conditions stated . Sublicensing allowed; section 10 makes unnecessary.","code":""},{"path":"https://slowkow.github.io/hlabud/LICENSE.html","id":"id_3-protecting-users-legal-rights-from-anti-circumvention-law","dir":"","previous_headings":"TERMS AND CONDITIONS","what":"3. Protecting Users’ Legal Rights From Anti-Circumvention Law","title":"GNU General Public License","text":"covered work shall deemed part effective technological measure applicable law fulfilling obligations article 11 WIPO copyright treaty adopted 20 December 1996, similar laws prohibiting restricting circumvention measures. convey covered work, waive legal power forbid circumvention technological measures extent circumvention effected exercising rights License respect covered work, disclaim intention limit operation modification work means enforcing, work’s users, third parties’ legal rights forbid circumvention technological measures.","code":""},{"path":"https://slowkow.github.io/hlabud/LICENSE.html","id":"id_4-conveying-verbatim-copies","dir":"","previous_headings":"TERMS AND CONDITIONS","what":"4. Conveying Verbatim Copies","title":"GNU General Public License","text":"may convey verbatim copies Program’s source code receive , medium, provided conspicuously appropriately publish copy appropriate copyright notice; keep intact notices stating License non-permissive terms added accord section 7 apply code; keep intact notices absence warranty; give recipients copy License along Program. may charge price price copy convey, may offer support warranty protection fee.","code":""},{"path":"https://slowkow.github.io/hlabud/LICENSE.html","id":"id_5-conveying-modified-source-versions","dir":"","previous_headings":"TERMS AND CONDITIONS","what":"5. Conveying Modified Source Versions","title":"GNU General Public License","text":"may convey work based Program, modifications produce Program, form source code terms section 4, provided also meet conditions: ) work must carry prominent notices stating modified , giving relevant date. b) work must carry prominent notices stating released License conditions added section 7. requirement modifies requirement section 4 “keep intact notices”. c) must license entire work, whole, License anyone comes possession copy. License therefore apply, along applicable section 7 additional terms, whole work, parts, regardless packaged. License gives permission license work way, invalidate permission separately received . d) work interactive user interfaces, must display Appropriate Legal Notices; however, Program interactive interfaces display Appropriate Legal Notices, work need make . compilation covered work separate independent works, nature extensions covered work, combined form larger program, volume storage distribution medium, called “aggregate” compilation resulting copyright used limit access legal rights compilation’s users beyond individual works permit. Inclusion covered work aggregate cause License apply parts aggregate.","code":""},{"path":"https://slowkow.github.io/hlabud/LICENSE.html","id":"id_6-conveying-non-source-forms","dir":"","previous_headings":"TERMS AND CONDITIONS","what":"6. Conveying Non-Source Forms","title":"GNU General Public License","text":"may convey covered work object code form terms sections 4 5, provided also convey machine-readable Corresponding Source terms License, one ways: ) Convey object code , embodied , physical product (including physical distribution medium), accompanied Corresponding Source fixed durable physical medium customarily used software interchange. b) Convey object code , embodied , physical product (including physical distribution medium), accompanied written offer, valid least three years valid long offer spare parts customer support product model, give anyone possesses object code either (1) copy Corresponding Source software product covered License, durable physical medium customarily used software interchange, price reasonable cost physically performing conveying source, (2) access copy Corresponding Source network server charge. c) Convey individual copies object code copy written offer provide Corresponding Source. alternative allowed occasionally noncommercially, received object code offer, accord subsection 6b. d) Convey object code offering access designated place (gratis charge), offer equivalent access Corresponding Source way place charge. need require recipients copy Corresponding Source along object code. place copy object code network server, Corresponding Source may different server (operated third party) supports equivalent copying facilities, provided maintain clear directions next object code saying find Corresponding Source. Regardless server hosts Corresponding Source, remain obligated ensure available long needed satisfy requirements. e) Convey object code using peer--peer transmission, provided inform peers object code Corresponding Source work offered general public charge subsection 6d. separable portion object code, whose source code excluded Corresponding Source System Library, need included conveying object code work. “User Product” either (1) “consumer product”, means tangible personal property normally used personal, family, household purposes, (2) anything designed sold incorporation dwelling. determining whether product consumer product, doubtful cases shall resolved favor coverage. particular product received particular user, “normally used” refers typical common use class product, regardless status particular user way particular user actually uses, expects expected use, product. product consumer product regardless whether product substantial commercial, industrial non-consumer uses, unless uses represent significant mode use product. “Installation Information” User Product means methods, procedures, authorization keys, information required install execute modified versions covered work User Product modified version Corresponding Source. information must suffice ensure continued functioning modified object code case prevented interfered solely modification made. convey object code work section , , specifically use , User Product, conveying occurs part transaction right possession use User Product transferred recipient perpetuity fixed term (regardless transaction characterized), Corresponding Source conveyed section must accompanied Installation Information. requirement apply neither third party retains ability install modified object code User Product (example, work installed ROM). requirement provide Installation Information include requirement continue provide support service, warranty, updates work modified installed recipient, User Product modified installed. Access network may denied modification materially adversely affects operation network violates rules protocols communication across network. Corresponding Source conveyed, Installation Information provided, accord section must format publicly documented (implementation available public source code form), must require special password key unpacking, reading copying.","code":""},{"path":"https://slowkow.github.io/hlabud/LICENSE.html","id":"id_7-additional-terms","dir":"","previous_headings":"TERMS AND CONDITIONS","what":"7. Additional Terms","title":"GNU General Public License","text":"“Additional permissions” terms supplement terms License making exceptions one conditions. Additional permissions applicable entire Program shall treated though included License, extent valid applicable law. additional permissions apply part Program, part may used separately permissions, entire Program remains governed License without regard additional permissions. convey copy covered work, may option remove additional permissions copy, part . (Additional permissions may written require removal certain cases modify work.) may place additional permissions material, added covered work, can give appropriate copyright permission. Notwithstanding provision License, material add covered work, may (authorized copyright holders material) supplement terms License terms: ) Disclaiming warranty limiting liability differently terms sections 15 16 License; b) Requiring preservation specified reasonable legal notices author attributions material Appropriate Legal Notices displayed works containing ; c) Prohibiting misrepresentation origin material, requiring modified versions material marked reasonable ways different original version; d) Limiting use publicity purposes names licensors authors material; e) Declining grant rights trademark law use trade names, trademarks, service marks; f) Requiring indemnification licensors authors material anyone conveys material (modified versions ) contractual assumptions liability recipient, liability contractual assumptions directly impose licensors authors. non-permissive additional terms considered “restrictions” within meaning section 10. Program received , part , contains notice stating governed License along term restriction, may remove term. license document contains restriction permits relicensing conveying License, may add covered work material governed terms license document, provided restriction survive relicensing conveying. add terms covered work accord section, must place, relevant source files, statement additional terms apply files, notice indicating find applicable terms. Additional terms, permissive non-permissive, may stated form separately written license, stated exceptions; requirements apply either way.","code":""},{"path":"https://slowkow.github.io/hlabud/LICENSE.html","id":"id_8-termination","dir":"","previous_headings":"TERMS AND CONDITIONS","what":"8. Termination","title":"GNU General Public License","text":"may propagate modify covered work except expressly provided License. attempt otherwise propagate modify void, automatically terminate rights License (including patent licenses granted third paragraph section 11). However, cease violation License, license particular copyright holder reinstated () provisionally, unless copyright holder explicitly finally terminates license, (b) permanently, copyright holder fails notify violation reasonable means prior 60 days cessation. Moreover, license particular copyright holder reinstated permanently copyright holder notifies violation reasonable means, first time received notice violation License (work) copyright holder, cure violation prior 30 days receipt notice. Termination rights section terminate licenses parties received copies rights License. rights terminated permanently reinstated, qualify receive new licenses material section 10.","code":""},{"path":"https://slowkow.github.io/hlabud/LICENSE.html","id":"id_9-acceptance-not-required-for-having-copies","dir":"","previous_headings":"TERMS AND CONDITIONS","what":"9. Acceptance Not Required for Having Copies","title":"GNU General Public License","text":"required accept License order receive run copy Program. Ancillary propagation covered work occurring solely consequence using peer--peer transmission receive copy likewise require acceptance. However, nothing License grants permission propagate modify covered work. actions infringe copyright accept License. Therefore, modifying propagating covered work, indicate acceptance License .","code":""},{"path":"https://slowkow.github.io/hlabud/LICENSE.html","id":"id_10-automatic-licensing-of-downstream-recipients","dir":"","previous_headings":"TERMS AND CONDITIONS","what":"10. Automatic Licensing of Downstream Recipients","title":"GNU General Public License","text":"time convey covered work, recipient automatically receives license original licensors, run, modify propagate work, subject License. responsible enforcing compliance third parties License. “entity transaction” transaction transferring control organization, substantially assets one, subdividing organization, merging organizations. propagation covered work results entity transaction, party transaction receives copy work also receives whatever licenses work party’s predecessor interest give previous paragraph, plus right possession Corresponding Source work predecessor interest, predecessor can get reasonable efforts. may impose restrictions exercise rights granted affirmed License. example, may impose license fee, royalty, charge exercise rights granted License, may initiate litigation (including cross-claim counterclaim lawsuit) alleging patent claim infringed making, using, selling, offering sale, importing Program portion .","code":""},{"path":"https://slowkow.github.io/hlabud/LICENSE.html","id":"id_11-patents","dir":"","previous_headings":"TERMS AND CONDITIONS","what":"11. Patents","title":"GNU General Public License","text":"“contributor” copyright holder authorizes use License Program work Program based. work thus licensed called contributor’s “contributor version”. contributor’s “essential patent claims” patent claims owned controlled contributor, whether already acquired hereafter acquired, infringed manner, permitted License, making, using, selling contributor version, include claims infringed consequence modification contributor version. purposes definition, “control” includes right grant patent sublicenses manner consistent requirements License. contributor grants non-exclusive, worldwide, royalty-free patent license contributor’s essential patent claims, make, use, sell, offer sale, import otherwise run, modify propagate contents contributor version. following three paragraphs, “patent license” express agreement commitment, however denominated, enforce patent (express permission practice patent covenant sue patent infringement). “grant” patent license party means make agreement commitment enforce patent party. convey covered work, knowingly relying patent license, Corresponding Source work available anyone copy, free charge terms License, publicly available network server readily accessible means, must either (1) cause Corresponding Source available, (2) arrange deprive benefit patent license particular work, (3) arrange, manner consistent requirements License, extend patent license downstream recipients. “Knowingly relying” means actual knowledge , patent license, conveying covered work country, recipient’s use covered work country, infringe one identifiable patents country reason believe valid. , pursuant connection single transaction arrangement, convey, propagate procuring conveyance , covered work, grant patent license parties receiving covered work authorizing use, propagate, modify convey specific copy covered work, patent license grant automatically extended recipients covered work works based . patent license “discriminatory” include within scope coverage, prohibits exercise , conditioned non-exercise one rights specifically granted License. may convey covered work party arrangement third party business distributing software, make payment third party based extent activity conveying work, third party grants, parties receive covered work , discriminatory patent license () connection copies covered work conveyed (copies made copies), (b) primarily connection specific products compilations contain covered work, unless entered arrangement, patent license granted, prior 28 March 2007. Nothing License shall construed excluding limiting implied license defenses infringement may otherwise available applicable patent law.","code":""},{"path":"https://slowkow.github.io/hlabud/LICENSE.html","id":"id_12-no-surrender-of-others-freedom","dir":"","previous_headings":"TERMS AND CONDITIONS","what":"12. No Surrender of Others’ Freedom","title":"GNU General Public License","text":"conditions imposed (whether court order, agreement otherwise) contradict conditions License, excuse conditions License. convey covered work satisfy simultaneously obligations License pertinent obligations, consequence may convey . example, agree terms obligate collect royalty conveying convey Program, way satisfy terms License refrain entirely conveying Program.","code":""},{"path":"https://slowkow.github.io/hlabud/LICENSE.html","id":"id_13-use-with-the-gnu-affero-general-public-license","dir":"","previous_headings":"TERMS AND CONDITIONS","what":"13. Use with the GNU Affero General Public License","title":"GNU General Public License","text":"Notwithstanding provision License, permission link combine covered work work licensed version 3 GNU Affero General Public License single combined work, convey resulting work. terms License continue apply part covered work, special requirements GNU Affero General Public License, section 13, concerning interaction network apply combination .","code":""},{"path":"https://slowkow.github.io/hlabud/LICENSE.html","id":"id_14-revised-versions-of-this-license","dir":"","previous_headings":"TERMS AND CONDITIONS","what":"14. Revised Versions of this License","title":"GNU General Public License","text":"Free Software Foundation may publish revised /new versions GNU General Public License time time. new versions similar spirit present version, may differ detail address new problems concerns. version given distinguishing version number. Program specifies certain numbered version GNU General Public License “later version” applies , option following terms conditions either numbered version later version published Free Software Foundation. Program specify version number GNU General Public License, may choose version ever published Free Software Foundation. Program specifies proxy can decide future versions GNU General Public License can used, proxy’s public statement acceptance version permanently authorizes choose version Program. Later license versions may give additional different permissions. However, additional obligations imposed author copyright holder result choosing follow later version.","code":""},{"path":"https://slowkow.github.io/hlabud/LICENSE.html","id":"id_15-disclaimer-of-warranty","dir":"","previous_headings":"TERMS AND CONDITIONS","what":"15. Disclaimer of Warranty","title":"GNU General Public License","text":"WARRANTY PROGRAM, EXTENT PERMITTED APPLICABLE LAW. EXCEPT OTHERWISE STATED WRITING COPYRIGHT HOLDERS /PARTIES PROVIDE PROGRAM “” WITHOUT WARRANTY KIND, EITHER EXPRESSED IMPLIED, INCLUDING, LIMITED , IMPLIED WARRANTIES MERCHANTABILITY FITNESS PARTICULAR PURPOSE. ENTIRE RISK QUALITY PERFORMANCE PROGRAM . PROGRAM PROVE DEFECTIVE, ASSUME COST NECESSARY SERVICING, REPAIR CORRECTION.","code":""},{"path":"https://slowkow.github.io/hlabud/LICENSE.html","id":"id_16-limitation-of-liability","dir":"","previous_headings":"TERMS AND CONDITIONS","what":"16. Limitation of Liability","title":"GNU General Public License","text":"EVENT UNLESS REQUIRED APPLICABLE LAW AGREED WRITING COPYRIGHT HOLDER, PARTY MODIFIES /CONVEYS PROGRAM PERMITTED , LIABLE DAMAGES, INCLUDING GENERAL, SPECIAL, INCIDENTAL CONSEQUENTIAL DAMAGES ARISING USE INABILITY USE PROGRAM (INCLUDING LIMITED LOSS DATA DATA RENDERED INACCURATE LOSSES SUSTAINED THIRD PARTIES FAILURE PROGRAM OPERATE PROGRAMS), EVEN HOLDER PARTY ADVISED POSSIBILITY DAMAGES.","code":""},{"path":"https://slowkow.github.io/hlabud/LICENSE.html","id":"id_17-interpretation-of-sections-15-and-16","dir":"","previous_headings":"TERMS AND CONDITIONS","what":"17. Interpretation of Sections 15 and 16","title":"GNU General Public License","text":"disclaimer warranty limitation liability provided given local legal effect according terms, reviewing courts shall apply local law closely approximates absolute waiver civil liability connection Program, unless warranty assumption liability accompanies copy Program return fee. END TERMS CONDITIONS","code":""},{"path":"https://slowkow.github.io/hlabud/LICENSE.html","id":"how-to-apply-these-terms-to-your-new-programs","dir":"","previous_headings":"","what":"How to Apply These Terms to Your New Programs","title":"GNU General Public License","text":"develop new program, want greatest possible use public, best way achieve make free software everyone can redistribute change terms. , attach following notices program. safest attach start source file effectively state exclusion warranty; file least “copyright” line pointer full notice found. Also add information contact electronic paper mail. program terminal interaction, make output short notice like starts interactive mode: hypothetical commands show w show c show appropriate parts General Public License. course, program’s commands might different; GUI interface, use “box”. also get employer (work programmer) school, , sign “copyright disclaimer” program, necessary. information , apply follow GNU GPL, see . GNU General Public License permit incorporating program proprietary programs. program subroutine library, may consider useful permit linking proprietary applications library. want , use GNU Lesser General Public License instead License. first, please read .","code":" Copyright (C) This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see . Copyright (C) This program comes with ABSOLUTELY NO WARRANTY; for details type 'show w'. This is free software, and you are welcome to redistribute it under certain conditions; type 'show c' for details."},{"path":"https://slowkow.github.io/hlabud/articles/examples.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"hlabud usage examples","text":"Kamil Slowikowski 2023-11-27 hlabud R package provides functions facilitate download analysis human leukocyte antigen (HLA) genotype sequence alignments IMGTHLA R. Let’s consider question might want answer HLA genotypes. amino acid positions different DRB1*04:174 DRB1*15:152 genotypes? hlabud, can find answer lines code: two genotypes nearly identical, amino acid position 9 different: position 9 E (Glu) DRB1*04:174 position 9 W (Trp) DRB1*15:152 just easy find nucleotides distinguish two alleles:","code":"library(hlabud) a <- hla_alignments(\"DRB1\") dosage(c(\"DRB1*04:174\", \"DRB1*15:152\"), a$onehot) #> pos9_E pos9_W #> DRB1*04:174 1 0 #> DRB1*15:152 0 1 n <- hla_alignments(\"DRB1\", type = \"nuc\") dosage(c(\"DRB1*04:174\", \"DRB1*15:152\"), n$onehot) #> pos109_C pos109_T #> DRB1*04:174 0 1 #> DRB1*15:152 1 0"},{"path":"https://slowkow.github.io/hlabud/articles/examples.html","id":"installation","dir":"Articles","previous_headings":"","what":"Installation","title":"hlabud usage examples","text":"quickest way get hlabud install GitHub: , included usage examples. hope inspire share HLA analyses. source code page available . Thank reporting issues hlabud.","code":"# install.packages(\"devtools\") devtools::install_github(\"slowkow/hlabud\")"},{"path":"https://slowkow.github.io/hlabud/articles/examples.html","id":"get-a-one-hot-encoded-matrix-for-all-hla-drb1-alleles","dir":"Articles","previous_headings":"","what":"Get a one-hot encoded matrix for all HLA-DRB1 alleles","title":"hlabud usage examples","text":"can use hla_alignments(\"DRB1\") load DRB1_prot.txt file latest IMGTHLA release: object list three items: $sequences amino acid sequence alignments data frame: conventions used alignments (copied EBI): entry allele displayed respect reference sequences. identity reference sequence present base displayed hyphen (-). Non-identity reference sequence shown displaying appropriate base position. insertion deletion occurred represented period (.). sequence unknown point alignment, represented asterisk (*). protein alignments null alleles, ‘Stop’ codons represented hash (X). protein alignments, sequence following termination codon, marked appear blank. conventions used nucleotide protein alignments. Learn alignments : https://www.ebi.ac.uk/ipd/imgt/hla/alignment/help/ $alleles matrix amino acids one column position: $onehot one-hot encoded matrix one column amino acid position:","code":"library(hlabud) a <- hla_alignments(gene = \"DRB1\", verbose = TRUE) #> Reading /home/runner/.local/share/hlabud/.354.0/alignments/DRB1_prot.txt str(a) #> List of 3 #> $ sequences: tibble [3,588 × 2] (S3: tbl_df/tbl/data.frame) #> ..$ allele: chr [1:3588] \"DRB1*01:01:01:01\" \"DRB1*01:01:01:02\" \"DRB1*01:01:01:03\" \"DRB1*01:01:01:04\" ... #> ..$ seq : chr [1:3588] \"MVCLKLPGGSCMTALTVTLMVLSSPLALAGDTRPRFLWQLKFECHFFNGTERVR.LLERCIYNQEE.SVRFDSDVGEYRAVTELGRPDAEYWNSQKDLLEQRRAAVDTYCR\"| __truncated__ \"------------------------------------------------------.-----------.--------------------------------------------\"| __truncated__ \"------------------------------------------------------.-----------.--------------------------------------------\"| __truncated__ \"------------------------------------------------------.-----------.--------------------------------------------\"| __truncated__ ... #> $ alleles : chr [1:3588, 1:288] \"M\" \"M\" \"M\" \"M\" ... #> ..- attr(*, \"dimnames\")=List of 2 #> .. ..$ : chr [1:3588] \"DRB1*01:01:01:01\" \"DRB1*01:01:01:02\" \"DRB1*01:01:01:03\" \"DRB1*01:01:01:04\" ... #> .. ..$ : chr [1:288] \"posn29\" \"posn28\" \"posn27\" \"posn26\" ... #> $ onehot : num [1:3588, 1:1786] 0 0 0 0 0 0 0 0 0 0 ... #> ..- attr(*, \"dimnames\")=List of 2 #> .. ..$ : chr [1:3588] \"DRB1*01:01:01:01\" \"DRB1*01:01:01:02\" \"DRB1*01:01:01:03\" \"DRB1*01:01:01:04\" ... #> .. ..$ : chr [1:1786] \"posn29_unk\" \"posn29_M\" \"posn28_unk\" \"posn28_L\" ... a$sequences #> # A tibble: 3,588 × 2 #> allele seq #> #> 1 DRB1*01:01:01:01 MVCLKLPGGSCMTALTVTLMVLSSPLALAGDTRPRFLWQLKFECHFFNGTERVR.LLER… #> 2 DRB1*01:01:01:02 ------------------------------------------------------.----… #> 3 DRB1*01:01:01:03 ------------------------------------------------------.----… #> 4 DRB1*01:01:01:04 ------------------------------------------------------.----… #> 5 DRB1*01:01:01:05 ------------------------------------------------------.----… #> 6 DRB1*01:01:01:06 ------------------------------------------------------.----… #> 7 DRB1*01:01:01:07 ------------------------------------------------------.----… #> 8 DRB1*01:01:01:08 ------------------------------------------------------.----… #> 9 DRB1*01:01:01:09 ------------------------------------------------------.----… #> 10 DRB1*01:01:01:10 ------------------------------------------------------.----… #> # ℹ 3,578 more rows a$alleles[1:5,1:7] #> posn29 posn28 posn27 posn26 posn25 posn24 posn23 #> DRB1*01:01:01:01 \"M\" \"V\" \"C\" \"L\" \"K\" \"L\" \"P\" #> DRB1*01:01:01:02 \"M\" \"V\" \"C\" \"L\" \"K\" \"L\" \"P\" #> DRB1*01:01:01:03 \"M\" \"V\" \"C\" \"L\" \"K\" \"L\" \"P\" #> DRB1*01:01:01:04 \"M\" \"V\" \"C\" \"L\" \"K\" \"L\" \"P\" #> DRB1*01:01:01:05 \"M\" \"V\" \"C\" \"L\" \"K\" \"L\" \"P\" a$onehot[1:5,1:7] #> posn29_unk posn29_M posn28_unk posn28_L posn28_V posn27_unk #> DRB1*01:01:01:01 0 1 0 0 1 0 #> DRB1*01:01:01:02 0 1 0 0 1 0 #> DRB1*01:01:01:03 0 1 0 0 1 0 #> DRB1*01:01:01:04 0 1 0 0 1 0 #> DRB1*01:01:01:05 0 1 0 0 1 0 #> posn27_C #> DRB1*01:01:01:01 1 #> DRB1*01:01:01:02 1 #> DRB1*01:01:01:03 1 #> DRB1*01:01:01:04 1 #> DRB1*01:01:01:05 1"},{"path":"https://slowkow.github.io/hlabud/articles/examples.html","id":"convert-genotypes-to-a-dosage-matrix","dir":"Articles","previous_headings":"","what":"Convert genotypes to a dosage matrix","title":"hlabud usage examples","text":"Suppose individuals following genotypes: want run association test amino acid positions, need convert genotype names matrix allele dosages (e.g., 0, 1, 2). can use dosage() convert individual’s genotypes amino acid dosages: Note: dosage matrix one row individual one column amino acid position. default, dosage() discard columns individuals identical. first individual dosage=3 pos6_R (position 6 Arg). ’s assigned individual 3 alleles input. Please careful check data looks way expect!","code":"genotypes <- c( \"DRB1*12:02:02:03,DRB1*12:02:02:03,DRB1*14:54:02\", \"DRB1*04:174,DRB1*15:152\", \"DRB1*04:56:02,DRB1*15:01:48\", \"DRB1*14:172,DRB1*04:160\", \"DRB1*04:359,DRB1*04:284:02\" ) dosage <- dosage(genotypes, a$onehot) dosage[,1:4] #> posn29_unk posn29_M pos6_R #> DRB1*12:02:02:03,DRB1*12:02:02:03,DRB1*14:54:02 1 2 3 #> DRB1*04:174,DRB1*15:152 2 0 2 #> DRB1*04:56:02,DRB1*15:01:48 2 0 2 #> DRB1*14:172,DRB1*04:160 2 0 2 #> DRB1*04:359,DRB1*04:284:02 2 0 2 #> pos9_E #> DRB1*12:02:02:03,DRB1*12:02:02:03,DRB1*14:54:02 3 #> DRB1*04:174,DRB1*15:152 1 #> DRB1*04:56:02,DRB1*15:01:48 1 #> DRB1*14:172,DRB1*04:160 2 #> DRB1*04:359,DRB1*04:284:02 2 dim(dosage) #> [1] 5 38"},{"path":"https://slowkow.github.io/hlabud/articles/examples.html","id":"logistic-regression-association-for-amino-acid-positions","dir":"Articles","previous_headings":"","what":"Logistic regression association for amino acid positions","title":"hlabud usage examples","text":"Let’s simulate dataset cases controls demonstrate one approach testing amino acid positions might associated cases. simulated dataset 100 individuals, 52 cases 48 controls. also one column amino acid position might want test association case variable. One possible approach association testing use glm() fit logistic regression model amino acid position. reveal amino acid position might associated case variable simulated dataset. volcano shows Odds Ratio P-value amino acid position. top hits P < 0.05 labeled. simulation, case variable associated pos123_S (P = 0.026, = 0.52, 95% CI 0.28 0.91).","code":"set.seed(2) n <- 100 d <- data.frame( geno = paste( sample(rownames(a$onehot), n, replace = TRUE), sample(rownames(a$onehot), n, replace = TRUE), sep = \",\" ), age = sample(21:100, n, replace = TRUE), case = sample(0:1, n, replace = TRUE) ) d <- cbind(d, dosage(d$geno, a$onehot)) d[1:5,1:6] #> geno age case #> DRB1*04:256,DRB1*04:125 DRB1*04:256,DRB1*04:125 55 0 #> DRB1*04:11:01:02,DRB1*01:02:12 DRB1*04:11:01:02,DRB1*01:02:12 73 1 #> DRB1*14:08,DRB1*15:01:02 DRB1*14:08,DRB1*15:01:02 72 0 #> DRB1*03:90,DRB1*04:278 DRB1*03:90,DRB1*04:278 22 1 #> DRB1*03:67N,DRB1*03:100:02 DRB1*03:67N,DRB1*03:100:02 34 0 #> posn29_unk posn29_M posn25_K #> DRB1*04:256,DRB1*04:125 2 0 0 #> DRB1*04:11:01:02,DRB1*01:02:12 1 1 1 #> DRB1*14:08,DRB1*15:01:02 0 2 1 #> DRB1*03:90,DRB1*04:278 2 0 0 #> DRB1*03:67N,DRB1*03:100:02 2 0 0 # select the amino acid positions that have at least 3 people with dosage > 0 my_as <- names(which(colSums(d[,4:ncol(d)] > 0) >= 3)) # run the association tests my_glm <- rbindlist(pblapply(my_as, function(my_a) { f <- sprintf(\"case ~ %s\", my_a) glm(as.formula(f), data = d, family = \"binomial\") %>% parameters(exponentiate = TRUE) })) # look at the top hits my_glm %>% arrange(p) %>% filter(!Parameter %in% c(\"(Intercept)\")) %>% head #> Parameter Coefficient SE CI CI_low CI_high z #> 1: pos123_S 0.5161604 0.1528830 0.95 0.282671869 0.9100318 -2.232794 #> 2: pos183_V 0.5570074 0.1572414 0.95 0.314570198 0.9582010 -2.072913 #> 3: pos34_H 2.1982280 0.8658531 0.95 1.053839944 5.0273874 1.999690 #> 4: pos51_A 0.1170213 0.1265210 0.95 0.006179376 0.6749415 -1.984314 #> 5: pos107_S 0.5815232 0.1638237 0.95 0.329174623 1.0003310 -1.924302 #> 6: pos69_L 1.6740699 0.4630002 0.95 0.982434872 2.9247856 1.863017 #> df_error p #> 1: Inf 0.02556252 #> 2: Inf 0.03818041 #> 3: Inf 0.04553374 #> 4: Inf 0.04722090 #> 5: Inf 0.05431680 #> 6: Inf 0.06245981"},{"path":"https://slowkow.github.io/hlabud/articles/examples.html","id":"umap-embedding-of-3588-hla-drb1-alleles","dir":"Articles","previous_headings":"","what":"UMAP embedding of 3,588 HLA-DRB1 alleles","title":"hlabud usage examples","text":"many possibilities analysis one-hot encoding HLA-DRB1 alleles. example, UMAP embedding 3,588 HLA-DRB1 alleles encoded one-hot amino acid matrix 1786 columns, one amino acid position. can highlight alleles amino acid H position 13: can represent amino acid position 13 different color:","code":"uamp(a$onehot, n_epochs = 200, min_dist = 1, spread = 2)"},{"path":"https://slowkow.github.io/hlabud/articles/examples.html","id":"get-hla-allele-frequencies-from-allele-frequency-net-database-afnd","dir":"Articles","previous_headings":"","what":"Get HLA allele frequencies from Allele Frequency Net Database (AFND)","title":"hlabud usage examples","text":"Download read table HLA allele frequencies Allele Frequency Net Database (AFND). use data, please cite latest manuscript Allele Frequency Net Database: Gonzalez-Galarza FF, McCabe , Santos EJMD, Jones J, Takeshita L, Ortega-Rivera ND, et al. Allele frequency net database (AFND) 2020 update: gold-standard data classification, open access genotype data new query tools. Nucleic Acids Res. 2020;48: D783–D788. doi:10.1093/nar/gkz1029 Plot frequency specific allele (DQB1*02:01) populations 1000 sampled individuals: See github.com/slowkow/allelefrequencies examples might use data.","code":"af <- hla_frequencies() af #> # A tibble: 123,502 × 7 #> group gene allele population indivs_over_n alleles_over_2n n #> #> 1 hla A A*01:01 Argentina Rosario To… 15.1 0.076 86 #> 2 hla A A*01:01 Armenia combined Reg… NA 0.125 100 #> 3 hla A A*01:01 Australia Cape York … NA 0.053 103 #> 4 hla A A*01:01 Australia Groote Eyl… NA 0.027 75 #> 5 hla A A*01:01 Australia New South … NA 0.187 134 #> 6 hla A A*01:01 Australia Yuendumu A… NA 0.008 191 #> 7 hla A A*01:01 Austria 27 0.146 200 #> 8 hla A A*01:01 Azores Central Islan… NA 0.08 59 #> 9 hla A A*01:01 Azores Oriental Isla… NA 0.115 43 #> 10 hla A A*01:01 Azores Terceira Isla… NA 0.109 130 #> # ℹ 123,492 more rows my_allele <- \"DQB1*02:01\" my_af <- af %>% filter(allele == my_allele) %>% filter(n > 1000) %>% arrange(-alleles_over_2n) ggplot(my_af) + aes(x = alleles_over_2n, y = reorder(population, alleles_over_2n)) + scale_y_discrete(position = \"right\") + geom_colh() + labs( x = \"Allele Frequency (Alleles / 2N)\", y = NULL, title = glue(\"Frequency of {my_allele} across {length(unique(my_af$population))} populations\"), caption = \"Data from AFND http://allelefrequencies.net\" )"},{"path":"https://slowkow.github.io/hlabud/articles/examples.html","id":"compute-hla-divergence-with-the-grantham-distance-matrix","dir":"Articles","previous_headings":"","what":"Compute HLA divergence with the Grantham distance matrix","title":"hlabud usage examples","text":"HLA allele binds specific set peptides. , individual two highly dissimilar alleles can bind greater number different peptides homozygous individual (https://doi.org/10.1007/BF02918202): MHC class II allele capacity bind present specific set peptides processed antigens. inability specific class II allele bind present fragment derived processed antigen results loss immune responsiveness antigen individuals homozygous class II allele. can compute HLA divergence metric set individuals like : divergence homozygote equal zero, definition: default, use amino acid distance matrix Granthan 1974 (https://doi.org/10.1126/science.185.4154.862). Alternatively, can choose use uniform matrix instead (diagonal values 0, non-diagonal values equal 1): amino acid distance matrix easily accessible, provide two built-options \"grantham\" \"uniform\":","code":"my_genos <- c(\"A*23:01:12,A*24:550\", \"A*25:12N,A*11:27\", \"A*24:381,A*33:85\") hla_divergence(my_genos, method = \"grantham\") #> A*23:01:12,A*24:550 A*25:12N,A*11:27 A*24:381,A*33:85 #> 0.4924242 3.3333333 4.9015152 hla_divergence(\"A*01:01,A*01:01\") #> A*01:01,A*01:01 #> 0 hla_divergence(my_genos, method = \"uniform\") #> A*23:01:12,A*24:550 A*25:12N,A*11:27 A*24:381,A*33:85 #> 0.007575758 0.040404040 0.060606061 amino_distance_matrix(method = \"uniform\") #> A R N D C Q E G H I L K M F P S T W Y V #> A 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 #> R 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 #> N 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 #> D 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 #> C 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 #> Q 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 #> E 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 #> G 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 #> H 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 #> I 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 #> L 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 #> K 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 #> M 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 #> F 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 #> P 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 #> S 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 #> T 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 #> W 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 #> Y 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 #> V 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0"},{"path":"https://slowkow.github.io/hlabud/articles/examples.html","id":"download-and-unpack-all-data-from-the-latest-imgthla-release","dir":"Articles","previous_headings":"","what":"Download and unpack all data from the latest IMGTHLA release","title":"hlabud usage examples","text":"want use hla_alignments(), don’t need install_hla() data files downloaded automatically needed cached future use. users might need access additional files present full data release. Run install_hla() download unpack latest IMGTHLA release. destination folder downloaded data files getOption(\"hlabud_dir\") (automatically tailored operating system thanks rappdirs package). examples download releases get list release names. Download latest release (default) specific release: Optionally, get set directory hlabud uses store data: List releases: installing releases, hlabud folder might look like :","code":"# Download all of the data (120MB) for the latest IMGTHLA release install_hla(release = \"latest\") # Download a specific release install_hla(release = \"3.51.0\") getOption(\"hlabud_dir\") #> [1] \"/home/username/.local/share/hlabud\" # Manually override the directory for hlabud to use options(hlabud_dir = \"/path/to/my/dir\") hla_releases() #> [1] \"3.51.0\" \"3.50.0\" \"3.49.0\" \"3.48.0\" \"3.47.0\" \"3.46.0\" \"3.45.1\" \"3.45.01\" #> [9] \"3.45.0.1\" \"3.45.0\" \"3.44.1\" \"3.44.0\" \"3.43.0\" \"3.42.0\" \"3.41.2\" \"3.41.0\" #> [17] \"3.40.0\" \"3.39.0\" \"3.38.0\" \"3.37.0\" \"3.36.0\" \"3.35.0\" \"3.34.0\" \"3.33.0\" #> [25] \"3.32.0\" \"3.31.0\" \"3.30.0\" \"3.29.0\" \"3.28.0\" \"3.27.0\" ❯ ls -lah \"/home/user/.local/share/hlabud\" total 207M drwxrwxr-x 3 user user 32 Apr 5 01:19 3.30.0 drwxrwxr-x 11 user user 4.0K Apr 7 19:31 3.40.0 drwxrwxr-x 12 user user 4.0K Apr 5 00:27 3.51.0 -rw-rw-r-- 1 user user 15K Apr 7 19:23 tags.json -rw-rw-r-- 1 user user 79M Apr 7 19:28 v3.40.0-alpha.tar.gz -rw-rw-r-- 1 user user 129M Apr 4 20:07 v3.51.0-alpha.tar.gz"},{"path":"https://slowkow.github.io/hlabud/articles/examples.html","id":"count-the-number-of-alleles-in-each-imgthla-release","dir":"Articles","previous_headings":"","what":"Count the number of alleles in each IMGTHLA release","title":"hlabud usage examples","text":"can get list release names: can get allele names release: Next, count many alleles release: plot number alleles line plot:","code":"releases <- hla_releases() releases #> [1] \".354.0\" \"3.53.0\" \"3.52.0\" \"3.51.0\" \"3.50.0\" \"3.49.0\" #> [7] \"3.48.0\" \"3.47.0\" \"3.46.0\" \"3.45.1\" \"3.45.01\" \"3.45.0.1\" #> [13] \"3.45.0\" \"3.44.1\" \"3.44.0\" \"3.43.0\" \"3.42.0\" \"3.41.2\" #> [19] \"3.41.0\" \"3.40.0\" \"3.39.0\" \"3.38.0\" \"3.37.0\" \"3.36.0\" #> [25] \"3.35.0\" \"3.34.0\" \"3.33.0\" \"3.32.0\" \"3.31.0\" \"3.30.0\" my_alleles <- rbindlist(lapply(releases, function(release) { retval <- hla_alleles(release = release) retval$release <- release return(retval) }), fill = TRUE) #> Warning in hla_alleles(release = release): unrecognized release name #> 'Allelelist.3451.txt' #> Warning in hla_alleles(release = release): unrecognized release name #> 'Allelelist.34501.txt' #> Warning in hla_alleles(release = release): unrecognized release name #> 'Allelelist.34501.txt' #> Warning in hla_alleles(release = release): unrecognized release name #> 'Allelelist.3441.txt' #> Warning in hla_alleles(release = release): unrecognized release name #> 'Allelelist.3412.txt' d <- my_alleles %>% count(release) %>% filter(n > 1) d #> release n #> 1: .354.0 38416 #> 2: 3.30.0 17509 #> 3: 3.31.0 17874 #> 4: 3.32.0 18363 #> 5: 3.33.0 18955 #> 6: 3.34.0 20272 #> 7: 3.35.0 21683 #> 8: 3.36.0 22548 #> 9: 3.37.0 24093 #> 10: 3.38.0 25958 #> 11: 3.39.0 26512 #> 12: 3.40.0 27273 #> 13: 3.41.0 27980 #> 14: 3.42.0 28786 #> 15: 3.43.0 29417 #> 16: 3.44.0 30523 #> 17: 3.45.0 31552 #> 18: 3.46.0 32330 #> 19: 3.47.0 33552 #> 20: 3.48.0 34145 #> 21: 3.49.0 35077 #> 22: 3.50.0 36016 #> 23: 3.51.0 36625 #> 24: 3.52.0 37068 #> 25: 3.53.0 37619 #> release n ggplot(d) + aes(x = release, y = n, group = 1) + geom_line() + geom_text(aes(label = release), hjust = 1) + labs(x = NULL, y = \"Number of alleles\", title = \"Each release has more HLA alleles\") + theme( axis.text.x = element_blank(), axis.ticks.x = element_blank(), ) d2 <- my_alleles %>% mutate(gene = str_split_fixed(Allele, \"\\\\*\", 2)[,1]) %>% count(release, gene) ggplot() + aes(x = release, y = n) + geom_line( data = d2, aes(group = gene, color = gene) ) + scale_color_discrete(guide = \"none\") + geom_text( data = d2 %>% filter(release == \"3.52.0\"), mapping = aes(label = gene), hjust = 0 ) + labs(x = NULL, y = \"Number of alleles\", title = \"Number of alleles per release and gene\") + scale_x_discrete(expand = expansion(mult = c(0.01, 0.1))) + scale_y_log10() + theme( panel.grid.major.y = element_line(), axis.text.x = element_blank(), axis.ticks.x = element_blank(), )"},{"path":"https://slowkow.github.io/hlabud/articles/visualize-hla-structure.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"Visualize HLA protein structures","text":"Kamil Slowikowski 2023-11-27 vignette, explore different methods visualizing molecular structure HLA proteins. First, ’ll look example use NGLVieweR R package show HLA protein structures. Next, ’ll use PyMOL thing.","code":""},{"path":"https://slowkow.github.io/hlabud/articles/visualize-hla-structure.html","id":"what-are-the-pdb-identifiers-for-each-hla-gene","dir":"Articles","previous_headings":"","what":"What are the PDB identifiers for each HLA gene?","title":"Visualize HLA protein structures","text":"list PDB identifiers might consider using represent HLA protein: Also try searching PDB website , e.g., \"HLA-DR\" see appropriate structure analysis.","code":"HLA-A 2xpg HLA-B 2bvp HLA-C 4nt6 HLA-DP 3lqz HLA-DQ 4z7w HLA-DR 3pdo"},{"path":"https://slowkow.github.io/hlabud/articles/visualize-hla-structure.html","id":"using-nglviewer","dir":"Articles","previous_headings":"","what":"Using NGLVieweR","title":"Visualize HLA protein structures","text":"Let’s try visualize position 9 HLA-B protein structure. visualize structure 2bvp Protein Data Bank (PDB). example NGLVieweR R package Niels van der Velden: view , see blue peptide red HLA-B protein. tyrosine position 9 highlighted ball+stick representation, also labeled text label. structure rotating can getter better view. can use hlabud answer questions HLA-B Tyr9 (tyrosine position 9). example, HLA-B alleles amino acid position? fraction reported HLA-B alleles Tyr9?","code":"# devtools::install_github(\"nvelden/NGLVieweR\") # we need the latest version library(NGLVieweR) library(magrittr) my_sele <- \"9:A\" NGLVieweR(\"2bvp\") %>% stageParameters( backgroundColor = \"white\", zoomSpeed = 1, cameraFov = 80 ) %>% addRepresentation( type = \"cartoon\" ) %>% addRepresentation( type = \"ball+stick\", param = list( sele = my_sele ) ) %>% addRepresentation( type = \"label\", param = list( sele = my_sele, labelType = \"format\", labelFormat = \"[%(resname)s]%(resno)s\", # or enter custom text labelGrouping = \"residue\", # or \"atom\" (eg. sele = \"20:A.CB\") color = \"black\", fontFamiliy = \"sans-serif\", xOffset = 1, yOffset = 0, zOffset = 0, fixedSize = TRUE, radiusType = 1, radiusSize = 5.5, # Label size showBackground = TRUE # backgroundColor=\"black\", # backgroundOpacity=0.5 ) ) %>% zoomMove( center = my_sele, zoom = my_sele, duration = 0, # animation time in ms z_offSet = -20 ) %>% setSpin() library(hlabud) a <- hla_alignments(\"B\") head(names(which(a$onehot[,\"pos9_Y\"] == 1))) #> [1] \"B*07:02:01:01\" \"B*07:02:01:02\" \"B*07:02:01:03\" \"B*07:02:01:04\" #> [5] \"B*07:02:01:05\" \"B*07:02:01:06\" sum(a$onehot[,\"pos9_Y\"] == 1) / nrow(a$onehot) #> [1] 0.7101798"},{"path":"https://slowkow.github.io/hlabud/articles/visualize-hla-structure.html","id":"using-pymol","dir":"Articles","previous_headings":"","what":"Using PyMOL","title":"Visualize HLA protein structures","text":"PyMOL one favorite methods visualizing protein structures, allows us change residue existing protein visualize new mutated protein. takes lines PyMOL create nice figure. example, want quickly highlight positions 13 45 HLA-DQB1, snippet PyMOL code produce figure . Bash script : Write PyMOL script Run PyMOL script pymol command PyMOL script : Load structure Protein Data Bank (PDB). 7kei identifier published protein structure. Color HLA-DQA1 protein teal. Color HLA-DQB1 protein orange. Color peptide purple. color residues 13 45 HLA-DQB1 red. Label residues positions names. Write PNG file view structure. image , manually rotated structure mouse added text labels like \"PDB: 7kei\" saving file.","code":"#!/usr/bin/env bash # Write a pymol script cat << EOF > script.pml fetch 7kei show cartoon remove solvent remove chain D remove chain H color teal, chain A color orange, chain B color purple, chain C color red, chain B & resi 13 color red, chain B & resi 45 label n. CA and chain B & resi 13, \"%s %s\" % (resi, resn) label n. CA and chain B & resi 45, \"%s %s\" % (resi, resn) png 7kei.png, width=1200, height=800, dpi=300 EOF # On Linux, we can just use `pymol` without making an alias # On macOS, we need to make an alias alias pymol=/Applications/PyMOL.app/Contents/MacOS/PyMOL pymol -c script.pml"},{"path":"https://slowkow.github.io/hlabud/articles/visualize-hla-structure.html","id":"other-pdb-viewers","dir":"Articles","previous_headings":"","what":"Other PDB viewers","title":"Visualize HLA protein structures","text":"Python: https://github.com/nglviewer/nglview Javascript: https://www.rcsb.org/3d-view https://www.ncbi.nlm.nih.gov/Structure/icn3d/full.html?mmdbid=7kei&bu=1 https://github.com/nglviewer/ngl https://github.com/biasmv/pv R: https://www.raymolecule.com","code":""},{"path":"https://slowkow.github.io/hlabud/authors.html","id":null,"dir":"","previous_headings":"","what":"Authors","title":"Authors and Citation","text":"Kamil Slowikowski. Author, maintainer.","code":""},{"path":"https://slowkow.github.io/hlabud/authors.html","id":"citation","dir":"","previous_headings":"","what":"Citation","title":"Authors and Citation","text":"J R, DJ B, X G, MA C, P F, SGE. M (2019). “IPD-IMGT/HLA Database.” Nucleic Acids Research, 48(D1), D948–D955. doi:10.1093/nar/gkz950. Slowikowski K (2023). hlabud: IMGTHLA Data R. doi:10.5281/zenodo.8183949, R package version 1.0.0.9999, https://github.com/slowkow/hlabud.","code":"@Article{, author = {Robinson J and Barker DJ and Georgiou X and Cooper MA and Flicek P and Marsh SGE.}, title = {IPD-IMGT/HLA Database}, doi = {10.1093/nar/gkz950}, year = {2019}, month = {oct}, publisher = {Oxford University Press}, volume = {48}, number = {D1}, pages = {D948–D955}, journal = {Nucleic Acids Research}, } @Manual{, title = {{hlabud}: IMGTHLA Data from R}, author = {Kamil Slowikowski}, year = {2023}, note = {R package version 1.0.0.9999}, doi = {10.5281/zenodo.8183949}, url = {https://github.com/slowkow/hlabud}, }"},{"path":"https://slowkow.github.io/hlabud/index.html","id":"hlabud-hla-analysis-in-r-","dir":"","previous_headings":"","what":"Methods for Access and Analysis of the Human Leukocyte Antigen (HLA) Gene Sequence Alignments from IMGTHLA","title":"Methods for Access and Analysis of the Human Leukocyte Antigen (HLA) Gene Sequence Alignments from IMGTHLA","text":"hlabud provides methods retrieve sequence alignment data IMGTHLA convert data convenient R matrices ready downstream analysis. See usage examples learn use data logistic regression dimensionality reduction. also share tips visualize 3D molecular structure HLA proteins highlight specific amino acid residues. example, let’s consider simple question two HLA genotypes DRB1*04:174 DRB1*15:152. amino acid positions different two genotypes? output, can conclude two genotypes nearly identical, different amino acids E W position 9.","code":"library(hlabud) a <- hla_alignments(\"DRB1\") dosage(c(\"DRB1*04:174\", \"DRB1*15:152\"), a$onehot) ## pos9_E pos9_W ## DRB1*04:174 1 0 ## DRB1*15:152 0 1"},{"path":"https://slowkow.github.io/hlabud/index.html","id":"installation","dir":"","previous_headings":"","what":"Installation","title":"Methods for Access and Analysis of the Human Leukocyte Antigen (HLA) Gene Sequence Alignments from IMGTHLA","text":"quickest way get hlabud install GitHub:","code":"# install.packages(\"devtools\") devtools::install_github(\"slowkow/hlabud\")"},{"path":"https://slowkow.github.io/hlabud/index.html","id":"examples","dir":"","previous_headings":"","what":"Examples","title":"Methods for Access and Analysis of the Human Leukocyte Antigen (HLA) Gene Sequence Alignments from IMGTHLA","text":"See usage examples get ideas use hlabud analyses. Get one-hot encoded matrix HLA-DRB1 alleles Convert genotypes dosage matrix Logistic regression association amino acid positions UMAP embedding 3,516 HLA-DRB1 alleles Get HLA allele frequencies Allele Frequency Net Database (AFND) Compute HLA divergence Grantham distance matrix Download unpack data latest IMGTHLA release","code":""},{"path":"https://slowkow.github.io/hlabud/index.html","id":"citation","dir":"","previous_headings":"","what":"Citation","title":"Methods for Access and Analysis of the Human Leukocyte Antigen (HLA) Gene Sequence Alignments from IMGTHLA","text":"hlabud provides access data IMGT/HLA database. Therefore, use hlabud please cite IMGT/HLA paper: Robinson J, Barker DJ, Georgiou X, Cooper MA, Flicek P, Marsh SGE. IPD-IMGT/HLA Database. Nucleic Acids Res. 2020;48: D948–D955. doi:10.1093/nar/gkz950 hlabud also provides access data Allele Frequency Net Database (AFND). Therefore, use hlabud::hla_frequencies() please cite AFND paper: Gonzalez-Galarza FF, McCabe , Santos EJMD, Jones J, Takeshita L, Ortega-Rivera ND, et al. Allele frequency net database (AFND) 2020 update: gold-standard data classification, open access genotype data new query tools. Nucleic Acids Res. 2020;48: D783–D788. doi:10.1093/nar/gkz1029 Additionally, can also cite hlabud package like : Slowikowski K. hlabud: methods access analysis human leukocyte antigen (HLA) gene sequence alignments IMGT/HLA. R package version 1.0.0.","code":""},{"path":"https://slowkow.github.io/hlabud/index.html","id":"related-work","dir":"","previous_headings":"","what":"Related work","title":"Methods for Access and Analysis of the Human Leukocyte Antigen (HLA) Gene Sequence Alignments from IMGTHLA","text":"recommend article anyone new HLA, beautiful figures help build intuition: La Gruta NL, Gras S, Daley SR, Thomas PG, Rossjohn J. Understanding drivers MHC restriction T cell receptors. Nat Rev Immunol. 2018;18: 467–478. Learn conventions HLA nomenclature: Marsh SGE, Albert ED, Bodmer WF, Bontrop RE, Dupont B, Erlich HA, et al. Nomenclature factors HLA system, 2010. Tissue Antigens. 2010;75: 291–455. case-control analysis HLA genotype data, consider BIGDAWG R package available CRAN. related article: Pappas DJ, Marin W, Hollenbach JA, Mack SJ. Bridging ImmunoGenomic Data Analysis Workflow Gaps (BIGDAWG): integrated case-control analysis pipeline. Hum Immunol. 2016;77: 283–287.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/amino_distance_matrix.html","id":null,"dir":"Reference","previous_headings":"","what":"Get a pairwise 20x20 distance matrix for all pairs of amino acids — amino_distance_matrix","title":"Get a pairwise 20x20 distance matrix for all pairs of amino acids — amino_distance_matrix","text":"default, return amino acid distance matrix Grantham 1974 (doi:10.1126/science.185.4154.862).","code":""},{"path":"https://slowkow.github.io/hlabud/reference/amino_distance_matrix.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Get a pairwise 20x20 distance matrix for all pairs of amino acids — amino_distance_matrix","text":"","code":"amino_distance_matrix(method = \"grantham\")"},{"path":"https://slowkow.github.io/hlabud/reference/amino_distance_matrix.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Get a pairwise 20x20 distance matrix for all pairs of amino acids — amino_distance_matrix","text":"method \"grantham\" Grantham 1974 matrix \"uniform\" matrix ones non-diagonal.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/amino_distance_matrix.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Get a pairwise 20x20 distance matrix for all pairs of amino acids — amino_distance_matrix","text":"20x20 symmetric matrix positive numbers zeros diagonal.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/amino_distance_matrix.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Get a pairwise 20x20 distance matrix for all pairs of amino acids — amino_distance_matrix","text":"","code":"# By default, the Grantham 1974 matrix amino_distance_matrix(\"grantham\") #> A R N D C Q E G H I L K M F P S T W Y #> A 0 112 111 126 195 91 107 60 86 94 96 106 84 113 27 99 58 148 112 #> R 112 0 86 96 180 43 54 125 29 97 102 26 91 97 103 110 71 101 77 #> N 111 86 0 23 139 46 42 80 68 149 153 94 142 158 91 46 65 174 143 #> D 126 96 23 0 154 61 45 94 81 168 172 101 160 177 108 65 85 181 160 #> C 195 180 139 154 0 154 170 159 174 198 198 202 196 205 169 112 149 215 194 #> Q 91 43 46 61 154 0 29 87 24 109 113 53 101 116 76 68 42 130 99 #> E 107 54 42 45 170 29 0 98 40 134 138 56 126 140 93 80 65 152 122 #> G 60 125 80 94 159 87 98 0 98 135 138 127 127 153 42 56 59 184 147 #> H 86 29 68 81 174 24 40 98 0 94 99 32 87 100 77 89 47 115 83 #> I 94 97 149 168 198 109 134 135 94 0 5 102 10 21 95 142 89 61 33 #> L 96 102 153 172 198 113 138 138 99 5 0 107 15 22 98 145 92 61 36 #> K 106 26 94 101 202 53 56 127 32 102 107 0 95 102 103 121 78 110 85 #> M 84 91 142 160 196 101 126 127 87 10 15 95 0 28 87 135 81 67 36 #> F 113 97 158 177 205 116 140 153 100 21 22 102 28 0 114 155 103 40 22 #> P 27 103 91 108 169 76 93 42 77 95 98 103 87 114 0 74 38 147 110 #> S 99 110 46 65 112 68 80 56 89 142 145 121 135 155 74 0 58 177 144 #> T 58 71 65 85 149 42 65 59 47 89 92 78 81 103 38 58 0 128 92 #> W 148 101 174 181 215 130 152 184 115 61 61 110 67 40 147 177 128 0 37 #> Y 112 77 143 160 194 99 122 147 83 33 36 85 36 22 110 144 92 37 0 #> V 64 96 133 152 192 96 121 109 84 29 32 97 21 50 68 124 69 88 55 #> V #> A 64 #> R 96 #> N 133 #> D 152 #> C 192 #> Q 96 #> E 121 #> G 109 #> H 84 #> I 29 #> L 32 #> K 97 #> M 21 #> F 50 #> P 68 #> S 124 #> T 69 #> W 88 #> Y 55 #> V 0 # All ones, and zeros on the diagonal amino_distance_matrix(\"uniform\") #> A R N D C Q E G H I L K M F P S T W Y V #> A 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 #> R 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 #> N 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 #> D 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 #> C 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 #> Q 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 #> E 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 #> G 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 #> H 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 #> I 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 #> L 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 #> K 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 #> M 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 #> F 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 #> P 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 #> S 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 #> T 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 #> W 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 #> Y 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 #> V 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0"},{"path":"https://slowkow.github.io/hlabud/reference/dosage.html","id":null,"dir":"Reference","previous_headings":"","what":"Convert a set of genotype names into a dosage matrix of each residue at each position — dosage","title":"Convert a set of genotype names into a dosage matrix of each residue at each position — dosage","text":"genotype name, return dosage matrix residue (amino acid nucleotide) position.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/dosage.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Convert a set of genotype names into a dosage matrix of each residue at each position — dosage","text":"","code":"dosage(names, mat, drop_constants = TRUE, drop_duplicates = TRUE)"},{"path":"https://slowkow.github.io/hlabud/reference/dosage.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Convert a set of genotype names into a dosage matrix of each residue at each position — dosage","text":"names Input character vector one genotype individual. entries must present rownames(mat). mat one-hot encoded matrix one row per allele one column residue (amino acid nucleotide) position. drop_constants Filter constant amino acid positions default. drop_duplicates Filter duplicate amino acid positions default.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/dosage.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Convert a set of genotype names into a dosage matrix of each residue at each position — dosage","text":"matrix one row input genotype, one column residue position.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/dosage.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Convert a set of genotype names into a dosage matrix of each residue at each position — dosage","text":"genotype represented like \"HLA-*01:01,HLA-*01:01\" default, returned matrix filtered exclude: positions input genotypes allele positions identical previous positions","code":""},{"path":"https://slowkow.github.io/hlabud/reference/dosage.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Convert a set of genotype names into a dosage matrix of each residue at each position — dosage","text":"","code":"DRB1_file <- file.path( \"https://github.com/ANHIG/IMGTHLA/raw\", \"5f2c562056f8ffa89aeea0631f2a52300ee0de17\", \"alignments/DRB1_prot.txt\" ) a <- read_alignments(DRB1_file) genotypes <- c( \"DRB1*12:02:02:03,DRB1*12:02:02:03,DRB1*14:54:02\", \"DRB1*04:174,DRB1*15:152\", \"DRB1*04:56:02,DRB1*15:01:48\", \"DRB1*14:172,DRB1*04:160\", \"DRB1*04:359,DRB1*04:284:02\" ) dosage <- dosage(genotypes, a$onehot) dosage[,1:5] #> posn29_unk posn29_M pos6_R #> DRB1*12:02:02:03,DRB1*12:02:02:03,DRB1*14:54:02 1 2 3 #> DRB1*04:174,DRB1*15:152 2 0 2 #> DRB1*04:56:02,DRB1*15:01:48 2 0 2 #> DRB1*14:172,DRB1*04:160 2 0 2 #> DRB1*04:359,DRB1*04:284:02 2 0 2 #> pos9_E pos9_W #> DRB1*12:02:02:03,DRB1*12:02:02:03,DRB1*14:54:02 3 0 #> DRB1*04:174,DRB1*15:152 1 1 #> DRB1*04:56:02,DRB1*15:01:48 1 1 #> DRB1*14:172,DRB1*04:160 2 0 #> DRB1*04:359,DRB1*04:284:02 2 0"},{"path":"https://slowkow.github.io/hlabud/reference/get_hlabud_dir.html","id":null,"dir":"Reference","previous_headings":"","what":"Get the name of the folder for caching downloaded IMGTHLA files — get_hlabud_dir","title":"Get the name of the folder for caching downloaded IMGTHLA files — get_hlabud_dir","text":"function : Get folder name getOption(\"hlabud_dir\") else automatically choose appropriate folder operating system thanks rappdirs. Create folder automatically already exist. Set hlabud_dir option new folder.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/get_hlabud_dir.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Get the name of the folder for caching downloaded IMGTHLA files — get_hlabud_dir","text":"","code":"get_hlabud_dir()"},{"path":"https://slowkow.github.io/hlabud/reference/get_hlabud_dir.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Get the name of the folder for caching downloaded IMGTHLA files — get_hlabud_dir","text":"name folder.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/get_hlabud_dir.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Get the name of the folder for caching downloaded IMGTHLA files — get_hlabud_dir","text":"locations hlabud_dir folder operating system. Linux: Mac: Windows: set hlabud_dir option, please use:","code":"~/.local/share/hlabud ~/Library/Application Support/hlabud C:\\Documents and Settings\\{User}\\Application Data\\slowkow\\hlabud options(hlabud_dir = \"/my/favorite/path\")"},{"path":"https://slowkow.github.io/hlabud/reference/get_hlabud_dir.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Get the name of the folder for caching downloaded IMGTHLA files — get_hlabud_dir","text":"","code":"if (FALSE) { hlabud_dir <- get_hlabud_dir() }"},{"path":"https://slowkow.github.io/hlabud/reference/get_onehot.html","id":null,"dir":"Reference","previous_headings":"","what":"Make a one-hot encoded matrix from a dataframe of amino acid\nsequences. — get_onehot","title":"Make a one-hot encoded matrix from a dataframe of amino acid\nsequences. — get_onehot","text":"Make one-hot encoded matrix dataframe amino acid sequences.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/get_onehot.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Make a one-hot encoded matrix from a dataframe of amino acid\nsequences. — get_onehot","text":"","code":"get_onehot(al, n_pre)"},{"path":"https://slowkow.github.io/hlabud/reference/get_onehot.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Make a one-hot encoded matrix from a dataframe of amino acid\nsequences. — get_onehot","text":"al dataframe columns allele, seq n_pre number amino acid sequences position 1.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/hla_alignments.html","id":null,"dir":"Reference","previous_headings":"","what":"Get sequence alignments from IMGTHLA — hla_alignments","title":"Get sequence alignments from IMGTHLA — hla_alignments","text":"conventions used alignments (EBI IMGT-HLA help page): entry allele displayed respect reference sequences. identity reference sequence present base displayed hyphen (-). Non-identity reference sequence shown displaying appropriate base position. insertion deletion occurred represented period (.). sequence unknown point alignment, represented asterisk (*). protein alignments null alleles, 'Stop' codons represented hash (X). protein alignments, sequence following termination codon, marked appear blank. conventions used nucleotide protein alignments.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/hla_alignments.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Get sequence alignments from IMGTHLA — hla_alignments","text":"","code":"hla_alignments( gene = \"DRB1\", type = \"prot\", release = \"latest\", verbose = FALSE )"},{"path":"https://slowkow.github.io/hlabud/reference/hla_alignments.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Get sequence alignments from IMGTHLA — hla_alignments","text":"gene name gene like \"DRB1\" type type sequence, one \"prot\", \"nuc\", \"gen\" release Default \"latest\". release name like \"3.51.0\". verbose TRUE, print messages along way.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/hla_alignments.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Get sequence alignments from IMGTHLA — hla_alignments","text":"list dataframe called sequences two matrices alleles onehot. dataframe two columns: allele: name allele, e.g., DQB*01:01 seq: amino acid sequence matrix alleles one row allele, one column position, values representing residues position allele. matrix onehot one-hot encoding variants distinguish alleles, one row allele one column amino acid position.","code":""},{"path":[]},{"path":"https://slowkow.github.io/hlabud/reference/hla_alignments.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Get sequence alignments from IMGTHLA — hla_alignments","text":"","code":"# \\donttest{ a <- hla_alignments(\"DRB1\") head(a$sequences) #> # A tibble: 6 × 2 #> allele seq #> #> 1 DRB1*01:01:01:01 MVCLKLPGGSCMTALTVTLMVLSSPLALAGDTRPRFLWQLKFECHFFNGTERVR.LLERC… #> 2 DRB1*01:01:01:02 ------------------------------------------------------.-----… #> 3 DRB1*01:01:01:03 ------------------------------------------------------.-----… #> 4 DRB1*01:01:01:04 ------------------------------------------------------.-----… #> 5 DRB1*01:01:01:05 ------------------------------------------------------.-----… #> 6 DRB1*01:01:01:06 ------------------------------------------------------.-----… a$alleles[1:6,1:6] #> posn29 posn28 posn27 posn26 posn25 posn24 #> DRB1*01:01:01:01 \"M\" \"V\" \"C\" \"L\" \"K\" \"L\" #> DRB1*01:01:01:02 \"M\" \"V\" \"C\" \"L\" \"K\" \"L\" #> DRB1*01:01:01:03 \"M\" \"V\" \"C\" \"L\" \"K\" \"L\" #> DRB1*01:01:01:04 \"M\" \"V\" \"C\" \"L\" \"K\" \"L\" #> DRB1*01:01:01:05 \"M\" \"V\" \"C\" \"L\" \"K\" \"L\" #> DRB1*01:01:01:06 \"M\" \"V\" \"C\" \"L\" \"K\" \"L\" a$onehot[1:6,1:6] #> posn29_unk posn29_M posn28_unk posn28_L posn28_V posn27_unk #> DRB1*01:01:01:01 0 1 0 0 1 0 #> DRB1*01:01:01:02 0 1 0 0 1 0 #> DRB1*01:01:01:03 0 1 0 0 1 0 #> DRB1*01:01:01:04 0 1 0 0 1 0 #> DRB1*01:01:01:05 0 1 0 0 1 0 #> DRB1*01:01:01:06 0 1 0 0 1 0 # }"},{"path":"https://slowkow.github.io/hlabud/reference/hla_alleles.html","id":null,"dir":"Reference","previous_headings":"","what":"Get a table of allele names for a particular IMGTHLA release — hla_alleles","title":"Get a table of allele names for a particular IMGTHLA release — hla_alleles","text":"Download list allele names HLA genes particular IMGTHLA release.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/hla_alleles.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Get a table of allele names for a particular IMGTHLA release — hla_alleles","text":"","code":"hla_alleles(release = \"latest\", overwrite = FALSE, verbose = FALSE)"},{"path":"https://slowkow.github.io/hlabud/reference/hla_alleles.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Get a table of allele names for a particular IMGTHLA release — hla_alleles","text":"release Default \"latest\". release name like \"3.51.0\". overwrite Overwrite existing alleles.json file Allelelist.{version}.txt file verbose TRUE, print messages along way.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/hla_alleles.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Get a table of allele names for a particular IMGTHLA release — hla_alleles","text":"data frame HLA allele ids names","code":""},{"path":[]},{"path":"https://slowkow.github.io/hlabud/reference/hla_alleles.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Get a table of allele names for a particular IMGTHLA release — hla_alleles","text":"","code":"# \\donttest{ head(hla_alleles()) #> AlleleID Allele #> 1 HLA00001 A*01:01:01:01 #> 2 HLA02169 A*01:01:01:02N #> 3 HLA14798 A*01:01:01:03 #> 4 HLA15760 A*01:01:01:04 #> 5 HLA16415 A*01:01:01:05 #> 6 HLA16417 A*01:01:01:06 # }"},{"path":"https://slowkow.github.io/hlabud/reference/hla_divergence.html","id":null,"dir":"Reference","previous_headings":"","what":"Calculate HLA divergence for each individual — hla_divergence","title":"Calculate HLA divergence for each individual — hla_divergence","text":"First, convert allele name (e.g. *01:01) amino acid sequence. divergence sum distances pair amino acids position, divided total sequence length. amino acid distance matrix use one published Grantham 1974 (doi:10.1126/science.185.4154.862), based three physical properties amino acids (composition, polarity, molecular volume) correlated estimate relative substitution frequency.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/hla_divergence.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Calculate HLA divergence for each individual — hla_divergence","text":"","code":"hla_divergence( alleles = c(\"A*01:01,A*02:01\"), method = \"grantham\", release = \"latest\" )"},{"path":"https://slowkow.github.io/hlabud/reference/hla_divergence.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Calculate HLA divergence for each individual — hla_divergence","text":"alleles character vector comma-delimited alleles individual. usually expect two alleles per individual, possible (fewer) copies due copy number alterations. function still works individual different number alleles. method pairwise amino acid matrix, method name: \"grantham\" \"uniform\" indicate pairwise amino acid distance matrix use. choose pass matrix, 20x20 symmetric matrix zeros diagonal, rownames colnames one-letter amino acid codes R N D C Q E G H L K M F P S T W Y V. release Default \"latest\". release name like \"3.51.0\".","code":""},{"path":"https://slowkow.github.io/hlabud/reference/hla_divergence.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Calculate HLA divergence for each individual — hla_divergence","text":"dataframe divergence individual.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/hla_divergence.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Calculate HLA divergence for each individual — hla_divergence","text":"code function translation original Perl code Tobias Lenz, published Pierini & Lenz 2018 MolBiolEvol (https://doi.org/10.1093/molbev/msy116). comparing two amino acid sequences, characters one 20 amino acids considered divergence calculation, gaps (characters) count.","code":""},{"path":[]},{"path":"https://slowkow.github.io/hlabud/reference/hla_divergence.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Calculate HLA divergence for each individual — hla_divergence","text":"","code":"my_genos <- c(\"A*23:01:12,A*24:550\", \"A*25:12N,A*11:27\", \"A*24:381,A*33:85\", \"A*01:01:,A*01:01,A*02:01\") hla_divergence(my_genos, method = \"grantham\") #> A*23:01:12,A*24:550 A*25:12N,A*11:27 A*24:381,A*33:85 #> 0.4924242 3.3333333 4.9015152 #> A*01:01:,A*01:01,A*02:01 #> 3.8367003 # This is equivalent hla_divergence(my_genos, method = amino_distance_matrix(\"grantham\")) #> A*23:01:12,A*24:550 A*25:12N,A*11:27 A*24:381,A*33:85 #> 0.4924242 3.3333333 4.9015152 #> A*01:01:,A*01:01,A*02:01 #> 3.8367003"},{"path":"https://slowkow.github.io/hlabud/reference/hla_frequencies.html","id":null,"dir":"Reference","previous_headings":"","what":"Get HLA frequences from Allele Frequency Net Database (AFND) — hla_frequencies","title":"Get HLA frequences from Allele Frequency Net Database (AFND) — hla_frequencies","text":"Download read table HLA allele frequencies Allele Frequency Net Database (AFND).","code":""},{"path":"https://slowkow.github.io/hlabud/reference/hla_frequencies.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Get HLA frequences from Allele Frequency Net Database (AFND) — hla_frequencies","text":"","code":"hla_frequencies(verbose = FALSE)"},{"path":"https://slowkow.github.io/hlabud/reference/hla_frequencies.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Get HLA frequences from Allele Frequency Net Database (AFND) — hla_frequencies","text":"verbose TRUE, print messages along way.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/hla_frequencies.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Get HLA frequences from Allele Frequency Net Database (AFND) — hla_frequencies","text":"dataframe HLA allele frequencies genes.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/hla_frequencies.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Get HLA frequences from Allele Frequency Net Database (AFND) — hla_frequencies","text":"use data, please cite latest manuscript Allele Frequency Net Database: Gonzalez-Galarza FF, McCabe , Santos EJMD, Jones J, Takeshita L, Ortega-Rivera ND, et al. Allele frequency net database (AFND) 2020 update: gold-standard data classification, open access genotype data new query tools. Nucleic Acids Res. 2020;48: D783–D788. doi:10.1093/nar/gkz1029","code":""},{"path":"https://slowkow.github.io/hlabud/reference/hla_frequencies.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Get HLA frequences from Allele Frequency Net Database (AFND) — hla_frequencies","text":"","code":"# \\donttest{ hla_frequencies() #> # A tibble: 123,502 × 7 #> group gene allele population indivs_over_n alleles_over_2n n #> #> 1 hla A A*01:01 Argentina Rosario To… 15.1 0.076 86 #> 2 hla A A*01:01 Armenia combined Reg… NA 0.125 100 #> 3 hla A A*01:01 Australia Cape York … NA 0.053 103 #> 4 hla A A*01:01 Australia Groote Eyl… NA 0.027 75 #> 5 hla A A*01:01 Australia New South … NA 0.187 134 #> 6 hla A A*01:01 Australia Yuendumu A… NA 0.008 191 #> 7 hla A A*01:01 Austria 27 0.146 200 #> 8 hla A A*01:01 Azores Central Islan… NA 0.08 59 #> 9 hla A A*01:01 Azores Oriental Isla… NA 0.115 43 #> 10 hla A A*01:01 Azores Terceira Isla… NA 0.109 130 #> # ℹ 123,492 more rows # }"},{"path":"https://slowkow.github.io/hlabud/reference/hla_genes.html","id":null,"dir":"Reference","previous_headings":"","what":"Get HLA gene names from IMGTHLA — hla_genes","title":"Get HLA gene names from IMGTHLA — hla_genes","text":"Retrieve list txt files github.com/ANHIG/IMGTHLA/alignments return list gene names derived file names.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/hla_genes.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Get HLA gene names from IMGTHLA — hla_genes","text":"","code":"hla_genes(release = \"latest\", overwrite = FALSE, verbose = FALSE)"},{"path":"https://slowkow.github.io/hlabud/reference/hla_genes.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Get HLA gene names from IMGTHLA — hla_genes","text":"release Default \"latest\". release name like \"3.51.0\". overwrite Overwrite existing genes.json file new one GitHub verbose TRUE, print messages along way.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/hla_genes.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Get HLA gene names from IMGTHLA — hla_genes","text":"tibble two columns: HLA gene names (\"\", \"DRB1\") types (\"nuc\", \"gen\", \"prot\").","code":""},{"path":[]},{"path":"https://slowkow.github.io/hlabud/reference/hla_genes.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Get HLA gene names from IMGTHLA — hla_genes","text":"","code":"# \\donttest{ hla_genes() #> # A tibble: 107 × 2 #> gene type #> #> 1 A gen #> 2 A nuc #> 3 A prot #> 4 B gen #> 5 B nuc #> 6 B prot #> 7 C gen #> 8 C nuc #> 9 C prot #> 10 DMA gen #> # ℹ 97 more rows # }"},{"path":"https://slowkow.github.io/hlabud/reference/hla_releases.html","id":null,"dir":"Reference","previous_headings":"","what":"Get the names of releases from IMGTHLA — hla_releases","title":"Get the names of releases from IMGTHLA — hla_releases","text":"Get tags github.com/ANHIG/IMGTHLA, save file called tags.json getOption(\"hlabud_dir\"), return release names file.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/hla_releases.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Get the names of releases from IMGTHLA — hla_releases","text":"","code":"hla_releases(overwrite = FALSE)"},{"path":"https://slowkow.github.io/hlabud/reference/hla_releases.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Get the names of releases from IMGTHLA — hla_releases","text":"overwrite Overwrite existing tags.json file getOption(\"hlabud_dir\")","code":""},{"path":"https://slowkow.github.io/hlabud/reference/hla_releases.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Get the names of releases from IMGTHLA — hla_releases","text":"character vector release names like \"3.51.0\"","code":""},{"path":"https://slowkow.github.io/hlabud/reference/hla_releases.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Get the names of releases from IMGTHLA — hla_releases","text":"","code":"# \\donttest{ hla_releases() #> [1] \".354.0\" \"3.53.0\" \"3.52.0\" \"3.51.0\" \"3.50.0\" \"3.49.0\" #> [7] \"3.48.0\" \"3.47.0\" \"3.46.0\" \"3.45.1\" \"3.45.01\" \"3.45.0.1\" #> [13] \"3.45.0\" \"3.44.1\" \"3.44.0\" \"3.43.0\" \"3.42.0\" \"3.41.2\" #> [19] \"3.41.0\" \"3.40.0\" \"3.39.0\" \"3.38.0\" \"3.37.0\" \"3.36.0\" #> [25] \"3.35.0\" \"3.34.0\" \"3.33.0\" \"3.32.0\" \"3.31.0\" \"3.30.0\" # }"},{"path":"https://slowkow.github.io/hlabud/reference/hlabud-package.html","id":null,"dir":"Reference","previous_headings":"","what":"hlabud: Methods for Access and Analysis of the Human Leukocyte Antigen (HLA) Gene Sequence Alignments from IMGTHLA — hlabud-package","title":"hlabud: Methods for Access and Analysis of the Human Leukocyte Antigen (HLA) Gene Sequence Alignments from IMGTHLA — hlabud-package","text":"Fetch sequence alignment data IMGTHLA database Robinson et al (2020) doi:10.1093/nar/gkz950 , automatically convert sequence alignments convenient R matrices ready downstream analysis. vignette shows examples using one-hot encoding data logistic regression dimensionality reduction. Data downloaded lazily, -needed, cached user-configurable folder.","code":""},{"path":[]},{"path":"https://slowkow.github.io/hlabud/reference/hlabud-package.html","id":"author","dir":"Reference","previous_headings":"","what":"Author","title":"hlabud: Methods for Access and Analysis of the Human Leukocyte Antigen (HLA) Gene Sequence Alignments from IMGTHLA — hlabud-package","text":"Maintainer: Kamil Slowikowski kslowikowski@gmail.com (ORCID)","code":""},{"path":"https://slowkow.github.io/hlabud/reference/install_hla.html","id":null,"dir":"Reference","previous_headings":"","what":"Download and unpack a tarball release from IMGTHLA — install_hla","title":"Download and unpack a tarball release from IMGTHLA — install_hla","text":"release tarball Github unpacked getOption(\"hlabud_dir\") folder.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/install_hla.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Download and unpack a tarball release from IMGTHLA — install_hla","text":"","code":"install_hla(release = \"latest\", overwrite = FALSE, verbose = FALSE)"},{"path":"https://slowkow.github.io/hlabud/reference/install_hla.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Download and unpack a tarball release from IMGTHLA — install_hla","text":"release Default \"latest\". release name like \"3.51.0\". overwrite TRUE, overwrite existing files release folder. verbose TRUE, print messages along way.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/install_hla.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Download and unpack a tarball release from IMGTHLA — install_hla","text":"Note latest releases 100 MB size, download might take slow connections.","code":""},{"path":[]},{"path":"https://slowkow.github.io/hlabud/reference/install_hla.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Download and unpack a tarball release from IMGTHLA — install_hla","text":"","code":"if (FALSE) { install_hla() install_hla(\"3.51.0\") install_hla(\"3.51.0\", verbose = TRUE) # Change the install directory options(hlabud_dir = \"path/to/my/dir\") install_hla() }"},{"path":"https://slowkow.github.io/hlabud/reference/one_to_three.html","id":null,"dir":"Reference","previous_headings":"","what":"Convert one letter amino acid codes to three letter amino acid codes — one_to_three","title":"Convert one letter amino acid codes to three letter amino acid codes — one_to_three","text":"Convert one letter amino acid codes three letter amino acid codes","code":""},{"path":"https://slowkow.github.io/hlabud/reference/one_to_three.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Convert one letter amino acid codes to three letter amino acid codes — one_to_three","text":"","code":"one_to_three(aminos)"},{"path":"https://slowkow.github.io/hlabud/reference/pipe.html","id":null,"dir":"Reference","previous_headings":"","what":"Pipe operator — %>%","title":"Pipe operator — %>%","text":"See magrittr::%>% details.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/pipe.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Pipe operator — %>%","text":"lhs value magrittr placeholder. rhs function call using magrittr semantics.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/pipe.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Pipe operator — %>%","text":"result calling rhs(lhs).","code":""},{"path":"https://slowkow.github.io/hlabud/reference/read_alignments.html","id":null,"dir":"Reference","previous_headings":"","what":"Read an alignment file *_(nuc|gen|prot).txt from IMGTHLA — read_alignments","title":"Read an alignment file *_(nuc|gen|prot).txt from IMGTHLA — read_alignments","text":"function reads txt files provided IMGTHLA.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/read_alignments.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Read an alignment file *_(nuc|gen|prot).txt from IMGTHLA — read_alignments","text":"","code":"read_alignments(file)"},{"path":"https://slowkow.github.io/hlabud/reference/read_alignments.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Read an alignment file *_(nuc|gen|prot).txt from IMGTHLA — read_alignments","text":"file File name txt file IMGTHLA like \"DQB1_prot.txt\"","code":""},{"path":"https://slowkow.github.io/hlabud/reference/read_alignments.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Read an alignment file *_(nuc|gen|prot).txt from IMGTHLA — read_alignments","text":"list dataframe called sequences two matrices alleles onehot. dataframe two columns: allele: name allele, e.g., DQB*01:01 seq: amino acid sequence matrix alleles one row allele, one column position, values representing residues position allele. matrix onehot one-hot encoding variants distinguish alleles, one row allele one column amino acid position.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/read_alignments.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Read an alignment file *_(nuc|gen|prot).txt from IMGTHLA — read_alignments","text":"Consider using hla_alignments() instead function. already txt file want read, can read read_alignments(\"myfile.txt\"). sequences contained file: {gene}_prot.txt amino acid sequence HLA allele. {gene}_nuc.txt nucleotide sequence exons. {gene}_gen.txt genomic sequence exons introns.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/read_alignments.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Read an alignment file *_(nuc|gen|prot).txt from IMGTHLA — read_alignments","text":"","code":"my_file <- file.path( \"https://github.com/ANHIG/IMGTHLA/raw\", \"5f2c562056f8ffa89aeea0631f2a52300ee0de17\", \"alignments/DRB1_prot.txt\" ) a <- read_alignments(my_file) head(a$sequences) #> # A tibble: 6 × 2 #> allele seq #> #> 1 DRB1*01:01:01:01 MVCLKLPGGSCMTALTVTLMVLSSPLALAGDTRPRFLWQLKFECHFFNGTERVR.LLERC… #> 2 DRB1*01:01:01:02 ------------------------------------------------------.-----… #> 3 DRB1*01:01:01:03 ------------------------------------------------------.-----… #> 4 DRB1*01:01:01:04 ------------------------------------------------------.-----… #> 5 DRB1*01:01:01:05 ------------------------------------------------------.-----… #> 6 DRB1*01:01:01:06 ------------------------------------------------------.-----… a$alleles[1:5,1:5] #> posn29 posn28 posn27 posn26 posn25 #> DRB1*01:01:01:01 \"M\" \"V\" \"C\" \"L\" \"K\" #> DRB1*01:01:01:02 \"M\" \"V\" \"C\" \"L\" \"K\" #> DRB1*01:01:01:03 \"M\" \"V\" \"C\" \"L\" \"K\" #> DRB1*01:01:01:04 \"M\" \"V\" \"C\" \"L\" \"K\" #> DRB1*01:01:01:05 \"M\" \"V\" \"C\" \"L\" \"K\" a$onehot[1:5,1:5] #> posn29_unk posn29_M posn28_unk posn28_V posn27_unk #> DRB1*01:01:01:01 0 1 0 1 0 #> DRB1*01:01:01:02 0 1 0 1 0 #> DRB1*01:01:01:03 0 1 0 1 0 #> DRB1*01:01:01:04 0 1 0 1 0 #> DRB1*01:01:01:05 0 1 0 1 0"},{"path":"https://slowkow.github.io/hlabud/news/index.html","id":"hlabud-1009999","dir":"Changelog","previous_headings":"","what":"hlabud 1.0.0.9999","title":"hlabud 1.0.0.9999","text":"Instead discarding positions *, include label unk, example pos241_unk indicates unknown amino acid position 241. Thanks Sreekar Mantena reporting issue! Fix --one error. example, HLA-pos361_- colnames($onehot) reference allele instead -. now fixed. Thanks Sreekar Mantena reporting issue!","code":""},{"path":"https://slowkow.github.io/hlabud/news/index.html","id":"hlabud-100","dir":"Changelog","previous_headings":"","what":"hlabud 1.0.0","title":"hlabud 1.0.0","text":"Added NEWS.md file track changes package.","code":""}] +[{"path":"https://slowkow.github.io/hlabud/LICENSE.html","id":null,"dir":"","previous_headings":"","what":"GNU General Public License","title":"GNU General Public License","text":"Version 3, 29 June 2007Copyright © 2007 Free Software Foundation, Inc.  Everyone permitted copy distribute verbatim copies license document, changing allowed.","code":""},{"path":"https://slowkow.github.io/hlabud/LICENSE.html","id":"preamble","dir":"","previous_headings":"","what":"Preamble","title":"GNU General Public License","text":"GNU General Public License free, copyleft license software kinds works. licenses software practical works designed take away freedom share change works. contrast, GNU General Public License intended guarantee freedom share change versions program–make sure remains free software users. , Free Software Foundation, use GNU General Public License software; applies also work released way authors. can apply programs, . speak free software, referring freedom, price. General Public Licenses designed make sure freedom distribute copies free software (charge wish), receive source code can get want , can change software use pieces new free programs, know can things. protect rights, need prevent others denying rights asking surrender rights. Therefore, certain responsibilities distribute copies software, modify : responsibilities respect freedom others. example, distribute copies program, whether gratis fee, must pass recipients freedoms received. must make sure , , receive can get source code. must show terms know rights. Developers use GNU GPL protect rights two steps: (1) assert copyright software, (2) offer License giving legal permission copy, distribute /modify . developers’ authors’ protection, GPL clearly explains warranty free software. users’ authors’ sake, GPL requires modified versions marked changed, problems attributed erroneously authors previous versions. devices designed deny users access install run modified versions software inside , although manufacturer can . fundamentally incompatible aim protecting users’ freedom change software. systematic pattern abuse occurs area products individuals use, precisely unacceptable. Therefore, designed version GPL prohibit practice products. problems arise substantially domains, stand ready extend provision domains future versions GPL, needed protect freedom users. Finally, every program threatened constantly software patents. States allow patents restrict development use software general-purpose computers, , wish avoid special danger patents applied free program make effectively proprietary. prevent , GPL assures patents used render program non-free. precise terms conditions copying, distribution modification follow.","code":""},{"path":[]},{"path":"https://slowkow.github.io/hlabud/LICENSE.html","id":"id_0-definitions","dir":"","previous_headings":"TERMS AND CONDITIONS","what":"0. Definitions","title":"GNU General Public License","text":"“License” refers version 3 GNU General Public License. “Copyright” also means copyright-like laws apply kinds works, semiconductor masks. “Program” refers copyrightable work licensed License. licensee addressed “”. “Licensees” “recipients” may individuals organizations. “modify” work means copy adapt part work fashion requiring copyright permission, making exact copy. resulting work called “modified version” earlier work work “based ” earlier work. “covered work” means either unmodified Program work based Program. “propagate” work means anything , without permission, make directly secondarily liable infringement applicable copyright law, except executing computer modifying private copy. Propagation includes copying, distribution (without modification), making available public, countries activities well. “convey” work means kind propagation enables parties make receive copies. Mere interaction user computer network, transfer copy, conveying. interactive user interface displays “Appropriate Legal Notices” extent includes convenient prominently visible feature (1) displays appropriate copyright notice, (2) tells user warranty work (except extent warranties provided), licensees may convey work License, view copy License. interface presents list user commands options, menu, prominent item list meets criterion.","code":""},{"path":"https://slowkow.github.io/hlabud/LICENSE.html","id":"id_1-source-code","dir":"","previous_headings":"TERMS AND CONDITIONS","what":"1. Source Code","title":"GNU General Public License","text":"“source code” work means preferred form work making modifications . “Object code” means non-source form work. “Standard Interface” means interface either official standard defined recognized standards body, , case interfaces specified particular programming language, one widely used among developers working language. “System Libraries” executable work include anything, work whole, () included normal form packaging Major Component, part Major Component, (b) serves enable use work Major Component, implement Standard Interface implementation available public source code form. “Major Component”, context, means major essential component (kernel, window system, ) specific operating system () executable work runs, compiler used produce work, object code interpreter used run . “Corresponding Source” work object code form means source code needed generate, install, (executable work) run object code modify work, including scripts control activities. However, include work’s System Libraries, general-purpose tools generally available free programs used unmodified performing activities part work. example, Corresponding Source includes interface definition files associated source files work, source code shared libraries dynamically linked subprograms work specifically designed require, intimate data communication control flow subprograms parts work. Corresponding Source need include anything users can regenerate automatically parts Corresponding Source. Corresponding Source work source code form work.","code":""},{"path":"https://slowkow.github.io/hlabud/LICENSE.html","id":"id_2-basic-permissions","dir":"","previous_headings":"TERMS AND CONDITIONS","what":"2. Basic Permissions","title":"GNU General Public License","text":"rights granted License granted term copyright Program, irrevocable provided stated conditions met. License explicitly affirms unlimited permission run unmodified Program. output running covered work covered License output, given content, constitutes covered work. License acknowledges rights fair use equivalent, provided copyright law. may make, run propagate covered works convey, without conditions long license otherwise remains force. may convey covered works others sole purpose make modifications exclusively , provide facilities running works, provided comply terms License conveying material control copyright. thus making running covered works must exclusively behalf, direction control, terms prohibit making copies copyrighted material outside relationship . Conveying circumstances permitted solely conditions stated . Sublicensing allowed; section 10 makes unnecessary.","code":""},{"path":"https://slowkow.github.io/hlabud/LICENSE.html","id":"id_3-protecting-users-legal-rights-from-anti-circumvention-law","dir":"","previous_headings":"TERMS AND CONDITIONS","what":"3. Protecting Users’ Legal Rights From Anti-Circumvention Law","title":"GNU General Public License","text":"covered work shall deemed part effective technological measure applicable law fulfilling obligations article 11 WIPO copyright treaty adopted 20 December 1996, similar laws prohibiting restricting circumvention measures. convey covered work, waive legal power forbid circumvention technological measures extent circumvention effected exercising rights License respect covered work, disclaim intention limit operation modification work means enforcing, work’s users, third parties’ legal rights forbid circumvention technological measures.","code":""},{"path":"https://slowkow.github.io/hlabud/LICENSE.html","id":"id_4-conveying-verbatim-copies","dir":"","previous_headings":"TERMS AND CONDITIONS","what":"4. Conveying Verbatim Copies","title":"GNU General Public License","text":"may convey verbatim copies Program’s source code receive , medium, provided conspicuously appropriately publish copy appropriate copyright notice; keep intact notices stating License non-permissive terms added accord section 7 apply code; keep intact notices absence warranty; give recipients copy License along Program. may charge price price copy convey, may offer support warranty protection fee.","code":""},{"path":"https://slowkow.github.io/hlabud/LICENSE.html","id":"id_5-conveying-modified-source-versions","dir":"","previous_headings":"TERMS AND CONDITIONS","what":"5. Conveying Modified Source Versions","title":"GNU General Public License","text":"may convey work based Program, modifications produce Program, form source code terms section 4, provided also meet conditions: ) work must carry prominent notices stating modified , giving relevant date. b) work must carry prominent notices stating released License conditions added section 7. requirement modifies requirement section 4 “keep intact notices”. c) must license entire work, whole, License anyone comes possession copy. License therefore apply, along applicable section 7 additional terms, whole work, parts, regardless packaged. License gives permission license work way, invalidate permission separately received . d) work interactive user interfaces, must display Appropriate Legal Notices; however, Program interactive interfaces display Appropriate Legal Notices, work need make . compilation covered work separate independent works, nature extensions covered work, combined form larger program, volume storage distribution medium, called “aggregate” compilation resulting copyright used limit access legal rights compilation’s users beyond individual works permit. Inclusion covered work aggregate cause License apply parts aggregate.","code":""},{"path":"https://slowkow.github.io/hlabud/LICENSE.html","id":"id_6-conveying-non-source-forms","dir":"","previous_headings":"TERMS AND CONDITIONS","what":"6. Conveying Non-Source Forms","title":"GNU General Public License","text":"may convey covered work object code form terms sections 4 5, provided also convey machine-readable Corresponding Source terms License, one ways: ) Convey object code , embodied , physical product (including physical distribution medium), accompanied Corresponding Source fixed durable physical medium customarily used software interchange. b) Convey object code , embodied , physical product (including physical distribution medium), accompanied written offer, valid least three years valid long offer spare parts customer support product model, give anyone possesses object code either (1) copy Corresponding Source software product covered License, durable physical medium customarily used software interchange, price reasonable cost physically performing conveying source, (2) access copy Corresponding Source network server charge. c) Convey individual copies object code copy written offer provide Corresponding Source. alternative allowed occasionally noncommercially, received object code offer, accord subsection 6b. d) Convey object code offering access designated place (gratis charge), offer equivalent access Corresponding Source way place charge. need require recipients copy Corresponding Source along object code. place copy object code network server, Corresponding Source may different server (operated third party) supports equivalent copying facilities, provided maintain clear directions next object code saying find Corresponding Source. Regardless server hosts Corresponding Source, remain obligated ensure available long needed satisfy requirements. e) Convey object code using peer--peer transmission, provided inform peers object code Corresponding Source work offered general public charge subsection 6d. separable portion object code, whose source code excluded Corresponding Source System Library, need included conveying object code work. “User Product” either (1) “consumer product”, means tangible personal property normally used personal, family, household purposes, (2) anything designed sold incorporation dwelling. determining whether product consumer product, doubtful cases shall resolved favor coverage. particular product received particular user, “normally used” refers typical common use class product, regardless status particular user way particular user actually uses, expects expected use, product. product consumer product regardless whether product substantial commercial, industrial non-consumer uses, unless uses represent significant mode use product. “Installation Information” User Product means methods, procedures, authorization keys, information required install execute modified versions covered work User Product modified version Corresponding Source. information must suffice ensure continued functioning modified object code case prevented interfered solely modification made. convey object code work section , , specifically use , User Product, conveying occurs part transaction right possession use User Product transferred recipient perpetuity fixed term (regardless transaction characterized), Corresponding Source conveyed section must accompanied Installation Information. requirement apply neither third party retains ability install modified object code User Product (example, work installed ROM). requirement provide Installation Information include requirement continue provide support service, warranty, updates work modified installed recipient, User Product modified installed. Access network may denied modification materially adversely affects operation network violates rules protocols communication across network. Corresponding Source conveyed, Installation Information provided, accord section must format publicly documented (implementation available public source code form), must require special password key unpacking, reading copying.","code":""},{"path":"https://slowkow.github.io/hlabud/LICENSE.html","id":"id_7-additional-terms","dir":"","previous_headings":"TERMS AND CONDITIONS","what":"7. Additional Terms","title":"GNU General Public License","text":"“Additional permissions” terms supplement terms License making exceptions one conditions. Additional permissions applicable entire Program shall treated though included License, extent valid applicable law. additional permissions apply part Program, part may used separately permissions, entire Program remains governed License without regard additional permissions. convey copy covered work, may option remove additional permissions copy, part . (Additional permissions may written require removal certain cases modify work.) may place additional permissions material, added covered work, can give appropriate copyright permission. Notwithstanding provision License, material add covered work, may (authorized copyright holders material) supplement terms License terms: ) Disclaiming warranty limiting liability differently terms sections 15 16 License; b) Requiring preservation specified reasonable legal notices author attributions material Appropriate Legal Notices displayed works containing ; c) Prohibiting misrepresentation origin material, requiring modified versions material marked reasonable ways different original version; d) Limiting use publicity purposes names licensors authors material; e) Declining grant rights trademark law use trade names, trademarks, service marks; f) Requiring indemnification licensors authors material anyone conveys material (modified versions ) contractual assumptions liability recipient, liability contractual assumptions directly impose licensors authors. non-permissive additional terms considered “restrictions” within meaning section 10. Program received , part , contains notice stating governed License along term restriction, may remove term. license document contains restriction permits relicensing conveying License, may add covered work material governed terms license document, provided restriction survive relicensing conveying. add terms covered work accord section, must place, relevant source files, statement additional terms apply files, notice indicating find applicable terms. Additional terms, permissive non-permissive, may stated form separately written license, stated exceptions; requirements apply either way.","code":""},{"path":"https://slowkow.github.io/hlabud/LICENSE.html","id":"id_8-termination","dir":"","previous_headings":"TERMS AND CONDITIONS","what":"8. Termination","title":"GNU General Public License","text":"may propagate modify covered work except expressly provided License. attempt otherwise propagate modify void, automatically terminate rights License (including patent licenses granted third paragraph section 11). However, cease violation License, license particular copyright holder reinstated () provisionally, unless copyright holder explicitly finally terminates license, (b) permanently, copyright holder fails notify violation reasonable means prior 60 days cessation. Moreover, license particular copyright holder reinstated permanently copyright holder notifies violation reasonable means, first time received notice violation License (work) copyright holder, cure violation prior 30 days receipt notice. Termination rights section terminate licenses parties received copies rights License. rights terminated permanently reinstated, qualify receive new licenses material section 10.","code":""},{"path":"https://slowkow.github.io/hlabud/LICENSE.html","id":"id_9-acceptance-not-required-for-having-copies","dir":"","previous_headings":"TERMS AND CONDITIONS","what":"9. Acceptance Not Required for Having Copies","title":"GNU General Public License","text":"required accept License order receive run copy Program. Ancillary propagation covered work occurring solely consequence using peer--peer transmission receive copy likewise require acceptance. However, nothing License grants permission propagate modify covered work. actions infringe copyright accept License. Therefore, modifying propagating covered work, indicate acceptance License .","code":""},{"path":"https://slowkow.github.io/hlabud/LICENSE.html","id":"id_10-automatic-licensing-of-downstream-recipients","dir":"","previous_headings":"TERMS AND CONDITIONS","what":"10. Automatic Licensing of Downstream Recipients","title":"GNU General Public License","text":"time convey covered work, recipient automatically receives license original licensors, run, modify propagate work, subject License. responsible enforcing compliance third parties License. “entity transaction” transaction transferring control organization, substantially assets one, subdividing organization, merging organizations. propagation covered work results entity transaction, party transaction receives copy work also receives whatever licenses work party’s predecessor interest give previous paragraph, plus right possession Corresponding Source work predecessor interest, predecessor can get reasonable efforts. may impose restrictions exercise rights granted affirmed License. example, may impose license fee, royalty, charge exercise rights granted License, may initiate litigation (including cross-claim counterclaim lawsuit) alleging patent claim infringed making, using, selling, offering sale, importing Program portion .","code":""},{"path":"https://slowkow.github.io/hlabud/LICENSE.html","id":"id_11-patents","dir":"","previous_headings":"TERMS AND CONDITIONS","what":"11. Patents","title":"GNU General Public License","text":"“contributor” copyright holder authorizes use License Program work Program based. work thus licensed called contributor’s “contributor version”. contributor’s “essential patent claims” patent claims owned controlled contributor, whether already acquired hereafter acquired, infringed manner, permitted License, making, using, selling contributor version, include claims infringed consequence modification contributor version. purposes definition, “control” includes right grant patent sublicenses manner consistent requirements License. contributor grants non-exclusive, worldwide, royalty-free patent license contributor’s essential patent claims, make, use, sell, offer sale, import otherwise run, modify propagate contents contributor version. following three paragraphs, “patent license” express agreement commitment, however denominated, enforce patent (express permission practice patent covenant sue patent infringement). “grant” patent license party means make agreement commitment enforce patent party. convey covered work, knowingly relying patent license, Corresponding Source work available anyone copy, free charge terms License, publicly available network server readily accessible means, must either (1) cause Corresponding Source available, (2) arrange deprive benefit patent license particular work, (3) arrange, manner consistent requirements License, extend patent license downstream recipients. “Knowingly relying” means actual knowledge , patent license, conveying covered work country, recipient’s use covered work country, infringe one identifiable patents country reason believe valid. , pursuant connection single transaction arrangement, convey, propagate procuring conveyance , covered work, grant patent license parties receiving covered work authorizing use, propagate, modify convey specific copy covered work, patent license grant automatically extended recipients covered work works based . patent license “discriminatory” include within scope coverage, prohibits exercise , conditioned non-exercise one rights specifically granted License. may convey covered work party arrangement third party business distributing software, make payment third party based extent activity conveying work, third party grants, parties receive covered work , discriminatory patent license () connection copies covered work conveyed (copies made copies), (b) primarily connection specific products compilations contain covered work, unless entered arrangement, patent license granted, prior 28 March 2007. Nothing License shall construed excluding limiting implied license defenses infringement may otherwise available applicable patent law.","code":""},{"path":"https://slowkow.github.io/hlabud/LICENSE.html","id":"id_12-no-surrender-of-others-freedom","dir":"","previous_headings":"TERMS AND CONDITIONS","what":"12. No Surrender of Others’ Freedom","title":"GNU General Public License","text":"conditions imposed (whether court order, agreement otherwise) contradict conditions License, excuse conditions License. convey covered work satisfy simultaneously obligations License pertinent obligations, consequence may convey . example, agree terms obligate collect royalty conveying convey Program, way satisfy terms License refrain entirely conveying Program.","code":""},{"path":"https://slowkow.github.io/hlabud/LICENSE.html","id":"id_13-use-with-the-gnu-affero-general-public-license","dir":"","previous_headings":"TERMS AND CONDITIONS","what":"13. Use with the GNU Affero General Public License","title":"GNU General Public License","text":"Notwithstanding provision License, permission link combine covered work work licensed version 3 GNU Affero General Public License single combined work, convey resulting work. terms License continue apply part covered work, special requirements GNU Affero General Public License, section 13, concerning interaction network apply combination .","code":""},{"path":"https://slowkow.github.io/hlabud/LICENSE.html","id":"id_14-revised-versions-of-this-license","dir":"","previous_headings":"TERMS AND CONDITIONS","what":"14. Revised Versions of this License","title":"GNU General Public License","text":"Free Software Foundation may publish revised /new versions GNU General Public License time time. new versions similar spirit present version, may differ detail address new problems concerns. version given distinguishing version number. Program specifies certain numbered version GNU General Public License “later version” applies , option following terms conditions either numbered version later version published Free Software Foundation. Program specify version number GNU General Public License, may choose version ever published Free Software Foundation. Program specifies proxy can decide future versions GNU General Public License can used, proxy’s public statement acceptance version permanently authorizes choose version Program. Later license versions may give additional different permissions. However, additional obligations imposed author copyright holder result choosing follow later version.","code":""},{"path":"https://slowkow.github.io/hlabud/LICENSE.html","id":"id_15-disclaimer-of-warranty","dir":"","previous_headings":"TERMS AND CONDITIONS","what":"15. Disclaimer of Warranty","title":"GNU General Public License","text":"WARRANTY PROGRAM, EXTENT PERMITTED APPLICABLE LAW. EXCEPT OTHERWISE STATED WRITING COPYRIGHT HOLDERS /PARTIES PROVIDE PROGRAM “” WITHOUT WARRANTY KIND, EITHER EXPRESSED IMPLIED, INCLUDING, LIMITED , IMPLIED WARRANTIES MERCHANTABILITY FITNESS PARTICULAR PURPOSE. ENTIRE RISK QUALITY PERFORMANCE PROGRAM . PROGRAM PROVE DEFECTIVE, ASSUME COST NECESSARY SERVICING, REPAIR CORRECTION.","code":""},{"path":"https://slowkow.github.io/hlabud/LICENSE.html","id":"id_16-limitation-of-liability","dir":"","previous_headings":"TERMS AND CONDITIONS","what":"16. Limitation of Liability","title":"GNU General Public License","text":"EVENT UNLESS REQUIRED APPLICABLE LAW AGREED WRITING COPYRIGHT HOLDER, PARTY MODIFIES /CONVEYS PROGRAM PERMITTED , LIABLE DAMAGES, INCLUDING GENERAL, SPECIAL, INCIDENTAL CONSEQUENTIAL DAMAGES ARISING USE INABILITY USE PROGRAM (INCLUDING LIMITED LOSS DATA DATA RENDERED INACCURATE LOSSES SUSTAINED THIRD PARTIES FAILURE PROGRAM OPERATE PROGRAMS), EVEN HOLDER PARTY ADVISED POSSIBILITY DAMAGES.","code":""},{"path":"https://slowkow.github.io/hlabud/LICENSE.html","id":"id_17-interpretation-of-sections-15-and-16","dir":"","previous_headings":"TERMS AND CONDITIONS","what":"17. Interpretation of Sections 15 and 16","title":"GNU General Public License","text":"disclaimer warranty limitation liability provided given local legal effect according terms, reviewing courts shall apply local law closely approximates absolute waiver civil liability connection Program, unless warranty assumption liability accompanies copy Program return fee. END TERMS CONDITIONS","code":""},{"path":"https://slowkow.github.io/hlabud/LICENSE.html","id":"how-to-apply-these-terms-to-your-new-programs","dir":"","previous_headings":"","what":"How to Apply These Terms to Your New Programs","title":"GNU General Public License","text":"develop new program, want greatest possible use public, best way achieve make free software everyone can redistribute change terms. , attach following notices program. safest attach start source file effectively state exclusion warranty; file least “copyright” line pointer full notice found. Also add information contact electronic paper mail. program terminal interaction, make output short notice like starts interactive mode: hypothetical commands show w show c show appropriate parts General Public License. course, program’s commands might different; GUI interface, use “box”. also get employer (work programmer) school, , sign “copyright disclaimer” program, necessary. information , apply follow GNU GPL, see . GNU General Public License permit incorporating program proprietary programs. program subroutine library, may consider useful permit linking proprietary applications library. want , use GNU Lesser General Public License instead License. first, please read .","code":" Copyright (C) This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see . Copyright (C) This program comes with ABSOLUTELY NO WARRANTY; for details type 'show w'. This is free software, and you are welcome to redistribute it under certain conditions; type 'show c' for details."},{"path":"https://slowkow.github.io/hlabud/articles/examples.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"hlabud usage examples","text":"Kamil Slowikowski 2023-11-29 hlabud R package provides functions facilitate download analysis human leukocyte antigen (HLA) genotype sequence alignments IMGTHLA R. Let’s consider question might want answer HLA genotypes. amino acid positions different DRB1*04:174 DRB1*15:152 genotypes? hlabud, can find answer lines code: two genotypes nearly identical, amino acid position 9 different: position 9 E (Glu) DRB1*04:174 position 9 W (Trp) DRB1*15:152 just easy find nucleotides distinguish two alleles:","code":"library(hlabud) a <- hla_alignments(\"DRB1\") dosage(c(\"DRB1*04:174\", \"DRB1*15:152\"), a$onehot) #> pos9_E pos9_W #> DRB1*04:174 1 0 #> DRB1*15:152 0 1 n <- hla_alignments(\"DRB1\", type = \"nuc\") dosage(c(\"DRB1*04:174\", \"DRB1*15:152\"), n$onehot) #> pos109_C pos109_T #> DRB1*04:174 0 1 #> DRB1*15:152 1 0"},{"path":"https://slowkow.github.io/hlabud/articles/examples.html","id":"installation","dir":"Articles","previous_headings":"","what":"Installation","title":"hlabud usage examples","text":"quickest way get hlabud install GitHub: , included usage examples. hope inspire share HLA analyses. source code page available . Thank reporting issues hlabud.","code":"# install.packages(\"devtools\") devtools::install_github(\"slowkow/hlabud\")"},{"path":"https://slowkow.github.io/hlabud/articles/examples.html","id":"get-a-one-hot-encoded-matrix-for-all-hla-drb1-alleles","dir":"Articles","previous_headings":"","what":"Get a one-hot encoded matrix for all HLA-DRB1 alleles","title":"hlabud usage examples","text":"can use hla_alignments(\"DRB1\") load DRB1_prot.txt file latest IMGTHLA release: object list three items: $sequences amino acid sequence alignments data frame: conventions used alignments (copied EBI): entry allele displayed respect reference sequences. identity reference sequence present base displayed hyphen (-). Non-identity reference sequence shown displaying appropriate base position. insertion deletion occurred represented period (.). sequence unknown point alignment, represented asterisk (*). protein alignments null alleles, ‘Stop’ codons represented hash (X). protein alignments, sequence following termination codon, marked appear blank. conventions used nucleotide protein alignments. Learn alignments : https://www.ebi.ac.uk/ipd/imgt/hla/alignment/help/ $alleles matrix amino acids one column position: $onehot one-hot encoded matrix one column amino acid position:","code":"library(hlabud) a <- hla_alignments(gene = \"DRB1\", verbose = TRUE) #> Reading /home/runner/.local/share/hlabud/.354.0/alignments/DRB1_prot.txt str(a) #> List of 3 #> $ sequences: tibble [3,588 × 2] (S3: tbl_df/tbl/data.frame) #> ..$ allele: chr [1:3588] \"DRB1*01:01:01:01\" \"DRB1*01:01:01:02\" \"DRB1*01:01:01:03\" \"DRB1*01:01:01:04\" ... #> ..$ seq : chr [1:3588] \"MVCLKLPGGSCMTALTVTLMVLSSPLALAGDTRPRFLWQLKFECHFFNGTERVR.LLERCIYNQEE.SVRFDSDVGEYRAVTELGRPDAEYWNSQKDLLEQRRAAVDTYCR\"| __truncated__ \"------------------------------------------------------.-----------.--------------------------------------------\"| __truncated__ \"------------------------------------------------------.-----------.--------------------------------------------\"| __truncated__ \"------------------------------------------------------.-----------.--------------------------------------------\"| __truncated__ ... #> $ alleles : chr [1:3588, 1:288] \"M\" \"M\" \"M\" \"M\" ... #> ..- attr(*, \"dimnames\")=List of 2 #> .. ..$ : chr [1:3588] \"DRB1*01:01:01:01\" \"DRB1*01:01:01:02\" \"DRB1*01:01:01:03\" \"DRB1*01:01:01:04\" ... #> .. ..$ : chr [1:288] \"posn29\" \"posn28\" \"posn27\" \"posn26\" ... #> $ onehot : num [1:3588, 1:1786] 0 0 0 0 0 0 0 0 0 0 ... #> ..- attr(*, \"dimnames\")=List of 2 #> .. ..$ : chr [1:3588] \"DRB1*01:01:01:01\" \"DRB1*01:01:01:02\" \"DRB1*01:01:01:03\" \"DRB1*01:01:01:04\" ... #> .. ..$ : chr [1:1786] \"posn29_unk\" \"posn29_M\" \"posn28_unk\" \"posn28_L\" ... a$sequences #> # A tibble: 3,588 × 2 #> allele seq #> #> 1 DRB1*01:01:01:01 MVCLKLPGGSCMTALTVTLMVLSSPLALAGDTRPRFLWQLKFECHFFNGTERVR.LLER… #> 2 DRB1*01:01:01:02 ------------------------------------------------------.----… #> 3 DRB1*01:01:01:03 ------------------------------------------------------.----… #> 4 DRB1*01:01:01:04 ------------------------------------------------------.----… #> 5 DRB1*01:01:01:05 ------------------------------------------------------.----… #> 6 DRB1*01:01:01:06 ------------------------------------------------------.----… #> 7 DRB1*01:01:01:07 ------------------------------------------------------.----… #> 8 DRB1*01:01:01:08 ------------------------------------------------------.----… #> 9 DRB1*01:01:01:09 ------------------------------------------------------.----… #> 10 DRB1*01:01:01:10 ------------------------------------------------------.----… #> # ℹ 3,578 more rows a$alleles[1:5,1:7] #> posn29 posn28 posn27 posn26 posn25 posn24 posn23 #> DRB1*01:01:01:01 \"M\" \"V\" \"C\" \"L\" \"K\" \"L\" \"P\" #> DRB1*01:01:01:02 \"M\" \"V\" \"C\" \"L\" \"K\" \"L\" \"P\" #> DRB1*01:01:01:03 \"M\" \"V\" \"C\" \"L\" \"K\" \"L\" \"P\" #> DRB1*01:01:01:04 \"M\" \"V\" \"C\" \"L\" \"K\" \"L\" \"P\" #> DRB1*01:01:01:05 \"M\" \"V\" \"C\" \"L\" \"K\" \"L\" \"P\" a$onehot[1:5,1:7] #> posn29_unk posn29_M posn28_unk posn28_L posn28_V posn27_unk #> DRB1*01:01:01:01 0 1 0 0 1 0 #> DRB1*01:01:01:02 0 1 0 0 1 0 #> DRB1*01:01:01:03 0 1 0 0 1 0 #> DRB1*01:01:01:04 0 1 0 0 1 0 #> DRB1*01:01:01:05 0 1 0 0 1 0 #> posn27_C #> DRB1*01:01:01:01 1 #> DRB1*01:01:01:02 1 #> DRB1*01:01:01:03 1 #> DRB1*01:01:01:04 1 #> DRB1*01:01:01:05 1"},{"path":"https://slowkow.github.io/hlabud/articles/examples.html","id":"convert-genotypes-to-a-dosage-matrix","dir":"Articles","previous_headings":"","what":"Convert genotypes to a dosage matrix","title":"hlabud usage examples","text":"Suppose individuals following genotypes: want run association test amino acid positions, need convert genotype names matrix allele dosages (e.g., 0, 1, 2). can use dosage() convert individual’s genotypes amino acid dosages: Note: dosage matrix one row individual one column amino acid position. default, dosage() discard columns individuals identical. first individual dosage=3 pos6_R (position 6 Arg). ’s assigned individual 3 alleles input. Please careful check data looks way expect!","code":"genotypes <- c( \"DRB1*12:02:02:03,DRB1*12:02:02:03,DRB1*14:54:02\", \"DRB1*04:174,DRB1*15:152\", \"DRB1*04:56:02,DRB1*15:01:48\", \"DRB1*14:172,DRB1*04:160\", \"DRB1*04:359,DRB1*04:284:02\" ) dosage <- dosage(genotypes, a$onehot) dosage[,1:4] #> posn29_unk posn29_M pos6_R #> DRB1*12:02:02:03,DRB1*12:02:02:03,DRB1*14:54:02 1 2 3 #> DRB1*04:174,DRB1*15:152 2 0 2 #> DRB1*04:56:02,DRB1*15:01:48 2 0 2 #> DRB1*14:172,DRB1*04:160 2 0 2 #> DRB1*04:359,DRB1*04:284:02 2 0 2 #> pos9_E #> DRB1*12:02:02:03,DRB1*12:02:02:03,DRB1*14:54:02 3 #> DRB1*04:174,DRB1*15:152 1 #> DRB1*04:56:02,DRB1*15:01:48 1 #> DRB1*14:172,DRB1*04:160 2 #> DRB1*04:359,DRB1*04:284:02 2 dim(dosage) #> [1] 5 38"},{"path":"https://slowkow.github.io/hlabud/articles/examples.html","id":"logistic-regression-association-for-amino-acid-positions","dir":"Articles","previous_headings":"","what":"Logistic regression association for amino acid positions","title":"hlabud usage examples","text":"Let’s simulate dataset cases controls demonstrate one approach testing amino acid positions might associated cases. simulated dataset 100 individuals, 52 cases 48 controls. also one column amino acid position might want test association case variable. One possible approach association testing use glm() fit logistic regression model amino acid position. reveal amino acid position might associated case variable simulated dataset. volcano shows Odds Ratio P-value amino acid position. top hits P < 0.05 labeled. simulation, case variable associated pos123_S (P = 0.026, = 0.52, 95% CI 0.28 0.91).","code":"set.seed(2) n <- 100 d <- data.frame( geno = paste( sample(rownames(a$onehot), n, replace = TRUE), sample(rownames(a$onehot), n, replace = TRUE), sep = \",\" ), age = sample(21:100, n, replace = TRUE), case = sample(0:1, n, replace = TRUE) ) d <- cbind(d, dosage(d$geno, a$onehot)) d[1:5,1:6] #> geno age case #> DRB1*04:256,DRB1*04:125 DRB1*04:256,DRB1*04:125 55 0 #> DRB1*04:11:01:02,DRB1*01:02:12 DRB1*04:11:01:02,DRB1*01:02:12 73 1 #> DRB1*14:08,DRB1*15:01:02 DRB1*14:08,DRB1*15:01:02 72 0 #> DRB1*03:90,DRB1*04:278 DRB1*03:90,DRB1*04:278 22 1 #> DRB1*03:67N,DRB1*03:100:02 DRB1*03:67N,DRB1*03:100:02 34 0 #> posn29_unk posn29_M posn25_K #> DRB1*04:256,DRB1*04:125 2 0 0 #> DRB1*04:11:01:02,DRB1*01:02:12 1 1 1 #> DRB1*14:08,DRB1*15:01:02 0 2 1 #> DRB1*03:90,DRB1*04:278 2 0 0 #> DRB1*03:67N,DRB1*03:100:02 2 0 0 # select the amino acid positions that have at least 3 people with dosage > 0 my_as <- names(which(colSums(d[,4:ncol(d)] > 0) >= 3)) # run the association tests my_glm <- rbindlist(pblapply(my_as, function(my_a) { f <- sprintf(\"case ~ %s\", my_a) glm(as.formula(f), data = d, family = \"binomial\") %>% parameters(exponentiate = TRUE) })) # look at the top hits my_glm %>% arrange(p) %>% filter(!Parameter %in% c(\"(Intercept)\")) %>% head #> Parameter Coefficient SE CI CI_low CI_high z #> 1: pos123_S 0.5161604 0.1528830 0.95 0.282671869 0.9100318 -2.232794 #> 2: pos183_V 0.5570074 0.1572414 0.95 0.314570198 0.9582010 -2.072913 #> 3: pos34_H 2.1982280 0.8658531 0.95 1.053839944 5.0273874 1.999690 #> 4: pos51_A 0.1170213 0.1265210 0.95 0.006179376 0.6749415 -1.984314 #> 5: pos107_S 0.5815232 0.1638237 0.95 0.329174623 1.0003310 -1.924302 #> 6: pos69_L 1.6740699 0.4630002 0.95 0.982434872 2.9247856 1.863017 #> df_error p #> 1: Inf 0.02556252 #> 2: Inf 0.03818041 #> 3: Inf 0.04553374 #> 4: Inf 0.04722090 #> 5: Inf 0.05431680 #> 6: Inf 0.06245981"},{"path":"https://slowkow.github.io/hlabud/articles/examples.html","id":"umap-embedding-of-3588-hla-drb1-alleles","dir":"Articles","previous_headings":"","what":"UMAP embedding of 3,588 HLA-DRB1 alleles","title":"hlabud usage examples","text":"many possibilities analysis one-hot encoding HLA-DRB1 alleles. example, UMAP embedding 3,588 HLA-DRB1 alleles encoded one-hot amino acid matrix 1786 columns, one amino acid position. can highlight alleles amino acid H position 13: can represent amino acid position 13 different color:","code":"uamp(a$onehot, n_epochs = 200, min_dist = 1, spread = 2)"},{"path":"https://slowkow.github.io/hlabud/articles/examples.html","id":"get-hla-allele-frequencies-from-allele-frequency-net-database-afnd","dir":"Articles","previous_headings":"","what":"Get HLA allele frequencies from Allele Frequency Net Database (AFND)","title":"hlabud usage examples","text":"Download read table HLA allele frequencies Allele Frequency Net Database (AFND). use data, please cite latest manuscript Allele Frequency Net Database: Gonzalez-Galarza FF, McCabe , Santos EJMD, Jones J, Takeshita L, Ortega-Rivera ND, et al. Allele frequency net database (AFND) 2020 update: gold-standard data classification, open access genotype data new query tools. Nucleic Acids Res. 2020;48: D783–D788. doi:10.1093/nar/gkz1029 Plot frequency specific allele (DQB1*02:01) populations 1000 sampled individuals: See github.com/slowkow/allelefrequencies examples might use data.","code":"af <- hla_frequencies() af #> # A tibble: 123,502 × 7 #> group gene allele population indivs_over_n alleles_over_2n n #> #> 1 hla A A*01:01 Argentina Rosario To… 15.1 0.076 86 #> 2 hla A A*01:01 Armenia combined Reg… NA 0.125 100 #> 3 hla A A*01:01 Australia Cape York … NA 0.053 103 #> 4 hla A A*01:01 Australia Groote Eyl… NA 0.027 75 #> 5 hla A A*01:01 Australia New South … NA 0.187 134 #> 6 hla A A*01:01 Australia Yuendumu A… NA 0.008 191 #> 7 hla A A*01:01 Austria 27 0.146 200 #> 8 hla A A*01:01 Azores Central Islan… NA 0.08 59 #> 9 hla A A*01:01 Azores Oriental Isla… NA 0.115 43 #> 10 hla A A*01:01 Azores Terceira Isla… NA 0.109 130 #> # ℹ 123,492 more rows my_allele <- \"DQB1*02:01\" my_af <- af %>% filter(allele == my_allele) %>% filter(n > 1000) %>% arrange(-alleles_over_2n) ggplot(my_af) + aes(x = alleles_over_2n, y = reorder(population, alleles_over_2n)) + scale_y_discrete(position = \"right\") + geom_colh() + labs( x = \"Allele Frequency (Alleles / 2N)\", y = NULL, title = glue(\"Frequency of {my_allele} across {length(unique(my_af$population))} populations\"), caption = \"Data from AFND http://allelefrequencies.net\" )"},{"path":"https://slowkow.github.io/hlabud/articles/examples.html","id":"compute-hla-divergence-with-the-grantham-distance-matrix","dir":"Articles","previous_headings":"","what":"Compute HLA divergence with the Grantham distance matrix","title":"hlabud usage examples","text":"HLA allele binds specific set peptides. , individual two highly dissimilar alleles can bind greater number different peptides homozygous individual (https://doi.org/10.1007/BF02918202): MHC class II allele capacity bind present specific set peptides processed antigens. inability specific class II allele bind present fragment derived processed antigen results loss immune responsiveness antigen individuals homozygous class II allele. can compute HLA divergence metric set individuals like : divergence homozygote equal zero, definition: default, use amino acid distance matrix Granthan 1974 (https://doi.org/10.1126/science.185.4154.862). Alternatively, can choose use uniform matrix instead (diagonal values 0, non-diagonal values equal 1): amino acid distance matrix easily accessible, provide two built-options \"grantham\" \"uniform\":","code":"my_genos <- c(\"A*23:01:12,A*24:550\", \"A*25:12N,A*11:27\", \"A*24:381,A*33:85\") hla_divergence(my_genos, method = \"grantham\") #> A*23:01:12,A*24:550 A*25:12N,A*11:27 A*24:381,A*33:85 #> 0.4924242 3.3333333 4.9015152 hla_divergence(\"A*01:01,A*01:01\") #> A*01:01,A*01:01 #> 0 hla_divergence(my_genos, method = \"uniform\") #> A*23:01:12,A*24:550 A*25:12N,A*11:27 A*24:381,A*33:85 #> 0.007575758 0.040404040 0.060606061 amino_distance_matrix(method = \"uniform\") #> A R N D C Q E G H I L K M F P S T W Y V #> A 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 #> R 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 #> N 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 #> D 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 #> C 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 #> Q 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 #> E 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 #> G 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 #> H 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 #> I 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 #> L 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 #> K 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 #> M 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 #> F 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 #> P 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 #> S 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 #> T 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 #> W 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 #> Y 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 #> V 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0"},{"path":"https://slowkow.github.io/hlabud/articles/examples.html","id":"download-and-unpack-all-data-from-the-latest-imgthla-release","dir":"Articles","previous_headings":"","what":"Download and unpack all data from the latest IMGTHLA release","title":"hlabud usage examples","text":"want use hla_alignments(), don’t need install_hla() data files downloaded automatically needed cached future use. users might need access additional files present full data release. Run install_hla() download unpack latest IMGTHLA release. destination folder downloaded data files getOption(\"hlabud_dir\") (automatically tailored operating system thanks rappdirs package). examples download releases get list release names. Download latest release (default) specific release: Optionally, get set directory hlabud uses store data: List releases: installing releases, hlabud folder might look like :","code":"# Download all of the data (120MB) for the latest IMGTHLA release install_hla(release = \"latest\") # Download a specific release install_hla(release = \"3.51.0\") getOption(\"hlabud_dir\") #> [1] \"/home/username/.local/share/hlabud\" # Manually override the directory for hlabud to use options(hlabud_dir = \"/path/to/my/dir\") hla_releases() #> [1] \"3.51.0\" \"3.50.0\" \"3.49.0\" \"3.48.0\" \"3.47.0\" \"3.46.0\" \"3.45.1\" \"3.45.01\" #> [9] \"3.45.0.1\" \"3.45.0\" \"3.44.1\" \"3.44.0\" \"3.43.0\" \"3.42.0\" \"3.41.2\" \"3.41.0\" #> [17] \"3.40.0\" \"3.39.0\" \"3.38.0\" \"3.37.0\" \"3.36.0\" \"3.35.0\" \"3.34.0\" \"3.33.0\" #> [25] \"3.32.0\" \"3.31.0\" \"3.30.0\" \"3.29.0\" \"3.28.0\" \"3.27.0\" ❯ ls -lah \"/home/user/.local/share/hlabud\" total 207M drwxrwxr-x 3 user user 32 Apr 5 01:19 3.30.0 drwxrwxr-x 11 user user 4.0K Apr 7 19:31 3.40.0 drwxrwxr-x 12 user user 4.0K Apr 5 00:27 3.51.0 -rw-rw-r-- 1 user user 15K Apr 7 19:23 tags.json -rw-rw-r-- 1 user user 79M Apr 7 19:28 v3.40.0-alpha.tar.gz -rw-rw-r-- 1 user user 129M Apr 4 20:07 v3.51.0-alpha.tar.gz"},{"path":"https://slowkow.github.io/hlabud/articles/examples.html","id":"count-the-number-of-alleles-in-each-imgthla-release","dir":"Articles","previous_headings":"","what":"Count the number of alleles in each IMGTHLA release","title":"hlabud usage examples","text":"can get list release names: can get allele names release: Next, count many alleles release: plot number alleles line plot:","code":"releases <- hla_releases() releases #> [1] \".354.0\" \"3.53.0\" \"3.52.0\" \"3.51.0\" \"3.50.0\" \"3.49.0\" #> [7] \"3.48.0\" \"3.47.0\" \"3.46.0\" \"3.45.1\" \"3.45.01\" \"3.45.0.1\" #> [13] \"3.45.0\" \"3.44.1\" \"3.44.0\" \"3.43.0\" \"3.42.0\" \"3.41.2\" #> [19] \"3.41.0\" \"3.40.0\" \"3.39.0\" \"3.38.0\" \"3.37.0\" \"3.36.0\" #> [25] \"3.35.0\" \"3.34.0\" \"3.33.0\" \"3.32.0\" \"3.31.0\" \"3.30.0\" my_alleles <- rbindlist(lapply(releases, function(release) { retval <- hla_alleles(release = release) retval$release <- release return(retval) }), fill = TRUE) #> Warning in hla_alleles(release = release): unrecognized release name #> 'Allelelist.3451.txt' #> Warning in hla_alleles(release = release): unrecognized release name #> 'Allelelist.34501.txt' #> Warning in hla_alleles(release = release): unrecognized release name #> 'Allelelist.34501.txt' #> Warning in hla_alleles(release = release): unrecognized release name #> 'Allelelist.3441.txt' #> Warning in hla_alleles(release = release): unrecognized release name #> 'Allelelist.3412.txt' d <- my_alleles %>% count(release) %>% filter(n > 1) d #> release n #> 1: .354.0 38416 #> 2: 3.30.0 17509 #> 3: 3.31.0 17874 #> 4: 3.32.0 18363 #> 5: 3.33.0 18955 #> 6: 3.34.0 20272 #> 7: 3.35.0 21683 #> 8: 3.36.0 22548 #> 9: 3.37.0 24093 #> 10: 3.38.0 25958 #> 11: 3.39.0 26512 #> 12: 3.40.0 27273 #> 13: 3.41.0 27980 #> 14: 3.42.0 28786 #> 15: 3.43.0 29417 #> 16: 3.44.0 30523 #> 17: 3.45.0 31552 #> 18: 3.46.0 32330 #> 19: 3.47.0 33552 #> 20: 3.48.0 34145 #> 21: 3.49.0 35077 #> 22: 3.50.0 36016 #> 23: 3.51.0 36625 #> 24: 3.52.0 37068 #> 25: 3.53.0 37619 #> release n ggplot(d) + aes(x = release, y = n, group = 1) + geom_line() + geom_text(aes(label = release), hjust = 1) + labs(x = NULL, y = \"Number of alleles\", title = \"Each release has more HLA alleles\") + theme( axis.text.x = element_blank(), axis.ticks.x = element_blank(), ) d2 <- my_alleles %>% mutate(gene = str_split_fixed(Allele, \"\\\\*\", 2)[,1]) %>% count(release, gene) ggplot() + aes(x = release, y = n) + geom_line( data = d2, aes(group = gene, color = gene) ) + scale_color_discrete(guide = \"none\") + geom_text( data = d2 %>% filter(release == \"3.52.0\"), mapping = aes(label = gene), hjust = 0 ) + labs(x = NULL, y = \"Number of alleles\", title = \"Number of alleles per release and gene\") + scale_x_discrete(expand = expansion(mult = c(0.01, 0.1))) + scale_y_log10() + theme( panel.grid.major.y = element_line(), axis.text.x = element_blank(), axis.ticks.x = element_blank(), )"},{"path":"https://slowkow.github.io/hlabud/articles/visualize-hla-structure.html","id":"introduction","dir":"Articles","previous_headings":"","what":"Introduction","title":"Visualize HLA protein structures","text":"Kamil Slowikowski 2023-11-29 vignette, explore different methods visualizing molecular structure HLA proteins. First, ’ll look example use NGLVieweR R package show HLA protein structures. Next, ’ll use PyMOL thing.","code":""},{"path":"https://slowkow.github.io/hlabud/articles/visualize-hla-structure.html","id":"what-are-the-pdb-identifiers-for-each-hla-gene","dir":"Articles","previous_headings":"","what":"What are the PDB identifiers for each HLA gene?","title":"Visualize HLA protein structures","text":"list PDB identifiers might consider using represent HLA protein: Also try searching PDB website , e.g., \"HLA-DR\" see appropriate structure analysis.","code":"HLA-A 2xpg HLA-B 2bvp HLA-C 4nt6 HLA-DP 3lqz HLA-DQ 4z7w HLA-DR 3pdo"},{"path":"https://slowkow.github.io/hlabud/articles/visualize-hla-structure.html","id":"using-nglviewer","dir":"Articles","previous_headings":"","what":"Using NGLVieweR","title":"Visualize HLA protein structures","text":"Let’s try visualize position 9 HLA-B protein structure. visualize structure 2bvp Protein Data Bank (PDB). example NGLVieweR R package Niels van der Velden: view , see blue peptide red HLA-B protein. tyrosine position 9 highlighted ball+stick representation, also labeled text label. structure rotating can getter better view. can use hlabud answer questions HLA-B Tyr9 (tyrosine position 9). example, HLA-B alleles amino acid position? fraction reported HLA-B alleles Tyr9?","code":"# devtools::install_github(\"nvelden/NGLVieweR\") # we need the latest version library(NGLVieweR) library(magrittr) my_sele <- \"9:A\" NGLVieweR(\"2bvp\") %>% stageParameters( backgroundColor = \"white\", zoomSpeed = 1, cameraFov = 80 ) %>% addRepresentation( type = \"cartoon\" ) %>% addRepresentation( type = \"ball+stick\", param = list( sele = my_sele ) ) %>% addRepresentation( type = \"label\", param = list( sele = my_sele, labelType = \"format\", labelFormat = \"[%(resname)s]%(resno)s\", # or enter custom text labelGrouping = \"residue\", # or \"atom\" (eg. sele = \"20:A.CB\") color = \"black\", fontFamiliy = \"sans-serif\", xOffset = 1, yOffset = 0, zOffset = 0, fixedSize = TRUE, radiusType = 1, radiusSize = 5.5, # Label size showBackground = TRUE # backgroundColor=\"black\", # backgroundOpacity=0.5 ) ) %>% zoomMove( center = my_sele, zoom = my_sele, duration = 0, # animation time in ms z_offSet = -20 ) %>% setSpin() library(hlabud) a <- hla_alignments(\"B\") head(names(which(a$onehot[,\"pos9_Y\"] == 1))) #> [1] \"B*07:02:01:01\" \"B*07:02:01:02\" \"B*07:02:01:03\" \"B*07:02:01:04\" #> [5] \"B*07:02:01:05\" \"B*07:02:01:06\" sum(a$onehot[,\"pos9_Y\"] == 1) / nrow(a$onehot) #> [1] 0.7101798"},{"path":"https://slowkow.github.io/hlabud/articles/visualize-hla-structure.html","id":"using-pymol","dir":"Articles","previous_headings":"","what":"Using PyMOL","title":"Visualize HLA protein structures","text":"PyMOL one favorite methods visualizing protein structures, allows us change residue existing protein visualize new mutated protein. takes lines PyMOL create nice figure. example, want quickly highlight positions 13 45 HLA-DQB1, snippet PyMOL code produce figure . Bash script : Write PyMOL script Run PyMOL script pymol command PyMOL script : Load structure Protein Data Bank (PDB). 7kei identifier published protein structure. Color HLA-DQA1 protein teal. Color HLA-DQB1 protein orange. Color peptide purple. color residues 13 45 HLA-DQB1 red. Label residues positions names. Write PNG file view structure. image , manually rotated structure mouse added text labels like \"PDB: 7kei\" saving file.","code":"#!/usr/bin/env bash # Write a pymol script cat << EOF > script.pml fetch 7kei show cartoon remove solvent remove chain D remove chain H color teal, chain A color orange, chain B color purple, chain C color red, chain B & resi 13 color red, chain B & resi 45 label n. CA and chain B & resi 13, \"%s %s\" % (resi, resn) label n. CA and chain B & resi 45, \"%s %s\" % (resi, resn) png 7kei.png, width=1200, height=800, dpi=300 EOF # On Linux, we can just use `pymol` without making an alias # On macOS, we need to make an alias alias pymol=/Applications/PyMOL.app/Contents/MacOS/PyMOL pymol -c script.pml"},{"path":"https://slowkow.github.io/hlabud/articles/visualize-hla-structure.html","id":"other-pdb-viewers","dir":"Articles","previous_headings":"","what":"Other PDB viewers","title":"Visualize HLA protein structures","text":"Python: https://github.com/nglviewer/nglview Javascript: https://www.rcsb.org/3d-view https://www.ncbi.nlm.nih.gov/Structure/icn3d/full.html?mmdbid=7kei&bu=1 https://github.com/nglviewer/ngl https://github.com/biasmv/pv R: https://www.raymolecule.com","code":""},{"path":"https://slowkow.github.io/hlabud/authors.html","id":null,"dir":"","previous_headings":"","what":"Authors","title":"Authors and Citation","text":"Kamil Slowikowski. Author, maintainer.","code":""},{"path":"https://slowkow.github.io/hlabud/authors.html","id":"citation","dir":"","previous_headings":"","what":"Citation","title":"Authors and Citation","text":"J R, DJ B, X G, MA C, P F, SGE. M (2019). “IPD-IMGT/HLA Database.” Nucleic Acids Research, 48(D1), D948–D955. doi:10.1093/nar/gkz950. Slowikowski K (2023). hlabud: IMGTHLA Data R. doi:10.5281/zenodo.8183949, R package version 1.0.0.9999, https://github.com/slowkow/hlabud.","code":"@Article{, author = {Robinson J and Barker DJ and Georgiou X and Cooper MA and Flicek P and Marsh SGE.}, title = {IPD-IMGT/HLA Database}, doi = {10.1093/nar/gkz950}, year = {2019}, month = {oct}, publisher = {Oxford University Press}, volume = {48}, number = {D1}, pages = {D948–D955}, journal = {Nucleic Acids Research}, } @Manual{, title = {{hlabud}: IMGTHLA Data from R}, author = {Kamil Slowikowski}, year = {2023}, note = {R package version 1.0.0.9999}, doi = {10.5281/zenodo.8183949}, url = {https://github.com/slowkow/hlabud}, }"},{"path":"https://slowkow.github.io/hlabud/index.html","id":"hlabud-hla-analysis-in-r-","dir":"","previous_headings":"","what":"Methods for Access and Analysis of the Human Leukocyte Antigen (HLA) Gene Sequence Alignments from IMGTHLA","title":"Methods for Access and Analysis of the Human Leukocyte Antigen (HLA) Gene Sequence Alignments from IMGTHLA","text":"hlabud provides methods retrieve sequence alignment data IMGTHLA convert data convenient R matrices ready downstream analysis. See usage examples learn use data logistic regression dimensionality reduction. also share tips visualize 3D molecular structure HLA proteins highlight specific amino acid residues. example, let’s consider simple question two HLA genotypes DRB1*04:174 DRB1*15:152. amino acid positions different two genotypes? output, can conclude two genotypes nearly identical, different amino acids E W position 9.","code":"library(hlabud) a <- hla_alignments(\"DRB1\") dosage(c(\"DRB1*04:174\", \"DRB1*15:152\"), a$onehot) ## pos9_E pos9_W ## DRB1*04:174 1 0 ## DRB1*15:152 0 1"},{"path":"https://slowkow.github.io/hlabud/index.html","id":"installation","dir":"","previous_headings":"","what":"Installation","title":"Methods for Access and Analysis of the Human Leukocyte Antigen (HLA) Gene Sequence Alignments from IMGTHLA","text":"quickest way get hlabud install GitHub:","code":"# install.packages(\"devtools\") devtools::install_github(\"slowkow/hlabud\")"},{"path":"https://slowkow.github.io/hlabud/index.html","id":"examples","dir":"","previous_headings":"","what":"Examples","title":"Methods for Access and Analysis of the Human Leukocyte Antigen (HLA) Gene Sequence Alignments from IMGTHLA","text":"See usage examples get ideas use hlabud analyses. Get one-hot encoded matrix HLA-DRB1 alleles Convert genotypes dosage matrix Logistic regression association amino acid positions UMAP embedding 3,516 HLA-DRB1 alleles Get HLA allele frequencies Allele Frequency Net Database (AFND) Compute HLA divergence Grantham distance matrix Download unpack data latest IMGTHLA release","code":""},{"path":"https://slowkow.github.io/hlabud/index.html","id":"citation","dir":"","previous_headings":"","what":"Citation","title":"Methods for Access and Analysis of the Human Leukocyte Antigen (HLA) Gene Sequence Alignments from IMGTHLA","text":"hlabud provides access data IMGT/HLA database. Therefore, use hlabud please cite IMGT/HLA paper: Robinson J, Barker DJ, Georgiou X, Cooper MA, Flicek P, Marsh SGE. IPD-IMGT/HLA Database. Nucleic Acids Res. 2020;48: D948–D955. doi:10.1093/nar/gkz950 hlabud also provides access data Allele Frequency Net Database (AFND). Therefore, use hlabud::hla_frequencies() please cite AFND paper: Gonzalez-Galarza FF, McCabe , Santos EJMD, Jones J, Takeshita L, Ortega-Rivera ND, et al. Allele frequency net database (AFND) 2020 update: gold-standard data classification, open access genotype data new query tools. Nucleic Acids Res. 2020;48: D783–D788. doi:10.1093/nar/gkz1029 Additionally, can also cite hlabud package like : Slowikowski K. hlabud: methods access analysis human leukocyte antigen (HLA) gene sequence alignments IMGT/HLA. R package version 1.0.0.","code":""},{"path":"https://slowkow.github.io/hlabud/index.html","id":"related-work","dir":"","previous_headings":"","what":"Related work","title":"Methods for Access and Analysis of the Human Leukocyte Antigen (HLA) Gene Sequence Alignments from IMGTHLA","text":"recommend article anyone new HLA, beautiful figures help build intuition: La Gruta NL, Gras S, Daley SR, Thomas PG, Rossjohn J. Understanding drivers MHC restriction T cell receptors. Nat Rev Immunol. 2018;18: 467–478. Learn conventions HLA nomenclature: Marsh SGE, Albert ED, Bodmer WF, Bontrop RE, Dupont B, Erlich HA, et al. Nomenclature factors HLA system, 2010. Tissue Antigens. 2010;75: 291–455. case-control analysis HLA genotype data, consider BIGDAWG R package available CRAN. related article: Pappas DJ, Marin W, Hollenbach JA, Mack SJ. Bridging ImmunoGenomic Data Analysis Workflow Gaps (BIGDAWG): integrated case-control analysis pipeline. Hum Immunol. 2016;77: 283–287.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/amino_distance_matrix.html","id":null,"dir":"Reference","previous_headings":"","what":"Get a pairwise 20x20 distance matrix for all pairs of amino acids — amino_distance_matrix","title":"Get a pairwise 20x20 distance matrix for all pairs of amino acids — amino_distance_matrix","text":"default, return amino acid distance matrix Grantham 1974 (doi:10.1126/science.185.4154.862).","code":""},{"path":"https://slowkow.github.io/hlabud/reference/amino_distance_matrix.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Get a pairwise 20x20 distance matrix for all pairs of amino acids — amino_distance_matrix","text":"","code":"amino_distance_matrix(method = \"grantham\")"},{"path":"https://slowkow.github.io/hlabud/reference/amino_distance_matrix.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Get a pairwise 20x20 distance matrix for all pairs of amino acids — amino_distance_matrix","text":"method \"grantham\" Grantham 1974 matrix \"uniform\" matrix ones non-diagonal.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/amino_distance_matrix.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Get a pairwise 20x20 distance matrix for all pairs of amino acids — amino_distance_matrix","text":"20x20 symmetric matrix positive numbers zeros diagonal.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/amino_distance_matrix.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Get a pairwise 20x20 distance matrix for all pairs of amino acids — amino_distance_matrix","text":"","code":"# By default, the Grantham 1974 matrix amino_distance_matrix(\"grantham\") #> A R N D C Q E G H I L K M F P S T W Y #> A 0 112 111 126 195 91 107 60 86 94 96 106 84 113 27 99 58 148 112 #> R 112 0 86 96 180 43 54 125 29 97 102 26 91 97 103 110 71 101 77 #> N 111 86 0 23 139 46 42 80 68 149 153 94 142 158 91 46 65 174 143 #> D 126 96 23 0 154 61 45 94 81 168 172 101 160 177 108 65 85 181 160 #> C 195 180 139 154 0 154 170 159 174 198 198 202 196 205 169 112 149 215 194 #> Q 91 43 46 61 154 0 29 87 24 109 113 53 101 116 76 68 42 130 99 #> E 107 54 42 45 170 29 0 98 40 134 138 56 126 140 93 80 65 152 122 #> G 60 125 80 94 159 87 98 0 98 135 138 127 127 153 42 56 59 184 147 #> H 86 29 68 81 174 24 40 98 0 94 99 32 87 100 77 89 47 115 83 #> I 94 97 149 168 198 109 134 135 94 0 5 102 10 21 95 142 89 61 33 #> L 96 102 153 172 198 113 138 138 99 5 0 107 15 22 98 145 92 61 36 #> K 106 26 94 101 202 53 56 127 32 102 107 0 95 102 103 121 78 110 85 #> M 84 91 142 160 196 101 126 127 87 10 15 95 0 28 87 135 81 67 36 #> F 113 97 158 177 205 116 140 153 100 21 22 102 28 0 114 155 103 40 22 #> P 27 103 91 108 169 76 93 42 77 95 98 103 87 114 0 74 38 147 110 #> S 99 110 46 65 112 68 80 56 89 142 145 121 135 155 74 0 58 177 144 #> T 58 71 65 85 149 42 65 59 47 89 92 78 81 103 38 58 0 128 92 #> W 148 101 174 181 215 130 152 184 115 61 61 110 67 40 147 177 128 0 37 #> Y 112 77 143 160 194 99 122 147 83 33 36 85 36 22 110 144 92 37 0 #> V 64 96 133 152 192 96 121 109 84 29 32 97 21 50 68 124 69 88 55 #> V #> A 64 #> R 96 #> N 133 #> D 152 #> C 192 #> Q 96 #> E 121 #> G 109 #> H 84 #> I 29 #> L 32 #> K 97 #> M 21 #> F 50 #> P 68 #> S 124 #> T 69 #> W 88 #> Y 55 #> V 0 # All ones, and zeros on the diagonal amino_distance_matrix(\"uniform\") #> A R N D C Q E G H I L K M F P S T W Y V #> A 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 #> R 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 #> N 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 #> D 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 #> C 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 #> Q 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 #> E 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 #> G 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 #> H 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 #> I 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 #> L 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 #> K 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 #> M 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 #> F 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 #> P 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 #> S 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 #> T 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 #> W 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 #> Y 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 #> V 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0"},{"path":"https://slowkow.github.io/hlabud/reference/dosage.html","id":null,"dir":"Reference","previous_headings":"","what":"Convert a set of genotype names into a dosage matrix of each residue at each position — dosage","title":"Convert a set of genotype names into a dosage matrix of each residue at each position — dosage","text":"genotype name, return dosage matrix residue (amino acid nucleotide) position.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/dosage.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Convert a set of genotype names into a dosage matrix of each residue at each position — dosage","text":"","code":"dosage(names, mat, drop_constants = TRUE, drop_duplicates = TRUE)"},{"path":"https://slowkow.github.io/hlabud/reference/dosage.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Convert a set of genotype names into a dosage matrix of each residue at each position — dosage","text":"names Input character vector one genotype individual. entries must present rownames(mat). mat one-hot encoded matrix one row per allele one column residue (amino acid nucleotide) position. drop_constants Filter constant amino acid positions default. drop_duplicates Filter duplicate amino acid positions default.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/dosage.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Convert a set of genotype names into a dosage matrix of each residue at each position — dosage","text":"matrix one row input genotype, one column residue position.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/dosage.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Convert a set of genotype names into a dosage matrix of each residue at each position — dosage","text":"genotype represented like \"HLA-*01:01,HLA-*01:01\" default, returned matrix filtered exclude: positions input genotypes allele positions identical previous positions","code":""},{"path":"https://slowkow.github.io/hlabud/reference/dosage.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Convert a set of genotype names into a dosage matrix of each residue at each position — dosage","text":"","code":"DRB1_file <- file.path( \"https://github.com/ANHIG/IMGTHLA/raw\", \"5f2c562056f8ffa89aeea0631f2a52300ee0de17\", \"alignments/DRB1_prot.txt\" ) a <- read_alignments(DRB1_file) genotypes <- c( \"DRB1*12:02:02:03,DRB1*12:02:02:03,DRB1*14:54:02\", \"DRB1*04:174,DRB1*15:152\", \"DRB1*04:56:02,DRB1*15:01:48\", \"DRB1*14:172,DRB1*04:160\", \"DRB1*04:359,DRB1*04:284:02\" ) dosage <- dosage(genotypes, a$onehot) dosage[,1:5] #> posn29_unk posn29_M pos6_R #> DRB1*12:02:02:03,DRB1*12:02:02:03,DRB1*14:54:02 1 2 3 #> DRB1*04:174,DRB1*15:152 2 0 2 #> DRB1*04:56:02,DRB1*15:01:48 2 0 2 #> DRB1*14:172,DRB1*04:160 2 0 2 #> DRB1*04:359,DRB1*04:284:02 2 0 2 #> pos9_E pos9_W #> DRB1*12:02:02:03,DRB1*12:02:02:03,DRB1*14:54:02 3 0 #> DRB1*04:174,DRB1*15:152 1 1 #> DRB1*04:56:02,DRB1*15:01:48 1 1 #> DRB1*14:172,DRB1*04:160 2 0 #> DRB1*04:359,DRB1*04:284:02 2 0"},{"path":"https://slowkow.github.io/hlabud/reference/get_hlabud_dir.html","id":null,"dir":"Reference","previous_headings":"","what":"Get the name of the folder for caching downloaded IMGTHLA files — get_hlabud_dir","title":"Get the name of the folder for caching downloaded IMGTHLA files — get_hlabud_dir","text":"function : Get folder name getOption(\"hlabud_dir\") else automatically choose appropriate folder operating system thanks rappdirs. Create folder automatically already exist. Set hlabud_dir option new folder.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/get_hlabud_dir.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Get the name of the folder for caching downloaded IMGTHLA files — get_hlabud_dir","text":"","code":"get_hlabud_dir()"},{"path":"https://slowkow.github.io/hlabud/reference/get_hlabud_dir.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Get the name of the folder for caching downloaded IMGTHLA files — get_hlabud_dir","text":"name folder.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/get_hlabud_dir.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Get the name of the folder for caching downloaded IMGTHLA files — get_hlabud_dir","text":"locations hlabud_dir folder operating system. Linux: Mac: Windows: set hlabud_dir option, please use:","code":"~/.local/share/hlabud ~/Library/Application Support/hlabud C:\\Documents and Settings\\{User}\\Application Data\\slowkow\\hlabud options(hlabud_dir = \"/my/favorite/path\")"},{"path":"https://slowkow.github.io/hlabud/reference/get_hlabud_dir.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Get the name of the folder for caching downloaded IMGTHLA files — get_hlabud_dir","text":"","code":"if (FALSE) { hlabud_dir <- get_hlabud_dir() }"},{"path":"https://slowkow.github.io/hlabud/reference/get_onehot.html","id":null,"dir":"Reference","previous_headings":"","what":"Make a one-hot encoded matrix from a dataframe of amino acid\nsequences. — get_onehot","title":"Make a one-hot encoded matrix from a dataframe of amino acid\nsequences. — get_onehot","text":"Make one-hot encoded matrix dataframe amino acid sequences.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/get_onehot.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Make a one-hot encoded matrix from a dataframe of amino acid\nsequences. — get_onehot","text":"","code":"get_onehot(al, n_pre)"},{"path":"https://slowkow.github.io/hlabud/reference/get_onehot.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Make a one-hot encoded matrix from a dataframe of amino acid\nsequences. — get_onehot","text":"al dataframe columns allele, seq n_pre number amino acid sequences position 1.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/hla_alignments.html","id":null,"dir":"Reference","previous_headings":"","what":"Get sequence alignments from IMGTHLA — hla_alignments","title":"Get sequence alignments from IMGTHLA — hla_alignments","text":"conventions used alignments (EBI IMGT-HLA help page): entry allele displayed respect reference sequences. identity reference sequence present base displayed hyphen (-). Non-identity reference sequence shown displaying appropriate base position. insertion deletion occurred represented period (.). sequence unknown point alignment, represented asterisk (*). protein alignments null alleles, 'Stop' codons represented hash (X). protein alignments, sequence following termination codon, marked appear blank. conventions used nucleotide protein alignments.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/hla_alignments.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Get sequence alignments from IMGTHLA — hla_alignments","text":"","code":"hla_alignments( gene = \"DRB1\", type = \"prot\", release = \"latest\", verbose = FALSE )"},{"path":"https://slowkow.github.io/hlabud/reference/hla_alignments.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Get sequence alignments from IMGTHLA — hla_alignments","text":"gene name gene like \"DRB1\" type type sequence, one \"prot\", \"nuc\", \"gen\" release Default \"latest\". release name like \"3.51.0\". verbose TRUE, print messages along way.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/hla_alignments.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Get sequence alignments from IMGTHLA — hla_alignments","text":"list dataframe called sequences two matrices alleles onehot. dataframe two columns: allele: name allele, e.g., DQB*01:01 seq: amino acid sequence matrix alleles one row allele, one column position, values representing residues position allele. matrix onehot one-hot encoding variants distinguish alleles, one row allele one column amino acid position.","code":""},{"path":[]},{"path":"https://slowkow.github.io/hlabud/reference/hla_alignments.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Get sequence alignments from IMGTHLA — hla_alignments","text":"","code":"# \\donttest{ a <- hla_alignments(\"DRB1\") head(a$sequences) #> # A tibble: 6 × 2 #> allele seq #> #> 1 DRB1*01:01:01:01 MVCLKLPGGSCMTALTVTLMVLSSPLALAGDTRPRFLWQLKFECHFFNGTERVR.LLERC… #> 2 DRB1*01:01:01:02 ------------------------------------------------------.-----… #> 3 DRB1*01:01:01:03 ------------------------------------------------------.-----… #> 4 DRB1*01:01:01:04 ------------------------------------------------------.-----… #> 5 DRB1*01:01:01:05 ------------------------------------------------------.-----… #> 6 DRB1*01:01:01:06 ------------------------------------------------------.-----… a$alleles[1:6,1:6] #> posn29 posn28 posn27 posn26 posn25 posn24 #> DRB1*01:01:01:01 \"M\" \"V\" \"C\" \"L\" \"K\" \"L\" #> DRB1*01:01:01:02 \"M\" \"V\" \"C\" \"L\" \"K\" \"L\" #> DRB1*01:01:01:03 \"M\" \"V\" \"C\" \"L\" \"K\" \"L\" #> DRB1*01:01:01:04 \"M\" \"V\" \"C\" \"L\" \"K\" \"L\" #> DRB1*01:01:01:05 \"M\" \"V\" \"C\" \"L\" \"K\" \"L\" #> DRB1*01:01:01:06 \"M\" \"V\" \"C\" \"L\" \"K\" \"L\" a$onehot[1:6,1:6] #> posn29_unk posn29_M posn28_unk posn28_L posn28_V posn27_unk #> DRB1*01:01:01:01 0 1 0 0 1 0 #> DRB1*01:01:01:02 0 1 0 0 1 0 #> DRB1*01:01:01:03 0 1 0 0 1 0 #> DRB1*01:01:01:04 0 1 0 0 1 0 #> DRB1*01:01:01:05 0 1 0 0 1 0 #> DRB1*01:01:01:06 0 1 0 0 1 0 # }"},{"path":"https://slowkow.github.io/hlabud/reference/hla_alleles.html","id":null,"dir":"Reference","previous_headings":"","what":"Get a table of allele names for a particular IMGTHLA release — hla_alleles","title":"Get a table of allele names for a particular IMGTHLA release — hla_alleles","text":"Download list allele names HLA genes particular IMGTHLA release.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/hla_alleles.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Get a table of allele names for a particular IMGTHLA release — hla_alleles","text":"","code":"hla_alleles(release = \"latest\", overwrite = FALSE, verbose = FALSE)"},{"path":"https://slowkow.github.io/hlabud/reference/hla_alleles.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Get a table of allele names for a particular IMGTHLA release — hla_alleles","text":"release Default \"latest\". release name like \"3.51.0\". overwrite Overwrite existing alleles.json file Allelelist.{version}.txt file verbose TRUE, print messages along way.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/hla_alleles.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Get a table of allele names for a particular IMGTHLA release — hla_alleles","text":"data frame HLA allele ids names","code":""},{"path":[]},{"path":"https://slowkow.github.io/hlabud/reference/hla_alleles.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Get a table of allele names for a particular IMGTHLA release — hla_alleles","text":"","code":"# \\donttest{ head(hla_alleles()) #> AlleleID Allele #> 1 HLA00001 A*01:01:01:01 #> 2 HLA02169 A*01:01:01:02N #> 3 HLA14798 A*01:01:01:03 #> 4 HLA15760 A*01:01:01:04 #> 5 HLA16415 A*01:01:01:05 #> 6 HLA16417 A*01:01:01:06 # }"},{"path":"https://slowkow.github.io/hlabud/reference/hla_divergence.html","id":null,"dir":"Reference","previous_headings":"","what":"Calculate HLA divergence for each individual — hla_divergence","title":"Calculate HLA divergence for each individual — hla_divergence","text":"First, convert allele name (e.g. *01:01) amino acid sequence. divergence sum distances pair amino acids position, divided total sequence length. amino acid distance matrix use one published Grantham 1974 (doi:10.1126/science.185.4154.862), based three physical properties amino acids (composition, polarity, molecular volume) correlated estimate relative substitution frequency.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/hla_divergence.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Calculate HLA divergence for each individual — hla_divergence","text":"","code":"hla_divergence( alleles = c(\"A*01:01,A*02:01\"), method = \"grantham\", release = \"latest\" )"},{"path":"https://slowkow.github.io/hlabud/reference/hla_divergence.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Calculate HLA divergence for each individual — hla_divergence","text":"alleles character vector comma-delimited alleles individual. usually expect two alleles per individual, possible (fewer) copies due copy number alterations. function still works individual different number alleles. method pairwise amino acid matrix, method name: \"grantham\" \"uniform\" indicate pairwise amino acid distance matrix use. choose pass matrix, 20x20 symmetric matrix zeros diagonal, rownames colnames one-letter amino acid codes R N D C Q E G H L K M F P S T W Y V. release Default \"latest\". release name like \"3.51.0\".","code":""},{"path":"https://slowkow.github.io/hlabud/reference/hla_divergence.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Calculate HLA divergence for each individual — hla_divergence","text":"dataframe divergence individual.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/hla_divergence.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Calculate HLA divergence for each individual — hla_divergence","text":"code function translation original Perl code Tobias Lenz, published Pierini & Lenz 2018 MolBiolEvol (https://doi.org/10.1093/molbev/msy116). comparing two amino acid sequences, characters one 20 amino acids considered divergence calculation, gaps (characters) count.","code":""},{"path":[]},{"path":"https://slowkow.github.io/hlabud/reference/hla_divergence.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Calculate HLA divergence for each individual — hla_divergence","text":"","code":"my_genos <- c(\"A*23:01:12,A*24:550\", \"A*25:12N,A*11:27\", \"A*24:381,A*33:85\", \"A*01:01:,A*01:01,A*02:01\") hla_divergence(my_genos, method = \"grantham\") #> A*23:01:12,A*24:550 A*25:12N,A*11:27 A*24:381,A*33:85 #> 0.4924242 3.3333333 4.9015152 #> A*01:01:,A*01:01,A*02:01 #> 3.8367003 # This is equivalent hla_divergence(my_genos, method = amino_distance_matrix(\"grantham\")) #> A*23:01:12,A*24:550 A*25:12N,A*11:27 A*24:381,A*33:85 #> 0.4924242 3.3333333 4.9015152 #> A*01:01:,A*01:01,A*02:01 #> 3.8367003"},{"path":"https://slowkow.github.io/hlabud/reference/hla_frequencies.html","id":null,"dir":"Reference","previous_headings":"","what":"Get HLA frequences from Allele Frequency Net Database (AFND) — hla_frequencies","title":"Get HLA frequences from Allele Frequency Net Database (AFND) — hla_frequencies","text":"Download read table HLA allele frequencies Allele Frequency Net Database (AFND).","code":""},{"path":"https://slowkow.github.io/hlabud/reference/hla_frequencies.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Get HLA frequences from Allele Frequency Net Database (AFND) — hla_frequencies","text":"","code":"hla_frequencies(verbose = FALSE)"},{"path":"https://slowkow.github.io/hlabud/reference/hla_frequencies.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Get HLA frequences from Allele Frequency Net Database (AFND) — hla_frequencies","text":"verbose TRUE, print messages along way.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/hla_frequencies.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Get HLA frequences from Allele Frequency Net Database (AFND) — hla_frequencies","text":"dataframe HLA allele frequencies genes.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/hla_frequencies.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Get HLA frequences from Allele Frequency Net Database (AFND) — hla_frequencies","text":"use data, please cite latest manuscript Allele Frequency Net Database: Gonzalez-Galarza FF, McCabe , Santos EJMD, Jones J, Takeshita L, Ortega-Rivera ND, et al. Allele frequency net database (AFND) 2020 update: gold-standard data classification, open access genotype data new query tools. Nucleic Acids Res. 2020;48: D783–D788. doi:10.1093/nar/gkz1029","code":""},{"path":"https://slowkow.github.io/hlabud/reference/hla_frequencies.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Get HLA frequences from Allele Frequency Net Database (AFND) — hla_frequencies","text":"","code":"# \\donttest{ hla_frequencies() #> # A tibble: 123,502 × 7 #> group gene allele population indivs_over_n alleles_over_2n n #> #> 1 hla A A*01:01 Argentina Rosario To… 15.1 0.076 86 #> 2 hla A A*01:01 Armenia combined Reg… NA 0.125 100 #> 3 hla A A*01:01 Australia Cape York … NA 0.053 103 #> 4 hla A A*01:01 Australia Groote Eyl… NA 0.027 75 #> 5 hla A A*01:01 Australia New South … NA 0.187 134 #> 6 hla A A*01:01 Australia Yuendumu A… NA 0.008 191 #> 7 hla A A*01:01 Austria 27 0.146 200 #> 8 hla A A*01:01 Azores Central Islan… NA 0.08 59 #> 9 hla A A*01:01 Azores Oriental Isla… NA 0.115 43 #> 10 hla A A*01:01 Azores Terceira Isla… NA 0.109 130 #> # ℹ 123,492 more rows # }"},{"path":"https://slowkow.github.io/hlabud/reference/hla_genes.html","id":null,"dir":"Reference","previous_headings":"","what":"Get HLA gene names from IMGTHLA — hla_genes","title":"Get HLA gene names from IMGTHLA — hla_genes","text":"Retrieve list txt files github.com/ANHIG/IMGTHLA/alignments return list gene names derived file names.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/hla_genes.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Get HLA gene names from IMGTHLA — hla_genes","text":"","code":"hla_genes(release = \"latest\", overwrite = FALSE, verbose = FALSE)"},{"path":"https://slowkow.github.io/hlabud/reference/hla_genes.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Get HLA gene names from IMGTHLA — hla_genes","text":"release Default \"latest\". release name like \"3.51.0\". overwrite Overwrite existing genes.json file new one GitHub verbose TRUE, print messages along way.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/hla_genes.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Get HLA gene names from IMGTHLA — hla_genes","text":"tibble two columns: HLA gene names (\"\", \"DRB1\") types (\"nuc\", \"gen\", \"prot\").","code":""},{"path":[]},{"path":"https://slowkow.github.io/hlabud/reference/hla_genes.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Get HLA gene names from IMGTHLA — hla_genes","text":"","code":"# \\donttest{ hla_genes() #> # A tibble: 107 × 2 #> gene type #> #> 1 A gen #> 2 A nuc #> 3 A prot #> 4 B gen #> 5 B nuc #> 6 B prot #> 7 C gen #> 8 C nuc #> 9 C prot #> 10 DMA gen #> # ℹ 97 more rows # }"},{"path":"https://slowkow.github.io/hlabud/reference/hla_releases.html","id":null,"dir":"Reference","previous_headings":"","what":"Get the names of releases from IMGTHLA — hla_releases","title":"Get the names of releases from IMGTHLA — hla_releases","text":"Get tags github.com/ANHIG/IMGTHLA, save file called tags.json getOption(\"hlabud_dir\"), return release names file.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/hla_releases.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Get the names of releases from IMGTHLA — hla_releases","text":"","code":"hla_releases(overwrite = FALSE)"},{"path":"https://slowkow.github.io/hlabud/reference/hla_releases.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Get the names of releases from IMGTHLA — hla_releases","text":"overwrite Overwrite existing tags.json file getOption(\"hlabud_dir\")","code":""},{"path":"https://slowkow.github.io/hlabud/reference/hla_releases.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Get the names of releases from IMGTHLA — hla_releases","text":"character vector release names like \"3.51.0\"","code":""},{"path":"https://slowkow.github.io/hlabud/reference/hla_releases.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Get the names of releases from IMGTHLA — hla_releases","text":"","code":"# \\donttest{ hla_releases() #> [1] \".354.0\" \"3.53.0\" \"3.52.0\" \"3.51.0\" \"3.50.0\" \"3.49.0\" #> [7] \"3.48.0\" \"3.47.0\" \"3.46.0\" \"3.45.1\" \"3.45.01\" \"3.45.0.1\" #> [13] \"3.45.0\" \"3.44.1\" \"3.44.0\" \"3.43.0\" \"3.42.0\" \"3.41.2\" #> [19] \"3.41.0\" \"3.40.0\" \"3.39.0\" \"3.38.0\" \"3.37.0\" \"3.36.0\" #> [25] \"3.35.0\" \"3.34.0\" \"3.33.0\" \"3.32.0\" \"3.31.0\" \"3.30.0\" # }"},{"path":"https://slowkow.github.io/hlabud/reference/hlabud-package.html","id":null,"dir":"Reference","previous_headings":"","what":"hlabud: Methods for Access and Analysis of the Human Leukocyte Antigen (HLA) Gene Sequence Alignments from IMGTHLA — hlabud-package","title":"hlabud: Methods for Access and Analysis of the Human Leukocyte Antigen (HLA) Gene Sequence Alignments from IMGTHLA — hlabud-package","text":"Fetch sequence alignment data IMGTHLA database Robinson et al (2020) doi:10.1093/nar/gkz950 , automatically convert sequence alignments convenient R matrices ready downstream analysis. vignette shows examples using one-hot encoding data logistic regression dimensionality reduction. Data downloaded lazily, -needed, cached user-configurable folder.","code":""},{"path":[]},{"path":"https://slowkow.github.io/hlabud/reference/hlabud-package.html","id":"author","dir":"Reference","previous_headings":"","what":"Author","title":"hlabud: Methods for Access and Analysis of the Human Leukocyte Antigen (HLA) Gene Sequence Alignments from IMGTHLA — hlabud-package","text":"Maintainer: Kamil Slowikowski kslowikowski@gmail.com (ORCID)","code":""},{"path":"https://slowkow.github.io/hlabud/reference/install_hla.html","id":null,"dir":"Reference","previous_headings":"","what":"Download and unpack a tarball release from IMGTHLA — install_hla","title":"Download and unpack a tarball release from IMGTHLA — install_hla","text":"release tarball Github unpacked getOption(\"hlabud_dir\") folder.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/install_hla.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Download and unpack a tarball release from IMGTHLA — install_hla","text":"","code":"install_hla(release = \"latest\", overwrite = FALSE, verbose = FALSE)"},{"path":"https://slowkow.github.io/hlabud/reference/install_hla.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Download and unpack a tarball release from IMGTHLA — install_hla","text":"release Default \"latest\". release name like \"3.51.0\". overwrite TRUE, overwrite existing files release folder. verbose TRUE, print messages along way.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/install_hla.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Download and unpack a tarball release from IMGTHLA — install_hla","text":"Note latest releases 100 MB size, download might take slow connections.","code":""},{"path":[]},{"path":"https://slowkow.github.io/hlabud/reference/install_hla.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Download and unpack a tarball release from IMGTHLA — install_hla","text":"","code":"if (FALSE) { install_hla() install_hla(\"3.51.0\") install_hla(\"3.51.0\", verbose = TRUE) # Change the install directory options(hlabud_dir = \"path/to/my/dir\") install_hla() }"},{"path":"https://slowkow.github.io/hlabud/reference/one_to_three.html","id":null,"dir":"Reference","previous_headings":"","what":"Convert one letter amino acid codes to three letter amino acid codes — one_to_three","title":"Convert one letter amino acid codes to three letter amino acid codes — one_to_three","text":"Convert one letter amino acid codes three letter amino acid codes","code":""},{"path":"https://slowkow.github.io/hlabud/reference/one_to_three.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Convert one letter amino acid codes to three letter amino acid codes — one_to_three","text":"","code":"one_to_three(aminos)"},{"path":"https://slowkow.github.io/hlabud/reference/pipe.html","id":null,"dir":"Reference","previous_headings":"","what":"Pipe operator — %>%","title":"Pipe operator — %>%","text":"See magrittr::%>% details.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/pipe.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Pipe operator — %>%","text":"lhs value magrittr placeholder. rhs function call using magrittr semantics.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/pipe.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Pipe operator — %>%","text":"result calling rhs(lhs).","code":""},{"path":"https://slowkow.github.io/hlabud/reference/read_alignments.html","id":null,"dir":"Reference","previous_headings":"","what":"Read an alignment file *_(nuc|gen|prot).txt from IMGTHLA — read_alignments","title":"Read an alignment file *_(nuc|gen|prot).txt from IMGTHLA — read_alignments","text":"function reads txt files provided IMGTHLA.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/read_alignments.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Read an alignment file *_(nuc|gen|prot).txt from IMGTHLA — read_alignments","text":"","code":"read_alignments(file)"},{"path":"https://slowkow.github.io/hlabud/reference/read_alignments.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Read an alignment file *_(nuc|gen|prot).txt from IMGTHLA — read_alignments","text":"file File name txt file IMGTHLA like \"DQB1_prot.txt\"","code":""},{"path":"https://slowkow.github.io/hlabud/reference/read_alignments.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Read an alignment file *_(nuc|gen|prot).txt from IMGTHLA — read_alignments","text":"list dataframe called sequences two matrices alleles onehot. dataframe two columns: allele: name allele, e.g., DQB*01:01 seq: amino acid sequence matrix alleles one row allele, one column position, values representing residues position allele. matrix onehot one-hot encoding variants distinguish alleles, one row allele one column amino acid position.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/read_alignments.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Read an alignment file *_(nuc|gen|prot).txt from IMGTHLA — read_alignments","text":"Consider using hla_alignments() instead function. already txt file want read, can read read_alignments(\"myfile.txt\"). sequences contained file: {gene}_prot.txt amino acid sequence HLA allele. {gene}_nuc.txt nucleotide sequence exons. {gene}_gen.txt genomic sequence exons introns.","code":""},{"path":"https://slowkow.github.io/hlabud/reference/read_alignments.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Read an alignment file *_(nuc|gen|prot).txt from IMGTHLA — read_alignments","text":"","code":"my_file <- file.path( \"https://github.com/ANHIG/IMGTHLA/raw\", \"5f2c562056f8ffa89aeea0631f2a52300ee0de17\", \"alignments/DRB1_prot.txt\" ) a <- read_alignments(my_file) head(a$sequences) #> # A tibble: 6 × 2 #> allele seq #> #> 1 DRB1*01:01:01:01 MVCLKLPGGSCMTALTVTLMVLSSPLALAGDTRPRFLWQLKFECHFFNGTERVR.LLERC… #> 2 DRB1*01:01:01:02 ------------------------------------------------------.-----… #> 3 DRB1*01:01:01:03 ------------------------------------------------------.-----… #> 4 DRB1*01:01:01:04 ------------------------------------------------------.-----… #> 5 DRB1*01:01:01:05 ------------------------------------------------------.-----… #> 6 DRB1*01:01:01:06 ------------------------------------------------------.-----… a$alleles[1:5,1:5] #> posn29 posn28 posn27 posn26 posn25 #> DRB1*01:01:01:01 \"M\" \"V\" \"C\" \"L\" \"K\" #> DRB1*01:01:01:02 \"M\" \"V\" \"C\" \"L\" \"K\" #> DRB1*01:01:01:03 \"M\" \"V\" \"C\" \"L\" \"K\" #> DRB1*01:01:01:04 \"M\" \"V\" \"C\" \"L\" \"K\" #> DRB1*01:01:01:05 \"M\" \"V\" \"C\" \"L\" \"K\" a$onehot[1:5,1:5] #> posn29_unk posn29_M posn28_unk posn28_V posn27_unk #> DRB1*01:01:01:01 0 1 0 1 0 #> DRB1*01:01:01:02 0 1 0 1 0 #> DRB1*01:01:01:03 0 1 0 1 0 #> DRB1*01:01:01:04 0 1 0 1 0 #> DRB1*01:01:01:05 0 1 0 1 0"},{"path":"https://slowkow.github.io/hlabud/news/index.html","id":"hlabud-1009999","dir":"Changelog","previous_headings":"","what":"hlabud 1.0.0.9999","title":"hlabud 1.0.0.9999","text":"Instead discarding positions *, include label unk, example pos241_unk indicates unknown amino acid position 241. Thanks Sreekar Mantena reporting issue! Fix --one error. example, HLA-pos361_- colnames($onehot) reference allele instead -. now fixed. Thanks Sreekar Mantena reporting issue!","code":""},{"path":"https://slowkow.github.io/hlabud/news/index.html","id":"hlabud-100","dir":"Changelog","previous_headings":"","what":"hlabud 1.0.0","title":"hlabud 1.0.0","text":"Added NEWS.md file track changes package.","code":""}]