HISAT 2.2.0 release 2/6/2020
This major version update includes a new feature to handle “repeat” reads. Based on sets of 100-bp simulated and 101-bp real reads that we tested, we found that 2.6-3.4% and 1.4-1.8% of the reads were mapped to >5 locations and >100 locations, respectively. Attempting to report all alignments would likely consume a prohibitive amount of disk space. In order to address this issue, our repeat indexing and alignment approach directly aligns reads to repeat sequences, resulting in one repeat alignment per read. HISAT2 provides application programming interfaces (API) for C++, Python, and JAVA that rapidly retrieve genomic locations from repeat alignments for use in downstream analyses.
Other minor bug fixes are also included as follows:
- Fixed occasional sign (+ or -) issues of template lengths in SAM file
- Fixed duplicate read alignments in SAM file
- Skip a splice site if exon’s last base or first base is ambiguous (N)