Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support for reference sequences with lowercase characters (a, c, g, t) #19

Merged
merged 1 commit into from
Nov 8, 2024

Conversation

tkonopka
Copy link
Contributor

This addresses issue #18. It fixes problems rendering reference sequences with lowercase characters.

The fix comes in two parts. First is an expansion of the COLOR dictionary to associate hex codes to each of the lowercase bases. All the colors are defined separately. This is makes the code long, but leaves the option to tune the colors later.

The second part is an adjustment in the drawing of the coverage track. Without this edit, the coverage track signals discrepancies between bases in reads (uppercase) and the reference. The images below show a random sequence with made-up reads, before and after implementing the edit.

before

after

To reproduce, here is a random reference sequence and two reads in sam format.

>random_with_lowercase
caAagTGtgCACCAtTTCtCactaacaGGCGgtaTaAAgtgatAGTtagaaTTTgaagaT
gATTAAggcTgctGGtGtGTGaTAtcGgtcaGaTatgaTAaCagagAaCCcCTcgatgtt
aAgAGgTTtCaaCatGGgTtgaaAcTaccgtgCTtTgTtTGgtGaTTTcatttctcaGAC
aTTaCAgTcGGggtGtGaTCGgtGgtGCcgGTcATAgATAACGCtaAgTgGcGacGGtGa
gAaAAcGCtgCAattgaaaGtgGcacgacGACgCTGacCtccCaGtGTGgACgGgTaCtg
ttgcaaacGCagtTtACcgGGATaTAAaTTgGGGCcccaGAggAaAGaTgcTCCagATCg
aGTGcgGAGCtTGCGttaaGCcATTgCcaTaGGTAggcttcGAGTcaCTTCcAAcCGATT
taAcGgGtaTAGAtGtTgcTGAGgGaCaTtTCATtcCAcACTCatGcttTAGtaacCcTt
cGGCaacatCcagtAtagtTtcTgAgcTgctaCttCatAgaTTgAAcTGACCGgACaAcA
ctACTGCGTattCatgcCGCctCCCcccCaAGtCCTGacAccgggtaggtataTGTacGA
aagGgTcAcaTTtTcCgcAtcaggcgCCGCgTTATGaCGtCTTcGGAcaTCTtGcGTAgG
agtcCacaaTAagccCCtaAgCTttAcTtAGccGgagtTcAaAGAGaATGCcGCGGcGCA
AgCTgggtccTTgtttAGAaATgTGCAAATtGGAGGtTtAtCACTTTGgTtaGTcGaTaT
gTTcgGgAtTAccttTaCGctgataGGaGcaTcccgAcTccAtAtctaGGgGCcCtGgcA
CTATGaAGTCTAatAgTaGTggTTCtCaaaCGCTccgtGTccagTTGCacATCcTACTCT
CTGAcTGctGagcCcgaTGCcctGtCgtAAatGACgcaGCcAgGGGTGtcTActggGCtg
CcgaCCgacccttAcGTTtacgaacggGATctacGtTTgAcGCATcAgCAGCAAaGaTAA
CTGaATGacgCtgTCAgtcTaGTActCcAACAaAAGTtcGtGCTTtGGCCGAGAggGgcC
tAcgtgGcGAcAaCaAaCtaGAATAacATAaaaAtCaagGtCgGcgtgggtgcAtCcgtT
gTtGCGAtCttttTttAAgtGGActaCgTtgcTttCgAcaATccCGtccgAgTCCActgC
aCaTgGGAGtTgTtaGTCTtccacAagTcCaCtgctGtcTAaAcTacTAGatgAaAGTCC
ACaggacTaaatctcaAAcAAGcTACtGCCAaaCAtTggTtATGTtAgaAGttGtGGATC
CCACtGCggtaTaataggCgccCgtAGGGGCggAAcctGCtcgCtgtgGGTcTTgtaGGa
GcAcgGAcACCgaGcGAcGGcgGttTctGtgccAgCACaCgttCTtTgcgaGaTagGttC
@SQ	SN:random_with_lowercase	LN:1440
@PG	ID:bwa	PN:bwa	VN:0.7.17-r1188	CL:bwa mem -o example.sam large.fa example.fastq
random_with_lowercase:600-630	0	random_with_lowercase	600	60	31M	*	0	0	AAAGGGTCACATTTTCCGCATCAGGCGCCGC	sssssssssssssssssssssssssssssss	NM:i:0	MD:Z:31	AS:i:31	XS:i:0
random_with_lowercase:620-650	0	random_with_lowercase	620	60	31M	*	0	0	TCAGGCGCCGCGTTATGACGTCTTCGGACAT	sssssssssssssssssssssssssssssss	NM:i:0	MD:Z:31	AS:i:31	XS:i:0

Copy link
Contributor

@nr23730 nr23730 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very helpful! I would like to see this merged.
By the way: This fixes #22 and #18.

@blaiseli
Copy link

blaiseli commented Nov 7, 2024

@danielmsk Could you accept this pull request ? This might make your software much easier to use for some people.

Thanks in advance.

Copy link
Collaborator

@danielmsk danielmsk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix! @tkonopka

@danielmsk danielmsk merged commit b95f02d into parklab:master Nov 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants