forked from galantelab/sideRETRO
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathChangeLog
1254 lines (694 loc) · 33.7 KB
/
ChangeLog
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
2020-08-11 Thiago L. A. Miller <[email protected]>
* Version 1.0.0
* UPDATE: AUTHORS, docs/authors.rst: Author names
* ADD: README.md (Citation): Citation in BibTeX
* ADD: *.[ch]: Attach the copying notices header
* ADD: docs/intro.rst (Citation): Citation in BibTeX
* UPDATE: docs/usage.rst (General Syntax)
* UPDATE: docs/result.rst (Results)
* UPDATE: check_sider_main.c (main): Control log
Enable and disable log verbosity according to the environment variable
in the macro LOG_DEBUG_KEY
* CHANGE: merge_call.c (DEFAULT_SUPPORT): 1
Disbale reclustering (support filtering) by default
* FIX: vcf.c (genotype_likelihood): 0,0,0
Set genotype to ./. when the probabilities are 0,0,0
2020-08-05 Thiago L. A. Miller <[email protected]>
* UPDATE: fasta.c: Use 'gz' interface
* CHANGE: gz.c: Make (GzFile *) public
* UPDATE: gff.c: Use 'gz' interface
* UPDATE: bed.c: Use 'gz' interface
2020-08-03 Thiago L. A. Miller <[email protected]>
* ADD: utils.c (buf_expand, entry_set): Control buf
Remove duplicated code by adding these functions to the utils.c
2020-08-01 Thiago L. A. Miller <[email protected]>
* ADD: gz.c: Add wrapper for libz
2020-07-29 Thiago L. A. Miller <[email protected]>
* ADD: main.c (print_citation): Show citation
Add citation option. Print it in BibTeX format
* FIX: vcf.c (genotype_likelihood): Be conservative
Restructure the genotyping. Compare doubles using 'fequal' and when in
case of equality, follow the table:
HOR HET HOA GENOTYPE
2 2 2 0/1
2 2 1 0/0
2 1 2 0/0
1 2 2 0/1
* ADD: utils.c (fequal): Compare float points
* UPDATE: process_sample.c, docs/usage.rst: Help
Update the argument 'annotation-file' help. Explain about the fields
required and about the lines that are filtered
2020-07-28 Thiago L. A. Miller <[email protected]>
* FIX: gff.c (gff_read): Spaces into GTF attr
Solve bug when reading GTF, possibly GFF3 as well, and there is a space
into the attr value: gene_name "My gene name". Now, the attr value will
correctly split it as: key=gene_name and value='My gene name'
2020-07-26 Thiago L. A. Miller <[email protected]>
* UPDATE: docs/result.rst: Start of text/imgs/codes
2020-07-23 Thiago L. A. Miller <[email protected]>
* FIX: .travis.yml: meson version
meson 0.55.0 fails when the option '-Db_coverage=true' is set
* ADD: docs/intro.rst: Add CRAM
* ADD: docs/method.rst: A new paragraph at beginning
2020-07-22 Thiago L. A. Miller <[email protected]>
* FIX: docs/usage.rst (A Practical Workflow)
Now sideRETRO works with CRAM format, so fix and update this example of
usage
* ADD: docs/usage.rst (Dealing with CRAM format)
* UPDATE: *.c: Include CRAM to the input files
* FIX: main.c: Typo in the help message
* FIX: meson.build: This closes #1
* UPDATE: docs/conf.py: Update year
2020-05-24 Thiago L. A. Miller <[email protected]>
* FIX: abnormal.c (dump_alignment): (long) core.pos
core.pos, in more recent versions of htslib, is declared as long int,
and not as int. So, cast core.pos to long int, in order to avoid gcc
warnings
2020-04-02 Thiago L. A. Miller <[email protected]>
* FIX: abnormal.c (sam_rewind): Working with CRAM
2020-03-26 Thiago L. A. Miller <[email protected]>
* Version 0.14.1
* FIX: meson.build: Make config.h before compiling
'config.h' is dynamically generated from template 'config.h.in', so it
need to be made before compiling. In order to assure the required order,
declare 'config.h' as a dependency of library and executable
2020-03-24 Thiago L. A. Miller <[email protected]>
* FIX: intro.rst (Features): Typo in 'Strandness'
* FIX: merge_call.c (merge_call_init): phred_quality
Set .phred_quality attribute to the default value at macro
DEFAULT_PHRED_QUALITY. Previous it was set to 0
2020-01-31 Thiago L. A. Miller <[email protected]>
* FIX: Dockerfile: Install git
2020-01-04 Thiago L. A. Miller <[email protected]>
* ADD: docs/meson.build (doc_images): barbara.jpg
2019-12-31 Thiago L. A. Miller <[email protected]>
* Version 0.14.0
2019-12-30 Thiago L. A. Miller <[email protected]>
* FIX: make_vcf.c: Aspell fix spelling errors
* FIX: merge_call.c: Aspell fix spelling errors
* FIX: process_sample.c: Aspell fix spelling errors
* FIX: docs/*.rst: Aspell fix spelling errors
* FIX: usage.rst (A Practical Workflow): Analysis results
* ADD: analyser.pl
2019-12-30 Thiago L. A. Miller <[email protected]>
* CHANGE: intro.rst: Functionalities -> Features
* CHANGE: README.md: Getting Started
2019-12-29 Thiago L. A. Miller <[email protected]>
* FIX: intro.rst: Typo
* ADD: README.md: Documentation
* ADD: retrocopy.rst: Complete references
* ADD: usage.rst: Usage and examples
* ADD: result.rst: Finish analysis topic
* ADD: result.rst: Download simulation data
* ADD: result.rst: Analysis confusion matrix
2019-12-28 Thiago L. A. Miller <[email protected]>
* ADD: result.rst: Analysis heatmap image
* ADD: result.rst: Running sideRETRO
2019-12-27 Thiago L. A. Miller <[email protected]>
* ADD: result.rst: Dataset and simulation topics
2019-12-26 Thiago L. A. Miller <[email protected]>
* ADD: retrocopy.rst: Retrocopy in a nutshell
2019-12-25 Thiago L. A. Miller <[email protected]>
* ADD: README.md: Dockerhub badge
* ADD: Dockerfile: docker image
* FIX: method.rst: Change Wikipedia to paper links
* CHANGE: footer.html: Style of button = normal
2019-12-24 Thiago L. A. Miller <[email protected]>
* ADD: footer.html: Extend RTD footer with COPYING message
* ADD: legalcode.txt: COPYING for documentation
* CHANGE: method.rst: References + Further Reading
2019-12-23 Thiago L. A. Miller <[email protected]>
* ADD: method.rst: Orientation
2019-12-22 Thiago L. A. Miller <[email protected]>
* ADD: method.rst (Genotype): More details and image
Explain in more details the genotype likelihoods, including the formula
derivations and an illustrative image
2019-12-20 Thiago L. A. Miller <[email protected]>
* ADD: method.rst: Genotype
2019-12-11 Thiago L. A. Miller <[email protected]>
* ADD: method.rst: Clustering
2019-12-09 Thiago L. A. Miller <[email protected]>
* ADD: method.rst: Abnormal alignments
2019-12-06 Thiago L. A. Miller <[email protected]>
* ADD: intro.rst: Host gene and make-vcf subcommand
2019-11-23 Thiago L. A. Miller <[email protected]>
* FIX: install.rst: Add 'Installing Meson'
2019-11-19 Thiago L. A. Miller <[email protected]>
* CHANGE: README.md: Add sideRETRO logo
* ADD: logo_sideRETRO.png
* ADD: install.rst: Installation page
* FIX: intro.rst: How it works
* ADD: README.md: readthedocs badge
* ADD: intro.rst: Introduction page
2019-11-16 Thiago L. A. Miller <[email protected]>
* ADD: docs: Documentation using sphinx
2019-12-25 Thiago L. A. Miller <[email protected]>
* ADD: make_vcf.c: Improve help message
* ADD: merge_call.c: Improve help and defaults
* ADD: process_sample.c: Improve help and defaults
* ADD: main.c: Improve help message
2019-12-24 Thiago L. A. Miller <[email protected]>
* FIX: utils.c (path_file): Remove extension at last dot
2019-12-19 Thiago L. A. Miller <[email protected]>
* FIX: retrocopy.c: Same parental in overlapped rtc
When two overlapped clusters share the same parental gene - maybe the
edge points are reachable, not core points (DBSCAN) - it would be
annotated as overlapped parentals. Now it will be annotated as PASS and
the parental gene name wont be duplicated (e.g. PTEN/PTEN)
* FIX: genotype.c (cross_insertion_point): ref evid
Evidence for reads covering the reference allele is calculated now by
the overlapping between read range and insertion point +/- read half
decil. It is necessary in order to avoid superextimation of reads
covering the reference allele due to mapping errors
2019-12-06 Thiago L. A. Miller <[email protected]>
* REMOVE: genotype.c, merge_call.c: --crossing-reads
* ADD: process_sample.c, merge_call.c: Final messages
2019-12-05 Thiago L. A. Miller <[email protected]>
* ADD: db.c, genotype.c vcf.c: INFO DP2
Add reference and alternate depth counts to VCF's info
2019-12-03 Thiago L. A. Miller <[email protected]>
* CHANGE: db.c, genotype.c, vcf.c: Genotype likelihood
Implement the genotype estimation by using the likelihood approach as it
is defined in Heng Li paper: "A statistical framework for SNP calling,
mutation discovery, association mapping and population genetical
parameter estimation from sequencing data - 2.2 (eq2)"
2019-11-28 Thiago L. A. Miller <[email protected]>
* Version 0.13.0
2019-11-27 Thiago L. A. Miller <[email protected]>
* ADD: make_vcf.c: make-vcf command
* FIX: vcf.c: Some typos in variable names
2019-11-26 Thiago L. A. Miller <[email protected]>
* ADD: genotype.c (genotype): phred_qualit
Filter reads crossing insertion point also by phred quality score
2019-11-25 Thiago L. A. Miller <[email protected]>
* FIX: vcf.c: Change fprintf to xfprintf
* FIX: vcf.c (vcf_print_header): Add array_free()
* ADD: check_sider_vcf.c: Tests for vcf.c interface
* ADD: vcf.c (vcf_index_fasta): VCF's REF field
Index FASTA file if passed by the user and use it for annotate the REF's
field. Also include the contig metadata at the header for each
chromosome
* ADD: fasta.c interface: Read FASTA file
* ADD: str.h (string_reset): Macro for reset string
* FIX: str.c (string_concat): strncat -> memcpy
strncat is too slow, so implement concatenation with memcpy
2019-11-23 Thiago L. A. Miller <[email protected]>
* ADD: vcf.c: Print host information
Add INFO: EXONIC, INTRONIC and NEAR. EXONIC/INTRONIC annotate host gene
detail. We choose to use separated flags, because a retrocopy can be
exonic and intronic at the same time - Overlapped parental genes,
multiple isoforms. NEAR is used for intergenic retrocopy near some
gene
2019-11-21 Thiago L. A. Miller <[email protected]>
* ADD: vcf.c interface. Write VCF file as output
2019-11-19 Thiago L. A. Miller <[email protected]>
* ADD: db.c, genotype.c: (reference_depth)
Save the reference depth for the absence of a retrocopy event. It will
be useful later when making the annotated outputs
2019-11-16 Thiago L. A. Miller <[email protected]>
* ADD: merge_call.c (merge_call_print)
* ADD: process_sample.c (process_sample_print)
2019-11-15 Thiago L. A. Miller <[email protected]>
* ADD: merge_call.c (MergeCall): struct
* ADD: process_sample.c (ProcessSample): struct
Add ProcessSample struct in order to hold all option values. Create
methods to init, destroy and validate the data. This way, things get
more organized and make it easier to dump the log values of the run
* FIX: merge_call.c (merge_call): Stop if no cluster
* FIX: process_sample.c, merge_call.c: Improve usage
2019-11-14 Thiago L. A. Miller <[email protected]>
* Version 0.12.0
* REMOVE: align.c, bwa.c: BWA subproject
Remove bwa subproject, because it won't be used any longer
2019-11-13 Thiago L. A. Miller <[email protected]>
* FIX: .travis.yml: Change (-t) timeout 2 -> 3
* ADD: genotype.c (dump_genotype): Save genotype
* FIX: genotype.c (prepare_genotype_query_stmt)
Add DISTINCT clause to SQL query in order to remove duplicated entries
* ADD: merge_call.c (genotype): Genotyping step
Add genotype step and its options: --threads and --crossing-reads
* CHANGE: genotype.c: max_cross -> crossing_reads
* ADD: retrocopy.c (retrocopy): Option for near gene
Add option --near-gene-distance in order to make dynamic the MAX_DIST
value
* ADD: genotype.c (clean_genotype_table)
2019-11-12 Thiago L. A. Miller <[email protected]>
* FIX: genotype.c (zygosity): hts_idx_destroy
Clean up memmory leak in hts_idx_t
* FIX: check_sider_{genotype,gff,wrapper}.c
Memmory leaks in test
* FIX: .travis.yml: Turn off libcheck timeout
* ADD: check_sider_genotype.c test
* ADD: genotype.c (max_cross, db_insert_genotype)
Add option max_cross (override MAX_CROSS). And finally, dump genotype to
database
* ADD: db.c (db_{prepare,insert}_genotype): genotype
2019-11-11 Thiago L. A. Miller <[email protected]>
* FIX: README.md (coveralls badge): master branch
* ADD: genotype.c (zygosity_{indexed,linear}_search)
If the BAM index is found, perform a fast search for each retrocopy
region, else index all regions inside an intervalar tree and make a slow
linear search all over the file
2019-11-09 Thiago L. A. Miller <[email protected]>
* ADD: genotype.c interface
* FIX: retrocopy.h: include db.h
2019-11-07 Thiago L. A. Miller <[email protected]>
* Version 0.11.0
* ADD: process_sample.c: --deduplicate option
* ADD: merge_call.c (merge_call): Run retrocopy step
* ADD: cluster.c (prepare_query_stmt): test type
The dedup step set the alignment type to ABNORMAL_NONE flag if the read
is a duplication. So, in the clustering step, it is necessary to avoid
using those duplicated reads
* ADD: check_sider_dedup.c test
2019-11-04 Thiago L. A. Miller <[email protected]>
* ADD: dedup.c interface: Remove duplicated reads
Add deduplication capability. Remove duplicated reads but one, which is
called the primary read. Other tools, specilized in remove duplications,
use some metric to choose the primary reads. For us, it is just
interesting the coordinates, so the primary reads are choosen randomly -
mostly the first one to appear
* ADD: abnormal.c (abnormal_classifier): Dup filter
Add filter for PCR or optical duplicated reads
2019-10-31 Thiago L. A. Miller <[email protected]>
* FIX: cluster.c, retrocopy.c (qname, source_id idx)
There is the possibility that different BAM files share reads with the
same query name. In order to avoid a mess to find the right mate, use
the source_id along with qname when required to match reads from the
same fragment
2019-10-27 Thiago L. A. Miller <[email protected]>
* ADD: retrocopy.c (clean_retrocopy_tables)
* ADD: retrocopy.c (calculate_orientation)
* FIX: meson.build: sqlite3 version 3.28.0
Change sqlite3 minimum version to 3.28.0 in order to support RANK()
functions
* ADD: check_sider_retrocopy.c test
Add test for 'retrocopy.c'. Also '.travis.yml' now prints all relevant
error logs
* ADD: retrocopy.c (merge_cluster)
Add merge_cluster function in order to populate 'cluster_merging' table
and classify retrocopy level.
Begin annotate_retrocopy function. For now it is capable to calculate
the insertion point
* ADD: db.c (db_{prepare, insert}_retrocopy)
2019-10-24 Thiago L. A. Miller <[email protected]>
* Version 0.10.0
* FIX: cluster.c (cluster): Clean tables before run
Clean clustering tables if a db exists in mc step
* CHANGE: process_sample.c: --max-base-frac -> freq
* FIX: cluster.c (prepare_filter_support_stmt): sid
Fix query statement by adding cluster.sid to grouping by
* CHANGE: merge_call.c (--gff-feature): -F -> -T
* ADD: gff.h (gff_filter_[hard|soft]_attribute_size)
2019-10-23 Thiago L. A. Miller <[email protected]>
* ADD: cluster.c, merge_call.c: --blacklist-padding
Option to increase the blacklisted regions (left and right) by N bases
2019-10-22 Thiago L. A. Miller <[email protected]>
* ADD: check_sider_blacklist.c
* ADD: exon.c (exon_tree_index_dump): GffFilter
* FIX: cluster.c (prepare_query_stmt): rlen == 0
* ADD: blacklist.c, merge_call.c: GffFilter
* ADD: gff.c (gff_filter_insert_feature)
2019-10-21 Thiago L. A. Miller <[email protected]>
* FIX: abnormal.c (dump_alignment): rlen == 0
* ADD: gff.c (gff_looks_like_gff_file): GFF filename
Test if the filename seems to belong to a GFF/GFF3/GTF file
* FIX: db.c (db_create_tables): Composite PK
Add cluster_sid references cluster(sid) to foreign keys and primary keys
into the table 'cluster_merging'
2019-10-20 Thiago L. A. Miller <[email protected]>
* ADD: cluster.c, db.c, blacklist.c: Filter column
Add cluster filter: CHR, DIST, REGION, SUPPORT, NONE. The philosophy now
is to keep all clusters and subclusters and add a new column to handle
the filtering steps
2019-10-15 Thiago L. A. Miller <[email protected]>
* ADD: gff.c (gff_filter_*)
Add gff filtering capabilities. The user must initiate a type GffFilter
with the feature_type to be filtered (e.g. gene, transcript, exon) and
may add attributes aswell (e.g. gene_type=protein_coding). The
attributes are hard and soft - which mean, hard attributes must all
match the pattern (AND); soft attributes, at least, must match one
pattern (OR).
Increase the 'gff.c' testing coverage and change the function name
'gff_open' to 'gff_open_for_reading'
2019-10-14 Thiago L. A. Miller <[email protected]>
* ADD: blacklist.c
Add blacklist interface and tables blacklist, overlapping_blacklist. The
interface was inspired in 'exon.c' way. Also add more command-line
options for indexing blacklisted regions from GTF/GFF3/BED files
2019-10-12 Thiago L. A. Miller <[email protected]>
* ADD: bed.c: Handle BED file format
2019-10-10 Thiago L. A. Miller <[email protected]>
* ADD: cluster.c (reclustering): genotype support
Add reclustering step in order to filtering low number of reads comming
from a given source (BAM). When those reads are removed, may occur that
the cluster become rarefied, and therefore, invalid according to DBSCAN
constraints.
The option --genotype-support has been added to control the low number
of reads comming from a source
2019-10-07 Thiago L. A. Miller <[email protected]>
* ADD: log.h (__FILENAME__): Macro for filename
* CHANGE: db.c, cluster.c (cluster table)
Move gene_name from clustering to cluster and add the window start and
end information
2019-10-06 Thiago L. A. Miller <[email protected]>
* CHANGE: db.c (schema table)
Change DB_SCHEMA_VERSION (float) by two integer values (%d.%d)
DB_SCHEMA_MAJOR_VERSION and DB_SCHEMA_MINOR_VERSION
2019-10-05 Thiago L. A. Miller <[email protected]>
* Version 0.9.0
* FIX: cluster.c, db.c, merge_call.c, recluster.c
Remove 'recluster.c' and merge its functionality to 'cluster.c'.
Catch all alignments whose mate overlaps a given exon and filter them
by: blacklisted chromosomes (e.g. chrM); read cannot be exonic from its
own parental; distance from its own parental gene.
Reshape database schema removing reclustering table and add retrocopy
and related tables to make relation among retrocopies and clusters
2019-09-30 Thiago L. A. Miller <[email protected]>
* ADD: correlation.c interface
Add statistics pearson and spearman correlation brought from GSL
2019-09-29 Thiago L. A. Miller <[email protected]>
* FIX: db.c (db_create_tables): foreign key
2019-09-25 Thiago L. A. Miller <[email protected]>
* FIX: process_sample.c: int -> float max_base_frac
* ADD: abnormal.c, process_sample.c (max-base-frac)
Filter reads according to the base frequency fraction in order to avoid
aligments against genome repetitive motifs
2019-09-22 Thiago L. A. Miller <[email protected]>
* FIX: cluster.c (prepare_query_stmt): blacklist_chr
Avoid clusters from and to the chromosomes into blacklist_chr set
2019-09-16 Thiago L. A. Miller <[email protected]>
* FIX: dbscan.c (dbscan_cluster): Same point shared
* DBSCAN is not entirely deterministic: border points that are
reachable from more than one cluster can be part of either
cluster, depending on the order the data are processed.
(Wikipedia)
* Fix the REACHABLE points which are shared among multiple clusters
2019-09-15 Thiago L. A. Miller <[email protected]>
* FIX: db.c (clustering table)
Fix foreign key issue by adding the clustering id as part of the weak
key. Add gene_name as parental_gene_name in order to facilitate future
analysis
* FIX: recluster.c (prepare_query_stmt): cluster_id
Change cluster_id to id
* FIX: merge_call.c (merge_call): reclustering_stmt
Call db_finalise for reclustering_stmt
* ADD: recluster.c interface
Add reclustering step in order to validate the clusters found. The
clusters will be analysed according to distance from, number of, and
number of reads for reclustering each putative parental gene
2019-09-13 Thiago L. A. Miller <[email protected]>
* ADD: db.c: reclustering table
Add reclustering table to hold validated clustering entries
* CHANGE: db.c (db_insert_clustering): rm cluster_id
Remove cluster_id from clustering table schema. Now it became a weak
entity
* ADD: merge_call.c, cluster.c (blacklist_chr)
Add option to avoid clustering at certain chromosomes. By default it
avoides 'chrM'
* ADD: set.c (set_is_member)
* CHANGE: chr.c (CHR_BUFSIZ): 32
2019-09-10 Thiago L. A. Miller <[email protected]>
* Version 0.8.0
* CHANGE: INSTALL.md -> INSTALL
* ADD: meson.build (vcs_tag)
Dynamically generate the VERSION macro according to git tag and commit
SHA-1
2019-09-09 Thiago L. A. Miller <[email protected]>
* FIX: exon.c (exon_tree_lookup_dump): Thread-safe
The alignment_id was included inside ExonTree object, which in turn was
shared among all threads. With no mutex, may occur shocking among all
alignment_id values. In order to fix it, a new private struct keeps
ExonTree and alignment_id separately
* ADD: abnormal.c, exon.c, process_sample.c: loginfo
Improve log messages for debugging and change attribute name num_threads
in AbnormalArg to inc_step - this way is more descriptive about its
function
2019-09-08 Thiago L. A. Miller <[email protected]>
* FIX: db.c (db_create_tables): source foreign key
* CHANGE: Move from Autotools to Meson build system
* REMOVE: Autotools
2019-09-06 Thiago L. A. Miller <[email protected]>
* ADD: cluster.c (index_alignment_qname)
Add index for alignment qname for speed up query. Update
prepare_query_stmt() in order to avoid the abnormal reads that
fall into their on parental gene
2019-09-05 Thiago L. A. Miller <[email protected]>
* ADD: --phred-quality option
Add phred quality score filter option in order to avoid low mapped
quality reads
2019-08-31 Thiago L. A. Miller <[email protected]>
* CHANGE: dbscan.c (dbscan_cluster): List -> Set
* Change List *seed to Set *seed in order to speed up union statement
* REMOVE: set.c (set_insert_all, set_remove_all)
* ADD: set.c: Set interface
* ADD: hash.c (direct_hash, direct_equal)
2019-08-26 Thiago L. A. Miller <[email protected]>
* ADD: tests: checked fixtures
2019-08-23 Thiago L. A. Miller <[email protected]>
* CHANGE: hash.c: New extensible hash algorithm
Move from chaining hash type to extensible hash. Now it has no need to
declare the hash size in 'new' function, because it is dynamically
allocated.
2019-08-15 Thiago L. A. Miller <[email protected]>
* ADD: configure.ac: -Wextra -Wpedantic
* FIX: Remove -Wunused-variable
* FIX: db_merge.c (db_merge): Remove *db2 redundancy
2019-08-14 Thiago L. A. Miller <[email protected]>
* Version 0.7.0
2019-08-13 Thiago L. A. Miller <[email protected]>
* ADD: configure.ac (AM_INIT_AUTOMAKE)
AM_INIT_AUTOMAKE(-Wall -Werror) and AM_PROG_AR macros
* FIX: check_sider_*.c: Move log_set_quiet to main
* ADD: db.c (db_{insert,check}_schema_version)
Add a new table 'schema' with a single column called 'version'. Its
function is keep track of the database schema state in a versioned way
2019-08-06 Thiago L. A. Miller <[email protected]>
* ADD: process_sample.c: max_distance, sorted
Add CLI options --max-distance, --sorted. The former for getting a
cutoff of maximum distance allowed between paired-end reads, and the
last for considering all files queryname sorted
* FIX: check_sider_abnormal.c (sam_{un}sorted)
* FIX: cluster.c, merge_call.c: Remove typos
* CHANGE: abnormal.c (parse_{sorted,unsorted}_sam)
Change abnormal interface to handle sorted and unsorted SAM/BAM files
2019-08-01 Thiago L. A. Miller <[email protected]>
* Version 0.6.0
* ADD: check_sider_db_merge.c: Test db_merge.c
2019-07-31 Thiago L. A. Miller <[email protected]>
* ADD: check_sider_db.c: Test low-level and schema
* ADD: check_sider_{utils,wrapper}.c: signal test
* ADD: utils.c (setup_signal)
Add setup_signal function to facilitate signal handling
* ADD: wrapper.c (xsigaction)
* ADD: check_sider_hash.c (test_hash_int)
2019-07-30 Thiago L. A. Miller <[email protected]>
* ADD: check_sider_wrapper.c
Add tests for wrapper interface
* ADD: check_sider_utils.c (equalstring, cmpstringp)
* ADD: COVERALLS support via TravisCI
* FIX: check_sider_cluster.c coverage
2019-07-29 Thiago L. A. Miller <[email protected]>
* FIX: bitree.c (bitree_traverse)
Fix bitree inorder, preorder, postorder traverse
* ADD: gff.c tests (gff_entry_dup)
* ADD: merge_call.c: option --in-place
* FIX: process_sample.c (cmpstringp)
* ADD: array.c (array_data)
* ADD: utils.c (cmpstringp, equalstring)
* ADD: array.c (find, remove)
Add array_find, array_find_with_equal_fun, array_remove and
array_remove_index
2019-07-27 Thiago L. A. Miller <[email protected]>
* REMOVE: cluster.c (prepare_query_stmt)
Remove unnecessary log_debug showing the sql statement compiled
* ADD: db_merge.c interface
Add db_merge function to merge all batchs into one SQLite3 database. The
function aims to append new values without breaking foreign key
constraints. The table 'exon' cannot hold duplicate values, so all
values of the first database is inserted, whereas the following
databases values are inserted if they are not present in the merged
table
* ADD: hash.c (int_hash, int_equal)
* CHANGE: dbc: More wrappers on SQLite3
Add low-level wrappers on SQLite3 functions in order to avoid so many
testing against each statement. Now the sqlite3_stmt methods wrappers
can be called just with its 'class' - no need to pass sqlite3 object as
well
2019-07-23 Thiago L. A. Miller <[email protected]>
* FIX: cluster.c (prepare_query_stmt)
Change SQL select statement, in order to catch the reads which
falls into an exonic region and whose mate, as well, falls into an
exonic region
* ADD: check_sider_cluster.c testing for cluster.c
2019-07-21 Thiago L. A. Miller <[email protected]>
* FIX: merge_call.c (merge_call): db_connect
* ADD: cluster.c (cluster): Print number of clusters
* ADD: merge-call command
* ADD: cluster.c interface
Add 'cluster' to handle DBSCAN calling
2019-07-20 Thiago L. A. Miller <[email protected]>
* ADD: db.c (db_{prepare,insert}_clustering)
Add table 'clustering' to database in order to keep organized the
possibilities of retrocopies found
* ADD: dbscan.c interface
* Add DBSCAN (Density-Based Spatial Clustering of Applications with Noise)
2019-07-18 Thiago L. A. Miller <[email protected]>
* FIX: process_sample.c (parse_process_sample ..)
Test for arguments and options just at the beggining of the function
* ADD: main.c (main): Short alias to subcommands
* ADD: db.c (db_create_tables): New table 'batch'
Add a new table called 'batch', in order to keep a grouping relation
among all SAM/BAM files processed at process-sample step
2019-07-17 Thiago L. A. Miller <[email protected]>
* ADD: CLI interface based in subcommands
In db.c, remove the column 'name' from table 'source', so it requires
only one id and the alignment file path.
Finally, begin to implement CLI interface based in subcommands. For now
there is only one subcommand working: 'process-sample'
* ADD: logger.c interface
logger interface handles log.c attributes, so it allocs mutex and open
log file - dealing with the file pointer as well.
* FIX: check_sider_abnormal.c: Remove gtf tempfile
* ADD: utils.c (mkdir_p): Works similar to mkdir -p
* ADD: wrapper.c (xmkdir)
* ADD: io.c interface to handle input file
Also fix tests for check_sider_process_sample.c - It was renamed from
check_sider_sider.c