This repository has been archived by the owner on Jun 23, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathdataset.json
2998 lines (2998 loc) · 211 KB
/
dataset.json
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
{
"3d-SPADE": {
"Name": "3d-SPADE",
"Description": "3d-SPADE is a method to find reoccurring spike patterns in parallel spike train data, and to determine their statistical significance. It is a part of Elephant.",
"parallel": true,
"distributed": false,
"GPU": false
},
"AGS_ACN": {
"Name": "AGS ACN",
"Description": "> CORRECT NAME OF TOOL COULD ALSO BE 'ags.sh acn.sh tools', 'AGS', 'ACN' | Fast and accurate average genome size and 16S rRNA gene average copy number computation in metagenomic data | Here, we present the ags.sh and acn.sh tools dedicated to the computation of the Average Genome Size (AGS) and 16S rRNA gene Average Copy Number (ACN), respectively. The ags.sh and acn.sh tools compute these metagenomic traits based on the (ultra-fast) annotation of 35 universally distributed single-copy genes in unassembled metagenomic data | Pereira-Flores, E., Gl\u00f6ckner F. O., and Fernandez-Guerra A. Fast and accurate average genome size and 16S rRNA gene average copy number computation in metagenomic data. BMC Bioinformatics. 2019;20(1):453. doi:10.1186/s12859-019-3031-y",
"parallel": false,
"distributed": true,
"GPU": false
},
"AIKYATAN": {
"Name": "AIKYATAN",
"Description": "mapping distal regulatory elements using convolutional learning on GPU.\n\nBACKGROUND:The data deluge can leverage sophisticated ML techniques for functionally annotating the regulatory non-coding genome. The challenge lies in selecting the appropriate classifier for the specific functional annotation problem, within the bounds of the hardware constraints and the model's complexity. In our system AIKYATAN, we annotate distal epigenomic regulatory sites, e.g., enhancers. Specifically, we develop a binary classifier that classifies genome sequences as distal regulatory regions or not, given their histone modifications' combinatorial signatures. This problem is challenging because the regulatory regions are distal to the genes, with diverse signatures across classes (e.g., enhancers and insulators) and even within each class (e.g., different enhancer sub-classes)",
"parallel": false,
"distributed": false,
"GPU": true
},
"APPAGATO": {
"Name": "APPAGATO",
"Description": "APproximate PArallel and stochastic GrAph querying TOol for biological networks.",
"parallel": true,
"distributed": false,
"GPU": false
},
"Arteria": {
"Name": "Arteria",
"Description": "An automation system for a sequencing core facility.\n\nStackStorm packs to automate sequencing center operations.\n\nEvent-driven sequencing center automation.\n\nIt forms the core of the Arteria automation system, which you can read about on our website or preprint. This pack integrates with a series of bioinformatic micro-services, which can be found at https://github.com/arteria-project.\n\nThis pack provides re-usable units for automating tasks at a sequencing core facility using the StackStorm event-driven automation platform.\n\nHandling sequencing data from massive parallel sequencing can be a daunting task! And while the process of handling sequencing data will share many of its characteristics across centers, the current norm is one center one solution. This creates a situation where reuse is difficult to achieve and the wheel is invented over and over again. This is a situation that we hope can be remedied in the form of the Arteria project",
"parallel": true,
"distributed": false,
"GPU": false
},
"BART-Seq": {
"Name": "BART-Seq",
"Description": "cost-effective massively parallelized targeted sequencing for genomics, transcriptomics, and single-cell analysis | Software required for Bart-Seq technology \u2013 a cheap technology to analyze (single) cells using forward and reverse barcoding for target genes | Demultiplexing pipeline for BARTSeq | Software required for Bart-Seq \u2013 a cost-effective target enrichment technology using forward and reverse barcoding to analyze selected set of genes in single cells and/or bulk RNA/DNA samples | The pipeline can be run via snakemake [-j 4] [-s \u2026/bartseq/Snakefile] [-d \u2026/mydata], where -j specifies the number of threads, and the other parameters default to ./Snakefile and ., respectively",
"parallel": true,
"distributed": false,
"GPU": false
},
"BIRD": {
"Name": "BIRD",
"Description": "Bayesian Estimation of Genetic Regulatory Effects in High-throughput Reporter Assays | Bayesian Inference of Regulatory Differences | Photo by Bill Majoros. Used with permission | [6/16/2018] First version of BIRD released - The first version of BIRD has been released on GitHub at https://github.com/bmajoros/BIRD | [8/23/2018] Experiment design web tool released - A web tool is now available for power and sample size estimation: http://67.159.92.22:8080/ | BIRD (Bayesian Inference of Regulatory Differences) is a software suite for identifying regulatory variants in data from STARR-seq and other massively parallel reporter assays (MPRAs)",
"parallel": true,
"distributed": false,
"GPU": false
},
"CLIJ": {
"Name": "CLIJ",
"Description": "GPU-accelerated image processing for everyone.\n\nCLIJ is an OpenCL - ImageJ bridge and a Fiji plugin allowing users with entry-level skills in programming to build GPU-accelerated workflows to speed up their image processing. Increased efforts were put on documentation, code examples, interoperability, and extensibility. CLIJ is based on ClearCL, JOCL, Imglib2, ImageJ and SciJava",
"parallel": false,
"distributed": false,
"GPU": true
},
"CrystaLattE": {
"Name": "CrystaLattE",
"Description": "Automated computation of lattice energies of organic crystals exploiting the many-body expansion to achieve dual-level parallelism.\n\nSet of scripts to automate the calculation of crystal lattice energies.\n\nAutomated calculation of crystal lattice energies with the many-body expansion.\n\nCrystaLattE has an interface with the quantum chemistry package PSI4. To run, the code requires a crystallographic information file containing structural information of the crystal and an input file specifying execution details. Work continues in the creation of a CrystaLattE pip package. So, for the moment, the instructions to download and install CrystaLattE and to create a conda environment that includes PSI4 are presented below.\n\nCyrstaLattE is a software that automates the computation of crystal lattice energies using the many-body cluster expansion",
"parallel": true,
"distributed": false,
"GPU": false
},
"Cytokit": {
"Name": "Cytokit",
"Description": "A single-cell analysis toolkit for high dimensional fluorescent microscopy imaging | Microscopy Image Cytometry Toolkit | Cytokit is a collection of tools for quantifying and analyzing properties of individual cells in large fluorescent microscopy datasets with a focus on those generated from multiplexed staining protocols. This includes a GPU-accelerated image processing pipeline (via TensorFlow), CLI tools for batch processing of experimental replicates (often requiring conditional configuration, as things tend go wrong when capturing hundreds of thousands of microscope images over a period of hours or days), and visualization UIs (either Cytokit Explorer or CellProfiler Analyst) | Med Google Cloud Platform kan du oprette, implementere og skalere apps, websites og tjenester p\u00e5 den samme infrastruktur som Google | Forts\u00e6t til Google Cloud Platform | Skriv den tekst, du h\u00f8rer eller ser",
"parallel": false,
"distributed": false,
"GPU": true
},
"DDVFA": {
"Name": "DDVFA",
"Description": "Distributed dual vigilance fuzzy adaptive resonance theory learns online, retrieves arbitrarily-shaped clusters, and mitigates order dependence. Dual Vigilance Fuzzy ART - Companion MATLAB Code. Distributed Dual Vigilance Fuzzy ART - Companion MATLAB Code.",
"parallel": false,
"distributed": true,
"GPU": false
},
"DiversityExsiccatae": {
"Name": "DiversityExsiccatae",
"Description": "DiversityExsiccatae (DWB-DE) is an application of Diversity Workbench (DWB). It might be installed as part of a DWB environment, but also as a stand-alone application, e. g., in a local installation. Each DWB module is devoted to a specific data domain. DE is currently used in a core installation at the SNSB, Botanische Staatssammlung M\u00fcnchen to manage bibliographic information and standardized abbreviations of botanical and mycological exsiccatae in a global index. Exsiccatae are series of specimen collections which are distributed as a publication together with assigned printed matters. They are usually available in multiple copies worldwide. This standard list for Exsiccatal series with unique and stable series identifers is increasing the efficiency of the digitization of botanical and mycological collections.",
"parallel": false,
"distributed": true,
"GPU": false
},
"Dr.seq2": {
"Name": "Dr.seq2",
"Description": "A quality control and analysis pipeline for parallel single cell transcriptome and epigenome data. It provides quality control and analysis functionalities for three data types: single cell transcriptome data, Drop-ChIP data and scATAC-seq data.",
"parallel": true,
"distributed": false,
"GPU": false
},
"DstarM": {
"Name": "DstarM",
"Description": "> MEDIUM CONFIDENCE! | > HOMEPAGE MISSING! | > CORRECT NAME OF TOOL COULD ALSO BE 'nondecision', 'DM', 'two-choice' | an R package for analyzing two-choice reaction time data with the D\u2217M method | The decision process in choice reaction time data is traditionally described in detail with diffusion models. However, the total reaction time is assumed to consist of the sum of a decision time (as modeled by the diffusion process) and the time devoted to nondecision processes (e.g., perceptual and motor processes). It has become standard practice to assume that the nondecision time is uniformly distributed. However, a misspecification of the nondecision time distribution introduces bias in the parameter estimates for the decision model. Recently, a new method has been proposed (called the D\u2217M method) that allows the estimation of the decision model parameters, while leaving the nondecision time distribution unspecified",
"parallel": false,
"distributed": true,
"GPU": false
},
"EPA-ng": {
"Name": "EPA-ng",
"Description": "Complete reimplementation of the evolutionary placement algorithm (EPA) that is substantially faster, offers a distributed memory parallelization, and integrates concepts from both, RAxML-EPA and PPLACER.",
"parallel": true,
"distributed": true,
"GPU": false
},
"EVR": {
"Name": "EVR",
"Description": "EVR is a chromosome 3D structure reconstruction tool using an Error-Vector Resultant algorithm based on DNA interaction data. With normalized or unnormalized IF matrix as input, the program generate a chromosome 3D structure output in a .pdb file. Using Cython and OpenCL, the program can run on CPUs/GPUS/APUs and thus usually faster than similar tools.",
"parallel": false,
"distributed": false,
"GPU": true
},
"Eoulsan": {
"Name": "Eoulsan",
"Description": "A versatile framework based on the Hadoop implementation of the MapReduce algorithm, dedicated to high throughput sequencing data analysis on distributed computers. With Eoulsan, users can easily set up a cloud computing cluster and automate the analysis of several samples at once using various software solutions available. Working either on standalone workstations or cloud computing clusters, Eoulsan provides an integrated and flexible solution for RNA-Seq data analysis of differential expression.",
"parallel": false,
"distributed": true,
"GPU": false
},
"FB5P-seq": {
"Name": "FB5P-seq",
"Description": "FACS-based 5-prime end single-cell RNAseq for integrative analysis of transcriptome and antigen receptor repertoire in B and T cells.\n\nFB5P-seq: FACS-based 5'-end single-cell RNA-seq.\n\nCopyright 2019: PMlab, Centre d'Immunologie de Marseille-Luminy This work is distributed under the terms of the GNU General Public License. It is free to use for all purposes.\n\nFB5P-seq is a computational pipeline to process single-cell RNA sequencing (scRNAseq) data produced with the FB5P-seq protocol designed by the Milpied lab at Centre d'Immunologie de Marseille-Luminy. The pipeline relies on 5 main softwares:",
"parallel": false,
"distributed": true,
"GPU": false
},
"FQStat": {
"Name": "FQStat",
"Description": "A parallel architecture for very high-speed assessment of sequencing quality metrics | BACKGROUND:High throughput DNA RNA sequencing has revolutionized biological and clinical research. Sequencing is widely used, and generates very large amounts of data, mainly due to reduced cost and advanced technologies. Quickly assessing the quality of giga-to-tera base levels of sequencing data has become a routine but important task. Identification and elimination of low-quality sequence data is crucial for reliability of downstream analysis results. There is a need for a high-speed tool that uses optimized parallel programming for batch processing and simply gauges the quality of sequencing data from multiple datasets independent of any other processing steps. RESULTS:FQStat is a stand-alone, platform-independent software tool that assesses the quality of FASTQ files using parallel programming",
"parallel": true,
"distributed": false,
"GPU": false
},
"FastFeatGen": {
"Name": "FastFeatGen",
"Description": "Faster parallel feature extraction from genome sequences and efficient prediction of DNA N6-methyladenine sites.\n\nFaster parallel feature extraction from genome sequence.\n\nThis is a tool for faster feature extraction from genome sequences and making efficient prediction. To build efficient prediction model, user can go through the following instructions step by step. If user only wants to predict query sequences from our built model, then just go to step Make prediction for query sequences. Currently, FastFeatGen supports text file as shown in dataset.txt file in datasets directory; however, fasta file can also be used with a simple preprocessing",
"parallel": true,
"distributed": false,
"GPU": false
},
"Flint": {
"Name": "Flint",
"Description": "Large scale microbiome profiling in the cloud | Main repository of the Flint project for Spark and Amazon EMR | This is the main repository of the Flint project for Amazon Web Services. Flint is a metagenomics profiling pipeline that is built on top of the Apache Spark framework, and is designed for fast real-time profiling of metagenomic samples against a large collection of reference genomes. Flint takes advantage of Spark's built-in parallelism and streaming engine architecture to quickly map reads against a large reference collection of bacterial genomes",
"parallel": true,
"distributed": false,
"GPU": false
},
"FunGeCo": {
"Name": "FunGeCo",
"Description": "A web based tool for estimation of Functional potential of bacterial genomes and microbiomes using Gene Context information.\n\nFunctional potential of bacterial genomes and microbiomes from gene context information.\n\nThis feature allows the user to input a newly sequenced genome and annotate it using gene context based modules generated using extensive literature mining and manual curation. Users can also carry out comparative analysis (synteny view using parallel coordinates) of the uploaded genome with genomes already sequenced in NCBI using interactive visualizations.\n\nThis feature allows comparison of functional modules in sequenced genomes obtained from NCBI. Users can interactively select upto five genomes which are compared using a synteny based visualization (parallel coordinates) and circular genomic representations. Information about individual modules in all these genomes can also be viewed as tabular outputs",
"parallel": true,
"distributed": false,
"GPU": false
},
"GASAL2": {
"Name": "GASAL2",
"Description": "a GPU accelerated sequence alignment library for high-throughput NGS data.\n\nGASAL2 - GPU-accelerated DNA alignment library.\n\nGASAL2 is an easy-to-use CUDA library for DNA/RNA sequence alignment algorithms. Currently it supports different kind of alignments:.\n\nA Linux platform with CUDA toolkit 8 or higher is required, along with usual build environment for C and C++ code. GASAL2 has been tested over NVIDIA GPUs with compute capabilities of 2.0, 3.5 and 5.0. Although lower versions of the CUDA framework might work, they have not been tested",
"parallel": false,
"distributed": false,
"GPU": true
},
"GET-IT": {
"Name": "GET-IT",
"Description": "Exploiting observations and measurement data standard for distributed LTER-Italy freshwater sites. Water quality issues.\n\nGeoinformation Enabling ToolkIT starterkit.\n\nGeoinformation Enabling ToolkIT starterkit ®.\n\nGET-IT allows you to easily share geospatial data on the web with simple actions.\n\nShare your maps, measurements, and sensors.",
"parallel": false,
"distributed": true,
"GPU": false
},
"GLM-PCA": {
"Name": "GLM-PCA",
"Description": "Feature selection and dimension reduction for single-cell RNA-Seq based on a multinomial model.\n\nDimension Reduction of Non-Normally Distributed Data.\n\nImplements a generalized version of principal components analysis (GLM-PCA) for dimension reduction of non-normally distributed data such as counts or binary matrices.",
"parallel": false,
"distributed": true,
"GPU": false
},
"GWAS-Flow": {
"Name": "GWAS-Flow",
"Description": "A GPU accelerated framework for efficient permutation based genome-wide association studies | GPU accelerated GWAS framework based on TensorFlow | GWAS-Flow was written and published in the hope that you might find it useful. If you do and use it for your research please cite the paper published alongside the software, which is currently publicly accessible on the BiorXiv preprint server. https://www.biorxiv.org/content/10.1101/783100v1 doi: 10.1101/783100",
"parallel": false,
"distributed": false,
"GPU": true
},
"Gene2vec": {
"Name": "Gene2vec",
"Description": "Machine learning method that utilizes transcriptome-wide gene co-expression to generate a distributed representation of genes.",
"parallel": false,
"distributed": true,
"GPU": false
},
"GeneRax": {
"Name": "GeneRax",
"Description": "A tool for species tree-aware maximum likelihood based gene tree inference under gene duplication, transfer, and loss | GeneRax is a parallel tool for species tree-aware maximum likelihood based gene tree inference under gene duplication, transfer, and loss",
"parallel": true,
"distributed": false,
"GPU": false
},
"GenomeScope_2.0": {
"Name": "GenomeScope 2.0",
"Description": "Reference-free profiling of polyploid genomes | We have developed GenomeScope 2.0, which applies classical insights from combinatorial theory to establish a detailed mathematical model of how k-mer frequencies will be distributed in heterozygous and polyploid genomes | Average k-mer coverage for polyploid genome | Upload results from running Jellyfish or KMC",
"parallel": false,
"distributed": true,
"GPU": false
},
"GenomicsDB": {
"Name": "GenomicsDB",
"Description": "Advancing clinical cohort selection with genomics analysis on a distributed platform.\n\nHighly performant data storage in C++ for importing, querying and transforming variant data with Java/Spark. Used in gatk4.\n\nSparse Array Storage for Genomics.\n\nGenomicsDB, originally from Intel Health and Lifesciences, is built on top of a fork of htslib and a tile-based array storage system for importing, querying and transforming variant data.\n\nOpen source project providing a collaboration to optimizing sparse array storage for genomics.\n\nUsing high-level APIs provided in C++, Java*, and Spark*, users can both write and read variant records to and from GenomicsDB shared-nothing instances in parallel using multiple processes in a Single Process Multiple Data (SPMD) manner.",
"parallel": true,
"distributed": true,
"GPU": false
},
"GoldenMutagenesis": {
"Name": "GoldenMutagenesis",
"Description": "The Golden Gate cloning technique has been proven to be a highly efficient toolbox for a variety of cloning setups. Based on its modular concept it is particularly suitable for the use in multiple-site mutagenesis approaches. In this technical note we developed a protocol termed Golden Mutagenesis for the rapid, easy, reliable and cheap formation of mutagenesis libraries. One to five positions could be altered in parallel or simultaneously within two days. To facilitate the implementation of this technique, this R-library has been developed for the automated primer design and the graphical evaluation of sequencing results to determine the quality of the library.",
"parallel": true,
"distributed": false,
"GPU": false
},
"GsVec": {
"Name": "GsVec",
"Description": "Comprehensive biological interpretation of gene signatures using semantic distributed representation.\n\nGsVec (Gene signature Vector) is an analysis method that supports the biological interpretation of Gene signature obtained by gene expression analysis of Bioinformatics. The association between the gene signature to be interpreted and the gene signature of the Pathway / Gene Ontology data base is performed by natural language processing",
"parallel": false,
"distributed": true,
"GPU": false
},
"HIrisPlex-S": {
"Name": "HIrisPlex-S",
"Description": "Massively parallel sequencing solutions for two common forensically used platforms | HIrisPlex-S Eye, Hair and Skin Colour DNA Phenotyping Webtool | 8px 9px 10px 11px 12px 13px 14px 15px 16px 17px 18px | With the advancement of DNA phenotyping as a tool in Forensic and Anthropological usage, we now provide an easy to use interactive website to predict eye, hair and skin colour from DNA using the IrisPlex, HIrisPlex and HIrisPlex-S systems",
"parallel": true,
"distributed": false,
"GPU": false
},
"HPCCS": {
"Name": "HPCCS",
"Description": "Collision Cross Section Calculations Using HPCCS.\n\nThe High Performance Collision Cross Section (HPCCS) is a new software for fast and accurate calculation of CCS for molecular ions. Based on the Trajectory Method (TM), HPCCS was parallelized and optimized to be an user-friendly program.\n\nHigh Performance Collision Cross Section Calculation \u2013 HPCCS",
"parallel": true,
"distributed": false,
"GPU": false
},
"HiPFSTA": {
"Name": "HiPFSTA",
"Description": "Gradient-based, GPU-accelerated, high-precision contour-segmentation algorithm with application to cell membrane fluctuation spectroscopy.",
"parallel": false,
"distributed": false,
"GPU": true
},
"ICGRM": {
"Name": "ICGRM",
"Description": "Integrative construction of genomic relationship matrix combining multiple genomic regions for big dataset.\n\nBACKGROUND:Genomic prediction is an advanced method for estimating genetic values, which has been widely accepted for genetic evaluation in animal and disease-risk prediction in human. It estimates genetic values with genome-wide distributed SNPs instead of pedigree. The key step of it is to construct genomic relationship matrix (GRM) via genome-wide SNPs; however, usually the calculation of GRM needs huge computer memory especially when the SNP number and sample size are big, so that sometimes it will become computationally prohibitive even for super computer clusters. We herein developed an integrative algorithm to compute GRM.",
"parallel": false,
"distributed": true,
"GPU": false
},
"MAP": {
"Name": "MAP",
"Description": "Model-based analysis of proteomic data to detect proteins with significant abundance changes.\n\nMAP (Model-based Analysis of Proteomic data), is designed to statistically compare the proteomic profiles generated from different biological samples using the isotope labeling based mass spectrometry (MS) technique and directly identify proteins with significant abundance changes. Unlike many existing tools for this purpose, it does not require parallel/additional technical replicates to fathom technical variations; instead, MAP uses a novel step-by-step regression analysis to directly model technical variations from the profiles under comparison. Therefore, experimental designs and their expenses can be simplified and reduced for more practices",
"parallel": true,
"distributed": false,
"GPU": false
},
"MARGO": {
"Name": "MARGO",
"Description": "MARGO (Massively Automated Real-time GUI for Object-tracking).\n\nFast object tracking in real time allows convenient tracking of very large numbers of animals and closed-loop experiments that control stimuli for many animals in parallel. We developed MARGO, a MATLAB-based, real-time animal tracking suite for custom behavioral experiments. We demonstrated that MARGO can rapidly and accurately track large numbers of animals in parallel over very long timescales, typically when spatially separated such as in multiwell plates. We incorporated control of peripheral hardware, and implemented a flexible software architecture for defining new experimental routines. These features enable closed-loop delivery of stimuli to many individuals simultaneously. We highlight MARGO's ability to coordinate tracking and hardware control with two custom behavioral assays (measuring phototaxis and optomotor response) and one optogenetic operant conditioning assay.",
"parallel": true,
"distributed": false,
"GPU": false
},
"MCSeEd": {
"Name": "MCSeEd",
"Description": "A reference-free, whole genome profiling system to address cytosine/adenine methylation changes.\n\nMethods for investigating DNA methylation nowadays either require a reference genome and high coverage, or investigate only CG methylation. Moreover, no large-scale analysis can be performed for N6-methyladenosine (6 mA) at an affordable price. Here we describe the methylation content sensitive enzyme double-digest restriction-site-associated DNA (ddRAD) technique (MCSeEd), a reduced-representation, reference-free, cost-effective approach for characterizing whole genome methylation patterns across different methylation contexts (e.g., CG, CHG, CHH, 6 mA). MCSeEd can also detect genetic variations among hundreds of samples. MCSeEd is based on parallel restrictions carried out by combinations of methylation insensitive and sensitive endonucleases, followed by next-generation sequencing.",
"parallel": true,
"distributed": false,
"GPU": false
},
"MMCL": {
"Name": "MMCL",
"Description": "GPU-accelerated mesh-based Monte Carlo photon transport simulations.\n\n===============================================================================.\n\n= Mesh-based Monte Carlo (MMC) =.\n\n= Supporting both OpenCL and Multi-threading with SSE4 =.\n\nA GPU-accelerated photon transport simulator.\n\nDownload v2019.4 or Nightly-build.\n\n'mmc -L' or 'mmclab('gpuinfo')' to list).\n\nDesigning MATLAB-streamlined MC-based analysis using MCXLAB and MMCLAB.",
"parallel": false,
"distributed": false,
"GPU": true
},
"MPRAnalyze": {
"Name": "MPRAnalyze",
"Description": "Statistical framework for massively parallel reporter assays | Statistical Analysis of MPRA data | MPRAnalyze provides statistical framework for the analysis of data generated by Massively Parallel Reporter Assays (MPRAs), used to directly measure enhancer activity. MPRAnalyze can be used for quantification of enhancer activity, classification of active enhancers and comparative analyses of enhancer activity between conditions. MPRAnalyze construct a nested pair of generalized linear models (GLMs) to relate the DNA and RNA observations, easily adjustable to various experimental designs and conditions, and provides a set of rigorous statistical testig schemes",
"parallel": true,
"distributed": false,
"GPU": false
},
"MPRAnator": {
"Name": "MPRAnator",
"Description": "Web-based tool for the design of massively parallel reporter assay experiments.",
"parallel": true,
"distributed": false,
"GPU": false
},
"MRUniNovo": {
"Name": "MRUniNovo",
"Description": "Tool for de novo peptide sequencing utilizing the hadoop distributed computing framework.",
"parallel": false,
"distributed": true,
"GPU": false
},
"MasterOfPores": {
"Name": "MasterOfPores",
"Description": "Parallel and scalable workflow for the analysis of Oxford Nanopore direct RNA sequencing datasets.\n\nNextflow pipeline for analysis of Nanopore reads (from RNA/cDNA/DNA).\n\nPlease read the documentation here: https://biocorecrg.github.io/master_of_pores/.\n\nNextflow pipeline for analysis of Nanopore data from direct RNA sequencing. This is a joint project between CRG bioinformatics core and Epitranscriptomics and RNA Dynamics research group.",
"parallel": true,
"distributed": false,
"GPU": false
},
"MaveDB": {
"Name": "MaveDB",
"Description": "An open-source platform to distribute and interpret data from multiplexed assays of variant effect.\n\nTable of Multiplexed Assay of Variant Effect (MAVE) studies.\n\nMaveDB - A repository for MAVE assay datasets.\n\nTo cite this document, please use the citation details for MaveDB.\n\nMaveDB is a public repository for datasets from Multiplexed Assays of Variant Effect (MAVEs), such as those generated by deep mutational scanning (DMS) or massively parallel reporter assay (MPRA) experiments.\n\nWelcome to our table of Multiplexed Assay of Variant Effect (MAVE) studies. To contribute a study or amend/expand an existing entry, please use the GitHub issue tracker or create a pull request",
"parallel": true,
"distributed": false,
"GPU": false
},
"MetaSanity": {
"Name": "MetaSanity",
"Description": "An integrated, customizable microbial genome evaluation and annotation pipeline.\n\nPipeline for major biological analyses.\n\nMetaSanity v1.1.1 - 2020 version.\n\nMetaSanity v1.1.1 provides a unified workflow for genome assessment and functional annotation that combines all outputs into a single queryable database \u2013 all within an easily distributed Docker image",
"parallel": false,
"distributed": true,
"GPU": false
},
"MetaSpark": {
"Name": "MetaSpark:",
"Description": "Spark-based distributed processing tool to recruit metagenomic reads to reference genomes.",
"parallel": false,
"distributed": true,
"GPU": false
},
"Metage2Metabo": {
"Name": "Metage2Metabo",
"Description": "metabolic complementarity applied to genomes of large-scale microbiotas for the identification of keystone species.\n\nFrom annotated genomes to metabolic screening in large scale microbiotas.\n\nMetage2metabo is a Python3 (Python >= 3.6) tool to perform graph-based metabolic analysis starting from annotated genomes (reference genomes or metagenome-assembled genomes). It uses Pathway Tools in a automatic and parallel way to reconstruct metabolic networks for a large number of genomes. The obtained metabolic networks are then analyzed individually and collectively in order to get the added value of metabolic cooperation in microbiota over individual metabolism and to identify and screen interesting organisms among all.\n\nm2m \u2014 metage2metabo documentation.\n\nFree document hosting provided by Read the Docs",
"parallel": true,
"distributed": false,
"GPU": false
},
"MobCal-MPI": {
"Name": "MobCal-MPI",
"Description": "Programfor calculation of ion or molecule collision cross sections on parallelized computing architectures.",
"parallel": true,
"distributed": false,
"GPU": false
},
"MySeq": {
"Name": "MySeq",
"Description": "Privacy-protecting browser-based personal Genome analysis for genomics education and exploration.\n\nMySeq is a web-application for privacy-protecting interactive analysis of personal genomes (distributed as compressed-and-indexed VCF files) inspired by GENOtation (previously the Interpretome) and DNA.LAND Compass. MySeq is intended for use as a genomics educational platform.\n\nAnalyzing the PTC Tasting Phenotype with MySeq.\n\nThis page is an example analysis of the 'bitter tasting' trait using the MySeq application in an embedded context. Here MySeq is used to both query a whole genome VCF for NA12878 (from Genome in a Bottle) by genomic coordinates and predict the bitter tasting phenotype directly. All of the queries demonstrated here are performed 'live' in the browser, that is these are not pre-generated results. Try MySeq as a 'standalone' application.\n\nMySeq is a single-page web application for privacy-protecting personal genome analysis",
"parallel": false,
"distributed": true,
"GPU": false
},
"NCPHLDA": {
"Name": "NCPHLDA",
"Description": "A novel method for human lncRNA-disease association prediction based on network consistency projection.\n\nCopyright (C) 2019 Zecheng Huang([email protected]),Guobo Xie([email protected]).\n\nThis program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.\n\nType: Package Title: NCPHLDA:A Novel Method for Human LncRNA-Disease Association Prediction Based on Network Consistency Projection.\n\nTo analyze these data on NCPHLDA to further infer potential associations between lncRNAs and diseases, you should input the appropriate code in the matlab Command Window:",
"parallel": false,
"distributed": true,
"GPU": false
},
"NMF-RI": {
"Name": "NMF-RI",
"Description": "Blind spectral unmixing of highly mixed multispectral flow and image cytometry data.\n\nMatlab code to spectrally unmix highly mixed multispectral flow and image cytometry data.\n\nTo use NMF-RI on new data (tissue or flow cytometry) we recommend to use the \u00b4main.m\u00b4 code distributed and load new cytometry data to the existing folder hierarchy.",
"parallel": false,
"distributed": true,
"GPU": false
},
"NUFEB": {
"Name": "NUFEB",
"Description": "Massively Parallel Simulator for Individual-based Modelling of Microbial Communities.",
"parallel": true,
"distributed": false,
"GPU": false
},
"OptiJ": {
"Name": "OptiJ",
"Description": "Open-source optical projection tomography of large organ samples.\n\nThe three-dimensional imaging of mesoscopic samples with Optical Projection Tomography (OPT) has become a powerful tool for biomedical phenotyping studies. OPT uses visible light to visualize the 3D morphology of large transparent samples. To enable a wider application of OPT, we present OptiJ, a low-cost, fully open-source OPT system capable of imaging large transparent specimens up to 13 mm tall and 8 mm deep with 50 \u00b5m resolution. OptiJ is based on off-the-shelf, easy-to-assemble optical components and an ImageJ plugin library for OPT data reconstruction. The software includes novel correction routines for uneven illumination and sample jitter in addition to CPU GPU accelerated reconstruction for large datasets. We demonstrate the use of OptiJ to image and reconstruct cleared lung lobes from adult mice.",
"parallel": false,
"distributed": false,
"GPU": true
},
"P3BSseq": {
"Name": "P3BSseq",
"Description": "Parallel processing pipeline software for automatic analysis of bisulfite sequencing data.",
"parallel": true,
"distributed": false,
"GPU": false
},
"PRINCESS": {
"Name": "PRINCESS",
"Description": "Privacy-preserving international collaboration framework for analyzing rare disease genetic data that are distributed across different continents.",
"parallel": false,
"distributed": true,
"GPU": false
},
"PVCTools": {
"Name": "PVCTools",
"Description": "parallel variation calling tools.\n\nPVCTools is Parallel Variation Calling Tools, which tries to call variation using multiple threads. Basically, the reference genome will be splited into small pieces, and corresponding alignment BAM files will be extracted. Under this way, it will speed up the process of variation calling a lot",
"parallel": true,
"distributed": false,
"GPU": false
},
"PaSGAL": {
"Name": "PaSGAL",
"Description": "PaSGAL (Parallel Sequence to Graph Aligner) is designed to accelerate local sequence alignment of sequences to directed acyclic sequence graphs (DAGs), e.g., variation graphs, splicing graphs.",
"parallel": true,
"distributed": false,
"GPU": false
},
"ParGenes": {
"Name": "ParGenes",
"Description": "Tool for massively parallel model selection and phylogenetic tree inference on thousands of genes.",
"parallel": true,
"distributed": false,
"GPU": false
},
"ParLECH": {
"Name": "ParLECH",
"Description": "A hybrid and scalable error correction algorithm for indel and substitution errors of long reads.\n\nBACKGROUND:Long-read sequencing has shown the promises to overcome the short length limitations of second-generation sequencing by providing more complete assembly. However, the computation of the long sequencing reads is challenged by their higher error rates (e.g., 13% vs. 1%) and higher cost ($0.3 vs. $0.03 per Mbp) compared to the short reads. METHODS:In this paper, we present a new hybrid error correction tool, called ParLECH (Parallel Long-read Error Correction using Hybrid methodology). The error correction algorithm of ParLECH is distributed in nature and efficiently utilizes the k-mer coverage information of high throughput Illumina short-read sequences to rectify the PacBio long-read sequences.",
"parallel": true,
"distributed": true,
"GPU": false
},
"PathFlowAI": {
"Name": "PathFlowAI",
"Description": "A High-Throughput Workflow for Preprocessing, Deep Learning and Interpretation in Digital Pathology | A Convenient High-Throughput Workflow for Preprocessing, Deep Learning Analytics and Interpretation in Digital Pathology | MedRxiv Manuscript: https://www.medrxiv.org/content/10.1101/19003897v1 | Fig. 1. PathFlowAI Framework: a) Annotations and whole slide images are preprocessed in parallel using Dask; b) Deep learning prediction model is trained on the model; c) Results are visualized; d) UMAP embeddings provide diagnostics; e) SHAP framework is used to find important regions for the prediction",
"parallel": true,
"distributed": false,
"GPU": false
},
"PepVis": {
"Name": "PepVis",
"Description": "PepVis tool is a python based GUI pipeline which can be used to model and prepare large-scale peptide structure from the sequence and also to perform large-scale peptide virtual screening. PepVis integrates ModPep and Gromacs for modelling and structure optimization of the peptides, while it integrates AutoDock Vina,ZDOCK, AutoDock CrankPep(ADCP) for performing peptide virtual screening. \nThe protein-peptide complexes can be rescored using ZRANK2 and the flexible refinement of the large protein-peptide complexes can also be performed using FlexPepDock. The parallel job execution has been implemented using GNU parallel and the user can provide inputs using GUI which will produce the bash script based on the customized input provided by the user and can be run in terminal.",
"parallel": true,
"distributed": false,
"GPU": false
},
"Phen2Gene": {
"Name": "Phen2Gene",
"Description": "Phen2Gene is a phenotype-driven gene prioritization tool, that takes HPO (Human Phenotype Ontology) IDs as inputs, searches and prioritizes candidate causal disease genes. It is distributed under the MIT License by Wang Genomics Lab. Additionally, we have provided a web server and an associated RESTful API service for running Phen2Gene. Finally, a mobile app for Phen2Gene and several other genetic diagnostic tools from our lab is being tested and will be available soon.",
"parallel": false,
"distributed": true,
"GPU": false
},
"PyBDA": {
"Name": "PyBDA",
"Description": "A command-line tool for analysis of big biological data sets for distributed HPC clusters.",
"parallel": false,
"distributed": true,
"GPU": false
},
"RelocaTE2": {
"Name": "RelocaTE2",
"Description": "a high resolution transposable element insertion site mapping tool for population resequencing.\n\nThis tool is for mapping TEs from resequencing data: Stajich lab.\n\nRelocaTE2: a high resolution transposable element insertion sites mapping tool for population resequencing.\n\nRelocaTE2 is an improved version of RelocaTE (Robb et al., 2013). RelocaTE2 is highly sensitive and accurate in mapping transposable elements (TE) polymorphisms at single base pair resolution. RelocaTE2 uses the reads associated with TEs as seeds to cluster the read pairs on chromosomes. It automatically detects the target site duplication (TSD) of a TE insertion from alignments in each cluster, which enable high resolution mapping of TE polymorphisms. Unlike parallel searching of multi-TE elements in RelocaTE, RelocaTE2 searches all TEs in one cycle, which enables us find polymorphisms of thousands of TEs in an individual genome or large population in a reasonable timeframe without losing sensitivity and specificity",
"parallel": true,
"distributed": false,
"GPU": false
},
"SDR": {
"Name": "SDR",
"Description": "A GRASS tool for the Sediment Delivery Ratio mapping.\n\nThis work involves a tool implementation for evaluating Sediment Delivery Ratio (SDR) in a river basin, through GRASS GIS software. The definition of a spatially distributed value of SDR is a very important task as the sediment routing can affects solid transport in stream channels, water quality degradation, and frequency increase of natural disasters such as debris flows and mudflows. For such reason the SDR evaluation, coupled with GIS approach, has been extensively used in scientific literature. Geographic information systems provide a fundamental support for a better prediction of SDR, since it can consider the space variability of factors influencing the sediment routing processes. Actually a specific GIS module to estimate the spatial variability of SDR does not exist.",
"parallel": false,
"distributed": true,
"GPU": false
},
"SIIM": {
"Name": "SIIM",
"Description": "SIIM is a professional organization at the nexus of medical imaging informatics and healthcare technologies. \nSIIM provides an unparalleled opportunity to not only hear from the best, brightest and most forward thinkers in imaging space but talk to them face to face in a collegial setting.",
"parallel": true,
"distributed": false,
"GPU": false
},
"SW-Tandem": {
"Name": "SW-Tandem",
"Description": "Tool for large-scale peptide identification with parallel spectrum dot product on Sunway TaihuLight.",
"parallel": true,
"distributed": false,
"GPU": false
},
"SWAPCounter": {
"Name": "SWAPCounter",
"Description": "Counting Kmers for Biological Sequences at Large Scale.\n\nThis is a distributed kmer counting tools for TB-PB sequencing dataset",
"parallel": false,
"distributed": true,
"GPU": false
},
"SigUNet": {
"Name": "SigUNet",
"Description": "signal peptide recognition based on semantic segmentation.\n\nA signal peptide predictor based on deep learning.\n\nFor CPU: pip3 install -r requirement.cpu.txt.\n\nFor GPU (suggest): pip3 install -r requirement.gpu.txt",
"parallel": false,
"distributed": false,
"GPU": true
},
"SpliceVec-g": {
"Name": "SpliceVec-g",
"Description": "SpliceVec: Distributed feature representations for splice junction prediction",
"parallel": false,
"distributed": true,
"GPU": false
},
"Treerecs": {
"Name": "Treerecs",
"Description": "Treerecs is an open-source (species- and gene-) tree reconciliation software distributed under the GNU AGPL licence.",
"parallel": false,
"distributed": true,
"GPU": false
},
"VGE": {
"Name": "VGE",
"Description": "Virtual Grid Engine (VGE) is a kind of middleware for running bioinformatics software pipelines on large-scale supercomputers which do not support any grid engine survices. VGE employs master-worker model. It first reserves processors and/or cores by running the job which is parallelized by MPI, then asign divided small tasks onto its worker processes. VGE is written in python.",
"parallel": true,
"distributed": false,
"GPU": false
},
"Wang-xiaoheng": {
"Name": "Wang-xiaoheng",
"Description": "Prediction of protein structural classes by different feature expressions based on 2-D wavelet denoising and fusion.\n\n2D-wavelet-for-protein-structural-classes-prediction We constructed a prediction model based on wavelet denoising using different feature expression methods. A new fusion idea, first fuse and then denoise, is proposed in this article. Two types of pseudo amino acid compositions are utilized to distill feature vectors. Then, a two-dimensional (2-D) wavelet denoising algorithm is used to remove the redundant information from two extracted feature vectors. The two feature vectors based on parallel 2-D wavelet denoising are fused, which is known as PWD-FU-PseAAC. The project includes three original datasets, source code for two-dimensional wavelet denoising and source code for feature vector prediction",
"parallel": true,
"distributed": false,
"GPU": false
},
"adacgh2": {
"Name": "ADaCGH2",
"Description": "Analysis and plotting of array CGH data. Allows usage of Circular Binary Segementation, wavelet-based smoothing (both as in Liu et al., and HaarSeg as in Ben-Yaacov and Eldar), HMM, BioHMM, GLAD, CGHseg. Most computations are parallelized (either via forking or with clusters, including MPI and sockets clusters) and use ff for storing data.",
"parallel": true,
"distributed": false,
"GPU": false
},
"affypara": {
"Name": "affyPara",
"Description": "The package contains parallelized functions for exploratory oligonucleotide array analysis. The package is designed for large numbers of microarray data.",
"parallel": true,
"distributed": false,
"GPU": false
},
"affyparaebi": {
"Name": "affyParaEBI",
"Description": "Aan R based pipeline for parallel pre-processing of Affymetrix TM chips. The pipeline starts from a directory containing raw CEL files files and produces Bioconductor R objects containing gene expression measurements suitable for further analysis.",
"parallel": true,
"distributed": false,
"GPU": false
},
"ahoj": {
"Name": "AHoJ: Apo Holo Protein Search",
"Description": "Webserver & command-line tool for search and alignment of APO (unbound) protein structures from HOLO (bound) forms and vice versa. Features: customizable search of Apo-Holo pairs in the PDB, alignment to the query structure, batch mode for fast parallel dataset processing, visualization via Molstar and PyMol, public documented REST-API.",
"parallel": true,
"distributed": false,
"GPU": false
},
"alexa-seq": {
"Name": "ALEXA-Seq",
"Description": "Alternative Expression Analysis by massively parallel RNA sequencing",
"parallel": true,
"distributed": false,
"GPU": false
},
"ambergpumdsimulation": {
"Name": "AmberGPUMDSimulation",
"Description": "An automated workflow tool for Kepler to perform AMBER GPU molecular dynamics simulations.",
"parallel": false,
"distributed": false,
"GPU": true
},
"anexvis": {
"Name": "anexVis",
"Description": "anexVis is a transcriptome tool to visualize organ/tissue-specific glycosaminoglycan biosynthetic and catabolic pathways in human health and diseases. anexVis allows one to analyze a large number of genes that are related to biosynthetic and catabolic pathways of all glycosaminoglycans, such as heparan sulfate, chondroitin sulfate, keratan sulfate, and hyaluronic acid, in parallel across various human tissues organs. Such visual analyses have not been accessible to the broad research community despite the accumulation of a large amount of RNA-seq data.",
"parallel": true,
"distributed": false,
"GPU": false
},
"apache_trino": {
"Name": "Apache Trino",
"Description": "Trino is a distributed SQL query engine designed to query large data sets distributed over one or more heterogeneous data sources.\n\nTrino is a tool designed to efficiently query vast amounts of data using distributed queries. If you work with terabytes or petabytes of data, you are likely using tools that interact with Hadoop and HDFS. Trino was designed as an alternative to tools that query HDFS using pipelines of MapReduce jobs, such as Hive or Pig, but Trino is not limited to accessing HDFS. Trino can be and has been extended to operate over different kinds of data sources, including traditional relational databases and other data sources such as Cassandra.\n\nTrino was designed to handle data warehousing and analytics: data analysis, aggregating large amounts of data and producing reports. These workloads are often classified as Online Analytical Processing (OLAP).",
"parallel": false,
"distributed": true,
"GPU": false
},
"aqmm": {
"Name": "aqmm",
"Description": "The AQMM was developed to perform absolute quantification of multiple metagenomes and it's parallel metatranscriptome. In order to use this algorithm, the experiment should be designed with both metagenome and metatranscriptome data. From the initial stage, the molecular experimental data should be recorded to help on estimating the overall DNA or RNA of a unit (ml/gram) of the sample. The AQMM was demonstrated to obtain better results of differential expression genes identification in comparative metatranscriptomic studies.",
"parallel": true,
"distributed": false,
"GPU": false
},
"aradeepopsis": {
"Name": "ARADEEPOPSIS",
"Description": "ARADEEPOPSIS is a software tool that enables plant researchers to non-invasively score plant growth, biomass accumulation and senescence from image data in a highly parallelized, high throughput, yet easy to use manner.",
"parallel": true,
"distributed": false,
"GPU": false
},
"arion_4_omics": {
"Name": "Arion 4 Omics",
"Description": "High performance, \u2018end-to-end\u2019 analysis pipeline for the classification of omics profiles. Incorporating highly parallel architecture and sophisticated database technologies to overcome inherent technology based bottlenecks, currently faced in the Life Science research path. Arion is a scalable platform providing rapid exploratory analysis via machine learning, visualization and statistical modules with topology planned for a future release.",
"parallel": true,
"distributed": false,
"GPU": false
},
"asltbx": {
"Name": "ASLtbx",
"Description": "Imaging Clinically Relevant Pain States Using Arterial Spin Labeling | The Center for Functional Neuroimaging at the University of Pennsylvania provides unification for currently distributed medical center efforts in physiological and clinical brain imaging and advance the general interests of the brain imaging community through targeted methods development, symposia and colloquia, handling of regulatory issues, and fund-raising efforts | Updated manual, ASL MoCo code, and asl_perf_subtract.m. Major new features include: | asl_perf_subtract.m has been updated with new parameters (T1b, T2wm, Rwm-blood) from the latest literature. CBF quantification for PASL using voxelwise M0 value has been updated. Please see the document therein for the details | Made a new version for SPM12. Major new features include: | 1) wraper functors for most of the processing steps,",
"parallel": false,
"distributed": true,
"GPU": false
},
"asterias": {
"Name": "Asterias",
"Description": "Set of tools for the analyses of high throughput genomic data that includes applications for microarray data normalization, filtering, detection of differential gene expression, class and survival prediction model building, and analysis of array CGH data. Most applications use parallel computing resulting in significant increases in speed.",
"parallel": true,
"distributed": false,
"GPU": false
},
"autodeconj": {
"Name": "AutoDeconJ",
"Description": "A GPU-accelerated ImageJ plugin for 3D light-field deconvolution with optimal iteration numbers predicting.",
"parallel": false,
"distributed": false,
"GPU": true
},
"autorelacs": {
"Name": "AutoRELACS",
"Description": "AutoRELACS is a Python script for automated generation and analysis of ultra-parallel ChIP-seq.",
"parallel": true,
"distributed": false,
"GPU": false
},
"basenumber": {
"Name": "BaseNumber",
"Description": "High performance of a GPU-accelerated variant calling tool in genome data analysis",
"parallel": false,
"distributed": false,
"GPU": true
},
"batchmap": {
"Name": "BatchMap",
"Description": "Parallel implementation of the OneMap R package for fast computation of F1 linkage maps in outcrossing species.",
"parallel": true,
"distributed": false,
"GPU": false
},
"bbrowser": {
"Name": "BBrowser",
"Description": "BBrowser allows biologists to handle scRNA-seq data without programming knowledge. Intuitive operation. Functions are updated. It has a new CITE-seq dashboard, a complete package for interactively exploring single-cell gene expression data in parallel with surface protein information.",
"parallel": true,
"distributed": false,
"GPU": false
},
"bets": {
"Name": "BETS",
"Description": "Bootstrap Elastic net regression from Time Series (BETS) is a statistical framework based on Granger causality for the recovery of a directed gene network from transcriptional time-series data. BETS uses elastic net regression and stability selection from bootstrapped samples to infer causal relationships among genes. BETS is highly parallelized, enabling efficient analysis of large transcriptional data sets.",
"parallel": true,
"distributed": false,
"GPU": false
},
"bigPint": {
"Name": "bigPint",
"Description": "Visualization methods for differential expression analysis | Methods for visualizing large multivariate datasets using static and interactive scatterplot matrices, parallel coordinate plots, volcano plots, and litre plots. Includes examples for visualizing RNA-sequencing datasets and differentially expressed genes | Big multivariate data plotted interactively | bigPint: Make BIG data pint-sized | Welcome to the bigPint package website! If you are a new user, please begin by reading from the Get Started tab at the top of this website. There are ten short vignette articles in that tab, and we recommend reading them in order. These short vignette articles consist of reproducible code that provide: | Alternatives to data metrics object | Clustering, DataImport, DifferentialExpression, GeneExpression, MultipleComparison, Normalization, Preprocessing, QualityControl, RNASeq, Sequencing, Software, Transcription, Visualization",
"parallel": true,
"distributed": false,
"GPU": false
},
"bigseqkit": {
"Name": "BigSeqKit",
"Description": "The Next Generation Sequencing (NGS) raw data are stored in FASTA and FASTQ text-based file formats. Common operations on FASTA/Q files include searching, filtering, sampling, deduplication and sorting, among others. We can find several tools in the literature for FASTA/Q file manipulation but none of them are well fitted for large files of tens of GB (likely TBs in the near future) since mostly they are based on sequential processing. The exception is seqkit that allows some routines to use a few threads but, in any case, the scalability is very limited. To deal with this issue, we introduce BigSeqKit, a parallel toolkit to manipulate FASTA/Q files at scale with speed and scalability at its core. BigSeqKit takes advantage of an HPC-Big Data framework (IgnisHPC) to parallelize and optimize the commands included in seqkit. In this way, in most cases it is from tens to hundreds of times faster than other state-of-the-art tools such as seqkit, samtools and pyfastx.",
"parallel": true,
"distributed": false,
"GPU": false
},
"biochemddi": {
"Name": "BioChemDDI",
"Description": "BioChemDDI web server is constructed to predict drug-drug interactions, which is compatible with most major browsers, and the parallel speed-up is implemented.",
"parallel": true,
"distributed": false,
"GPU": false
},
"biocparallel": {
"Name": "BiocParallel",
"Description": "This package provides modified versions and novel implementation of functions for parallel evaluation, tailored to use with BioConductor objects.",
"parallel": true,
"distributed": false,
"GPU": false
},
"biogridrunner": {
"Name": "BioGridRunner",
"Description": "BioGridRunner is a distributed computing application for bioinformatics, incorporating directory services (data and software), grid computing methods (security, authentication, data transport and remote jobs), and gene sequence and genomic data processing methods.",
"parallel": false,
"distributed": true,
"GPU": false
},
"biosigner": {
"Name": "biosigner",
"Description": "This package implements a new method to assess the relevance of the variables for the prediction performances of the classifier. The approach can be run in parallel with the PLS-DA, Random Forest, and SVM binary classifiers. The signatures and the corresponding 'restricted' models are returned, enabling future predictions on new datasets.",
"parallel": true,
"distributed": false,
"GPU": false
},
"bis-snp": {
"Name": "Bis-SNP",
"Description": "A package based on the Genome Analysis Toolkit map-reduce framework for genotyping in bisulfite treated massively parallel sequencing on Illumina platform. It uses bayesian inference with either manually specified or automatically estimated methylation probabilities of different cytosine context(not only CpG, CHH, CHG in Bisulfite-seq, but also GCH et.al. in other bisulfite treated sequencing) to determine genotypes and methylation levels simultaneously.",
"parallel": true,
"distributed": false,
"GPU": false
},
"bitpai": {
"Name": "BitPAI",
"Description": "A bit-parallel algorithm for general, integer-scoring global alignment. Integer-scoring schemes assign integer weights for match, mismatch and insertion/deletion. This method uses structural properties in the relationship between adjacent scores in the scoring matrix to construct classes of efficient algorithms, each designed for a particular set of weights.",
"parallel": true,
"distributed": false,
"GPU": false
},
"brian2genn": {
"Name": "Brian2GeNN",
"Description": "Brian2GeNN is a software package that enables the users to make use of GeNN GPU acceleration when developing their models in Brian, without requiring any technical knowledge about GPUs, C++ or GeNN. The new Brian2GeNN software uses a pipeline of code generation to translate Brian scripts into C++ code that can be used as input to GeNN, and subsequently can be run on suitable NVIDIA GPU accelerators.",
"parallel": false,
"distributed": false,
"GPU": true
},
"bsf-skeleton": {
"Name": "BSF-skeleton",
"Description": "A template for parallelization of iterative numerical algorithms on cluster computing systems.",
"parallel": true,
"distributed": false,
"GPU": false
},
"cate": {
"Name": "CATE",
"Description": "A fast and scalable CUDA implementation to conduct highly parallelized evolutionary tests on large scale genomic data.",
"parallel": true,
"distributed": false,
"GPU": false
},
"catsnap": {
"Name": "Catsnap",
"Description": "A user-friendly algorithm for determining the conservation of protein variants reveals extensive parallelisms in the evolution of alternative splicing.",
"parallel": true,
"distributed": false,
"GPU": false
},
"cbmos": {
"Name": "CBMOS",
"Description": "CBMOS is a GPU-enabled Python framework for the numerical study of center-based models. Cell-based models are becoming increasingly popular for applications in developmental biology.",
"parallel": false,
"distributed": false,
"GPU": true
},
"cbptools": {
"Name": "CBPtools",
"Description": "CBPtools is a Python package that allows users to run an extensively evaluated rCBP analysis workflow on a given ROI. It currently supports two modalities: resting-state functional connectivity and structural connectivity based on diffusion-weighted imaging, along with support for custom connectivity matrices. Analysis parameters are customizable and the workflow can be scaled to a large number of subjects using a parallel processing environment. Parcellation results with corresponding validity metrics are provided as textual and graphical output. Regional connectivity-based parcellation (rCBP) is a widely used procedure for investigating the structural and functional differentiation within a region of interest (ROI) based on its long-range connectivity.",
"parallel": true,
"distributed": false,
"GPU": false
},
"cbs_das_protein_viewer": {
"Name": "CBS DAS protein viewer",
"Description": "Protein viewer which uses the distributed annotation system (DAS) to integrate and present annotation data from multiple sources for a protein sequence.",
"parallel": false,
"distributed": true,
"GPU": false
},
"cellosaurus": {
"Name": "Cellosaurus",
"Description": "A knowledge resource on cell lines. It attempts to describe all cell lines used in biomedical research, including immortalized cell lines, naturally immortal cell lines (ie stem cell lines), finite life cell lines when those are distributed and widely used, vertebrate cell lines with an emphasis on human, mouse and rat cell lines, and invertebrate (insect and tick) cell lines.",
"parallel": false,
"distributed": true,
"GPU": false
},
"chemos": {
"Name": "ChemOS",
"Description": "ChemOS is a portable, modular and versatile software package, which supplies the structured layers indis-pensable for operating autonomous laboratories. Additionally, it enables remote control oflaboratories, provides access to distributed computing resources, and comprises state-of-the-art machine learning methods.",
"parallel": false,
"distributed": true,
"GPU": false
},
"chinook": {
"Name": "Chinook",
"Description": "Chinook is a peer-to-peer (P2P) service for the discovery, use and assessment of bioinformatics programs. Chinook Online allows researchers to connect and run distributed bioinformatics programs using a web application.",
"parallel": false,
"distributed": true,
"GPU": false
},
"chromscape": {
"Name": "ChromSCape",
"Description": "ChromSCape is a user-friendly interactive Shiny/R application distributed as a Bioconductor package, that processes single-cell epigenomic data to assist the biological interpretation of chromatin landscapes within cell populations. ChromSCape analyses the distribution of repressive and active histone modifications as well as chromatin accessibility landscapes from single-cell datasets.",
"parallel": false,
"distributed": true,
"GPU": false
},
"circvar-db": {
"Name": "circVAR database",
"Description": "circVAR database is genome-wide archive of genetic variants for human circular RNAs\nCircular RNAs (circRNAs), the 3' and 5' ends of which are covalently linked, are a kind of widely distributed and abundant RNAs found in eukaryotic organisms in recent years. They could play as sponges for regulating microRNAs and RNA binding proteins. Our circVAR database aims to provide resources for circRNA-related genetic variants in healthy and diseased populations.",
"parallel": false,
"distributed": true,
"GPU": false
},
"clij-assistant": {
"Name": "CLIJ-assistant",
"Description": "CLIJx-Assistant is an intuitive user interface for building custom GPU-accelerated image processing workflows using CLIJ2 in Fiji. It visualizes workflows as image date flow graphs while building them. It suggests what to do next and generates scripts and human readable protocols to facilitate reproducible bio-image analysis.",
"parallel": false,
"distributed": false,
"GPU": true
},
"cloudburst": {
"Name": "CloudBurst",
"Description": "CloudBurst is a parallel read-mapping algorithm optimized for mapping next-generation sequence data to the human genome and other reference genomes.",
"parallel": true,
"distributed": false,
"GPU": false
},
"clustersheep": {
"Name": "ClusterSheep",
"Description": "ClusterSheep is a GPU/CUDA-accelerated software tool for large-scale clustering of tandem mass spectra from shotgun proteomics.",
"parallel": false,
"distributed": false,
"GPU": true
},
"cmfsm": {
"Name": "cmFSM",
"Description": "A parallel acceleration tool for classical frequent subgraph mining algorithm. The core idea is to employ parallel techniques to parallelize extension tasks, so as to reduce computation time. On the other hand, multi-node strategy is employed to solve the problem of memory constraints.",
"parallel": true,
"distributed": false,
"GPU": false
},
"cmsa": {
"Name": "CMSA",
"Description": "A heterogeneous CPU/GPU computing system for multiple similar RNA/DNA sequence alignment.",
"parallel": false,
"distributed": false,
"GPU": true
},
"cogstack": {
"Name": "CogStack",
"Description": "Experiences of deploying integrated information retrieval and extraction services in a large National Health Service Foundation Trust hospital.\n\nCogStack is a lightweight distributed, fault tolerant database processing architecture and ecosystem, intended to make NLP processing and preprocessing easier in resource constrained environments.\n\nCogStack is a lightweight distributed, fault tolerant database processing architecture, intended to make NLP processing and preprocessing easier in resource constained environments.",
"parallel": false,
"distributed": true,
"GPU": false
},
"coils": {
"Name": "COILS",
"Description": "Program that compares a sequence to a database of known parallel two-stranded coiled-coils and derives a similarity score. By comparing this score to the distribution of scores in globular and coiled-coil proteins, the program then calculates the probability that the sequence will adopt a coiled-coil conformation.",
"parallel": true,
"distributed": false,
"GPU": false
},
"comer2": {
"Name": "COMER2",
"Description": "GPU-accelerated sensitive and specific homology searches.\n\nCOMER2, cross-platform software for protein remote homology search.\n\nThe COMER method based on sequence profile-profile comparison is one of the most sensitive and accurate computational tools developed for protein alignment and homology search. COMER version 2.1 (COMER2) represents one of the fastest implementations of calculations for sensitive protein homology search. High COMER2 performance is achieved by harnessing the power of the Graphics processing unit (GPU). Hence, a GPU is expected to be installed on the system.\n\nCOMER2, a cross-platform software package for protein remote homology search and alignment.",
"parallel": false,
"distributed": false,
"GPU": true
},
"compact": {
"Name": "CompaCt",
"Description": "CompaCt performs automated integrative comparative analysis of large-scale (protein) interaction datasets, identifying groups of interactors (e.g., protein complexes) in parallel in multiple species, allowing systematic identification and comparison of conserved as well as taxon-specific components of protein complexes and other interactions.",
"parallel": true,
"distributed": false,
"GPU": false
},
"compss": {
"Name": "COMPSs",
"Description": "Sequential programming model\nGeneral purpose programming languages + annotations/hints\nExploitation of implicit parallelism\nAutomatic on-the-fly creation of a \ntask dependency graph\nCOMPSs workflows portable to HPC and Cloud without change",
"parallel": true,
"distributed": false,
"GPU": false
},
"consort-nlp": {
"Name": "CONSORT-NLP",
"Description": "CONSORT-NLP is an Artificial Intelligence Tool with Automated Reporting Checklist Generation for Parallel Group Randomized Trials.",
"parallel": true,
"distributed": false,
"GPU": false
},
"coolpuppy": {
"Name": "coolpup.py",
"Description": "coolpup.py is a versatile Python tool to perform pile-up analysis on Hi-C data in .cool format. coolpup.py is particularly well suited performance-wise for analysing huge numbers of potential interactions, since it loads whole chromosomes into memory one by one (or in parallel to speed it up) to extract small submatrices quickly. Having to read everything into memory makes it relatively slow for small numbers of loops, but performance doesn't decrease until you reach a huge number of interactions.",
"parallel": true,
"distributed": false,
"GPU": false
},
"copywriter": {
"Name": "CopywriteR",
"Description": "This tool extracts DNA copy number information from targeted sequencing by utilizing off-target reads. It allows extracting uniformly distributed copy number information, and it can be applied to sequencing data obtained from various techniques including chromatin immunoprecipitation and target enrichment on small gene panels. Thereby, this tool constitutes a widely applicable alternative to available copy number detection tools.",
"parallel": false,
"distributed": true,
"GPU": false
},
"corenup": {
"Name": "CORENup",
"Description": "CORENup is a deep learning model for nucleosome identification. CORENup processes a DNA sequence as input using one-hot representation and combines in a parallel fashion a fully convolutional neural network and a recurrent layer. These two parallel levels are devoted to catching both non periodic and periodic DNA string features.",
"parallel": true,
"distributed": false,
"GPU": false
},
"cosinor": {
"Name": "cosinor",
"Description": "Time distributed data analysis by Cosinor.Online application.\n\nTools for estimating and predicting the cosinor model version 1.1 from CRAN.\n\ncosinor is a set of simple functions that transforms longitudinal data to estimate the cosinor linear model as described in Tong (1976). Methods are given to summarize the mean, amplitude and acrophase, to predict the mean annual outcome value, and to test the coefficients.\n\ncosinor: Tools for estimating and predicting the cosinor model.\n\nMolcan, L. (2019). Time distributed data analysis by Cosinor. Online application. bioRxiv, 805960.\n\nPreprint available at BioRxiv, doi: https://doi.org/10.1101/805960.\n\nCosinor Online, cosinor on the web, is a simple web-based application evaluating the presence of 24-h periodicity. Loading data is easy, just copy and paste columns from an Excel file to the web form",
"parallel": false,
"distributed": true,
"GPU": false
},
"coverageview": {
"Name": "CoverageView",
"Description": "This package provides a framework for the visualization of genome coverage profiles. It can be used for ChIP-seq experiments, but it can be also used for genome-wide nucleosome positioning experiments or other experiment types where it is important to have a framework in order to inspect how the coverage distributed across the genome.",
"parallel": false,
"distributed": true,
"GPU": false
},
"covidmulti-net": {
"Name": "CovidMulti-Net",
"Description": "CovidMulti-Net is a parallel-dilated multi-scale feature fusion architecture for the identification of COVID-19 cases from chest X-ray Images.",
"parallel": true,
"distributed": false,
"GPU": false
},
"cpmd": {
"Name": "CPMD",
"Description": "Parallelized plane wave/pseudopotential implementation of Density Functional Theory, particularly designed for ab-initio molecular dynamics.",
"parallel": true,
"distributed": false,
"GPU": false
},
"crimson": {
"Name": "CRIMSON",
"Description": "Crimson (cardiovascular integrated modelling & simulation) is an advanced simulation environment for subject-specific hemodynamic analysis. Crimson is integrating best-in-class open source-solutions for parallel flow solvers, fluid-structure interactions, GUI-based boundary condition specification, data assimilation, medical image processing, mesh generation, transitional hemodynamics, and much more.",
"parallel": true,
"distributed": false,
"GPU": false
},
"croco-xlink": {
"Name": "CroCo",
"Description": "A a user-centred tool to convert results from crosslinking mass spectrometry experiments | Convert data formats from chemical cross-linking mass spectrometry (XL-MS) | The CroCo cross-link converter \u2014 CroCo documentation | CroCo converts multiple data format from cross-linking mass spectrometry software tools to xTable format (in csv format) | The CroCo cross-link converter \u00b6 | CroCo converts multiple data format from cross-linking mass spectrometry software tools to xTable format (in csv format). It is distributed as graphical programme to be run from an executable and as a Python module to be integrated into workflows",
"parallel": false,
"distributed": true,
"GPU": false
},
"crtpowerdist": {
"Name": "CRTpowerdist",
"Description": "An R package to calculate attained power and construct the power distribution for cross-sectional stepped-wedge and parallel cluster randomized trials.",
"parallel": true,
"distributed": false,
"GPU": false
},
"ct-projection-simulator": {
"Name": "CT Projection Simulator",
"Description": "CT Projection Simulator computes the parallel beam 2D projections of an object specified as a set of geometric shapes.",
"parallel": true,
"distributed": false,
"GPU": false
},
"cuda-ec": {
"Name": "CUDA-EC",
"Description": "A scalable parallel algorithm for correcting sequencing errors in high-throughput short-read data so that error-free reads can be available before DNA fragment assembly.",
"parallel": true,
"distributed": false,
"GPU": false
},
"cuda-meme": {
"Name": "CUDA-MEME",
"Description": "Motif discovery software based on MEME algorithm for a single GPU device using CUDA programming model. At present, it only supports the OOPS and ZOOPS models.",
"parallel": false,
"distributed": false,
"GPU": true
},
"cudammc": {
"Name": "cudaMMC",
"Description": "GPU-enhanced multiscale Monte Carlo chromatin 3d modelling.",
"parallel": false,
"distributed": false,
"GPU": true
},
"cudasw": {
"Name": "CUDASW++",
"Description": "CUDASW++ (compute unified device architecture) is a bioinformatics software for Smith-Waterman protein database searches that takes advantage of the massively parallel CUDA architecture of NVIDIA Tesla GPUs to perform fast sequence searches.",
"parallel": true,
"distributed": false,
"GPU": true
},
"cufluxsampler.jl": {
"Name": "CuFluxSampler.jl",
"Description": "GPU-accelerated algorithms for flux sampling in CUDA.jl, working with COBREXA.jl",
"parallel": false,
"distributed": false,
"GPU": true
},
"cuneiform": {
"Name": "Cuneiform",
"Description": "A functional scientific workflow language for large scale scientific data analysis. It enables integration of external tools written in any language and automatic parallelization of execution.",
"parallel": true,
"distributed": false,
"GPU": false
},
"curc": {
"Name": "CURC",
"Description": "A GPU-accelerated reference-free compressor for high-throughput sequencing reads of FASTQ files.",
"parallel": false,
"distributed": false,
"GPU": true
},
"cuttlefish": {
"Name": "Cuttlefish",
"Description": "Cuttlefish is a fast, parallel, and very lightweight memory tool to construct the compacted de Bruijn graph from genome reference(s). Cuttlefish is a tool for constructing the (colored) compacted de Bruijn graph from a collection of one or more genome references. Cuttlefish introduces a novel modeling scheme of the de Bruijn graph vertices as finite-state automata, and constrains the state-space for the automata to enable tracking of their transitioning states with very low memory usage. Cuttlefish is also fast and highly parallelizable. Experimental results demonstrate that the algorithm scales much better than existing approaches, especially as the number and scale of the input references grow.",
"parallel": true,
"distributed": false,