You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The test data, using the bash scripts, ran fine on our cluster system, as well as on 10K subsets of my 18 samples, using the same but modified bash scripts. However, when I ran LSA on the full files (18 samples, each about 500 MB) I was getting errors during HashCounting.sh, even when running with up to 40 threads. I am not using the LSFS system.
Here is the error message:
parallel: This job failed:
echo $(date) writing k-mer corpus for file 2;
python LSA/kmer_corpus.py -r 2 -i Vhashed_reads/ -o Vcluster_vectors/ >> VLogs/KmerCorpus.log 2>&1
printing end of last log file...
hashobject.kmer_corpus_to_disk(Kmer_Hash_Count_Files[fr],mask=M)
IndexError: list index out of range
Traceback (most recent call last):
File "LSA/kmer_corpus.py", line 33, in
hashobject.kmer_corpus_to_disk(Kmer_Hash_Count_Files[fr],mask=M)
IndexError: list index out of range
Traceback (most recent call last):
File "LSA/kmer_corpus.py", line 33, in
hashobject.kmer_corpus_to_disk(Kmer_Hash_Count_Files[fr],mask=M)
IndexError: list index out of range
Hello! Thanks for the nice paper.
The test data, using the bash scripts, ran fine on our cluster system, as well as on 10K subsets of my 18 samples, using the same but modified bash scripts. However, when I ran LSA on the full files (18 samples, each about 500 MB) I was getting errors during HashCounting.sh, even when running with up to 40 threads. I am not using the LSFS system.
Here is the error message:
parallel: This job failed:
echo $(date) writing k-mer corpus for file 2;
python LSA/kmer_corpus.py -r 2 -i Vhashed_reads/ -o Vcluster_vectors/ >> VLogs/KmerCorpus.log 2>&1
printing end of last log file...
hashobject.kmer_corpus_to_disk(Kmer_Hash_Count_Files[fr],mask=M)
IndexError: list index out of range
Traceback (most recent call last):
File "LSA/kmer_corpus.py", line 33, in
hashobject.kmer_corpus_to_disk(Kmer_Hash_Count_Files[fr],mask=M)
IndexError: list index out of range
Traceback (most recent call last):
File "LSA/kmer_corpus.py", line 33, in
hashobject.kmer_corpus_to_disk(Kmer_Hash_Count_Files[fr],mask=M)
IndexError: list index out of range
Something funny, when I looked at the Log files for the test data and my subset data, I found similar errors as with the full data (attached below). Looking up the errors, I thought maybe they had to do with this: http://stackoverflow.com/questions/4964101/pep-3118-warning-when-using-ctypes-array-as-numpy-array.
Here are outputs that you requested for other issues from the run with the full dataset:
HashReads.log
KmerCorpus.log
CombineFractions.log
MergeHash.log
Note: GlobalWeights.log and CreateHash.log were empty.
ls -l Vcluster_vectors.txt
ls -l Vhashed_reads.txt
ls -l Voriginal_reads.txt
Here are the log files ad outputs from the run with your test data:
CombineFractions.log
CreateHash.log
HashReads.log
KmerClusterIndex.log
KmerCorpus.log
KmerLSI.log
MergeIntermediatePartitions.log
ReadPartitions.log
Note: these were empty
GlobalWeights.log
KmerClusterCols.log
KmerClusterMerge.log
KmerClusterParts.log
MergeHash.log
ls -l cluster_vectors
ls -l hashed_reads
ls -l original_reads
Thanks in advance,
Nori
The text was updated successfully, but these errors were encountered: