Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fubar max mem reported #5

Open
maasha opened this issue Jun 15, 2015 · 4 comments
Open

Fubar max mem reported #5

maasha opened this issue Jun 15, 2015 · 4 comments
Assignees
Labels

Comments

@maasha
Copy link

maasha commented Jun 15, 2015

./klust ~/scratch/GG_BP.fna --sort_incr -u ~/scratch/clusters
Running with parameters:
  k = 5
  id = 0.85
  max_rejects = 8
  depth = 0

Reading sequences...
Time: 71.9542 sec.
Seqs/sec: 17552.6

Sorting by increasing sequence length...
Clustering 1262986 sequences...
100%
Time: 84.8908 sec.
Throughput: 14877.8 seqs/sec.

Clusters:   5754
Max size:   157402
Avg size:   219.497
Min size:   1
Singletons: 2136
Max mem:    3115372 MB

Particulars:

gcc --version
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 6.0 (clang-600.0.56) (based on LLVM 3.5svn)
Target: x86_64-apple-darwin14.3.0
Thread model: posix
@ahovgaard
Copy link
Collaborator

I am not able to reproduce this on Linux, but it seems that the ru_maxrss field in the rusage struct might be in bytes on OS X, as opposed to kilobytes on Linux. Seems plausible.

struct rusage usage;
if(getrusage(RUSAGE_SELF, &usage) == 0)
    cout << "Max mem:    " << usage.ru_maxrss / 1024 << " MB" << endl;

@ahovgaard ahovgaard self-assigned this Jun 15, 2015
@ahovgaard
Copy link
Collaborator

Another thing: Is reading sequences always that slow on your machine? Here I get around 40k sequences (SILVA) per second and that is using an encrypted hard drive (not SSD).

@ahovgaard ahovgaard added the bug label Jun 15, 2015
@ahovgaard
Copy link
Collaborator

I just pushed something which should hopefully fix this. Will you pull or clone again and test it?

@maasha
Copy link
Author

maasha commented Jun 26, 2015

So the memory reporting is better but unstable - here I ran the same command twice and first time I get Max mem: 2952 MB and second time: Max mem: 3358 MB?

With respect to reading speed. I got an unencrypted flash drive on a brand new MacBook Pro :o(.

maasha@edna:~/scratch$ klust GG_BP.fna --sort_incr -u clusters
Running with parameters:
  k = 5
  id = 0.85
  max_rejects = 8
  depth = 0

Reading sequences...
Time: 72.3317 sec.
Seqs/sec: 17461

Sorting by increasing sequence length...
Clustering 1262986 sequences...
100%
Time: 86.4049 sec.
Throughput: 14617.1 seqs/sec.

Clusters:   5754
Max size:   157402
Avg size:   219.497
Min size:   1
Singletons: 2136
Max mem:    2952 MB
maasha@edna:~/scratch$ klust GG_BP.fna --sort_incr -u clusters
Running with parameters:
  k = 5
  id = 0.85
  max_rejects = 8
  depth = 0

Reading sequences...
Time: 65.8981 sec.
Seqs/sec: 19165.7

Sorting by increasing sequence length...
Clustering 1262986 sequences...
100%
Time: 85.8016 sec.
Throughput: 14719.8 seqs/sec.

Clusters:   5754
Max size:   157402
Avg size:   219.497
Min size:   1
Singletons: 2136
Max mem:    3358 MB

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants