-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to run the block-level experiment #28
Comments
sorry, I cannot regenerate your error. When I ran the command "python tokenizer.py zipblocks", there will be data under file_blocks_stats. Did you unzip the folder test-env.tgz? Maybe that's the issue? |
@dyangUCI However, the file "file-tokens" is still empty. I'd like to know if this is the case under your environment. |
Hi, I found the issue in the tokenizer: there's some extra info we once collected for Java functions for some specific experiments and abandoned later on, but the code remains in tokenizer.py, causing index out of range failures, so the results files are not complete. Please pull the git project now and rerun the tokenizer.py, it should be correct now. |
There will be 56 blocks in the tokens file. The stats file contains both file stats and block stats, 61 lines in total. You can check the results on your end accordingly. |
Yeah, thanks very much! |
But I still have some questions below. |
The number of Node folders represents the number of processes that were run in parallel to carry out the clone detection. The numeric argumnet N in the command ‘Python controller.py N’ tells the controller script to cary out clone detection using N processes. For systems where memory is low, N should be 1. Each process will reserve the amount of memory which is specified in the xmx and xms arguments to jvm. |
Did you find the matter? (If U still remember it...) |
The description can be found here.
https://github.com/Mondego/SourcererCC/issues/26
@dyangUCI @pedromartins4
Yeah, I used the samples in this repository (test-env). The three projects are zipped so I executed the command "python tokenizer.py zipblocks ". But as I have said, the document under /file_block_stats ("file-stats") is empty. I don't know what is wrong.
The text was updated successfully, but these errors were encountered: