How to run the block-level experiment #28

Kaka727 · 2018-12-20T07:16:52Z

The description can be found here.
https://github.com/Mondego/SourcererCC/issues/26
@dyangUCI @pedromartins4
Yeah, I used the samples in this repository (test-env). The three projects are zipped so I executed the command "python tokenizer.py zipblocks ". But as I have said, the document under /file_block_stats ("file-stats") is empty. I don't know what is wrong.

dyangUCI · 2018-12-21T06:57:27Z

sorry, I cannot regenerate your error. When I ran the command "python tokenizer.py zipblocks", there will be data under file_blocks_stats. Did you unzip the folder test-env.tgz? Maybe that's the issue?

Kaka727 · 2018-12-21T08:45:19Z

@dyangUCI
Thanks for your response. This time I retry this command and the file "file_blocks_stats" really contains some contents as below.

However, the file "file-tokens" is still empty. I'd like to know if this is the case under your environment.
Thanks~

dyangUCI · 2018-12-21T19:26:10Z

Hi, I found the issue in the tokenizer: there's some extra info we once collected for Java functions for some specific experiments and abandoned later on, but the code remains in tokenizer.py, causing index out of range failures, so the results files are not complete. Please pull the git project now and rerun the tokenizer.py, it should be correct now.

dyangUCI · 2018-12-21T19:30:11Z

There will be 56 blocks in the tokens file. The stats file contains both file stats and block stats, 61 lines in total. You can check the results on your end accordingly.

Kaka727 · 2018-12-22T12:03:53Z

Yeah, thanks very much!
this time it really works~

Kaka727 · 2018-12-22T12:10:51Z

But I still have some questions below.
First, in my computer, the results for block-level clones of sampled projects are null. Is it right?
Second, I'd like to know what do Node_1, Node_2, and so on represent for?
Thanks~

saini · 2018-12-22T20:05:57Z

The number of Node folders represents the number of processes that were run in parallel to carry out the clone detection. The numeric argumnet N in the command ‘Python controller.py N’ tells the controller script to cary out clone detection using N processes. For systems where memory is low, N should be 1. Each process will reserve the amount of memory which is specified in the xmx and xms arguments to jvm.

Kaka727 · 2018-12-23T01:46:32Z

@dyangUCI @saini
Thanks for your quick response.
I'd like to know the results for the three sampled projects. In my computer, there is no "query" file under /NODE_1/output8.0 after executing "python controller.py 1". I'd like to know what's the matter.

zhuwq585 · 2021-06-28T14:18:48Z

@dyangUCI @saini
Thanks for your quick response.
I'd like to know the results for the three sampled projects. In my computer, there is no "query" file under /NODE_1/output8.0 after executing "python controller.py 1". I'd like to know what's the matter.

Did you find the matter? (If U still remember it...)

dyangUCI closed this as completed Dec 21, 2018

dyangUCI reopened this Dec 21, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to run the block-level experiment #28

How to run the block-level experiment #28

Kaka727 commented Dec 20, 2018

dyangUCI commented Dec 21, 2018

Kaka727 commented Dec 21, 2018

dyangUCI commented Dec 21, 2018

dyangUCI commented Dec 21, 2018

Kaka727 commented Dec 22, 2018

Kaka727 commented Dec 22, 2018

saini commented Dec 22, 2018

Kaka727 commented Dec 23, 2018

zhuwq585 commented Jun 28, 2021

How to run the block-level experiment #28

How to run the block-level experiment #28

Comments

Kaka727 commented Dec 20, 2018

dyangUCI commented Dec 21, 2018

Kaka727 commented Dec 21, 2018

dyangUCI commented Dec 21, 2018

dyangUCI commented Dec 21, 2018

Kaka727 commented Dec 22, 2018

Kaka727 commented Dec 22, 2018

saini commented Dec 22, 2018

Kaka727 commented Dec 23, 2018

zhuwq585 commented Jun 28, 2021