-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Syntatic Feature Extraction #11
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this change since this is not a part of this issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please include the updated feature_data.csv as well
@@ -34,6 +34,16 @@ | |||
// feature 10: #literals | |||
private int literals = 0; | |||
|
|||
// new features | |||
private float avgCommas = 0.0f; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to use 0.0f - because Java supports Auto casting.
float avgCommas = 0 will work.
features.setParenthesis(count(snippet, "\\(") * 2); | ||
|
||
//get the number of lines of code | ||
String[] lines = snippet.split("\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will include empty lines as well. Instead use SCC tool to get the lines of code (!empty+!comment lines)
@@ -83,4 +84,7 @@ public void testIfStatements2() { | |||
public void testMethodParameters2() { | |||
assertEquals(NUM_OF_PARAMETERS_2, featureVisitor2.getFeatures().getNumOfParameters()); | |||
} | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove these empty lines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zeesoolee Overall good job! A couple of comments.
- Let's use a separate file - say something like
aggregated_feature_extractor
to collect features that include LOC (lines of code). LOC can be obtained from the loc_data.csv. - Update unit tests to test your feature.
- Always mention the GH issue Id in your PR description. - this is a good practice to link issues and corresponding fix
@zeesoolee / @zslee001 closed this PR since the fork of this implementation is now outdated. But the comments mentioned here should be addressed in your new PR. And link this PR's ID in your new PR. |
Syntatic Feature Extraction, some progress on new features