Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extraction of Gang of Four Motifs #318

Open
5 tasks done
RavenMarQ opened this issue Oct 14, 2024 · 35 comments · May be fixed by #326
Open
5 tasks done

Extraction of Gang of Four Motifs #318

RavenMarQ opened this issue Oct 14, 2024 · 35 comments · May be fixed by #326
Assignees

Comments

@RavenMarQ
Copy link
Collaborator

RavenMarQ commented Oct 14, 2024

Purpose

Originally, the task of this issue was to use pattern4 as the tool to analyze the class files of a project via generating a xml. Then, afterwards, generating a table from the provided xml data. However, it was found that the task has already been completed and testing of the tool was necessary as it presented issues during development. On that note, it was given to me to simply test if the pattern4 tool generates the Gang of Four motifs as we wanted.

Process

Simply retrieving the jar file, running it against multiple folders with class files for it to analyze from parsed java projects, and using the data in the specified notebook will tell us that the worries about pattern4 being unstable will be alleviated.

Tasks

  • Get familiar with GoF motifs and what the pattern4 jar is doing
  • Acquire copy of pattern4
  • Using the class files from a generated project, use pattern4 in command line to analyze it
  • Then, using the notebook, attempt to run through its entirety from scratch
  • Repeat one or two more times to ensure that it works
@carlosparadis carlosparadis changed the title Extraction of Source Code Text for Analysis Extraction of Gang of Four Motifs Oct 14, 2024
@carlosparadis
Copy link
Member

Edited the title. The Source code extraction is currently being done by @daomcgill as part of her M3 on #313

Context

While @daomcgill will look at source files as text, and use srcML to get different set of words out of a file representation, your work will continue building on the capability of extracting dependencies from source code, as you did on M1 with Scitools.

Dependency Graphs and Social Smell Motifs

What is different, however, is now we are looking less at extracting dependencies as a whole, such as function calls, and instead asking ourselves what we do with said graphs. One such way is the motifs. I showed you what a triangle and square motifs were last call. It used data from who changes what file (commits), and also who communicate with who (mail data like @daomcgill obtained from mbox, or issue tracker comments like ours now from GitHub serve as data in these connections).

The square motif also had a connection between file and file, that is the data you obtained in Scitools.

The goal of those motifs is to understand "who collaborate without communicate". As we are now communicating before your code is merged in Kaiaulu!

Architectural Flaws as Motifs

There are other types of motifs. Remember: All a motif is, is a configuration of a tiny graph that we will search on the entire graph of the code. What the nodes are, and what the edges are in said motif is up for us to decide. For example, if you define the nodes as files, and you look for some configurations and type of dependencies, you will extract architecture flaws. One of @rnkazman work with the DV8 tool, which Kaiaulu interfaces with on dv8.R does exactly that.

Gang of Four Design Patterns as Motifs

If you follow me up to this point, then the task in this issue should be more clear:

We are looking for certain configurations of files and their dependencies. However, instead of architectural flaws, we are looking for gang of four motifs. They are commonly known as Gang of Four Design patterns.

This is important so you don't go down the wrong rabbit hole: Rather than us building the graph and the motif (Kaiaulu allows you to do that, as shown in the triangle and square motif Notebook), we will try to use an external tool to do that for us. If we are unable to, we will instead try to code our own.

Tool 1 - pattern4

The tool is: https://users.encs.concordia.ca/~nikolaos/pattern_detection.html

It is written in Java. Your first step, before you even touch any R code, is to download the tool, and run it. The tool can launch as a graphic interface, and as a command line. It is not immediately obvious how to run via terminal, so I will try to find how we did that in the past.

If you run the tool successfully, it will generate an XML file. We already have code to parse said XML file. I will create a branch and a PR containing said code, but on the meantime you can try using the tool. Similar to Scitools, the tool will want to a see a folder containing source code. This tool only works for Java projects. The tool can take awhile running, so I recommend trying on a small project first. I was also told it can require some fiddling with where you are pointing the tool to look at to generate the files.

You of course should get some idea what a gang of four design pattern is. Look at some youtube videos if needed. The ones you care about should be the ones the tool can detect: https://users.encs.concordia.ca/~nikolaos/pattern_detection.html

Tool 2 - Eclipse Plugin

You should avoid this for now, as I don't believe this tool allow us to run from terminal, which means the integration to Kaiaulu will not be viable. It does offer some visualization capabilities.

https://github.com/tsantalis/DPD4Eclipse

Functions in Kaiaulu - the API

We will want to have one function to take as input the source code and output the XML file, and we will want another that parses the XML and gives us a table. As I said before, the parser function is already done, I need to find it (I dont remember if it was ever versioned).

There are two publications that can serve as documentation to what is going on. Since the goal is not for you to read papers, but rather code, I will point exactly what passages are relevant just to help you understand what is going on, and give you some additional examples. So long you remember what you did with scitools understand, where an external tool, invoked by Kaiaulu function, gives you a file that is not a table, and another function, the parser, parses into a table into Kaiaulu, then you have the right mental model of this task and just need to suplement your understaning of motifs and design patterns.


As a final note: I hope you see that the functions in Kaiaulu basically encapsulate the overhead of having to learn the intricasies of all these tools out there. It gives a common interface to get tables out of them, while the notebooks gives you sufficient context to get started.

Please try to use this comment to refine your issue specification.

@carlosparadis
Copy link
Member

@RavenMarQ seems i forgot aspects of what Kaiaulu already had:

The task I was going to give you was already done (!) and is documented here: http://itm0.shidler.hawaii.edu/kaiaulu/articles/graph_gof_showcase.html

See also: http://itm0.shidler.hawaii.edu/kaiaulu/reference/index.html#-gang-of-four-patterns-

This will make things easy: I recall this functionality was not working properly due to some limitations of pattern4 at the time: This task is therefore much simpler:

Try out the notebook on two Java projects on GitHub (remember to go for one that is not too big or you may need to leave it overnight running), and see if you can get the xml file out of it and parse.

Also, please create a new issue placeholder for the actual M2 task.

Thanks!

@carlosparadis
Copy link
Member

@beydlern, What I just asked @RavenMarQ is a fine example of where your work, to look for projects in OpenHub, would be useful for a large number of projects. A project not too big is one here not too large in respect to LOC.

@RavenMarQ
Copy link
Collaborator Author

Here's a quick update on the testing, as of current the tool is not able to create an xml file. For both projects I tested, I was unable to get it to generate any.

@carlosparadis
Copy link
Member

@RavenMarQ I need more details on what did not work, thanks!

@RavenMarQ
Copy link
Collaborator Author

After running the notebook, changing the config files to point to my project and the pattern4.jar, when I reach the point of

gof_patterns <- write_gof_patterns(pattern4_path = pattern4_path,
                                   class_folder_path = class_folder_path)

The gof_patterns holds an empty string. All file paths are valid, but simply doesn't extract anything from the compiled Calculator App you shared with me in Issue 308. Once more, the documentation and the explanation doesn't help explain much where it could have gone wrong.

@carlosparadis
Copy link
Member

Did you use .class files in the folder? You need compiled java files for this.

@carlosparadis
Copy link
Member

Also, I mentioned on call you should use Kaiaulu function to know how to run manually the pattern4.jar directly, instead of calling the function itself. You want to isolate the issue to the tool, rather than try adding more things around it when just trying to use it. Look how the function calls the tool via command line, then do it manually to see if it works on .class files.

@RavenMarQ
Copy link
Collaborator Author

RavenMarQ commented Oct 23, 2024

Yes, I used a smaller project with five .class that I've compiled. One error, one main, one extends, and two other class files. I have also spoken to @beydlern about the configuration being a potential problem and have confirmed that it is correct.

I will look to running it on the command line, via the provided html's instructions.

@carlosparadis
Copy link
Member

Can you put the files on the shared drive so I can take a look, and when you get to it, paste the command here for pattern4 that you are running so I can validate. I am trying to see if I find the older example code we had of compiled classes.

@RavenMarQ
Copy link
Collaborator Author

Alright, tested it in command line, and the xml only contains a system node with all the GoF Patterns:

<?xml version="1.0" encoding="UTF-8"?>
<system>
	<pattern name="Factory Method" />
	<pattern name="Prototype" />
	<pattern name="Singleton" />
	<pattern name="(Object)Adapter" />
	<pattern name="Command" />
	<pattern name="Composite" />
	<pattern name="Decorator" />
	<pattern name="Observer" />
	<pattern name="State" />
	<pattern name="Strategy" />
	<pattern name="Bridge" />
	<pattern name="Template Method" />
	<pattern name="Visitor" />
	<pattern name="Proxy" />
	<pattern name="Proxy2" />
	<pattern name="Chain of Responsibility" />
</system>

I will be uploading a copy of my class files in the meetings folder in a new folder.

@carlosparadis
Copy link
Member

OK this is a good start. You need a more comprehensive system to try pattern4.jar. It is trying to detect design patterns. A simple source code will likely not have them.

There used to be a JHotDraw pre compiled source code that was a good starting point. Try to find a reasonable sized java code that is not 3 or 5 files and run it against. The XML is empty because it didn't detect anything.

@RavenMarQ
Copy link
Collaborator Author

Alright, any good GitHub projects I can try to find a reasonable amount of project to compile? I only used tiny projects so I don't have to sit there for it to process- but it seems I might have to.

@carlosparadis
Copy link
Member

@RavenMarQ see on the shared drive the folder tsantalis/tsantalis_ground_truth/compiled_projects. The 3 folders in there are already compiled for you, so do not try to compile anything. I recommend you try JHotDraw 5.1 folder first. Download it to your computer, and try to run pattern4.jar.

There is actually a Kaiaulu compatible (may need some minor editing on filepaths) to run the R function you have on tsantalis_groundtruth/jhotdraw5.1.yml.

Try this one first, since I know it should generate a XML file. Note the XML output is already there if you want to inspect what the output should look like.

Please don't edit anything in the tsantalis folders I just uploaded.

Some other folders will show up within the next hour as some files are still updating, but everything for JHotDraw should already be there.

We want to wrap this on this week, or you will fall behind on the downloader task.

@RavenMarQ
Copy link
Collaborator Author

Alright, testing and messing about with the notebook and its values, it seems that the notebook is at fault here. I had to change a few values here and there in the notebook and in the config files. The default folder in which the notebook uses for output is invalid- as it would seem. However, as it stands the GoF Pattern writer and parser works.

@carlosparadis
Copy link
Member

@RavenMarQ this is too vague. I need to know what exactly you had to change in the Notebook so we can fix this. Could you elaborate?

@RavenMarQ
Copy link
Collaborator Author

RavenMarQ commented Oct 24, 2024

When calling write_gof_patterns and parse_gof_patterns, the functions have a default: '/tmp/gof.xml', however this is not a filepath recognized by every computer/drive where Kaiaulu is present (given by the fact the syscall made by the function threw out an error code 1).

For the fix, I had to actually put in at

pattern4_output_filepath <- conf[["tool"]][["pattern4"]][["output_filepath"]]

a valid file path, since I have my files bisected between two drives. I fixed it by either setting it to the relative: '../tmp/gof/xml' or the absolute: 'F:/Kaiaulu/tmp/gof.xml'. I am unsure why this behavior occurs, so in the notebook I added to the function calls the parameter:
output_filepath = pattern4_output_filepath to both functions. After that, the function worked fine.

Also, the notebook makes a lot of assumptions in the configuration files that the Kaiaulu directory would be in C: drive and under Desktop (which is frankly an oversight). As I have noticed, your point of configuration files not being universal makes it a hassle for users to run notebooks- hence why you have given us the task of refactoring the directories and configuration files as a group.

@carlosparadis
Copy link
Member

Kaiaulu is not intended to run on Windows (see https://github.com/sailuh/kaiaulu?tab=readme-ov-file#installation). Are you saying on OS X or Ubuntu the /tmp/ folder does not exist? Otherwise the premise doesn't apply.

Also, the notebook makes a lot of assumptions in the configuration files that the Kaiaulu directory would be in C: drive and under Desktop (which is frankly an oversight). As I have noticed, your point of configuration files not being universal makes it a hassle for users to run notebooks- hence why you have given us the task of refactoring the directories and configuration files as a group.

I'm somewhat confused by this. Which file points to C:Drive?


With all this being said, the concern in this notebook is less on the filepaths, especially if you are trying to run on an operational system Kaiaulu is not intended for. The point is to test pattern4.jar. The limitation found before was with pattern4.jar itself being able to parse the XML.

I remain unclear whether the tool works on the 3 projects put on Google Drive or any other. How Kaiaulu uses pattern4.jar is the less interesting part. It is just a systemcall, filepaths, and file parsing.

Have you tested pattern4.jar yet on the projects placed on Drive?

@RavenMarQ
Copy link
Collaborator Author

Yes, the pattern4.jar works as intended- if I forgot to communicate that.

@carlosparadis
Copy link
Member

Great, could you pick any project in java, compile and generate the xml then as a final test? Thanks!

@carlosparadis
Copy link
Member

carlosparadis commented Oct 25, 2024

Also, please update the first comment. It still says placeholder!

@RavenMarQ
Copy link
Collaborator Author

I have tested the notebook with all three of the different projects you provided, is it then to be said you want me to find one more java project, compile it, and test it just to make sure?

@carlosparadis
Copy link
Member

@RavenMarQ That is correct. One of the main issues with the prior attempt to use pattern4 was a) the pain to compile the java project to be able to use it, and b) pointing to said compiled project required some maneuvering on where the files should be placed.

I believe on G. Drive, the folders were already placed in a manner for that to work. Take a look on the respective G. Drive conf files of the projects and there should be a comment that explains what had to be done for pattern4 to work.

E.g. JHotDraw 5.1 config has this note:

compile_note: >
1. you will need to compile the project into class files by: find . -name "*.java" -exec javac {} ;
2. Again,the path that you put in has to be exactly one level above all subdirectories that contain class files. e.g. /path/to/jhotdraw/JHotDraw5.1/sources/CH/ifa/draw/
3. Run command: sudo java -Xms32m -Xmx512m -jar pattern4.jar -target "/path/to/jhotdraw/JHotDraw5.1/sources/CH/ifa/draw/" -output "jhot_output.xml"

Meanwhile Junit 3.7 has this:

compile_note: >
 1. You will need to extract the class files from the junit.jar file:
   Specify the root path of pattern.jar (Tsantalis tool) exactly one level above subdirectories that contain the class files. 
    e.g: instead of /path/to/junit/junit3.7/, put down /path/to/junit/junit3.7/junit 
 2. make sure to have right permission configured, because without sudo in front of the command, it will throw illegal > > argument error due to lack of file access permission.

That's what I mean by maneuvering for it to work. So picking a random project, trying to figure out how to compile, then trying to get pattern4 to run is the "test run" I want you to try to see if it is too hard (then we can try the notebook).

Lastly, we can fix any documentation you found lacking in the whole process, or any changes to the new project configuration file schema, and that would conclude the task.

@RavenMarQ
Copy link
Collaborator Author

In the meeting we discussed with the java project I found- we'll take the class files that I found and test running it against the pattern4 jar.
I will be testing out what configuration of file structure that the jar can read:

  1. The default directory order made by the compiled project
  2. A two-level directory with the head directory with sub-directory containing all the class files
  3. A one-level directory where the folder contains all the class files
    If all three fail to generate a file, it may be the fault of the project I chose. I will have to search for a more advanced project that would make use of GoF motifs and retry with the testing.

Afterwards, getting this to run in the notebook and putting efforts in improving the documentation in the corresponding notebook.

@RavenMarQ
Copy link
Collaborator Author

RavenMarQ commented Oct 29, 2024

Since the previous Java project was insufficient to get a reasonable .xml size, I instead opted to use this project instead. Testing out the configuration of the file structure after compiling, I compared to two .xmls from configuration styles 1 and 2 and found that the raw pattern4 jar can read all class files regardless of depth in the organizational structure. I know this because the original build folder had classes nested 8 folders deep in separate grand^8-parent folders. With that, I did not test variant 3, which was a last resort if the jar couldn't read the files.

Thus, it is safe to say that a user can simply input the raw compiled folder they have of all the classes and the jar will be able to read it.

Afterwards, I ran it in the notebook and it worked out fine, once again applying the fixes to point at where I needed it to. Going through the notebook, to help future users get the thing to run, I would suggest putting where to find the configuration file its pulling from so that way users don't have to scroll through and read all the yml files to change where the jar is located, what folder they want to save to, what xml file they want to output to, etc. Other than that, the pattern4 jar is actually functional and properly runs.

@carlosparadis
Copy link
Member

carlosparadis commented Oct 29, 2024

@RavenMarQ could you do a PR with the fixes you are proposing in the notebook so I can review? @rnkazman seems this can be a path forward afterall!

@RavenMarQ could you share the XML? you can place on drive. Also could you paste in this issue where you find the instructions to compile and what you had to use in the end? We may want to add that as an example in the notebook narrative too since not a lot of people are used to compiling. I am assuming you used maven?

@RavenMarQ
Copy link
Collaborator Author

The PR is created, as well as the xml in our shared folder. Inside of the Files folder inside should have the output.xml. As for compilation, for the first proposed project I used Maven, however the project I found had a Gradle builder that compiled the classes for me.

@carlosparadis
Copy link
Member

@RavenMarQ sounds good, I will take a look. Does the notebook references the URL for the grade builder? Could you link here too?

@RavenMarQ
Copy link
Collaborator Author

I don't understand what you're asking me, but what I did essentially:
Using my IDE, IntelliJ, which allows for the usage of building using either Maven or Gradle, the project that I linked prior had a Gradle builder in it. Simply linking the Gradle builder and building the project as usual in IntelliJ, it compiled and built the classes for me.

@carlosparadis
Copy link
Member

carlosparadis commented Oct 30, 2024

You are considering the reader is familiar with what gradler build, intelij, and maven are. Imagine if we did the same with Scitools.

If this is not clear, take a few screenshots of how you did and paste here and we can go from there. You have an issue already for the next task?

@RavenMarQ
Copy link
Collaborator Author

RavenMarQ commented Oct 30, 2024

I just had assumed that if the reader wanted to analyze their own java project, they would have an understanding on compiling a Java project. Even now, I still don't understand what I'm doing and had spent hours configuring IntelliJ years ago in order for my IDE to do this. To be honest, I don't think I would be able to give an in-depth tutorial for helping users be able to compile a project from scratch. If this is within the issue scope, then I would be willing to take more time to use a VM or new computer to simulate not having any tools and make a step-by-step of getting to where I did in order to extract the class files from the project.

We also have the issues for my next two tasks up as well, and was going to start making a proposed specification in the executables task.

@carlosparadis
Copy link
Member

I suggest you move on to the next task and we can align on call to sync on this issue.

@daomcgill daomcgill linked a pull request Oct 30, 2024 that will close this issue
@RavenMarQ
Copy link
Collaborator Author

RavenMarQ commented Nov 7, 2024

To assist the user in getting the notebook to run and possibly gaining experience compiling projects, would it be alright if I used the files that you provided (or the link to the original sources)? Namely the tsantalis folder in our shared drive folder.

I have only two projects which I used to test the notebook, but I simply just want to make sure the notebook user doesn't give up on compiling with Gradle or Maven (aka Apache Maven).

@carlosparadis
Copy link
Member

I guess it depends on how you want to "use" the files. Are you suggesting you will version the code here? Or you will explain the user how to compile said project?

Did you check the other Apache READMEs to see how they compile and did the checklist? I was expecting that to be the next comment for us to discuss

@RavenMarQ
Copy link
Collaborator Author

Pushed out the changes for the notebook. Please do let me know if I overlooked something

@carlosparadis carlosparadis assigned beydlern and unassigned RavenMarQ Nov 11, 2024
@carlosparadis carlosparadis added this to the ics496-fall24-m2 milestone Nov 11, 2024
@RavenMarQ RavenMarQ assigned RavenMarQ and unassigned beydlern Nov 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants