Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aland #2

Open
wants to merge 50 commits into
base: main
Choose a base branch
from
Open
Changes from 1 commit
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
f76fc6a
Update README.md with Aland's answers and codes
alandastudillo Oct 3, 2023
626d95f
add jupyter notebooks with Python code
alandastudillo Oct 3, 2023
a7b5150
Update README.md
alandastudillo Oct 3, 2023
e06ba2a
Update README.md
alandastudillo Oct 4, 2023
81d0105
Update README.md
alandastudillo Oct 4, 2023
3e62bb6
add new updated files
alandastudillo Oct 4, 2023
24b9307
Delete ResearchGraph4Neo4j2.ipynb
alandastudillo Oct 4, 2023
a6e3b34
Delete test1_json2.ipynb
alandastudillo Oct 4, 2023
a48ea67
Update README.md
alandastudillo Oct 4, 2023
4e95ea4
Update README.md
alandastudillo Oct 4, 2023
5aeb837
Update README.md
alandastudillo Oct 4, 2023
e74c8fe
Update README.md
alandastudillo Oct 4, 2023
2f5b27c
Update README.md
alandastudillo Oct 4, 2023
44d703e
Update README.md
alandastudillo Oct 4, 2023
daf3c0a
file with additional details, information, resources
alandastudillo Oct 4, 2023
0010062
Update README.md
alandastudillo Oct 4, 2023
2430e4a
new version of the transform_big_json notebook
alandastudillo Oct 4, 2023
e96343c
Update README.md
alandastudillo Oct 4, 2023
858147c
Update README.md
alandastudillo Oct 4, 2023
de0c2f1
add an additional image
alandastudillo Oct 4, 2023
de57337
Update README.md
alandastudillo Oct 4, 2023
7fe4a8d
Create readme.md
alandastudillo Oct 6, 2023
97ff0d6
add images to imag folder
alandastudillo Oct 6, 2023
9f10168
Create readme.md
alandastudillo Oct 6, 2023
a1ee73f
add examples file
alandastudillo Oct 6, 2023
effe98b
Update README.md
alandastudillo Oct 6, 2023
cc38efa
Delete example_papers_graph.png
alandastudillo Oct 6, 2023
7661f22
Delete example_papers_graph0.png
alandastudillo Oct 6, 2023
6bf1725
Delete neo4j_examples.txt
alandastudillo Oct 6, 2023
77579e9
Create readme.md
alandastudillo Oct 6, 2023
93aed03
add notebooks
alandastudillo Oct 6, 2023
74db69b
Delete notes directory
alandastudillo Oct 6, 2023
339088d
Create readme
alandastudillo Oct 6, 2023
e31a5c4
add j notebooks
alandastudillo Oct 6, 2023
b23f56c
add images
alandastudillo Oct 6, 2023
3f2d141
add new j notes
alandastudillo Oct 6, 2023
2335019
Update README.md
alandastudillo Oct 6, 2023
23c62b2
Update README.md
alandastudillo Oct 6, 2023
d2d4680
Update README.md
alandastudillo Oct 6, 2023
d367313
add modularity report
alandastudillo Oct 6, 2023
173bb2d
Delete ResearchGraph4Neo4j3.ipynb
alandastudillo Oct 6, 2023
f17ef4a
Delete Transform_BIG_JSON.ipynb
alandastudillo Oct 6, 2023
3f4905d
Delete explore_JSON.ipynb
alandastudillo Oct 6, 2023
926b970
Update README.md
alandastudillo Oct 6, 2023
3637821
Update README.md
alandastudillo Oct 6, 2023
1f2c61a
add cito image
alandastudillo Oct 6, 2023
d5fb1a2
Update README.md
alandastudillo Oct 6, 2023
5b233b9
Update README.md
alandastudillo Oct 10, 2023
f21a9a7
Update README.md code for creating nodes for papers
alandastudillo Oct 10, 2023
0cf98a6
Add files via upload
alandastudillo Oct 12, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Update README.md
alandastudillo authored Oct 4, 2023

Verified

This commit was signed with the committer’s verified signature.
snyk-bot Snyk bot
commit 2f5b27c3ea8a7764d3f28485854032150f5d8c22
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -43,7 +43,7 @@ Activities

The JSON file (after unzip) has a size of around 4.8 GB. This big JSON file cannot be directly load into memory or by JSON package even in Python. After carefully consider the options to load this big JSON, considering the use of Neo4j + APOC, and the limited resources of my local machine, I decided to go for a very non optimal way. I split the big JSON into smaller JSON files, and then to load the data by chunks in the Neo4j DB (Desktop version 5.3.0 + APOC), and using Python 🐍 to iterate through the files and queries, in order of getting the chunks of JSON to populate the graph DB.

The file **Transform_BIG_JSON.ipynb** shows the Python 🐍 code to get chunks of JSON from a big JSON file. The size of the chunks can be configurable.
The file **Transform_BIG_JSON.ipynb** [here](/Transform_BIG_JSON.ipynb) shows the Python 🐍 code to get chunks of JSON from a big JSON file. The size of the chunks can be configurable.

It was considered to get 3 types of nodes:
- AUTHOR
@@ -53,7 +53,7 @@ and 2 types of relationships:
- WRITTEN_BY: PAPER -[WRITTEN_BY]-> AUTHOR
- IS_PART_OF: AUTHOR -[IS_PART_OF]-> ORGANISATION

In the case of 1 item (article) per JSON the reading and creation of nodes is done. For each JSON file ( eg. paper_1.json), i.e., for each article, the following procedures were performed using Cypher:
In the case of 1 item (article) per JSON the reading and creation of nodes is done. For each JSON file (eg. paper_1.json), i.e., for each article, the following procedures were performed using Cypher:

To create the nodes for PAPERS: