Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add import directly from graphml and json without transforming to csv first. #27

Open
edatarch opened this issue Apr 10, 2019 · 6 comments

Comments

@edatarch
Copy link

No description provided.

@beebs-systap
Copy link
Member

We plan to support this when we move to Apache TinkerPop 3.4, which deprecates the io() method on the Graph and moves it to the GraphTraversalSource.

@edatarch
Copy link
Author

One of the limitations I see on the CSV is that I cannot have varying properties for the vertices in the same file. The CSV load expects the properties to be converted to columns with the value in the different rows. This means I need to group my vertices with the same properties and create separate files for each. This is true for edges as well.

The JSON or Graph structure allows for varying properties. Would the GraphTravelsalSource support this specific use case?

@beebs-systap
Copy link
Member

@edatarch With the CSV format the header must include all possible properties, but you can have varying properties for the vertices in the same file. For a given vertex, the properties that are not present can be without a value. There's same sample of the Tinkerpop graph below with different properties per node.

The GraphTraversalSource will support JSON or GraphML loaded via the io() methods.

~id,~label,name:string,lang:string,age:int
1,person,,,29
2,person,vadas,,
3,software,,,
4,person,josh,,32
5,software,ripple,java,
6,person,peter,,35

@RyanBeatty
Copy link

Direct export to something like GraphSON would be lovely :)

Context: I am exploring a graph database storage solution that looks like the following. AWS Neptune instances service OLTP queries. A daily (?) automated export converts the AWS Neptune graph to GraphSON. The graph in GraphSON form is used as input to a Hadoop-Gremlin graph running on AWS EMR for OLAP queries

At the moment it looks like I could use export-pg to dump things into csv or json and then do some data transformation on my end to get it into GraphSON. If I could avoid doing that work, that would be awesome :)

@beebs-systap
Copy link
Member

@RyanBeatty We can look at the GraphSON format in the future. We also have some other plans that may be relevant to your use case. Happy to discuss in more detail.

@iansrobinson

@RyanBeatty
Copy link

@beebs-systap Thanks for the quick response! Would definitely appreciate any feedback or ideas for my use case here. I've sent you an email to take discussion off this thread

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants