Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add method to directly load triples into BasicHDT #25

Closed

Conversation

rubensworks
Copy link
Member

This allows triples to be directly loaded into a BasicHDT without having to read them from a file.

@RubenVerborgh
Copy link
Member

@mielvds @joachimvh @laurensdv This is just what we needed for the management server. Now we can set up a pipeline unzip / clean / etc. without having to materialize.

@RubenVerborgh
Copy link
Member

@LaurensRietveld @wouterbeek You will also like this; support for piping will likely simplify the LOD Laundromat internal pipeline.

@LaurensRietveld
Copy link
Member

Cool stuff Ruben, thanks. This will indeed simplify some parts in our pipeline.
To verify: this PR means piped inputs, but not streamed HDT generation right? I.e., this does not affect the hardware footprint of the HDT generation process?

@rubensworks
Copy link
Member Author

@LaurensRietveld Correct, it works the same way as generating HDT's from an input file, with the difference that this new method accepts a triple iterator instead of a file.

@mielvds
Copy link
Member

mielvds commented May 11, 2016

Before I merge, what is the overlap between the loadTriples and loadDictionary methods? Could this be written shorter and cleaner?

@rubensworks
Copy link
Member Author

@mielvds You're right, I could abstract some things. The only real difference is the ProgressListener parameter. I'll look into it.

@rubensworks
Copy link
Member Author

@mielvds All overlapping code between the loadTriples and loadDictionary methods has been abstracted.

@mielvds mielvds self-assigned this Aug 17, 2016
@mielvds mielvds added this to the Early 2017 milestone Dec 9, 2016
@RubenVerborgh RubenVerborgh force-pushed the master branch 6 times, most recently from 9a8e48d to 532823c Compare January 10, 2017 18:12
@mielvds
Copy link
Member

mielvds commented Jan 11, 2017

@rubensworks I did an attempt at merging this with the current develop branch, but I can't make it compile. There seems to be an issue with the inline functions you declared:

 [HDT] Compiling src/hdt/BasicHDT.cpp
src/hdt/BasicHDT.cpp:273:9: error: expected expression
    return [fileName, baseUri, notation](RDFCallback& loader) {
           ^
src/hdt/BasicHDT.cpp:282:9: error: expected expression
    return [triples](RDFCallback& loader) {
           ^
2 errors generated.

Could you have a look? Result is in https://github.com/rdfhdt/hdt-cpp/tree/rubensworks-feature-loadtriples-single

@rubensworks
Copy link
Member Author

@mielvds Apparently the Makefile isn't using c++11, in which lambda support was introduced.

So either we add -stdlib=libc++ -std=c++11 to the default Makefile flags, which should be supported on most machines nowadays. Or I rewrite this PR so it doesn't use lambda's.

@RubenVerborgh
Copy link
Member

I'd say; try C++11, and if Travis takes it, it's good.

@mielvds
Copy link
Member

mielvds commented Jan 12, 2017

If it doesn't break backwards compatibility, fine by me.
If you make changes to this PR, do it directly in the https://github.com/rdfhdt/hdt-cpp/tree/rubensworks-feature-loadtriples-single branch please.

@RubenVerborgh
Copy link
Member

We can probably remove the gzip dependency after this PR?

@mielvds mielvds modified the milestones: Late 2017, Early 2017 Feb 6, 2017
@RubenVerborgh
Copy link
Member

Could you please rebase this on the latest develop?

@RubenVerborgh
Copy link
Member

How does this compare to #47? (CC: @drobilla)

@RubenVerborgh
Copy link
Member

@rubensworks Based on your comment #77 (comment), how do we proceed with this one?

@rubensworks
Copy link
Member Author

@RubenVerborgh As this PR only adds a new method for streaming triples into BasicHDT, STDIN streaming would have to be implemented on top of that.
Piping can however already be done on Mac and Linux.

So I suggest closing this PR, unless someone would require this functionality for something else.

@RubenVerborgh
Copy link
Member

Agree on closing, thanks!

@mielvds
Copy link
Member

mielvds commented Jan 3, 2018

@rubensworks can I delete the branch?

@LaurensRietveld
Copy link
Member

Or, should we rebase this branch on dev and reopen this pr? I've yet to see anybody to convert a gzipped ntriples file to hdt, despite what the Readme says. And, got the impression @rubensworks that this pr can be used to convert gzipped files to hdt, is that correct?

@LaurensRietveld
Copy link
Member

Sorry, ignore my last message (context: #47 (comment))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants