Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic federations (endpoints different for each query) #6

Open
tfrancart opened this issue Feb 14, 2019 · 4 comments
Open

Dynamic federations (endpoints different for each query) #6

tfrancart opened this issue Feb 14, 2019 · 4 comments
Assignees
Milestone

Comments

@tfrancart
Copy link

Hello

I am in a situation where the endpoints participating in a federation are selected at runtime by the user before issuing a query.
This means that I need to create a different federation each time for each query; however FederationManager is a singleton and throws an Exception when I try to create a second federation when one was first created.
Even if I shutdown() the FederationManager after each query, this means only one FederationManager can exists at a given time, and this would block concurrent queries on my application.

I think I need to use FederationManager removeAll / addAll to update the list of endpoints for each query while keeping the same FederationManager instance. Correct ?
Do I need to recreate a new Repository object from the updated FederationManager ? how ? or should I keep the same Repository object and only update the FederationManager ?
Sample pseudo-code on how to deal with this situation would be very welcome !

Thanks a lot

@aschwarte10 aschwarte10 self-assigned this Feb 18, 2019
@aschwarte10 aschwarte10 added this to the 5.2.0 milestone Feb 18, 2019
aschwarte10 pushed a commit that referenced this issue Feb 18, 2019
This change adds support for dynamic federations: by defining the
relevant endpoints in the FedXDataset and passing this to a given Query,
the evaluation is done against those endpoints only.
@aschwarte10
Copy link
Contributor

@tfrancart in the initial design of FedX is meant to be a static federation of a given set of endpoints. All queries are executed against the federation (in its static configuration) and the engine decides about the relevant endpoints transparently.

But I think your use case makes sense: I have added a feature to FedX such that you can control the endpoints for a given query. Note that the configuration of FedX still contains all possible endpoints as federation members, only in the query definition you can now optionally tell the engine which ones to use for the given query.

This can be done using the Dataset construct available in the RDF4J API.

Example:

TupleQuery tq = ...; 
 FedXDataset ds = new FedXDataset(tq.getDataset);
 ds.addEndpoint("myEndpoint");
 ds.addEndpoint("otherEndpoint");
 tq.setDataset(ds)
 TupleQueryResult res = tq.evaluate()

Let me know if this solves your use case. Also have a look at the tests attached in the linked commit

@tfrancart
Copy link
Author

Thanks a lot. I will give it a try.
In the meantime I was able to play with the removeAll / addAll functions before executing each query and it seemed to work; and I kept a single Repository instance.

Would that also work if the query itself specifies its dataset using FROM clauses corresponding to the SPARQL endpoints ?

SELECT ...
FROM <myEndpoint>
FROM <otherEndpoint>
WHERE { ... }

@aschwarte10
Copy link
Contributor

@tfrancart the idea of using FROM makes absolutely sense, but I need to give it some more thinking. Looking at the SPARQL standard, it would be a kind of free interpretation of the keyword. FROM is meant to be used to refer to graphs, in this case I would have to use it to identify federation members. If you think of these as a graph, the standard definitions are met...

@tfrancart
Copy link
Author

If you think of these as a graph, the standard definitions are met...

Yep, that's exactly my situation : I query a "virtual graph" composed of the union of graphs that are actually the members of my federation. The federation itself is a SPARQL endpoint.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants