basic collection records endpoint #195

lukavdplas · 2024-07-23T12:57:27Z

This implements a basic endpoint to view the records inside a collection.

The current response will contain the URIs of records in the collection, but does not yet contain the contents of records. It also doesn't include pagination. Unlike with catalogue results, that doesn't seem immediately necessary.

The view will return a graph with an ActivityStreams Collection. Unlike the catalogue search endpoint, this isn't an OrderedCollection, because collections are not ordered, but it should be doable to reuse some of the same views.

jgonggrijp · 2024-11-06T15:14:49Z

@lukavdplas is this finished? Shall I review it?

lukavdplas · 2024-11-06T15:35:25Z

Yes, I think it's ready for review.

jgonggrijp

Shouldn't the collection-records endpoint also produce the contents of the records in the given collection? Otherwise the frontend would have to issue additional requests, potentially one per record, in order to fetch enough data to display the overview table.

jgonggrijp · 2024-11-06T18:07:23Z

backend/collect/graphs.py

+
+    items = list(graph.objects(list_node, RDF.first))
+    rest_nodes = graph.objects(list_node, RDF.rest)
+    for rest in rest_nodes:


Is rest a single-element iterator? Otherwise, I find the for loop a little bit alarming in combination with the recursion.

From what I can tell, there should be only one rest to a list, but nothing will enforce that on a technical level. (The specification of rest does not mention such a restriction, though.)

Anyway, if that happens, this solution is to concatenate the different lists, but another option would be to pick one and ignore the rest.

Now that you mention it, the recursive definition is a bit problematic since you can easily define circular lists in RDF - which would lead to infinite recursion. This can also happen if you drop all but one rest element.

I don't think this is a huge issue here, though, since edpop collections are created by the application itself. It would be an issue with a generic collection reader, though.

Ideally this algorithm should use a set membership check to prevent an infinite loop. Up to you whether you want to implement that on this branch or defer it to a new issue.

backend/collect/graphs.py

jgonggrijp · 2024-11-06T18:13:29Z

backend/collect/graphs.py

+    return g
+
+
+def collection_triples(graph: Graph, list_node: IdentifiedNode) -> Triples:


Doesn't this function do exactly the same thing as list_from_graph_collection?

Oh, I see the difference. This function returns full triples, the other only bare objects. Maybe just use the result from list_from_graph_collection? You know that the subject is always list_node, that the first predicate is RDF.first and that the remaining predicates are all RDF.rest. Saves some duplicate logic.

The rationale here was that the graph may be implemented in a non-standard way, such as having multiple first or rest elements for a list node. If we're deleting or changing a collection, it's relevant to select the triples that are currently stored, rather than reconstruct how they should be stored.

You could compare these results to see the difference:

found_triples = collection_triples(g, node) items = list_from_graph_collection(g, node) reconstructed_collection = list_to_graph_collection(items, node) normalized_triples = all_triples(reconstructed_collection)

The second is essentially what you're proposing, which will yield the same result if the list is stored the way you would expect. (If you use model.save(), it will replace the found_triples with the normalized_triples, by the way.)

So in theory yes, these would be the same, but they might not be and these functions have different uses in that scenario. I'll add a comment to explain this.

jgonggrijp · 2024-11-06T20:43:29Z

backend/collect/api.py

+        g += collection_obj._class_triples()
+        g += EDPOPCollection.records._stored_triples(collection_obj)


The convention for methods starting with an underscore is that they are not supposed to be called, or even known, outside their own class. Should these methods be private in the first place, and if so, isn't there some kind of public interface that you can use instead?

Yeah, I think I initially envisioned these as "private" methods but they probably should just be public. You don't really need the method here, though.

I you don't really need it (which of the two, by the way?), then why is it invoked?

jgonggrijp · 2024-12-12T16:03:53Z

Heads-up @lukavdplas: I'm going to rebase and force-push this branch before merging it. Contents will not change but your local reference will go stale. You can realign with git reset --hard if desired.

jgonggrijp reviewed Nov 6, 2024

View reviewed changes

jgonggrijp mentioned this pull request Dec 11, 2024

Import catalogus data into collection #141

Open

jgonggrijp changed the base branch from feature/collection-api to develop December 12, 2024 16:11

lukavdplas added 6 commits December 12, 2024 17:18

scaffold collection records endpoint

ef53b5e

save collection records as activityStreams collection

0a3d961

collection records endopint

9618451

docstrings + formatting

860349c

clarity

55a2079

don't use private method in view

fbd22f7

jgonggrijp force-pushed the feature/collection-records-api branch from 30fe3dd to fbd22f7 Compare December 12, 2024 16:18

jgonggrijp merged commit 4427cee into develop Dec 12, 2024
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

basic collection records endpoint #195

basic collection records endpoint #195

lukavdplas commented Jul 23, 2024

jgonggrijp commented Nov 6, 2024

lukavdplas commented Nov 6, 2024

jgonggrijp left a comment

jgonggrijp Nov 6, 2024

lukavdplas Dec 11, 2024

jgonggrijp Dec 11, 2024

jgonggrijp Nov 6, 2024

jgonggrijp Nov 6, 2024

lukavdplas Dec 11, 2024

jgonggrijp Nov 6, 2024

lukavdplas Dec 11, 2024

jgonggrijp Dec 11, 2024

jgonggrijp commented Dec 12, 2024

		return g


		def collection_triples(graph: Graph, list_node: IdentifiedNode) -> Triples:

		g += collection_obj._class_triples()
		g += EDPOPCollection.records._stored_triples(collection_obj)

basic collection records endpoint #195

basic collection records endpoint #195

Conversation

lukavdplas commented Jul 23, 2024

jgonggrijp commented Nov 6, 2024

lukavdplas commented Nov 6, 2024

jgonggrijp left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jgonggrijp commented Dec 12, 2024