Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include project id in zipped export #5156

Open
jwijffels opened this issue Nov 11, 2024 · 9 comments
Open

Include project id in zipped export #5156

jwijffels opened this issue Nov 11, 2024 · 9 comments

Comments

@jwijffels
Copy link

jwijffels commented Nov 11, 2024

Is your feature request related to a problem? Please describe.
I'd like to have the project id next to the project name when exporting the project as UIMA CAS XMI (XML 1.0)

Describe the solution you'd like
When exporting the project as a zipped file, currently the project id is not available anywhere.
image
It seems only available in the cas metadata (namely the baseuri where it looks like repository/project/6/document/828/source) when a document is annotated, so not when there are no documents annotated whatsoever.

baseuri = cas.select('de.tudarmstadt.ukp.dkpro.core.api.metadata.type.DocumentMetaData')[0].documentBaseUri

Describe alternatives you've considered
I could get it through the python client, where you first get a list of projects and for a specific project, get a zipped download, but I'd rather also have that the frontend returns the project id in the zip file

Additional context
Example screenshot of the json of the project export showing that only the project name is put in the json, not the project id
image

@reckart
Copy link
Member

reckart commented Nov 12, 2024

The project ID has no meaning for the exported project because when you import the project again, it gets assigned a new ID. Why do you need it?

@jwijffels
Copy link
Author

jwijffels commented Nov 12, 2024

That's indeed what I thought the reason was why it's not included.

The context is, I'm using Inception to collect training data for building a NER model.
I have flows using pycaprio which collect the data (by project id) and would like to have the same structure outputted if a user decides to export the data as a zipped project before I plug them into the NER modelling process.

@reckart
Copy link
Member

reckart commented Nov 12, 2024

Have you considered using the project slug instead of the ID? It is at least a bit more stable. The slug would only change on import if there is already another project with the same slug.

@jwijffels
Copy link
Author

That's indeed an option. Currently I parse out the project_id based on this documentBaseUri repository/**project/6/**document/828/source
but then I have the project id only in case there are annotations or curations.
I could use the project slug when exporting as zip as an alternative to the name, but in pycaprio, the client.api.projects only returns the project name and project id, not the slug.

@reckart
Copy link
Member

reckart commented Nov 12, 2024

I could use the project slug when exporting as zip as an alternative to the name, but in pycaprio, the client.api.projects only returns the project name and project id, not the slug.

Right... I believe it's a pycaprio thing though and that INCEpTION already returns the slug in the API response (as field name). Fortunately, pycaprio is now maintained here and we can do new releases of it. Would you like to look into adding access to the name field as slug to pycaprio?

https://github.com/inception-project/pycaprio

@jwijffels
Copy link
Author

Ah, so the name element in client.api.projects in pycaprio is the slug?
Good, I'll dig into pycaprio to see what the API response really returns.

@reckart
Copy link
Member

reckart commented Nov 12, 2024

This is the data transfer object on the Java side where you can see how the project information is mapped to the JSON response:

https://github.com/inception-project/inception/blob/main/inception/inception-remote/src/main/java/de/tudarmstadt/ukp/clarin/webanno/webapp/remoteapi/aero/model/RProject.java

@jwijffels
Copy link
Author

Ok, clear. thanks. I'll look at the pycaprio side to incorporate the slug there and restructure my code to work with the slug.

@reckart
Copy link
Member

reckart commented Nov 12, 2024

I just noted that the slug in the exported project is always null - will be fixed in next bug fix release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants