Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDS to Pandas DataFrames #290

Open
pablo-de-andres opened this issue Jun 22, 2020 · 3 comments
Open

CUDS to Pandas DataFrames #290

pablo-de-andres opened this issue Jun 22, 2020 · 3 comments
Assignees
Labels
📈 performance 🌱 new feature Solving the issue involves the incorporation of a new feature. 💬 discussion The idea is not mature enough to result in an implementation, and needs further discussion.

Comments

@pablo-de-andres
Copy link
Member

In GitLab by @yoavnash on Apr 8, 2020, 20:23

For downstream applications, there is a need to export tabular data represented as CUDS objects to Pandas DataFrames and vice versa, namely for ML tasks. How should this be done?

Related to #235 and #258.

Comments:

  1. Why not use SPARQL since it returns a table as a query result?
    • Could this way of action handle multiple large tables of floats?
    • Not everyone knows/wants to use SPARQL
  2. It would be very convenient to have an OSP-core function like so: df = to_dataframe(dataset) where dataset is a CUDS object.
  3. To support this conversion, then one way to do it would be to map column headers to ontology concepts linked in a certain pattern, and then the rows would be individuals that follow that pattern.
  4. For efficiency reasons (time and space), it makes sense to store the tabular data as a dataframe, and not as regular CUDS objects. However, the user should be oblivious to this.
@pablo-de-andres pablo-de-andres added 💬 discussion The idea is not mature enough to result in an implementation, and needs further discussion. 🌱 new feature Solving the issue involves the incorporation of a new feature. 📈 performance labels Jun 22, 2020
@pablo-de-andres
Copy link
Member Author

In GitLab by @yoavnash on Apr 8, 2020, 20:26

changed the description

@pablo-de-andres
Copy link
Member Author

In GitLab by @yoavnash on Apr 8, 2020, 20:43

changed the description

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
📈 performance 🌱 new feature Solving the issue involves the incorporation of a new feature. 💬 discussion The idea is not mature enough to result in an implementation, and needs further discussion.
Projects
None yet
Development

No branches or pull requests

3 participants