Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open table in Voyager option producing incorrect data #51

Open
ssharif6 opened this issue Jul 5, 2018 · 6 comments
Open

Open table in Voyager option producing incorrect data #51

ssharif6 opened this issue Jul 5, 2018 · 6 comments

Comments

@ssharif6
Copy link
Collaborator

ssharif6 commented Jul 5, 2018

This issue concerns the Open Table in Voyager option.

Here is the code to reproduce this issue.

image

Currently, with the code from above, I get this result when I right click the pandas dataframe and click the Open Table in Voyager option.

image

The data for the related views is not correct, and should look more like this, (just open cars.json with Voyager)

image

As you can see, the charts are different in that the data is inconsistent between the two, which should be the same! For instance, the Origin vs Number of Records barchart displayed from the pandas dataframe route is different than the one from opening a json file in Voyager

@ellisonbg
Copy link
Collaborator

ellisonbg commented Jul 5, 2018 via email

@zzhangjii
Copy link
Collaborator

Thanks, @ssharif6 , I'll look up into it, maybe the underneath data-frame we extract has a different structure?

@zzhangjii
Copy link
Collaborator

Hi, @ssharif6 , it turns out to be an issue about panda dataframe's default setting.
if not specified, pd will only display the first 30 rows + the last 30 rows, so JupyterLab_Voyager will only be able to get this 60 rows of data instead of the full dataset (that's why you see a difference).

Without major changes, to extract the whole dataset from this 'partial' table is almost impossible. So, an easy solution would be just changing the panda settings:

pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

In this way, the whole frame is displayed, and 'Open Table in Voyager' will get the correct data.

@dhirschfeld
Copy link

Is Open Table in Voyager just scraping the HTML output table? If so, I don't think that's an appropriate method to get the underlying data.

You can't simply change the display of the output because printing large DataFrames will then effectively hang your kernel session.

@zzhangjii
Copy link
Collaborator

True, that's definitely not a good solution. But currently Jupyter notebook doesn't expose the source dataset in cell output, for this extension, we don't have an easy way to directly find and access the data unless we modify the notebook itself to add some APIs.

@dhirschfeld
Copy link

^^^ I think this is a general design question that Jupyter needs to answer properly.

I guess a big aspect of this is how you communicate the data from Python/R/whatever objects to the frontend in javascript and had thought arrow might fit the bill there.

IMHO It would be great if the Open in Voyager context menu could be integrated with the variable inspector. They're already doing something similar by providing a phosphor datagrid view for numpy arrays. Although it's not yet an officially supported extension I think it would make sense for it to be in future - it's one of the highest request items from my users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants