Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jupyter notebook for Data presentation and plots #41

Closed
ojha-aditya opened this issue Nov 6, 2024 · 10 comments
Closed

Jupyter notebook for Data presentation and plots #41

ojha-aditya opened this issue Nov 6, 2024 · 10 comments
Labels
Organization question Further information is requested

Comments

@ojha-aditya
Copy link
Contributor

@laserlab do we need to incude the plots searately from the jupyter notebook in our PR for Data presentation?

@ojha-aditya ojha-aditya added the question Further information is requested label Nov 6, 2024
@ojha-aditya ojha-aditya added this to the Plot Time Domain Methane milestone Nov 6, 2024
@laserlab
Copy link
Contributor

laserlab commented Nov 6, 2024

No, this time inside the notebook. However it is required that all plotting task members work on a single notebook.

@ojha-aditya
Copy link
Contributor Author

And we need to merge all plots for a task into a single image in the notebook?

@ojha-aditya
Copy link
Contributor Author

Actually, as no data presentation member has submitted a PR yet would it be a better idea to push a blank Jupyter notebook already so that everyone can make changes to that from the beginning instead of making a file of their own and then having to resolve conflicts @laserlab ? Or maybe we can push the incomplete notebook discussed in #48 right away just to have a file for everyone to work on?

@ojha-aditya
Copy link
Contributor Author

ojha-aditya commented Nov 7, 2024

Also, I just encountered the issue that Python 3.11 did not work well with some libraries, especially pandas, so I switched to 3.12.7. Putting it here just so that anyone encountering a similar problem can see.

@ojha-aditya
Copy link
Contributor Author

I am receiving this error when trying to read in the .json file containing data for all continents.

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[4], <a href='vscode-notebook-cell:?execution_count=4&line=1'>line 1</a>
----> <a href='vscode-notebook-cell:?execution_count=4&line=1'>1</a> methane_series = get_timeseries("methane-data-collection/data/methane_data.json")

File /workspaces/CP1-24-HW5/preparation.py:83, in get_timeseries(path)
     <a href='/workspaces/CP1-24-HW5/preparation.py:61'>61</a> '''
     <a href='/workspaces/CP1-24-HW5/preparation.py:62'>62</a> This function reads json files from the data collection task 
     <a href='/workspaces/CP1-24-HW5/preparation.py:63'>63</a> and returns a pandas time series with datetime as index and 
   (...)
     <a href='/workspaces/CP1-24-HW5/preparation.py:79'>79</a> - Pandas Time Series. Index = Datetime, Data = CO2/Methane Concentration
     <a href='/workspaces/CP1-24-HW5/preparation.py:80'>80</a> '''
     <a href='/workspaces/CP1-24-HW5/preparation.py:82'>82</a> #Extracts data from the json file at the input path
---> <a href='/workspaces/CP1-24-HW5/preparation.py:83'>83</a> data = pd.read_json(path)
     <a href='/workspaces/CP1-24-HW5/preparation.py:84'>84</a> #Uses the month and year information from the json file,
     <a href='/workspaces/CP1-24-HW5/preparation.py:85'>85</a> # assumes data was taken on the first of each month,
     <a href='/workspaces/CP1-24-HW5/preparation.py:86'>86</a> # creates new column with datetime
     <a href='/workspaces/CP1-24-HW5/preparation.py:87'>87</a> data['date'] = pd.to_datetime(data[['Year', 'Month']].assign(Day=1))

File ~/.local/lib/python3.12/site-packages/pandas/io/json/_json.py:815, in read_json(path_or_buf, orient, typ, dtype, convert_axes, convert_dates, keep_default_dates, precise_float, date_unit, encoding, encoding_errors, lines, chunksize, compression, nrows, storage_options, dtype_backend, engine)
    <a href='~/.local/lib/python3.12/site-packages/pandas/io/json/_json.py:813'>813</a>     return json_reader
    <a href='~/.local/lib/python3.12/site-packages/pandas/io/json/_json.py:814'>814</a> else:
--> <a href='~/.local/lib/python3.12/site-packages/pandas/io/json/_json.py:815'>815</a>     return json_reader.read()

File ~/.local/lib/python3.12/site-packages/pandas/io/json/_json.py:1025, in JsonReader.read(self)
...
   <a href='~/.local/lib/python3.12/site-packages/pandas/io/json/_json.py:1407'>1407</a>         str(k): v
   <a href='~/.local/lib/python3.12/site-packages/pandas/io/json/_json.py:1408'>1408</a>         for k, v in ujson_loads(json, precise_float=self.precise_float).items()
   <a href='~/.local/lib/python3.12/site-packages/pandas/io/json/_json.py:1409'>1409</a>     }

ValueError: Trailing data
Output is truncated. View as a [scrollable element](command:cellOutput.enableScrolling?ae95ae4e-ee15-4e44-838d-78d90c9f83c9) or open in a [text editor](command:workbench.action.openLargeOutput?ae95ae4e-ee15-4e44-838d-78d90c9f83c9). Adjust cell output [settings](command:workbench.action.openSettings?%5B%22%40tag%3AnotebookOutputLayout%22%5D)...

@laserlab
Copy link
Contributor

laserlab commented Nov 7, 2024

Check with the author of that file

@ojha-aditya
Copy link
Contributor Author

@avgagliardo could you comment on this please? I am unable to understand if this problem needs to be resolved on the level of data file or the function definition or if I am doing something wrong.

@ojha-aditya
Copy link
Contributor Author

I basically used the get_timeseries function from preparations.py for the methane_data.json file

@laserlab
Copy link
Contributor

laserlab commented Nov 8, 2024

Actually, as no data presentation member has submitted a PR yet would it be a better idea to push a blank Jupyter notebook already so that everyone can make changes to that from the beginning instead of making a file of their own and then having to resolve conflicts @laserlab ? Or maybe we can push the incomplete notebook discussed in #48 right away just to have a file for everyone to work on?

There will be conflicts, but yes a blank file at least fixes the file name.

@ojha-aditya
Copy link
Contributor Author

Closing this issue now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Organization question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants