Add kaggle update to release script #3182
Labels
kaggle
Sharing our data and analysis with the Kaggle community
nightly-builds
Anything having to do with nightly builds or continuous deployment.
Right now we are manually updating our Kaggle dataset. Ideally, we would use the kaggle API to automatically update the kaggle version when there is a new version. I took a stab at using the kaggle API but ran into an issue.
Kaggle uses the datapackage schema to track metadata about datasets. I pulled the existing metadata for the PUDL dataset with this command:
where the current directory contained all of the
.parquet
,.sqlite.gz
and.json
files of the nightly outputs.Then I tried to create a new version with this command:
Which uploaded all of the data but then failed with this error:
Dataset version creation error: Incompatible Dataset Type
There might be a bug that prevents folks from updating manually created datasets using the Kaggle API. I was able to initialize and update a private Kaggle dataset with the same pudl output files using the CLI.
I propose we point our notebooks at a new kaggle dataset that can be updated using the CLI.
Tasks
The text was updated successfully, but these errors were encountered: