-
Notifications
You must be signed in to change notification settings - Fork 287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: added guide for result storages (Dataset, KeyValueStore) #587
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be a guide for Crawlee Python, but your code samples are in JS.
If you want to continue working on that, update the code samples to Python. Also, please use the same structure as we have in other guides (docs/guides) - code samples are in separate files, we use links to API docs, ...
Hi @vdusek pls check this PR and provide your valuable inputs on this, I am keen to work on this and contribute more to the crawlee. |
Hey @Manish-k723, CI checks are not passing. |
Hi @vdusek resolved the CI errors, pls let me know your inputs if something is still not correct. |
|
||
Every Crawlee project run is linked to a default dataset, which is generally used to store the results specific to that crawler execution. Utilizing this dataset is optional. | ||
|
||
In Crawlee, datasets are represented by the Dataset class. To facilitate writing to the default dataset, Crawlee provides the `Dataset.pushData()` function. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Identifiers are wrong. Probably just blindly copy-pasted from JS...
|
||
Every Crawlee project run is tied to a default key-value store. By convention, the project’s input and output are saved in this default key-value store under the keys `INPUT` and `OUTPUT`, respectively. Typically, both input and output are in JSON format, though other formats are also acceptable. | ||
|
||
In Crawlee, the key-value store is represented by the KeyValueStore class. To facilitate easy access to the default key-value store, Crawlee provides the functions `KeyValueStore.getValue()` and `KeyValueStore.setValue()`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Identifiers are wrong. Probably just blindly copy-pasted from JS...
Co-authored-by: Vlada Dusek <[email protected]>
Description
This PR adds documentation on Crawlee's result storage types, specifically the Key-Value Store and Dataset, providing usage examples and file structures for efficient data management.
Closes: Create a new guide for result storages (
Dataset
,KeyValueStore
) #479CI passed