0.10.0: Pyspark.pandas Support, PydanticModel datatype, Performance Improvements #819
cosmicBboy
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Highlights
pandera
now supports pyspark dataframe validation viapyspark.pandas
The pandera koalas integration has now been deprecated
You can now
pip install pandera[pyspark]
and validatepyspark.pandas
dataframes:PydanticModel
DataType Enables Row-wise Validation with apydantic
modelPandera now supports row-wise validation by applying a pydantic model as a dataframe-level dtype:
The equivalent
DataFrameSchema
would be:Improved conda installation experience
Before this release there were only two conda packages: one to install
pandera-core
and another to installpandera
(which would install all extras functionality)The conda packaging now supports finer-grained control:
Enhancements
schema.to_yaml
#790infer_schema
#789Bugfixes
Deprecations
Docs Improvements
Testing Improvements
Misc Changes
Contributors
This discussion was created from the release 0.10.0: Pyspark.pandas Support, PydanticModel datatype, Performance Improvements.
Beta Was this translation helpful? Give feedback.
All reactions