Merge missing data encodings #86

rolyp · 2020-09-23T15:37:28Z

See notebook. Summary:

read dataset using utils.read_data
examine some values in LRE Ages 3-5 - Full Incl # column
plot the frequences of the unique values in the subsample
instantiate Ptype and fit schema to subsample
plot posterior distribution for column type and row type
list the missing values for the column
replace those values in the column by a new missing data encoding
run PType again to verify new encoding correctly identified as missing

To do:

read dataset directly rather than via utils.read_data
nothing gained by plots – remove
use Ptype to browse unique values?
subsume missing probabilities plot with col.get_missing_values()?

The text was updated successfully, but these errors were encountered:

rolyp · 2020-10-21T19:53:01Z

@tahaceritli Do we actually need this use case? Isn’t merging of missing data encodings exactly what is achieved when you do Schema.transform, when all data values interpreted as “missing” are mapped to pd.NA?

rolyp added the task:use-cases label Sep 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge missing data encodings #86

Merge missing data encodings #86

rolyp commented Sep 23, 2020 •

edited

Loading

rolyp commented Oct 21, 2020 •

edited

Loading

Merge missing data encodings #86

Merge missing data encodings #86

Comments

rolyp commented Sep 23, 2020 • edited Loading

rolyp commented Oct 21, 2020 • edited Loading

rolyp commented Sep 23, 2020 •

edited

Loading

rolyp commented Oct 21, 2020 •

edited

Loading