Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Table: speed-up computation of basic stats of given columns. #3166

Merged
merged 2 commits into from
Aug 17, 2018

Conversation

thocevar
Copy link
Contributor

Issue

Imputation of tables with a large number of features is slow/unfeasible (e.g. #3129).

Imputation is done column by column, each time computing the basic stats. However, each time these stats are computed for the entire table (all columns) resulting in a quadratic time complexity.

Description of changes

Table's _compute_basic_stats has a columns parameter. Before, it passed the entire table to statistics.utils.stats and filtered the results. Now, we filter the table by given columns before passing it on for stats computation.

Includes
  • Code changes
  • Tests
  • Documentation

@thocevar thocevar requested a review from astaric July 27, 2018 13:55
@thocevar thocevar changed the title Table: speed-up computation of basic stats of given columns. [ENH] Table: speed-up computation of basic stats of given columns. Jul 30, 2018
@codecov-io
Copy link

Codecov Report

Merging #3166 into master will increase coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #3166      +/-   ##
==========================================
+ Coverage   82.37%   82.37%   +<.01%     
==========================================
  Files         336      336              
  Lines       58321    58319       -2     
==========================================
  Hits        48042    48042              
+ Misses      10279    10277       -2

@lanzagar lanzagar added this to the 3.16 milestone Aug 3, 2018
@lanzagar lanzagar merged commit 8d9988d into biolab:master Aug 17, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants