Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Post postmodeling: A new hope #673

Open
wants to merge 17 commits into
base: master
Choose a base branch
from
Open

Post postmodeling: A new hope #673

wants to merge 17 commits into from

Conversation

nanounanue
Copy link
Contributor

This PR is about putting out there the new proposed API. It is still work in progress.

Missing parts:

  • Jaccard plot methods
  • Feature importance plot methods
  • Crosstab plot methods
  • Error analysis
  • A notebook

I want to push this, so other people could start coding using this interface.

@nanounanue nanounanue requested review from shaycrk and ecsalomon April 17, 2019 19:19
return evaluations.drop('_sa_instance_state', axis=1).set_index(['model_group_id', 'model_id', 'metric', 'parameter'])


class ModelGroup(Base):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the original definitions in https://github.com/dssg/triage/blob/master/src/triage/component/results_schema/schema.py be used? It would be nice to minimize the code that needs to be changed when the schema changes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will use them :)

Copy link
Member

@jesteria jesteria left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool beans

comments might've gotten a bit low-level; and, I'm not really sure what's new code and what's not.

but, for some of these, it's unclear to me that it'll work as is.

).run()


session = create_session(shared_db_engine)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this line do anything? (i notice that you're not using session, at least.)

src/triage/component/postmodeling/README.org Outdated Show resolved Hide resolved
src/triage/component/postmodeling/__init__.py Outdated Show resolved Hide resolved
src/triage/component/postmodeling/__init__.py Outdated Show resolved Hide resolved
src/triage/component/postmodeling/__init__.py Outdated Show resolved Hide resolved
src/triage/component/postmodeling/jaccard.py Outdated Show resolved Hide resolved
src/triage/component/postmodeling/jaccard.py Outdated Show resolved Hide resolved
src/triage/component/postmodeling/jaccard.py Outdated Show resolved Hide resolved
evaluations = evaluations.sort_values('evaluation_start_time', ascending=True)
evaluations = evaluations[['evaluation_start_time', 'value']]
evaluations['type'] = model_group.type.split('.')[-1]
return evaluations
Copy link
Member

@jesteria jesteria Apr 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function needn't be declared dynamically, need it? I would think it can be declared at the module level, (rather than re-declared every time the wrapping function is called).

perhaps it'll need metric added to its signature, but that's all...?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean? I am afraid that I am not following

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function plot_metric_over_time will re-declare the inner function get_model_group_evaluations every time that the former is invoked. This is sometimes useful, but problems with it include:

  • performance: the interpreter must recreate the function dynamically
  • debugging & legibility: the inner function inherits a dynamic scope from every invocation of plot_metric_over_time
  • testing & reusability: (if necessary), the inner function couldn't be either tested or reused with this factoring

Though the inner function here isn't huge, I would say it's big enough that it may as well be moved to the module scope, to avoid the above pitfalls.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense! I will do it.

My idea was not to pollute the module namespace with this function that is only used inside this one, but, what are you saying makes a lot of sense. What will be a good case for using inner functions?

Copy link
Member

@jesteria jesteria Apr 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right! I think the same way. (But I've also been yelled at for doing so 😹)

The inner function – once moved to the module level – can at least be given a leading underscore to indicate that it's not intended as part of the module API.

I'd say the important thing is that inner functions be trivial – very short, (including lambdas of course).

But ultimately it's up to your judgement of the pros & cons relative to the situation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you

src/triage/component/postmodeling/plots.py Outdated Show resolved Hide resolved
@nanounanue
Copy link
Contributor Author

@jesteria Thank you for all your comments! I will work on them

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants