Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ben/what is probability #141

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

ben-herbst
Copy link
Collaborator

No description provided.

Copy link
Member

@matthew-brett matthew-brett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small comments and suggestions

In practice you will asked to provide answers based on data. For example, you may be given data about customer
behaviour in a large bank and asked to develop a model that will provide the probability of default of the customers in
the bank. This is an important problem for all financial institutions - if it does not have a good credit risk model,
it will either loose money by being too conservative in the way it lends money, or loose money bay taking on too much
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
it will either loose money by being too conservative in the way it lends money, or loose money bay taking on too much
it will either loose money by being too conservative in the way it lends money, or loose money by taking on too much

behaviour in a large bank and asked to develop a model that will provide the probability of default of the customers in
the bank. This is an important problem for all financial institutions - if it does not have a good credit risk model,
it will either loose money by being too conservative in the way it lends money, or loose money bay taking on too much
of a risk. You can opt for applying a sophisticated model such as a deep neural network but you are almost guaranteed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't they be confused by "deep neural network" here? Is there a more general way of saying this - such as:

Suggested change
of a risk. You can opt for applying a sophisticated model such as a deep neural network but you are almost guaranteed
of a risk. You can opt for applying an extremely complex "machine-learning" model with many parameters but you are almost guaranteed

the bank. This is an important problem for all financial institutions - if it does not have a good credit risk model,
it will either loose money by being too conservative in the way it lends money, or loose money bay taking on too much
of a risk. You can opt for applying a sophisticated model such as a deep neural network but you are almost guaranteed
to come to grief. First study the problem and come to terms of all the many issues at stake. We speak of experience!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better as?

Suggested change
to come to grief. First study the problem and come to terms of all the many issues at stake. We speak of experience!
to come to grief. First study the problem and come to terms of all the many issues at stake. We speak from experience!

Let's illustrate the idea with a simple example. The teacher asks little Annie to solve the following problem: Ten
sheep are on this side of the road and one sheep crosses to the other side, how many sheep remain on this side? Annie
knows the answer of course, and replies, correctly, none. This quite agitates the teacher and asks, there were ten sheep
on this side of the road and one crosses over to the other side, how is it that you tell non remain? Annie replies,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
on this side of the road and one crosses over to the other side, how is it that you tell non remain? Annie replies,
on this side of the road and one crosses over to the other side, why are you saying that none remain? Annie replies,

It is easy to get the arithmetic right, but as easy to get the problem wrong if you don't understand it.

Please make sure you know what problem you have to solve. You may even run into situations where a company provides you
with lots of data and then ask you to extract meaningful information from it. Our advice it, work with the company to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
with lots of data and then ask you to extract meaningful information from it. Our advice it, work with the company to
with lots of data and then asks you to extract meaningful information from it. Our advice is, work with the company to

These raise serious ethical questions that the practitioner should be aware of.

Returning to the criminal detection problem mentioned above, it failed. Let's think of what the model does. Since it
it given samples of photographs of criminals and non-criminals, i.e. each photograph comes with the label, `criminal`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
it given samples of photographs of criminals and non-criminals, i.e. each photograph comes with the label, `criminal`
is given samples of photographs of criminals and non-criminals, i.e. each photograph comes with the label, `criminal`

## How is you model going to be used?

The responsibility of the technical developer does not end with providing the model, or the analytics needed for the
purpose. It is important to know how your model is going to be used. If you are to develop a system that need to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
purpose. It is important to know how your model is going to be used. If you are to develop a system that need to
purpose. It is important to know how your model is going to be used. If you are to develop a system that needs to

is all about coding. Anyway, you will become so much more marketable if you learn the basics of solid software practices.
We don't have the space to do it here but we do want to stress its importance. Always keep in mind the following:

1. Use versioning control, we recommend using git. If you regularly push to the git repo this will protect you from
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. Use versioning control, we recommend using git. If you regularly push to the git repo this will protect you from
1. Use version control; we recommend using git. If you regularly push to the git repo this will protect you from

1. Use versioning control, we recommend using git. If you regularly push to the git repo this will protect you from
accidental software loss. It makes is also eay to share your code. You want other people to use your code, it make you
so much more useful!
2. Ask someone elso to critically review your code. Even better if you work in an environment where there is a formal
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
2. Ask someone elso to critically review your code. Even better if you work in an environment where there is a formal
2. Ask someone else to critically review your code. Even better if you work in an environment where there is a formal

2. Ask someone elso to critically review your code. Even better if you work in an environment where there is a formal
system of code review.
3. Read other people's code. You will learn a lot.
4. Always tests for your code. This means that you run your code on small examples for which you know the answer. Every
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
4. Always tests for your code. This means that you run your code on small examples for which you know the answer. Every
4. Always add tests for your code. This means that you run your code on small examples for which you know the answer. Every

* Split out section on software practices to new chapter
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants