Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc: Improve the docstring from the code source so that the documentation API will be clear #514

Closed
sylvaincom opened this issue Oct 17, 2024 · 5 comments
Assignees
Labels
documentation Improvements or additions to documentation

Comments

@sylvaincom
Copy link
Contributor

sylvaincom commented Oct 17, 2024

Which part of the documentation needs improvement?

Which part of the documentation needs improvement? Documentation API (so enhance the code source docstring). Indeed:

The skore documentation will soon be launched (#412). Same as in scikit-learn, there will be an API tab. Actually, what is done in Sphinx documentation is that all the content of the documentation on our classes is automatically built from the docstring of the code source.

Example with scikit-learn's logistic regression

The documentation API looks like this:
Capture d’écran 2024-10-17 à 08 57 32
...
Capture d’écran 2024-10-17 à 08 56 21

while the corresponding docstring in the code base looks like this:

class LogisticRegression(LinearClassifierMixin, SparseCoefMixin, BaseEstimator):
    """
    Logistic Regression (aka logit, MaxEnt) classifier.
    
    In the multiclass case, the training algorithm uses the one-vs-rest (OvR)
    scheme if the 'multi_class' option is set to 'ovr', and uses the
    cross-entropy loss if the 'multi_class' option is set to 'multinomial'.
    (Currently the 'multinomial' option is supported only by the 'lbfgs',
    'sag', 'saga' and 'newton-cg' solvers.)

...

   Read more in the :ref:`User Guide <logistic_regression>`.

   Parameters
   ----------
   penalty : {'l1', 'l2', 'elasticnet', None}, default='l2'
       Specify the norm of the penalty:

       - `None`: no penalty is added;
       - `'l2'`: add a L2 penalty term and it is the default choice;
       - `'l1'`: add a L1 penalty term;
       - `'elasticnet'`: both L1 and L2 penalty terms are added.

Describe the problem found in the documentation

Currently, our docstring is quite poor:
Capture d’écran 2024-10-17 à 09 03 12

Especially, for machine learning stuff such as skore.cross_validate, it will have to be properly documented cc @augustebaum, @MarieS-WiMLDS.

Suggested improvement

Please write docstring everywhere! For each class, describe its attributes, etc. For each method of each class, document it.

Additional context

No response

@sylvaincom sylvaincom added documentation Improvements or additions to documentation needs-triage This has been recently submitted and needs attention labels Oct 17, 2024
@augustebaum
Copy link
Contributor

For the documentation of the Project class, it needs to be mentioned that the class isn't supposed to be instantiated directly; rather, the user should first run python3 -m skore create project.skore and then in their script run project = skore.load("project.skore")

@tuscland tuscland removed the needs-triage This has been recently submitted and needs attention label Oct 18, 2024
@MarieS-WiMLDS
Copy link
Contributor

TODO: once the cross-validate PR is validated, check that it's compliant with the rest (from the PR, it looks ok to me).
To me, it will be all good.
@sylvaincom do you see any other part of the repo that needs better documentation?

@sylvaincom
Copy link
Contributor Author

sylvaincom commented Oct 18, 2024

Hmm I think we're aiming at DS users that will probably ignore the front end and back end API to focus on ML API such as cross validation and train test split

But having a minimal clean docstring everywhere is nice to have I think

And very thorough docstring for pure ML stuff

@MarieS-WiMLDS
Copy link
Contributor

Should we close this issue now that the documentation has been improved by a lot + there are epics more precise to keep improving?

@sylvaincom
Copy link
Contributor Author

Yes!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

4 participants