-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updating contributing doc to add more detail on how to start #1185
Changes from 12 commits
ec37e13
b3dae47
99e5450
4a39f36
ee2f739
e01637c
0628fc3
53d143c
5b106df
fafd78d
57f6a36
1bd1482
d5466bf
de1b7d9
09d0090
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -124,15 +124,51 @@ See the relevant sections above on how to do this. | |
Setting up the environment | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
Follow the steps in the :ref:`installation_instructions` > "From Source" section | ||
to set up your environment, install the required development dependencies, and | ||
run the tests. | ||
To contribute, you will first have to run through some steps: | ||
|
||
- Set up your environment by forking the repository (`Github doc on | ||
forking and | ||
cloning <https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/fork-a-repo>`__). | ||
- Create and activate a new virtual environment: | ||
|
||
- With `venv <https://docs.python.org/3/library/venv.html>`__, create | ||
the env with ``python -m venv env_skrub`` and then activate it with | ||
``source env_skrub/bin/activate``. | ||
- With | ||
`conda <https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html>`__, | ||
create the env with ``conda new -n env_skrub`` and activate it with | ||
``conda activate env_skrub``. | ||
- While at the root of your local copy of skrub and within the new | ||
env, install the required development dependencies by running | ||
``pip install --editable ".[dev, lint, test, doc]"``. | ||
|
||
- Run ``pre-commit install`` to activate some checks that will run every | ||
time you do a ``git commit`` (mostly, formatting checks). | ||
|
||
If you want to make sure that everything runs properly, you can run all | ||
the tests with the command ``pytest -s skrub/tests``; note that this may | ||
take a long time. Some tests may raise warnings such as: | ||
|
||
When starting to work on a new issue, it's recommended to create a new branch: | ||
.. code:: sh | ||
Vincent-Maladiere marked this conversation as resolved.
Show resolved
Hide resolved
|
||
UserWarning: Only pandas and polars DataFrames are supported, but input is a Numpy array. Please convert Numpy arrays to DataFrames before passing them to skrub transformers. Converting to pandas DataFrame with columns ['0', '1', …]. | ||
warnings.warn( | ||
|
||
This is expected, and you may proceed with the next steps without worrying about them. However, no tests should fail at this point: if they do fail, then let us know. | ||
|
||
Now that the development environment is ready, you may start working on | ||
the new issue by creating a new branch: | ||
|
||
.. code:: sh | ||
|
||
.. code:: console | ||
git checkout -b my-branch-name-eg-fix-issue-123 | ||
# make some changes | ||
git add ./the/file-i-changed | ||
git commit -m "my message" | ||
git push --set-upstream origin my-branch-name-eg-fix-issue-123 | ||
|
||
git switch -c branch_name | ||
At this point, if you visit again the `pull requests | ||
page <https://github.com/skrub-data/skrub/pulls>`__ github should show a | ||
banner asking if you want to open a pull request from your new branch. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. also I don't think the "Implementation guidelines" below is very helpful There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. should it be removed? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jeromedockes Do you think it's too generic? Otherwise, I believe it summarizes our first principles quite well for a new contributor.
This one for example is not obvious and it's good to have it highlighted there. |
||
|
||
|
||
.. _implementation guidelines: | ||
|
@@ -183,7 +219,8 @@ Additionally, you might have updated the internal dataframe API in | |
``skrub/_dataframe/tests/test_common.py`` to add a test for the | ||
``amazing_function``. | ||
|
||
Run each updated test file using ``pytest``: | ||
Run each updated test file using ``pytest`` | ||
([pytest docs](https://docs.pytest.org/en/stable/)): | ||
|
||
.. code:: sh | ||
|
||
|
@@ -193,10 +230,20 @@ Run each updated test file using ``pytest``: | |
|
||
The ``-vsl`` flag provides more information when running the tests. | ||
|
||
It is also possible to run a specific test, or set of tests using the | ||
commands ``pytest the_file.py::the_test``, or | ||
``pytest the_file.py -k 'test_name_pattern'``. This is helpful to avoid | ||
having to run all the tests. | ||
|
||
If you work on Windows, you might have some issues with the working | ||
directory if you use ``pytest``, while ``python -m pytest ...`` should | ||
be more robust. | ||
|
||
Once you are satisfied with your changes, you can run all the tests to make sure | ||
that your change did not break code elsewhere: | ||
|
||
.. code:: sh | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. maybe another important pytest feature we could mention is running a specific test either with Also, I've noticed that in some cases in windows There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. also we could add a link to the pytest documentation There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. added notes |
||
pytest -s skrub/tests | ||
|
||
Finally, sync your changes with the remote repository and wait for CI to run. | ||
|
@@ -229,6 +276,21 @@ the docstrings. Check for possible problems by running | |
|
||
pytest skrub/path/to/file | ||
|
||
|
||
Formatting and pre-commit checks | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
Formatting the code well helps with code development and maintenance, | ||
Vincent-Maladiere marked this conversation as resolved.
Show resolved
Hide resolved
|
||
which is skrub requires that all commits follow a specific set of | ||
Vincent-Maladiere marked this conversation as resolved.
Show resolved
Hide resolved
|
||
formatting rules to ensure code quality. | ||
|
||
Luckily, these checks are performed automatically by the ``pre-commit`` | ||
tool (`pre-commit docs <https://pre-commit.com>`__) before any commit | ||
can be pushed. Something worth noting is that if the ``pre-commit`` | ||
hooks format some files, the commit will be canceled: you will have to | ||
stage the changes made by ``pre-commit`` and commit again. | ||
|
||
|
||
Submitting your code | ||
^^^^^^^^^^^^^^^^^^^^ | ||
|
||
|
@@ -237,17 +299,10 @@ a PR by clicking the "Compare & pull request" button on GitHub, | |
targeting the skrub repository. | ||
|
||
|
||
Integration | ||
^^^^^^^^^^^ | ||
|
||
Community consensus is key in the integration process. Expect a minimum | ||
of 1 to 3 reviews depending on the size of the change before we consider | ||
merging the PR. | ||
|
||
Please be mindful that maintainers are volunteers, so review times may vary. | ||
|
||
Continuous Integration (CI) | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
After creating your PR, CI tools will run proceed to run all the tests on all | ||
configurations supported by skrub. | ||
|
||
- **Github Actions**: | ||
Used for testing skrub across various platforms (Linux, macOS, Windows) | ||
|
@@ -273,18 +328,32 @@ actions are taken. | |
Note that by default the documentation is built, but only the examples that are | ||
directly modified by the pull request are executed. | ||
|
||
- If the remote repository was changed, you might need to run | ||
``pre-commit run --all-files`` to make sure that the formatting is | ||
correct. | ||
- If a specific test environment fails, it is possible to run the tests | ||
in the environment that is failing by using pixi. For example if the | ||
env is ``ci-py309-min-optional-deps``, it is possible to replicate it | ||
using the following command: | ||
CI is testing all possible configurations supported by skrub, so tests may fail | ||
with configurations different from what you are developing with. If this is the | ||
case, it is possible to run the tests in the environment that is failing by | ||
using pixi. For example if the env is ``ci-py309-min-optional-deps``, it is | ||
possible to replicate it using the following command: | ||
|
||
.. code:: sh | ||
|
||
pixi run -e ci-py309-min-optional-deps pytest skrub/tests/path/to/test | ||
|
||
This command downloads the specific environment on the machine, so you can test | ||
it locally and apply fixes, or have a clearer idea of where the code is failing | ||
to discuss with the maintainers. | ||
|
||
Finally, if the remote repository was changed, you might need to run | ||
``pre-commit run --all-files`` to make sure that the formatting is | ||
correct. | ||
|
||
Integration | ||
^^^^^^^^^^^ | ||
|
||
Community consensus is key in the integration process. Expect a minimum | ||
of 1 to 3 reviews depending on the size of the change before we consider | ||
merging the PR. | ||
|
||
Please be mindful that maintainers are volunteers, so review times may vary. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I don't think this is technically correct nowadays 😁 |
||
|
||
|
||
Building the documentation | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -32,7 +32,14 @@ The goal of skrub is to bridge the gap between tabular data sources and machine- | |
|
||
skrub provides high-level tools for joining dataframes (``Joiner``, ``AggJoiner``, ...), | ||
encoding columns (``MinHashEncoder``, ``ToCategorical``, ...), building a pipeline | ||
(``TableVectorizer``, ``tabular_learner``, ...), and more. | ||
(``TableVectorizer``, ``tabular_learner``, ...), and explore interactively your data (``TableReport``). | ||
|
||
.. figure:: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is not necessarily the final position, I can move it around if needed |
||
https://github.com/rcap107/skrub-datasets/blob/master/data/output.gif?raw=true | ||
:alt: An animation showing how TableReport works | ||
|
||
An animation showing how TableReport works | ||
|
||
|
||
>>> from skrub.datasets import fetch_employee_salaries | ||
>>> dataset = fetch_employee_salaries() | ||
|
@@ -69,5 +76,8 @@ The best way to support the development of skrub is to spread the word! | |
Also, if you already are a skrub user, we would love to hear about your use cases and challenges in the `Discussions <https://github.com/skrub-data/skrub/discussions>`_ section. | ||
|
||
To report a bug or suggest enhancements, please | ||
`open an issue <https://docs.github.com/en/issues/tracking-your-work-with-issues/creating-an-issue>`_ and/or | ||
`submit a pull request <https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request>`_. | ||
`open an issue <https://docs.github.com/en/issues/tracking-your-work-with-issues/creating-an-issue>`_. | ||
|
||
If you want to contribute directly to the library, then check the | ||
`how to contribute <https://skrub-data.org/stable/CONTRIBUTING.html>`_ page on | ||
the website for more information. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think one thing that can be confusing with pre-commit is it can modify some files and cancel the commit, and then you need to stage its modification and run
commit
again. I'm not sure what would be the best place to add a little explanation about that without making this section too long 🤔There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a section on that a bit further down