Contributing to arXiv projects

Thanks for your interest in contributing to arXiv software development! Here is a quick run-through of things you should know before diving in.

How to help

We have a list of projects at https://github.com/orgs/arXiv/projects. We try to put things here that are relatively self-contained (i.e. don't require a ton of coordination with other projects). If you see a project that looks interesting to you, take a look at the associated issues as well as the README of the associated repository.

Recommendations for making contributions:

Make sure that you are working from a ticket. If you see something that needs doing, and there isn't a ticket, please make one. This helps us to keep tabs on what we're doing.
Keep contributions bite-sized. We try to split up work into small, manageable pieces. This makes it easier for multiple people to work together, and especially facilitates code review. If you find yourself adding more than a few dozen lines of code, the task might be too big. Consider splitting the work into multiple tickets that you can deliver separately.
Don't hesitate to ask questions. With a project this size, it is impossible to anticipate everything that contributors will need to know. Please ask lots of questions (just comment on the open issues, or create a new issue), and we'll do our best to answer them.

Reading

For a high-level overview of the arXiv-NG project, see the arXiv arXitecture. Of particular interest:

Development practices
- Testing & QA
Services in arXiv-NG
Design system

Making contributions

Take a look at branch management.

In general, we deliver work by raising a pull request from a feature branch (e.g. bug/ARXIVNG-3092, feature/issue-32-better-widgets) to the develop branch. You can also raise a PR from a forked repo to the develop branch of this (arXiv) repo.

For the PR to be merged:

There should be an open issue documenting the feature, bug, or task to be completed. This helps to prevent going down rabbit holes or otherwise spending time on the wrong thing.
All tests must be passing (and the contribution must have tests).
Mypy type checking should pass.
Pydocstyle should have no errors.
Pylint should score at or above 8/10.

Testing

We use the built-in Python unittest framework to write all of our tests. Try to stick to the built-in tooling here.

Tests should live either in a tests/ module at the root of the repo, or in a tests module in a particular component of the application (e.g. someapp/services/tests/).

We generally run tests with nose2, but you can also use pytest if you prefer.

We aim for tests at the following levels of granularity:

Unit tests. We focus on unit tests for public functions/classes of modules. These should be consciously designed to avoid testing behavior that is implementation-specific.
Module tests. These test whole functional components within the application, making liberal use of Python's mock library. The kinds of modules that you might test include:
- Domain modules. These capture the core concepts, rules, and interactions within the application. You shouldn't need to mock anything, because the domain must not depend on any other components of the application.
- Controller modules. Mock service modules (e.g. for persistence), and test return values against whatever parameters might get passed in by the routes.
- Service integration modules. If possible, mock/spoof the external service that you're integrating with. For AWS services, moto is pretty nice.
Application tests. These tests run the whole application, possibly mocking external dependencies as needed. Use the Flask/Werkzeug test tooling for these tests.

For some useful background on test design, check out The Practical Test Pyramid.

Type annotations + static checks

We use type annotations throughout the running codebase (i.e. everything except tests). Check out Python's built-in typing library.

We use mypy to type-check code. If everything checks out, the following should return no lines and exit 0, using the mypyp.ini config in the root of this repo.

pipenv run mypy -p [APPLICATION] | grep -v "test.*" | grep -v "defined here"

You can also just run:

./tests/type-check.sh [APPLICATION]

Linting + Documentation

Code should adhere as closely as possible to PEP008. The following should exit 0:

pipenv run pydocstyle --convention=numpy --add-ignore=D401 [APPLICATION]

Or just run:

./tests/style.sh [APPLICATION]

We use Numpy style for docstrings, and otherwise follow PEP257. All modules, classes, public functions (i.e. not starting with _) should have docstrings. It's also nice if you add docstrings for constants and class attributes; see PEP224.

A .pylintrc config file can be found in the root of this repository. The following should score >= 8/10.

pipenv run pylint [APPLICATION]

Or run:

./tests/lint.sh [APPLICATION]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CONTRIBUTING.md

CONTRIBUTING.md

Contributing to arXiv projects

How to help

Recommendations for making contributions:

Reading

Making contributions

Testing

Type annotations + static checks

Linting + Documentation

Files

CONTRIBUTING.md

Latest commit

History

CONTRIBUTING.md

File metadata and controls

Contributing to arXiv projects

How to help

Recommendations for making contributions:

Reading

Making contributions

Testing

Type annotations + static checks

Linting + Documentation