Fast implementation of the Neighborhood Component Analysis algorithm in Python. The ideas behind some of the choices are further expanded in Master's thesis, Fast low-rank metric learning.
Features:
- Sklearn-like API
- Same gradient cost as the objective function
- Avoid overflows when the scale of the metric is large
- WIP Mini-batch version
Sample usage from Python:
from nca import NCA
n = NCA()
n.fit(X, y)
X = n.transform(X)
For an example, run the example.py script. Among others the script accepts the type of model and the dataset:
python example.py --model nca --data wine
For a complete description of the available options, just invoke the help prompt:
python example.py -h
The code depends on the usual Python scientific environment: NumPy, SciPy, Scikit-learn.
The required packages are listed in the requirements.txt
file and can be installed in a virtual environment as follows:
virtualenv -p python3 venv
source venv/bin/activate
pip install -r requirements.txt
- Other NCA implementations Python + MATLAB
- Other metric learning algorithms
Special thanks to Iain Murray for writing a first version of this code, teaching me about automatic differentiation and supervising my Master's thesis project.
- Add requirements
- Add examples
- Add example using NCA with Nearest Neighbour
- Test numerical stability
- Add argument parsing for example script
- Add some visualizations
- Add PCA to the list of models
- Add concentric circles and noise to the list of datasets
- Create separate modules, for example: data, models
- Package the code
- Do not compute gradients when only the cost function is required
- Add example on MNIST
- Add tests
- Add gradient check tests
- Compute timings
- Big O notation for memory and computation
- Write documentation
- Implement version with mini-batches
- Provide links to other implementations and outline differences
- Motivate metric learning
- Implement nearest mean metric learning