Skip to content

Latest commit

 

History

History
17 lines (10 loc) · 787 Bytes

README.md

File metadata and controls

17 lines (10 loc) · 787 Bytes

Spanish Test

Python package with a pre-trained classifier for detecting if a given sample is from Spanish origin.

The training features are population probabilities.

The training labels must contain a column labeled as "nationality". Those samples of Spanish origin must be labelled as Spanish.

The pre-trained model has been trained with samples from the 1k Genome project.

The current master points to the internal development version and includes a pre-trained model using a pipeline obtained with TPOT.

In the following months we are going to include the TPOT library as a dependency. For now on, we only include the necessary files along their respective copyright notifications.

The published paper used the best model found by means of the TPE algorithm.