-
Notifications
You must be signed in to change notification settings - Fork 0
Python API Specification
The APIs and their input and output formats are described below. Examples of python code using the APIs described below are under GenomicsSampleAPIs/python_api/example
.
API to get the total number of samples that is in Tile DB.
Input: None
Output: Number of samples
Example:
num_samples = api.getNumSamples()
API to get the ranges where valid data is available in Tile DB. The API returns a string that is in json format, and you can use the json library in python to quickly convert the string to a dictionary object.
Example:
import json
data = json.loads(api.getValidPositions("5", 1, 1000))
Input:
Parameter | Type | Description |
---|---|---|
chromosome | string | name of the contig (first column in VCF) |
start | long | start of the range to search for valid positions |
end | long | end of the range to search for valid positions |
Output:
Parameter | Type | Description |
---|---|---|
data | json-dict<string, list<long>> | dictionary with keys = ['indices', 'POSITION', 'END']. indices is a list of sample ids that have valid data in the queried range. POSITION and END values are list objects with the start and end position values, where there exists values between start and end at each index in the list. |
API to get the ranges where valid data is available in Tile DB. The API returns a string that is in json format, and you can use the json library in python to quickly convert the string to a dictionary object.
Example:
import json
data = json.loads(api.getPostion("5", 500, ["REF", "QUAL"]))
Input:
Parameter | Type | Description |
---|---|---|
chromosome | list<string> | name of the contig (first column in VCF) |
position | list<long> | position to fetch the data from |
Attributes | list<string> | list of attributes that need to be fetched from Tile DB |
Output:
Parameter | Type | Description |
---|---|---|
data | json-dict<string, list<attribute_type>> | dictionary with keys = ['indices', 'POSITION', 'END', 'attribute0', ...]. indices is a list of sample ids that have valid data in the queried range. POSITION and END values are list objects with the start and end position values, where there exists values between start and end at each index in the list. Each attribute is a is a list object each of which has elements of the type of the attribute. |
NOTE:
- & in ALT refers to <NON-REF>
- indices are Samples Ids that can be used to construct scipy sparse matrices. For e.g., to construct a 1D scipy sparse matrix,
import json
from scipy.sparse import csc_matrix
data = json.loads(api.getValidPositions("5", 1, 1000, ["REF", "QUAL"]))
row = data['indices'] # indices==sample_ids start from 0 and correspond to row #
col = [0] * len(row) # constructing a matrix with single column
csc_matrix((data['QUAL'], (row, col)))
getPosition is also overloaded to take a list of contigs and list of positions as input, and returns the values for all the queried positions.
Example
The example below queries for position 500 in both contig 5 and 6.
import json
data = json.loads(api.getPostion(["5", "6"], [500, 500], ["REF", "QUAL"]))
Input:
Parameter | Type | Description |
---|---|---|
chromosome | string | name of the contig (first column in VCF) |
position | long | position to fetch the data from |
Attributes | list<string> | list of attributes that need to be fetched from Tile DB |
Output:
Parameter | Type | Description |
---|---|---|
data | json-dict<string, list<attribute_type>> | dictionary with the following format { contig : { POSITION : {'indices' : [results], 'POSITION': [results], 'END':[results], 'attribute0': [...], 'attribute1': [...], ...}, next_POSITION : {...}}, next_contig : {...} } |
API gets the sample name corresponding to the sample IDs, that was returned from the APIs above.
Example:
sample_names = api.getSampleNames(sample_ids)
Input:
Parameter | Type | Description |
---|---|---|
Sample IDs | list | list of sample IDs |
Output:
Parameter | Type | Description |
---|---|---|
Sample Names | list | list of sample names |
- Variant Store
- Python API
- Utils
- MAF to TileDB Import
- VCF to TileDB Import
- Additional Info