These scripts allow you to query the http://beta.semanticfna.org/
API module "ask" using R or Python. They return a CSV file of the results.
The Flora of North America Semantic MediaWiki can be queried using the Semantic MediaWiki semantic search syntax.
In brief, you must have a condition:
[[Authority::Linnaeus]]
You can optionally return properties of the taxa matching your condition:
?Distribution
Putting this all together using pipes, we have a query like this:
[[Authority::Linnaeus]]|?Distribution
Or with additional properties requested, like this:
[[Authority::Linnaeus]]|?Distribution|?Taxon family
Sample queries can be found here:
Read more about Semantic MediaWiki query syntax:
- https://www.semantic-mediawiki.org/wiki/Help:Semantic_search
- https://www.semantic-mediawiki.org/wiki/Help:Search_operators
Semantic MediaWiki limits API queries to 5,000 results. If you expect your query to return more than 5,000 results, you should run your query in batches. (N.B.: There are ~20,000 treatments in the FNA Online.)
We recommend running your queries by 'published volume' by adding a volume condition to your query (e.g., "[[Volume::Volume 17]]"). Please see this page for a list of volumes that can be queried.
This section assumes you are familiar with the R programming language.
Show instructions
Open a terminal.
Type git clone https://github.com/jocelynpender/fna-query.git
Open an R console. Type
install.packages("WikipediR")
install.packages("tidyverse")
- Open an R console
- Open the run_query.R script
- Run your query:
E.g., [[Distribution::Nunavut]]
Use ask_query_titles
.
It returns only a list of Taxon names that match your query.
In the fna-query
directory, run
source("R/src/query.R")
page_titles_vector <- ask_query_titles("[[Distribution::Nunavut]]", "output_file_name.csv")
E.g., [[Distribution::Nunavut]]|?Taxon family
Use ask_query_titles_properties
It returns a list of Taxon names and associated properties asked for by your query
In the fna-query
directory, run
source("R/src/query.R")
properties_texts_data_frame <- ask_query_titles_properties("[[Distribution::Nunavut]]|?Taxon family", "output_file_name.csv")
E.g., [[Distribution::Nunavut]]
> page_titles_vector
[1] "Abietinella abietina"
[2] "Achillea millefolium"
[3] "Agrostis"
[4] "Agrostis anadyrensis"
...
See https://github.com/jocelynpender/fna-query/blob/master/R/demo_queries/distribution/nunavut_taxa.csv for a sample output file.
E.g., [[Distribution::Nunavut]]|?Taxon family
> properties_texts_data_frame
Taxon family
Abietinella abietina Thuidiaceae
Achillea millefolium Asteraceae
Agrostis Poaceae
Agrostis anadyrensis Poaceae
...
See https://github.com/jocelynpender/fna-query/blob/master/R/demo_queries/distribution/nunavut_taxa_family_name.csv for a sample output file.
Don't know what to query? See the demo queries here: https://github.com/jocelynpender/fna-query/tree/master/R/demo_queries
This section assumes you are familiar with Python programming.
Show instructions
You'll need to create an account to use the API with Python
-
Create your account http://beta.floranorthamerica.org/Special:CreateAccount
-
Find the file called
local.py.example
in thepython/src
folder. Rename it tolocal.py
and add your credentials.
requirements.txt
has been generated with pip freeze > requirements.txt
Open a terminal.
cd fna-query
pip install -r requirements.txt
The project was built within a conda environment. A conda YAML file has been generated with conda env export > fna-query.yml
.
Open a terminal.
cd fna-query
conda env create -f fna-query.yml
- Open a terminal.
- Prepare your query. E.g.,
[[Special status::Introduced]]
- Run your query using:
(if using conda, start with:
conda activate environment-name
)
cd fna-query
cd python
python -m src.run_query --output_file_name "output_file_name.csv" --query_string "[[Query::here]]"
The -m
flag tells Python to run the script run_query.py
and import the src module.
If your query results are extensive, the query will take some time to process. Please be patient.
E.g., [[Illustrator::+]][[Illustration::Present]]
python -m src.run_query --output_file_name "illustrated_taxa.csv" --query_string "[[Illustrator::+]][[Illustration::Present]][[Taxon family::Asteraceae]]"
See https://github.com/jocelynpender/fna-query/blob/master/python/demo_queries/distribution/nunavut_taxa.csv for a sample output file.
E.g., [[Illustrator::+]][[Illustration::Present]]|?Taxon rank
python -m src.run_query --output_file_name "illustrated_taxa_taxon_family.csv" --query_string "[[Illustrator::+]][[Illustration::Present]][[Taxon family::Asteraceae]]|?Taxon rank"
See https://github.com/jocelynpender/fna-query/blob/master/python/demo_queries/distribution/nunavut_taxa_family_name.csv for a sample output file.
Don't know what to query? See the demo queries here: https://github.com/jocelynpender/fna-query/tree/master/python/demo_queries
Contact [email protected] or [email protected] for support.
Please leave your bug reports here: https://github.com/jocelynpender/fna-query/issues
- Read more about the WikipediR package for R.
- Read more about the mwclient for Python.
Sometimes you'll need to batch the API return results. Here is an R script for merging multiple CSV files.