This is the API for NRCAN's Energuide data.
This project is composed of two parts: the API itself and the ETL process that produces the data the API will serve. There are further details in the READMEs for each of the respective portions of the project.
If you're really keen (and on a Mac), this should do you. Or continue reading for more details.
# install python 3 and mongo
./bootstrap.sh
# import data
make setup
# export Apollo Engine API Key
export NRCAN_ENGINE_API_KEY=your_api_key
# boot up API
make run
The bootstrap.sh file is a quick way to get up-and-running on a macOS environment. It relies on Homebrew to install both Python 3 (using pyenv) and MongoDB.
To get started, run
./bootstrap.sh
Note that to get pyenv running by default in your preferred shell, you'll need to add eval "$(pyenv init -)"
to your ~/.bash_profile
or ~/.zshrc
or after installing pyenv.
Once the script runs through, you'll need to import the data into your database, and then boot up the API that connects to it.
The Python code in /nrcan_etl
transforms the data from the default formatting set by NRCAN, and then inserts it into our Mongo database. More details can be found in the README.
For local development, run
make setup
This will
- install the needed Python dependencies
- drop your current test data (if you have any)
- import fresh test data
Now that we have data, it's time to boot up the API.
Apollo Engine is a monitoring/logging layer that gives us out-of-the-box diagnostic information about our graphql instance. You'll need your own API key to get the API running, so sign up for one here.
Once you have one, you'll need to
export NRCAN_ENGINE_API_KEY=your_api_key
The JavaScript code in /api
builds us a GraphQL API which allows us to query NRCAN data from the Mongo database. More info in the README.
To build the app and connect it to mongo, run
make watch
This will
- install the needed JavaScript dependencies
- build the app
- serve it up locally
- rebuild when files are changed
-
Download and install Python from here. Version 3.6.4 or higher is required.
-
Download current version of mongoDB community server from here.
-
Download Node.js from here. Version >=8.x of Node is required.
Check if the libraries are accessible by running the following commands in a terminal window
python --version
mongo --version
node --version
If version is not shown, the path to the library has to be added to the 'PATH' environment variable in System Properties.
NOTE: If you're having problems, make sure that you are not adding the library to a different user's 'PATH' (such as the user path of the local admin). The library can be added to the System 'PATH' variable so that it is visible to all users on the machine.
To work with the system, a copy will need to be cloned to the developers local machine using git
, there are many different git clients that can be used, but for simplicity sake we recommend Git for Windows.
Copy the repository link, and use it in the following command:
git clone <<repository link>>
Once the repository has been downloded, change directories into the folder created (cd nrcan_api
).
Mongodb
must be started before going to the next step. Open new terminal window and run the follow commands from your root directory:
# this directory is required when running mongod for the first time
md data\db
mongod
This window needs to remain open (it is running the database), but can be minimized as it will no longer be used.
In a new terminal window run the following command drop existing energuide mongodb
test data, if any exists.
mongo energuide --eval "db.dwellings.drop()"
Installing Python applications in a virtualenv
is considered best practice. To do so, navigate to the cloned git repository and run the following:
cd nrcan_etl
python -m venv env
env\Scripts\activate.bat
pip install -r requirements.txt
pip install -e .
Note: "." is part of the command
Run the following commands:
# extract from csv to zip file
energuide extract --infile tests/scrubbed_random_sample_xml.csv --outfile allthedata.zip
# load data into mongodb
energuide load --filename allthedata.zip
# delete zip file
del allthedata.zip
Unit tests for the python code.
pytest tests
mypy src tests
We can verify the data has actually been imported by using the mongodb command-line client. More detailed docs exist, but these should get you going.
Run following command to connect to energuide
db:
#open db client
mongo
# show available databases ('energuide' should exist)
show dbs
# set energuide as the default 'db'
use energuide
# get the count of entries in 'dwellings' (should be 7)
db.dwellings.count()
# select 'forwardSortationArea' value of each dwelling
db.dwellings.find({}, {'forwardSortationArea': 1})
# result should look like:
{ "_id" : ObjectId("5a848002e349de06d4bc8205"), "forwardSortationArea" : "T0J" }
{ "_id" : ObjectId("5a848002e349de06d4bc8206"), "forwardSortationArea" : "A2H" }
{ "_id" : ObjectId("5a848002e349de06d4bc8207"), "forwardSortationArea" : "Y1A" }
{ "_id" : ObjectId("5a848002e349de06d4bc8208"), "forwardSortationArea" : "G1A" }
{ "_id" : ObjectId("5a848002e349de06d4bc8209"), "forwardSortationArea" : "B2R" }
{ "_id" : ObjectId("5a848002e349de06d4bc820a"), "forwardSortationArea" : "X0A" }
{ "_id" : ObjectId("5a848002e349de06d4bc820b"), "forwardSortationArea" : "C1A" }
# disconnect from mongodb
quit()
Move to nrcan_api\api
folder.
API
utilizes Apollo Engine to monitor activities on host with GraphQL
website.
Apollo Engine is a monitoring/logging layer that gives us out-of-the-box diagnostic information about our graphql instance. You'll need your own API key to get the API running, so sign up for one here.
Key looks similar to: service:yname-8241:lQ3g_8Yojs4stdIWqwwj-bQ
Set following variables:
set NRCAN_DB_CONNECTION_STRING=mongodb://localhost:27017
set NRCAN_DB_NAME=energuide
set NRCAN_COLLECTION_NAME=dwellings
set NRCAN_ENGINE_API_KEY=service:yname-8241:lQ3g_8Yojs4stdIWqwwj-bQ
Make sure yarn
is installed
# should return a version number
yarn --version
If yarn
is not installed, run npm install --global yarn
to install.
Next command start GraphQL
yarm build && yarn start
The API should be running now! Yes!! 🎉🎉🎉
Check it out at http://localhost:3000/graphiql
Try out this query to get you going.
{
dwellings(
filters: [
{field: dwellingForwardSortationArea comparator: eq value: "C1A"}
]
) {
results {
yearBuilt
city
}
}
}
Or just click here