- Create a virtual environment if not already created:
python3 -m venv venv/[ENV-NAME]
- Activate the virtual environment:
- On macOS/Linux:
source venv/[ENV-NAME]/bin/activate
- On Windows:
venv\[ENV-NAME]\Scripts\activate
- On macOS/Linux:
- Install the dependencies:
pip install -e ./cell_feature_data
This step makes thecreate-dataset
command available globally within the virtual environment. - Run
create-dataset
to start the dataset creation process. This will:- Request the path of the file you want to process. Formats supported:
.csv
, with more formats to be added as development progresses - Ask for an output path to save your dataset. If not specified, a new dataset folder is created in
data
, named after the input file - Process the input file and generate the necessary json files for the dataset
- Prompt for additional information about the dataset and update the json files accordingly
- Request the path of the file you want to process. Formats supported:
- Deactivate the virtual environment when finished:
deactivate
- make a branch or fork of this repo
- create a new dataset folder under
data
dataset.json
: a json file with metadata about the dataset and the names of the other files. this is the only filename that matters. Otherwise everything is a relative path.- a json file describing the measured features in this dataset. Key in
dataset.json
:featureDefsPath
- a json file listing the per cell data. Key in
dataset.json
:featuresDataPath
- a json file with settings for volume data channels in the 3d viewer. Key in
dataset.json
:viewerSettingsPath
-
Before pushing a PR back to this repo, run the preliminary data consistency checks locally and make sure the validation passes. If it doesn't check the logs to see what went wrong and fix any errors.
- to validate a single dataset:
npm run validate-single-dataset [PATH/TO/DATASET]
- to validate all datasets within
data
folder:npm run validate-datasets
- to validate a single dataset:
-
If everything looks good, run the process dataset from
Actions
by clicking the "Run workflow" dropdown and entering the following settings:- set
branch
to your branch - enter the folder name that contains your dataset
- leave checkbox unchecked if this is your first time uploading
- leave db set to
staging
- set
For more on what these files should look like, look at process-dataset/data-validation/schema.js
and Full spec documentation
- Go to Manual deploy Action
- Click the workflow drop down, leave branch at
main
- Set
Deploy with staging db
totrue
- Click the workflow drop down, leave branch at
- Go to
staging.cfe.allencell.org
git clone https://github.com/allen-cell-animated/cell-feature-explorer.git
npm i
npm run start:dev-db
Clone or fork this repo
run npm i
- choose Test mode in your security rules
- add your required secret tokens to .env file
- set NODE_ENV="dev" in .env file
Please refer to the
Needed in .env file
section for obtaining tokens
NODE_ENV= "production" || "staging" || "dev"
AWS_SECRET=
AWS_ID=
# used if NODE_ENV === "production"
FIREBASE_TOKEN=
FIREBASE_EMAIL=
# used if NODE_ENV === "staging"
STAGING_FIREBASE_TOKEN=
STAGING_FIREBASE_EMAIL=
# used if NODE_ENV === "dev"
DEV_FIREBASE_TOKEN= project settings/service accounts/generate new private key
DEV_FIREBASE_ID= project settings/general/project ID
DEV_FIREBASE_EMAIL= project settings/services accounts/firebase service account
To access
AWS
,production
, orstaging
, please contact the development team for the necessary credentials
dev
: developer's personal testing database, the default option for development. Create your own credentials to access.staging
: group testing || for scientists to review. NODE_ENV=="staging"production
: production database for cfe.allencell.org. NODE_ENV=="production"
node process-dataset [PATH/TO/DATASET]
or
npm run process-dataset [PATH/TO/DATASET]
To skip the fileInfo upload but run all the other steps (fileInfo upload takes a long time because firebase limits to 500 uploads per request):
node process-dataset [PATH/TO/DATASET] true
or
npm run process-dataset [PATH/TO/DATASET] true
npm run upload-image [PATH/TO/DATASET]
npm run release-dataset [MEGASET_NAME]
// will release every dataset in a megaset. Note, this isn't the folder name, it's the megaset name
npm run release-dataset [DATASET_ID]
// will release a dataset that isn't part of a megaset, id should be in the format [NAME]_v[VERSION]
npm run release-dataset [MEGASET_NAME] [DATASET_ID]
// will a dataset contained within a megaset, id should be in the format [NAME]_v[VERSION]
MIT license