For dataset creators:

To create a new dataset:

Option 1: Use the `cell-feature-data` package

Create a virtual environment if not already created: python3 -m venv venv/[ENV-NAME]
Activate the virtual environment:
- On macOS/Linux: source venv/[ENV-NAME]/bin/activate
- On Windows: venv\[ENV-NAME]\Scripts\activate
Install the dependencies: pip install -e ./cell_feature_data This step makes the create-dataset command available globally within the virtual environment.
Run create-dataset to start the dataset creation process. This will:
- Request the path of the file you want to process. Formats supported: .csv, with more formats to be added as development progresses
- Ask for an output path to save your dataset. If not specified, a new dataset folder is created in data, named after the input file
- Process the input file and generate the necessary json files for the dataset
- Prompt for additional information about the dataset and update the json files accordingly
Deactivate the virtual environment when finished: deactivate

Option 2: Manually create json files within a dataset folder

make a branch or fork of this repo
create a new dataset folder under data
Expected files in a dataset directory:
- dataset.json: a json file with metadata about the dataset and the names of the other files. this is the only filename that matters. Otherwise everything is a relative path.
- a json file describing the measured features in this dataset. Key in dataset.json: featureDefsPath
- a json file listing the per cell data. Key in dataset.json: featuresDataPath
- a json file with settings for volume data channels in the 3d viewer. Key in dataset.json: viewerSettingsPath

After creating the dataset:

Before pushing a PR back to this repo, run the preliminary data consistency checks locally and make sure the validation passes. If it doesn't check the logs to see what went wrong and fix any errors.
- to validate a single dataset: npm run validate-single-dataset [PATH/TO/DATASET]
- to validate all datasets within data folder: npm run validate-datasets
If everything looks good, run the process dataset from Actions by clicking the "Run workflow" dropdown and entering the following settings:
- set branch to your branch
- enter the folder name that contains your dataset
- leave checkbox unchecked if this is your first time uploading
- leave db set to staging

For more on what these files should look like, look at `process-dataset/data-validation/schema.js` and Full spec documentation

To view your dataset:

Option 1: Point cell feature explorer staging site to staging database

Go to Manual deploy Action
- Click the workflow drop down, leave branch at main
- Set Deploy with staging db to true
Go to staging.cfe.allencell.org

Option 2: Run Cell Feature explorer locally

git clone https://github.com/allen-cell-animated/cell-feature-explorer.git
npm i
npm run start:dev-db

For Developers:

Clone or fork this repo run npm i

Setup

A glance of cloud firestore

Set up a dev database

choose Test mode in your security rules
add your required secret tokens to .env file
set NODE_ENV="dev" in .env file

Please refer to the Needed in .env file section for obtaining tokens

Needed in .env file:

NODE_ENV= "production" || "staging" || "dev"
AWS_SECRET=
AWS_ID=

# used if NODE_ENV === "production"
FIREBASE_TOKEN=
FIREBASE_EMAIL=

# used if NODE_ENV === "staging"
STAGING_FIREBASE_TOKEN= 
STAGING_FIREBASE_EMAIL=

# used if NODE_ENV === "dev"
DEV_FIREBASE_TOKEN= project settings/service accounts/generate new private key
DEV_FIREBASE_ID= project settings/general/project ID
DEV_FIREBASE_EMAIL= project settings/services accounts/firebase service account

To access AWS, production, or staging, please contact the development team for the necessary credentials

Three database endpoints:

dev: developer's personal testing database, the default option for development. Create your own credentials to access.
staging: group testing || for scientists to review. NODE_ENV=="staging"
production: production database for cfe.allencell.org. NODE_ENV=="production"

To process a new dataset:

node process-dataset [PATH/TO/DATASET] or npm run process-dataset [PATH/TO/DATASET]

To skip the fileInfo upload but run all the other steps (fileInfo upload takes a long time because firebase limits to 500 uploads per request):

node process-dataset [PATH/TO/DATASET] true or npm run process-dataset [PATH/TO/DATASET] true

Upload a dataset card image after the data has been uploaded

npm run upload-image [PATH/TO/DATASET]

Release dataset to production

npm run release-dataset [MEGASET_NAME] // will release every dataset in a megaset. Note, this isn't the folder name, it's the megaset name npm run release-dataset [DATASET_ID] // will release a dataset that isn't part of a megaset, id should be in the format [NAME]_v[VERSION] npm run release-dataset [MEGASET_NAME] [DATASET_ID] // will a dataset contained within a megaset, id should be in the format [NAME]_v[VERSION]

MIT license

Name		Name	Last commit message	Last commit date
Latest commit History 498 Commits
.github		.github
bin		bin
cell_feature_data		cell_feature_data
data		data
dist		dist
docs		docs
functions		functions
src		src
wetzel		wetzel
.firebaserc		.firebaserc
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
firebase.json		firebase.json
firestore.indexes.json		firestore.indexes.json
firestore.rules		firestore.rules
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

For dataset creators:

To create a new dataset:

Option 1: Use the `cell-feature-data` package

Option 2: Manually create json files within a dataset folder

Expected files in a dataset directory:

After creating the dataset:

For more on what these files should look like, look at `process-dataset/data-validation/schema.js` and Full spec documentation

To view your dataset:

Option 1: Point cell feature explorer staging site to staging database

Option 2: Run Cell Feature explorer locally

For Developers:

Setup

A glance of cloud firestore

Set up a dev database

Needed in .env file:

Three database endpoints:

To process a new dataset:

Upload a dataset card image after the data has been uploaded

Release dataset to production

About

Releases

Packages

Languages

allen-cell-animated/cell-feature-data

Folders and files

Latest commit

History

Repository files navigation

For dataset creators:

To create a new dataset:

Option 1: Use the cell-feature-data package

Option 2: Manually create json files within a dataset folder

Expected files in a dataset directory:

After creating the dataset:

For more on what these files should look like, look at process-dataset/data-validation/schema.js and Full spec documentation

To view your dataset:

Option 1: Point cell feature explorer staging site to staging database

Option 2: Run Cell Feature explorer locally

For Developers:

Setup

A glance of cloud firestore

Set up a dev database

Needed in .env file:

Three database endpoints:

To process a new dataset:

Upload a dataset card image after the data has been uploaded

Release dataset to production

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Option 1: Use the `cell-feature-data` package

For more on what these files should look like, look at `process-dataset/data-validation/schema.js` and Full spec documentation

Packages