Skip to content

ReFlow Development History

Scott White edited this page Jan 25, 2016 · 21 revisions

Nov 2012

  • Initial discussion of creating some kind of software, enabling automated analysis of flow cytometry data
  • Would like to build off of prior clustering work
  • Clustering code requires GPU/CUDA so a desktop stand-alone app is ruled out, as most users wouldn't have the hardware required, and if they did, wouldn't have the knowledge for setting up a Linux based CUDA workstation
  • Also, FCS data is labelled inconsistently, so having all the raw FCS files sent to us is a problem...we don't know what is acquired in each channel
  • What if we create a centralized web-based tool for transferring FCS data to us, while allowing the uploader to correctly identify the channels acquired in the FCS files? Then, we could have multiple remote "worker" machines grab the data and analyze it, uploading the results back to the central server. This way, we could decouple the analysis from the server allowing for scalability, and at the same time using web standards to avoid software installs on the machines of end users.

Dec 2012

  • 1st commit!
  • ReFlow started as a typical Django 1.4 project
  • ReFlow logo designed and created...gotta have a logo

Jan 2013

  • Early on we realized a REST API would be helpful for any worker code we may create, so we started with Tastypie, a REST API library for Django
  • 1st realization of the complexity of the "panel" tried switching to AJAX table-based creation from the single parameter modal
  • Implemented basic scatterplot matrix in D3, began to see the limitations of web applications as the first plot had to be limited to only 1000 events for decent performance
  • 1st Ubuntu setup guide documentation

Feb 2013

  • Tastypie was becoming difficult to use for complex filtering, especially in many-to-many relationships. Started looking into alternative REST API apps for Django, settled on Django REST Framework...still using it now (2016)
  • Started ReFlow REST Client repository, began using httplib and urllib. The 1st version was really just intended for helping to POST multiple FCS files to ReFlow easier, as this was becoming an obvious bottleneck
  • Implemented Token Authentication for the REST API, more secure than Basic Auth
  • Refactoring of Site/Subject/Project relationships as subject may not be required, Sample no belongs to a Site rather than a Subject...caused a lot of modifications to existing code
  • Began seeing issues with using Django's FileField performance, perhaps there weren't a lot of people using Django for large file management and retrieval

Mar 2013

  • Continued trouble-shooting of file upload "temp" file issue for files over 2.5MB...most of this stemmed from trying to save a NumPy array of the FCS events for faster access.

  • Eventually abandoned saving NumPy array, as FCS file I/O speed was improved so we can just read the event data on the fly...plus the data had to be compensated anyway...decided to let the server store things and do it well.

  • Started using REST Framework docs app to auto-document the REST API

  • Re-worked panel builder to build a panel based on a sample file, which simplified the panel building process   

    • don't have to specify a site, the sample is already associated with a site   
    • panel name will automatically be the sample's original file name, can modify if you want   
    • we know how many panel parameter maps we need b/c it's the number of sample parameter maps   
    • pre-fill the fcs text for all the panel parameters, so don't have to type it
  • But, the panel builder was beginning to push the limits of Django's inlineformset_factory, lots of JavaScript & DOM manipulation

  • Started auditing for DB optimization for retrieving numerous nested relations

Apr 2013

  • Added object-level user permissions for project/site using django-guardian
  • Modify REST API to use new permissions system

May 2013

  • Added capability for admins to create & modify Markers (was called Antibody back then) and Fluorochromes
  • Added dedicated page for viewing a specific list of samples for a single site (later removed)
  • Added new REST API to retrieve "uncategorized" samples. I believe at this time we allowed uploading of samples without annotating them first. The annotation step was done after the upload.
  • Added new models for Specimen, SubjectGroup & SampleGroup. SampleGroup was removed some time later.

Jun 2013

  • Implemented REST API for Specimen model
  • Upgraded to Django 1.5...lots of related changes
  • Started experimenting with Travis Continuous Integration testing
  • Started the process management data models

Jul 2013

  • Started implementing process management as a separate Django app 'process_manager'
  • Created the models for the concept of a SampleSet
  • Created Worker model, kind of a special sub-classed Django User
  • The whole ProcessRequest form system started becoming quite complex, so this started down the road of using AJAX calls from the client-side JavaScript to the back-end REST API and avoiding the Django templating system. This was done using jQuery and involved lots of DOM manipulation, which was eventually led to the "web app" approach ReFlow uses today.

Aug 2013

  • Began work on REST APIs for the ReFlowWorker to poll for and request assignment of "viable" PRs.
  • Created "breadcrumb" navigation links at the top of all pages
  • Lots more work on the ProcessRequest models and request form
  • Had a large test phase for multiple labs in CIP

Sep 2013

  • After the testing by Cecile/Karo & other CIP labs, there was a lot of discussion regarding the concept of a "panel" in ReFlow and in flow cytometry in general. Cecile remarked that a panel for CIP proficiency should have flexibility for some parameters...the concept of the 2-tiered panel design arose from these discussions.
  • Massive overhaul to the panel design to incorporate a project panel that serves as a master template for specific site panel implementations. The site panel would be required to match all the parameters specified in the project panel, but there would be some flexibility for site panels to deviate if the project panel allowed.
  • The 2-tier approach will also address the 2 step upload / annotation, but requires a SitePanel be present prior to uploading samples.
  • Creating the new SitePanel became quite cumbersome, so added FCS file upload to server to read the FCS parameter names on the fly
  • Started incorporating lots of validation logic to handle panel template requirments as well as different requirements for Full Stain, FMO & Isotype control panel sub-types...things were getting quite complicated at this point.
  • Started adding jQuery table sorting to some HTML tables
  • Moved Stimulation model from global to project-specific

Oct 2013

  • New model for SampleMetadata
  • Added 'exclude' boolean field to Sample model for flagging some samples to not be processed
  • Added capability to edit SitePanels
  • Added "auto-population" of Site panel parameters based on FCS parameter text fields
  • Merged 'process_manager' app into main repository app
  • Added API to retrieve samples with primary key as the file name (used mostly for Worker caching)
  • Created dedicated 'subsample' field to store the sub-sampled events chosen during initial upload...to avoid duplicating the sub-sampling step for multiple PRs on the same sample
  • Work on compensation validation moved comps under site panel to avoid duplicates for each Sample (although at this point the Sample could still "override" the site panel's comp)
  • Complete re-write of ReFlow upload client GUI for the 2-step categorization and upload process
  • Essentially, what we referred to as ReFlow 2.0 is ready for testing

Nov 2013

  • Improved auto-population of site panel parameters
  • After the panel re-modeling, it was requested that ReFlow handle sample 'batches' so a new Cytometer model was created as well as an acquisition date field to both Sample and Compensation
  • Added pre-treatment and storage fields to Sample model
  • re-factored Antibody model to Marker
  • Added 'Null' parameter type choice to "ignore" some FCS parameters (this ends up being a trickier issue later as some "null" parameters may contribute to spillover and so we need to know what they are for applying compensation)
  • Initial fork of fcm library IO functions into FlowIO

Dec 2013

  • Enabled editing of compensation matrices
  • Improvements to REST API filters & adding some new endpoints
  • Reworked PR models
  • Work begins on ReFlowWorker code, where the real processing will happen

Jan 2014

  • Video demo of ReFlow: https://vimeo.com/84548862
  • Added ProcessRequestOutput model for storing processing results
  • Began looking into asinh for transform
  • Initial fork of fcm library stats functions into FlowStats
  • Modify ReFlowWorker to use new FlowStats package, 1st working pipeline achieved

Feb 2014

  • Began designing multi-phase testing with CIP labs, with the 1st phase testing the sample upload
  • Re-worked samples view with dynamic filtering
  • Lots of bug-fixes in prep for user testing
  • Lots of documentation written in prep for user testing
  • At the end of Feb, began looking into AngularJS to address shortcomings with Django templates for a more feature-rich, complex and dynamic web application. Even before the 1st test phase for uploading FCS files, began looking at Angular possibly replacing the Python/Tk upload GUI.

Mar 2014

  • First real testing of the automated analysis pipeline begins (by Cliburn and myself)
  • Bugfixes and updates to make ReFlowWorker more robust
  • 1st test phase with CIP labs finished, drafted and sent our summary which concluded that with the recent advancements in web standards (HTML5) that we would try a new web-app approach to uploading files and get away from the clunky upload GUI that required logging in from 2 applications to upload files.
  • Created multi-file upload AngularJS app...lots of playing around to figure out the best way to use Angular
  • Wrote FCS file parser in JavaScript to use in the FCS upload web-app
  • Update to Bootstrap 3
  • March 2014 is another major turning point in ReFlow development, from here a major overhaul will occur in migrating the entire site to a single page web application and abandoning any use of Django views or templates. The REST API is critical in allowing this to move forward.
  • Started converting process request to angular
  • Started test phase 2 with CIP labs, this time using the new web-based upload app

Apr 2014

  • Finished converting process request form to angular
  • Lots of updates to the REST API to make it more "complete" in terms of GET/POST/PUT in prep for coming angularization
  • Needed client-side validation for site panel creation for a better user experience even if it does duplicate server side Python functionality as client-side JavaScript
  • Finally replaces fcm in ReFlow with FlowIO
  • Created new models and separate angular upload app for Bead Samples
  • Update to angular 1.2.16
  • Separated worker code from ReFlowRESTClient repo to create a new ReFlowWorker git repository
  • Test phase 3 started for CIP labs, testing the creation of site panels

May 2014

  • Converted panel template create to an Angular app
  • Lots of changes/additions to panel template validation rules in prep for next test phase
  • Started investigating mode detection in compensation bead files to potentially create an automated compensation routine
  • Began evaluating the NVIDIA Jetson TK1 kit for use a ReFlow Worker
  • Test phase 4 started for CIP labs, testing the creation of panel templates

Jun 2014

  • Held a ReFlow teleconference with all CIP test groups
  • More work on comp beads for auto-comp...this ended up not working out as the comp bead data wasn't as "clean" as I thought it would be...the "signal" channel wasn't always the one with highest values
  • Started work on angularizing the entire ReFlow site under one giant single page web application...really large undertaking that involves porting all Python-based Django view code to JavaScript in AngularJS, as well as the Django template code to Angular markup.
  • Updated to angular 1.2.17
  • Added capability to add existing users to projects and better manage user permissions

Jul 2014

  • Finished angularization
  • Replaced glyphicions with font-awesome for site-wide icons
  • Convert breadcrumb navigation using URLS to angular-breadcrumb to use angular routes
  • Finally added delete functionality to nearly all models...a bit tricky because of hierarchy to notify user of everything that would be deleted

Aug 2014

  • Officially renamed any use of the term ProjectPanel to PanelTemplate
  • I believe we began drafting the ReFlow manuscript around this time

Sep 2014

  • Continued work on the ReFlow manuscript
  • Update ReFlow documentation
  • Started changes in prep for Django 1.7

Oct 2014

  • Started PR visualization for clustering output
    • Features: heat map, color coded clusters, cluster toggling, etc.
  • Updated dependency: d3 (3.4.13)
  • Update to Django 1.7 (no more South migrations!)
  • FlowIO updates to create FCS files from scratch, allows creation of test FCS data files
  • Add asinh transform to FlowUtils

Nov 2014

  • Allow specifying comp for PR and store used comp with PR
  • Added compensation APIs
  • More usability tweaks to PR viz (cluster toggling, disable animation, parallel plot)
  • consolidation of ng controller dependencies to a ModelService
  • added interface for project admins to manage project users

Dec 2014

  • continuation of ModelService conversion
  • interface for managing Workers
  • FCS upload optimizations: don't attempt upload of already uploaded files
  • update ReFlow REST client to retrieve sample cluster data & example for getting PR results as CSV

Jan 2015

  • option to view FCS metadata for uploaded samples
  • PR wizard: show summary of inputs in last step & don't select time or null parameters by default
  • PR detail: show list of samples chosen
  • added table sorting to most tables
  • create/manage new ReFlow users
  • allow users to change their password
  • re-design overall look & feel of entire site (cleaner & and more efficient use of space)
  • new model for cell subset labels and cluster labels for applying them to clusters

Feb 2015

  • remove deprecated angular dependency ui-select2, use ui-select instead
  • continue implementing labeling of clusters
  • major overhaul to panel design: no more parent templates, moving to "tag" based panel variants creating a panel "family" and greatly simplifying panel creation and validation
  • convert Marker and Fluoro models from global to project-level
  • Update ReFlowWorker to allow analysis of single samples

April 2015

  • Read compensation from drag & drop FCS files or from TSV text (like before)
  • better "auto" matching of sample annotation when uploading new FCS files
  • start overhaul of Process Request output & begin development of 2-stage processing
  • update REST client to remove ProcessRequestOutput calls & update SampleCluster POST
  • started updating ReFlowWorker to allow 2-stage clustering

May 2015

  • Continued development of 2-stage processing, now saving pis, mus, & sigmas of parent stage for any 2nd stage clustering requests
  • Update Process Request visualization for new ProcessRequest models and 2-stage processing input form
  • Tweaking default process request input parameters & user interface (don't allow selection of time and null parameters)
  • Allow creation of compensation model instances from already uploaded FCS samples from their $SPILL or $SPILLOVER metadata values
  • update REST client to support SampleClusterComponent API
  • finished major re-organization of ReFlowWorker code to allow 2-stage clustering

June 2015

  • Improved panel template & sample annotation validation, as well as some UI improvements
  • Removed old, unused compensation bead functionality
  • Color-coded labels for Process Request status & better display of PR status messages
  • Started work on downloading "clean" FCS files
  • updates to FlowIO to allow saving modified "clean" FCS files
  • Allow user to specify random seed and subsample count for 2nd stage requests
  • Improve exception handling in REST client & support reporting of PR errors
  • Update ReFlowWorker to allow using multiple GPUs on 1 machine to handle multiple ProcessRequest jobs in parallel
  • major overhaul for logging in ReFlowWorker

Aug 2015

  • ReFlowWorker: filter out negative scatter values prior to sub-sampling
  • ReFlowWorker: fix non-reproducible sub-sampling if PRs have different numbers of samples
  • ReFlowWorker: don't error out if enriched subsample count is less than subsample specified by PR

Sep 2015

  • changed default PR input values (again)
  • better display of Process Request cluster table (now scroll-able)
  • Implemented detail view of Process Requests results for all samples in tabular format, capable of being easily filtered and exported as CSV file
  • major dependency upgrade to Django REST Framework 3
  • dependency upgrade of AngularJS to 1.2.28
  • began investigating excessive memory use for FCS file uploads
  • removed Cytometer model
  • update FlowStats to be less aggresive at merging components by default

Oct 2015

  • After some significant profiling, finally get FCS samples to download faster
  • Improvements to the usability of the FCS upload file queue, auto-populating some Sample model fields from FCS file name & changing "magic" checkbox that initializes sample parameter modal to a button
  • Began updating documentation
  • REST client: remove cytometer API functions, support API for download sample cluster events

Nov 2015

  • Update to FlowStats to handle very large FCS data sets by chunking the classification step
  • Continued rewriting documentation, including auditing requirements and verifying the ReFlow Worker setup