Gee Fu is an application that holds Gene Feature data. It has been designed with the needs of researchers wanting to keep, share and annotate sequence and feature data. Gee Fu is a Ruby on Rails based RESTful web-service application that stores and serves sequence assembly and genome feature data on request.
GeeFu can be used as a base that can be easily extended into fuller customised web-applications using the powerful Rails framework.
Gee Fu is ideally suited to serving large amounts of data such as those from high-throughput sequencing experiments via bio-samtools and BAM files.
Gee Fu is capable of receiving and handling requests from AnnoJ , a web service based viewing engine for genomic data. It can return JSON data which AnnoJ is able to render. We anticipate being able to serve up data in formats suitable for different applications as development progresses and we become aware of other rendering engines and web services that request data.
- Install rvm using:
\curl -L https://get.rvm.io | bash -s stable --rails --autolibs=enabled --ruby=1.9.3-p286
- Install redis (on OSX with homebrew use)
brew install redis
- Follow instructions given by
brew info redis
to have launchd start redis at login - Install git (on OSX with homebrew use)
brew install git
- Install postgres (on OSX use PostgresApp)
- Clone the GeeFU repo
git clone git://github.com/danmaclean/gee_fu.git .
cd gee_fu
- Accept the .rvmrc as trusted
- Install bundler
gem install bundler
- Setup the repo
bundle install
- Signup for a Mandrill account and grab an API key
- Create a .env file in the root folder with:
EMAIL_SENDER=My Name <[email protected]>
[email protected]
MANDRILL_APIKEY=[insert API key]
- Create a Postgres user called gee_fu with no password
- Change the attributes of Postgres user gee_fu to allow database creation
ALTER USER gee_fu CREATEDB;
- Run
rake db:setup
- Follow the instructions at README_FOR_DATA.mdown
- Start the gee-fu server with
foreman start
- Visit http://localhost:5000
- Once you've signed up as the first user use
rake admin:set [email protected]
to set your account as an admin to add Organisms
The AnnoJ view requires a JSON format configuration file, to tell it about the genome and the service etc. Gee Fu allows you to maintain a YAML file and convert it to the RAILS_ROOT/public/javascripts/config.js
AnnoJ uses. Edit the file RAILS_ROOT/config/config.yml
to set up the AnnoJ view.
The most important are the tracks stanzas, this is where you tell AnnoJ the tracks that should be available in your browser.
There are 3 track types:
- ModelsTrack, for aggregated transcript style gene models
- RepeatsTrack, for strandless block objects like repeats and
- ReadsTrack, for high throughput se- quencing reads that will be be shown at various zoom-levels. The reference sequence Gee Fu provides is sent to AnnoJ as if it is a big read. Here are the different configuration options for the rest
id : a numeric unique id for the track
name: free text name
type : one of ModelsTrack , RepeatsTrack , ReadsTrack
data : /features/[annoj|chromosome]/experiment_id
showControls : 'true' to see the individual lane control button or leave absent
height : 80 start up height of the track in pixels
minHeight : 20 height of track when minimised
maxHeight : 60 # height of track when maximised
single : # 'true' if the track is strandless otherwise absent
AnnoJ requires certain metadata to understand what it is that it is supposed to be rendering. This is required at a genome and individual track level. When you create an experiment or genome you have the option to add a yml metadata file. As a minimum you should include the following information for your genome:
−−−
institution :
name: Your Institute
logo : images/ institute_logo .png
url : http ://www.your_place.etc
service :
format : Unspecified
title : Sample Gee Fu served browser (TAIR 9 Gene Models)
server : Unspecified
version : Vers 1
access: public
description: Free text description of the service
engineer:
name: Mick Jagger
email : [email protected]
genome:
version : TAIR9
description : Chr1 from TAIR9
species : Arabidopsis thaliana
The schema is very straightforward and easily extended. It consists of a central Features table and a many to many join table Parents that indicates which features are parents (according to their gff records) of which other features.
You can extend the database exactly as if it were any other Rails application. See the Rails documentation for conventions for creating and naming new tables, Rails prefers convention over configuration so you should pay attention to these. Adding new functionality to the web app is covered by the same documentation.
If you get really frustrated with the software, feel free to complain to me [email protected]
. A lot of your initial problems will be answered in the Rails community pages, please look there if your problem looks like it might be related to Rails more directly than this particular instance of a Rails app.