diff --git a/docs/categories/index.xml b/docs/categories/index.xml
index d754d1a..02b45be 100644
--- a/docs/categories/index.xml
+++ b/docs/categories/index.xml
@@ -1,10 +1,11 @@
- Categories on Sifter
- https://bmeg.github.io/sifter/categories/
- Recent content in Categories on Sifter
+ Categories on
+ /categories/
+ Recent content in Categories on Hugo -- gohugo.io
- en-us
+ en
+
diff --git a/docs/index.xml b/docs/index.xml
index 2b517ba..86fd22d 100644
--- a/docs/index.xml
+++ b/docs/index.xml
@@ -1,338 +1,11 @@
- Sifter
- https://bmeg.github.io/sifter/
- Recent content on Sifter
+
+ /
+ Recent content on Hugo -- gohugo.io
- en-us
-
- accumulate
- https://bmeg.github.io/sifter/docs/transforms/accumulate/
- Mon, 01 Jan 0001 00:00:00 +0000
-
- https://bmeg.github.io/sifter/docs/transforms/accumulate/
- accumulate Gather sequential rows into a single record, based on matching a field
-Parameters name Type Description field string (field path) Field used to match rows dest string field to store accumulated records Example - accumulate: field: model_id dest: rows
-
-
-
- avroLoad
- https://bmeg.github.io/sifter/docs/inputs/avroload/
- Mon, 01 Jan 0001 00:00:00 +0000
-
- https://bmeg.github.io/sifter/docs/inputs/avroload/
- avroLoad Load an AvroFile
-Parameters name Description input Path to input file
-
-
-
- clean
- https://bmeg.github.io/sifter/docs/transforms/clean/
- Mon, 01 Jan 0001 00:00:00 +0000
-
- https://bmeg.github.io/sifter/docs/transforms/clean/
- clean Remove fields that don’t appear in the desingated list.
-Parameters name Type Description fields [] string Fields to keep removeEmpty bool Fields with empty values will also be removed storeExtra string Field name to store removed fields Example - clean: fields: - id - synonyms
-
-
-
- debug
- https://bmeg.github.io/sifter/docs/transforms/debug/
- Mon, 01 Jan 0001 00:00:00 +0000
-
- https://bmeg.github.io/sifter/docs/transforms/debug/
- debug Print out copy of stream to logging
-Parameters name Type Description label string Label for log output format bool Use multiline spaced output Example - debug: {}
-
-
-
- distinct
- https://bmeg.github.io/sifter/docs/transforms/distinct/
- Mon, 01 Jan 0001 00:00:00 +0000
-
- https://bmeg.github.io/sifter/docs/transforms/distinct/
- distinct Using templated value, allow only the first record for each distinct key
-Parameters name Type Description value string Key used for distinct value Example - distinct: value: "{{row.key}}"
-
-
-
- embedded
- https://bmeg.github.io/sifter/docs/inputs/embedded/
- Mon, 01 Jan 0001 00:00:00 +0000
-
- https://bmeg.github.io/sifter/docs/inputs/embedded/
- embedded Load data from embedded structure
-Example inputs: data: embedded: - { "name" : "Alice", "age": 28 } - { "name" : "Bob", "age": 27 }
-
-
-
- emit
- https://bmeg.github.io/sifter/docs/transforms/emit/
- Mon, 01 Jan 0001 00:00:00 +0000
-
- https://bmeg.github.io/sifter/docs/transforms/emit/
- emit Send data to output file. The naming of the file is outdir/script name.pipeline name.emit name.json.gz
-Parameters name Type Description name string Name of emit value example - emit: name: protein_compound_association
-
-
-
- Example
- https://bmeg.github.io/sifter/docs/example/
- Mon, 01 Jan 0001 00:00:00 +0000
-
- https://bmeg.github.io/sifter/docs/example/
- Example Pipeline Our first task will be to convert a ZIP code TSV into a set of county level entries.
-The input file looks like:
-ZIP,COUNTYNAME,STATE,STCOUNTYFP,CLASSFP 36003,Autauga County,AL,01001,H1 36006,Autauga County,AL,01001,H1 36067,Autauga County,AL,01001,H1 36066,Autauga County,AL,01001,H1 36703,Autauga County,AL,01001,H1 36701,Autauga County,AL,01001,H1 36091,Autauga County,AL,01001,H1 First is the header of the pipeline. This declares the unique name of the pipeline and it’s output directory.
-name: zipcode_map outdir: ./ docs: Converts zipcode TSV into graph elements Next the configuration is declared.
-
-
-
- fieldParse
- https://bmeg.github.io/sifter/docs/transforms/fieldparse/
- Mon, 01 Jan 0001 00:00:00 +0000
-
- https://bmeg.github.io/sifter/docs/transforms/fieldparse/
-
-
-
-
- fieldProcess
- https://bmeg.github.io/sifter/docs/transforms/fieldprocess/
- Mon, 01 Jan 0001 00:00:00 +0000
-
- https://bmeg.github.io/sifter/docs/transforms/fieldprocess/
- fieldProcess Create stream of objects based on the contents of a field. If the selected field is an array each of the items in the array will become an independent row.
-Parameters name Type Description field string Name of field to be processed mapping map[string]string Project templated values into child element itemField string If processing an array of non-dict elements, create a dict as {itemField:element} example - fieldProcess: field: portions mapping: sample: "{{row.
-
-
-
- fieldType
- https://bmeg.github.io/sifter/docs/transforms/fieldtype/
- Mon, 01 Jan 0001 00:00:00 +0000
-
- https://bmeg.github.io/sifter/docs/transforms/fieldtype/
- fieldType Set field to specific type, ie cast as float or integer
-example - fieldType: t_depth: int t_ref_count: int t_alt_count: int n_depth: int n_ref_count: int n_alt_count: int start: int
-
-
-
- filter
- https://bmeg.github.io/sifter/docs/transforms/filter/
- Mon, 01 Jan 0001 00:00:00 +0000
-
- https://bmeg.github.io/sifter/docs/transforms/filter/
- filter Filter rows in stream using a number of different methods
-Parameters name Type Description field string (field path) Field used to match rows value string (template string) Template string to match against match string String to match against check string How to check value, ’exists’ or ‘hasValue’ method string Method name python string Python code string gpython string Python code string run using (https://github.com/go-python/gpython) Example Field based match
-- filter: field: table match: source_statistics Check based match
-
-
-
- from
- https://bmeg.github.io/sifter/docs/transforms/from/
- Mon, 01 Jan 0001 00:00:00 +0000
-
- https://bmeg.github.io/sifter/docs/transforms/from/
- from Parmeters Name of data source
-Example inputs: profileReader: tableLoad: input: "{{config.profiles}}" pipelines: profileProcess: - from: profileReader
-
-
-
- glob
- https://bmeg.github.io/sifter/docs/inputs/glob/
- Mon, 01 Jan 0001 00:00:00 +0000
-
- https://bmeg.github.io/sifter/docs/inputs/glob/
- glob Scan files using * based glob statement and open all files as input.
-Parameters Name Description storeFilename Store value of filename in parameter each row input Path of avro object file to transform xmlLoad xmlLoad configutation tableLoad Run transform pipeline on a TSV or CSV jsonLoad Run a transform pipeline on a multi line json file avroLoad Load data from avro file Example inputs: pubmedRead: glob: input: "{{config.baseline}}/*.xml.gz" xmlLoad: {}
-
-
-
- graphBuild
- https://bmeg.github.io/sifter/docs/transforms/graphbuild/
- Mon, 01 Jan 0001 00:00:00 +0000
-
- https://bmeg.github.io/sifter/docs/transforms/graphbuild/
- graphBuild Build graph elements from JSON objects using the JSON Schema graph extensions.
-example - graphBuild: schema: "{{config.allelesSchema}}" title: Allele
-
-
-
- gripperLoad
- https://bmeg.github.io/sifter/docs/inputs/gripperload/
- Mon, 01 Jan 0001 00:00:00 +0000
-
- https://bmeg.github.io/sifter/docs/inputs/gripperload/
-
-
-
-
- hash
- https://bmeg.github.io/sifter/docs/transforms/hash/
- Mon, 01 Jan 0001 00:00:00 +0000
-
- https://bmeg.github.io/sifter/docs/transforms/hash/
- hash Parameters name Type Description field string Field to store hash value value string Templated string of value to be hashed method string Hashing method: sha1/sha256/md5 example - hash: value: "{{row.contents}}" field: contents-sha1 method: sha1
-
-
-
- Inputs
- https://bmeg.github.io/sifter/docs/inputs/
- Mon, 01 Jan 0001 00:00:00 +0000
-
- https://bmeg.github.io/sifter/docs/inputs/
- Every playbook consists of a series of inputs.
-
-
-
- jsonLoad
- https://bmeg.github.io/sifter/docs/inputs/jsonload/
- Mon, 01 Jan 0001 00:00:00 +0000
-
- https://bmeg.github.io/sifter/docs/inputs/jsonload/
- jsonLoad Load data from a JSON file. Default behavior expects a single dictionary per line. Each line is a seperate entry. The multiline parameter reads all of the lines of the files and returns a single object.
-Parameters name Description input Path of JSON file to transform multiline Load file as a single multiline JSON object Example inputs: caseData: jsonLoad: input: "{{config.casesJSON}}"
-
-
-
- lookup
- https://bmeg.github.io/sifter/docs/transforms/lookup/
- Mon, 01 Jan 0001 00:00:00 +0000
-
- https://bmeg.github.io/sifter/docs/transforms/lookup/
- lookup Using key from current row, get values from a reference source
-Parameters name Type Description replace string (field path) Field to replace lookup string (template string) Key to use for looking up data copy map[string]string Copy values from record that was found by lookup. The Key/Value record uses the Key as the destination field and copies the field from the retrieved records using the field named in Value tsv TSVTable TSV translation table file json JSONTable JSON data file table LookupTable Inline lookup table pipeline PipelineLookup Use output of a pipeline as a lookup table Example JSON file based lookup The JSON file defined by config.
-
-
-
- map
- https://bmeg.github.io/sifter/docs/transforms/map/
- Mon, 01 Jan 0001 00:00:00 +0000
-
- https://bmeg.github.io/sifter/docs/transforms/map/
- map Run function on every row
-Parameters name Description method Name of function to call python Python code to be run gpython Python code to be run using GPython Example - map: method: response gpython: | def response(x): s = sorted(x["curve"].items(), key=lambda x:float(x[0])) x['dose_um'] = [] x['response'] = [] for d, r in s: try: dn = float(d) rn = float(r) x['dose_um'].append(dn) x['response'].append(rn) except ValueError: pass return x
-
-
-
- objectValidate
- https://bmeg.github.io/sifter/docs/transforms/objectvalidate/
- Mon, 01 Jan 0001 00:00:00 +0000
-
- https://bmeg.github.io/sifter/docs/transforms/objectvalidate/
- objectValidate Use JSON schema to validate row contents
-parameters name Type Description title string Title of object to use for validation schema string Path to JSON schema definition example - objectValidate: title: Aliquot schema: "{{config.schema}}"
-
-
-
- Overview
- https://bmeg.github.io/sifter/docs/
- Mon, 01 Jan 0001 00:00:00 +0000
-
- https://bmeg.github.io/sifter/docs/
- Sifter pipelines Sifter pipelines process steams of nested JSON messages. Sifter comes with a number of file extractors that operate as inputs to these pipelines. The pipeline engine connects togeather arrays of transform steps into direct acylic graph that is processed in parallel.
-Example Message:
-{ "firstName" : "bob", "age" : "25" "friends" : [ "Max", "Alex"] } Once a stream of messages are produced, that can be run through a transform pipeline.
-
-
-
- Pipeline Steps
- https://bmeg.github.io/sifter/docs/transforms/
- Mon, 01 Jan 0001 00:00:00 +0000
-
- https://bmeg.github.io/sifter/docs/transforms/
- Transforms alter the data
-
-
-
- project
- https://bmeg.github.io/sifter/docs/transforms/project/
- Mon, 01 Jan 0001 00:00:00 +0000
-
- https://bmeg.github.io/sifter/docs/transforms/project/
- project Populate row with templated values
-parameters name Type Description mapping map[string]any New fields to be generated from template rename map[string]string Rename field (no template engine) Example - project: mapping: type: sample id: "{{row.sample_id}}"
-
-
-
- reduce
- https://bmeg.github.io/sifter/docs/transforms/reduce/
- Mon, 01 Jan 0001 00:00:00 +0000
-
- https://bmeg.github.io/sifter/docs/transforms/reduce/
- reduce Using key from rows, reduce matched records into a single entry
-Parameters name Type Description field string (field path) Field used to match rows method string Method name python string Python code string gpython string Python code string run using (https://github.com/go-python/gpython) init map[string]any Data to use for first reduce Example - reduce: field: dataset_name method: merge init: { "compounds" : [] } gpython: | def merge(x,y): x["compounds"] = list(set(y["compounds"]+x["compounds"])) return x
-
-
-
- regexReplace
- https://bmeg.github.io/sifter/docs/transforms/regexreplace/
- Mon, 01 Jan 0001 00:00:00 +0000
-
- https://bmeg.github.io/sifter/docs/transforms/regexreplace/
-
-
-
-
- Sifter Pipeline File
- https://bmeg.github.io/sifter/docs/playbook/
- Mon, 01 Jan 0001 00:00:00 +0000
-
- https://bmeg.github.io/sifter/docs/playbook/
- Pipeline File An sifter pipeline file is in YAML format and describes an entire processing pipelines. If is composed of the following sections: config, inputs, pipelines, outputs. In addition, for tracking, the file will also include name and class entries.
-class: sifter name: <script name> outdir: <where output files should go, relative to this file> config: <config key>: <config value> <config key>: <config value> # values that are referenced in pipeline parameters for # files will be treated like file paths and be # translated to full paths inputs: <input name>: <input driver>: <driver config> pipelines: <pipeline name>: # all pipelines must start with a from step - from: <name of input or pipeline> - <transform name>: <transform parameters> outputs: <output name>: <output driver>: <driver config>
-
-
-
- split
- https://bmeg.github.io/sifter/docs/transforms/split/
- Mon, 01 Jan 0001 00:00:00 +0000
-
- https://bmeg.github.io/sifter/docs/transforms/split/
- split Split a field using string sep
-Parameters name Type Description field string Field to the split sep string String to use for splitting Example - split: field: methods sep: ";"
-
-
-
- sqldump
- https://bmeg.github.io/sifter/docs/inputs/sqldump/
- Mon, 01 Jan 0001 00:00:00 +0000
-
- https://bmeg.github.io/sifter/docs/inputs/sqldump/
- sqlDump Scan file produced produced from sqldump.
-Parameters Name Type Description input string Path to the SQL dump file tables []string Names of tables to read out Example inputs: database: sqldumpLoad: input: "{{config.sql}}" tables: - cells - cell_tissues - dose_responses - drugs - drug_annots - experiments - profiles
-
-
-
- sqliteLoad
- https://bmeg.github.io/sifter/docs/inputs/sqliteload/
- Mon, 01 Jan 0001 00:00:00 +0000
-
- https://bmeg.github.io/sifter/docs/inputs/sqliteload/
- sqliteLoad Extract data from an sqlite file
-Parameters Name Type Description input string Path to the SQLite file query string SQL select statement based input Example inputs: sqlQuery: sqliteLoad: input: "{{config.sqlite}}" query: "select * from drug_mechanism as a LEFT JOIN MECHANISM_REFS as b on a.MEC_ID=b.MEC_ID LEFT JOIN TARGET_COMPONENTS as c on a.TID=c.TID LEFT JOIN COMPONENT_SEQUENCES as d on c.COMPONENT_ID=d.COMPONENT_ID LEFT JOIN MOLECULE_DICTIONARY as e on a.MOLREGNO=e.MOLREGNO"
-
-
-
- tableLoad
- https://bmeg.github.io/sifter/docs/inputs/tableload/
- Mon, 01 Jan 0001 00:00:00 +0000
-
- https://bmeg.github.io/sifter/docs/inputs/tableload/
- tableLoad Extract data from tabular file, includiong TSV and CSV files.
-Parameters Name Type Description input string File to be transformed rowSkip int Number of header rows to skip columns []string Manually set names of columns extraColumns string Columns beyond originally declared columns will be placed in this array sep string Separator \t for TSVs or , for CSVs Example config: gafFile: ../../source/go/goa_human.gaf.gz inputs: gafLoad: tableLoad: input: "{{config.gafFile}}" columns: - db - id - symbol - qualifier - goID - reference - evidenceCode - from - aspect - name - synonym - objectType - taxon - date - assignedBy - extension - geneProduct
-
-
-
- xmlLoad
- https://bmeg.github.io/sifter/docs/inputs/xmlload/
- Mon, 01 Jan 0001 00:00:00 +0000
-
- https://bmeg.github.io/sifter/docs/inputs/xmlload/
- xmlLoad Load an XML file
-Parameters name Description input Path to input file Example inputs: loader: xmlLoad: input: "{{config.xmlPath}}"
-
-
+ en
+
diff --git a/docs/sitemap.xml b/docs/sitemap.xml
index 63f0281..cd5ab70 100644
--- a/docs/sitemap.xml
+++ b/docs/sitemap.xml
@@ -2,79 +2,10 @@
- https://bmeg.github.io/sifter/
- 0
+ /
- https://bmeg.github.io/sifter/docs/transforms/accumulate/
+ /categories/
- https://bmeg.github.io/sifter/docs/inputs/avroload/
-
- https://bmeg.github.io/sifter/categories/
-
- https://bmeg.github.io/sifter/docs/transforms/clean/
-
- https://bmeg.github.io/sifter/docs/transforms/debug/
-
- https://bmeg.github.io/sifter/docs/transforms/distinct/
-
- https://bmeg.github.io/sifter/docs/
-
- https://bmeg.github.io/sifter/docs/inputs/embedded/
-
- https://bmeg.github.io/sifter/docs/transforms/emit/
-
- https://bmeg.github.io/sifter/docs/example/
-
- https://bmeg.github.io/sifter/docs/transforms/fieldparse/
-
- https://bmeg.github.io/sifter/docs/transforms/fieldprocess/
-
- https://bmeg.github.io/sifter/docs/transforms/fieldtype/
-
- https://bmeg.github.io/sifter/docs/transforms/filter/
-
- https://bmeg.github.io/sifter/docs/transforms/from/
-
- https://bmeg.github.io/sifter/docs/inputs/glob/
-
- https://bmeg.github.io/sifter/docs/transforms/graphbuild/
-
- https://bmeg.github.io/sifter/docs/inputs/gripperload/
-
- https://bmeg.github.io/sifter/docs/transforms/hash/
-
- https://bmeg.github.io/sifter/docs/inputs/
-
- https://bmeg.github.io/sifter/docs/inputs/jsonload/
-
- https://bmeg.github.io/sifter/docs/transforms/lookup/
-
- https://bmeg.github.io/sifter/docs/transforms/map/
-
- https://bmeg.github.io/sifter/docs/transforms/objectvalidate/
-
- https://bmeg.github.io/sifter/docs/
-
- https://bmeg.github.io/sifter/docs/transforms/
-
- https://bmeg.github.io/sifter/docs/transforms/project/
-
- https://bmeg.github.io/sifter/docs/transforms/reduce/
-
- https://bmeg.github.io/sifter/docs/transforms/regexreplace/
-
- https://bmeg.github.io/sifter/docs/playbook/
-
- https://bmeg.github.io/sifter/docs/transforms/split/
-
- https://bmeg.github.io/sifter/docs/inputs/sqldump/
-
- https://bmeg.github.io/sifter/docs/inputs/sqliteload/
-
- https://bmeg.github.io/sifter/docs/inputs/tableload/
-
- https://bmeg.github.io/sifter/tags/
-
- https://bmeg.github.io/sifter/docs/inputs/xmlload/
+ /tags/
diff --git a/docs/tags/index.xml b/docs/tags/index.xml
index 54ec252..99ec5b4 100644
--- a/docs/tags/index.xml
+++ b/docs/tags/index.xml
@@ -1,10 +1,11 @@
- Tags on Sifter
- https://bmeg.github.io/sifter/tags/
- Recent content in Tags on Sifter
+ Tags on
+ /tags/
+ Recent content in Tags on Hugo -- gohugo.io
- en-us
+ en
+
diff --git a/website/content/docs.md b/website/content/docs.md
index 0a5c479..d047861 100644
--- a/website/content/docs.md
+++ b/website/content/docs.md
@@ -11,12 +11,12 @@ menu:
Sifter pipelines process steams of nested JSON messages. Sifter comes with a number of
file extractors that operate as inputs to these pipelines. The pipeline engine
-connects togeather arrays of transform steps into direct acylic graph that is processed
+connects togeather arrays of transform steps into directed acylic graph that is processed
in parallel.
Example Message:
-```
+```json
{
"firstName" : "bob",
"age" : "25"
@@ -37,3 +37,109 @@ be done in a transform pipeline these include:
- Table based field translation
- Outputing the message as a JSON Schema checked object
+
+# Script structure
+
+## Header
+Each sifter file starts with a set of field to let the software know this is a sifter script, and not some random YAML file. There is also a `name` field for the script. This name will be used for output file creation and logging. Finally, there is an `outdir` that defines the directory where all output files will be placed. All paths are relative to the script file, so the `outdir` set to `my-results` will create the directory `my-results` in the same directory as the script file, regardless of where the sifter command is invoked.
+```yaml
+class : sifter
+name:
+outdir:
+```
+
+# Config and templating
+The `config` section is a set of defined keys that are used throughout the rest of the script.
+
+Example config:
+```
+config:
+ sqlite: ../../source/chembl/chembl_33/chembl_33_sqlite/chembl_33.db
+ uniprot2ensembl: ../../tables/uniprot2ensembl.tsv
+ schema: ../../schema/
+```
+
+Various fields in the script file will be be parsed using a [Mustache](https://mustache.github.io/) template engine. For example, to access the various values within the config block, the template `{{config.sqlite}}`.
+
+
+# Inputs
+The input block defines the various data extractors that will be used to open resources and create streams of JSON messages for processing. The possible input engines include:
+ - AVRO
+ - JSON
+ - XML
+ - SQL-dump
+ - SQLite
+ - TSV/CSV
+ - GLOB
+
+For any other file types, there is also a plugin option to allow the user to call their own code for opening files.
+
+# Pipeline
+The `pipelines` defined a set of named processing pipelines that can be used to transform data. Each pipeline starts with a `from` statement that defines where data comes from. It then defines a linear set of transforms that are chained togeather to do processing. Pipelines may used `emit` steps to output messages to disk. The possible data transform steps include:
+- Accumulate
+- Clean
+- Distinct
+- DropNull
+- Field Parse
+- Field Process
+- Field Type
+- Filter
+- FlatMap
+- GraphBuild
+- Hash
+- JSON Parse
+- Lookup
+- Value Mapping
+- Object Validation
+- Project
+- Reduce
+- Regex
+- Split
+- UUID Generation
+
+Additionally, users are able to define their one transform step types using the `plugin` step.
+
+# Example script
+```yaml
+class: sifter
+
+name: go
+outdir: ../../output/go/
+
+config:
+ oboFile: ../../source/go/go.obo
+ schema: ../../schema
+
+inputs:
+ oboData:
+ plugin:
+ commandLine: ../../util/obo_reader.py {{config.oboFile}}
+
+pipelines:
+ transform:
+ - from: oboData
+ - project:
+ mapping:
+ submitter_id: "{{row.id[0]}}"
+ case_id: "{{row.id[0]}}"
+ id: "{{row.id[0]}}"
+ go_id: "{{row.id[0]}}"
+ project_id: "gene_onotology"
+ namespace: "{{row.namespace[0]}}"
+ name: "{{row.name[0]}}"
+ - map:
+ method: fix
+ gpython: |
+ def fix(row):
+ row['definition'] = row['def'][0].strip('"')
+ if 'xref' not in row:
+ row['xref'] = []
+ if 'synonym' not in row:
+ row['synonym'] = []
+ return row
+ - objectValidate:
+ title: GeneOntologyTerm
+ schema: "{{config.schema}}"
+ - emit:
+ name: term
+```
\ No newline at end of file
diff --git a/website/content/docs/inputs/plugin.md b/website/content/docs/inputs/plugin.md
new file mode 100644
index 0000000..3b560e3
--- /dev/null
+++ b/website/content/docs/inputs/plugin.md
@@ -0,0 +1,74 @@
+---
+title: plugin
+menu:
+ main:
+ parent: inputs
+ weight: 100
+---
+
+# plugin
+Run user program for customized data extraction.
+
+## Example
+
+```yaml
+inputs:
+ oboData:
+ plugin:
+ commandLine: ../../util/obo_reader.py {{config.oboFile}}
+```
+
+The plugin program is expected to output JSON messages, one per line, to STDOUT that will then
+be passed to the transform pipelines.
+
+## Example Plugin
+The `obo_reader.py` plugin, it reads a OBO file, such as the kind the describe the GeneOntology, and emits the
+records as single line JSON messages.
+```python
+ #!/usr/bin/env python
+
+import re
+import sys
+import json
+
+re_section = re.compile(r'^\[(.*)\]')
+re_field = re.compile(r'^(\w+): (.*)$')
+
+def obo_parse(handle):
+ rec = None
+ for line in handle:
+ res = re_section.search(line)
+ if res:
+ if rec is not None:
+ yield rec
+ rec = None
+ if res.group(1) == "Term":
+ rec = {"type": res.group(1)}
+ else:
+ if rec is not None:
+ res = re_field.search(line)
+ if res:
+ key = res.group(1)
+ val = res.group(2)
+ val = re.split(" ! | \(|\)", val)
+ val = ":".join(val[0:3])
+ if key in rec:
+ rec[key].append(val)
+ else:
+ rec[key] = [val]
+
+ if rec is not None:
+ yield rec
+
+
+def unquote(s):
+ res = re.search(r'"(.*)"', s)
+ if res:
+ return res.group(1)
+ return s
+
+
+with open(sys.argv[1]) as handle:
+ for rec in obo_parse(handle):
+ print(json.dumps(rec))
+```
\ No newline at end of file
diff --git a/website/content/docs/inputs/gripperLoad.md b/website/content/docs/transforms/flatmap.md
similarity index 50%
rename from website/content/docs/inputs/gripperLoad.md
rename to website/content/docs/transforms/flatmap.md
index ac738a1..c880db2 100644
--- a/website/content/docs/inputs/gripperLoad.md
+++ b/website/content/docs/transforms/flatmap.md
@@ -1,7 +1,7 @@
---
-title: gripperLoad
+title: flatMap
menu:
main:
- parent: inputs
+ parent: transforms
weight: 100
---
diff --git a/website/content/docs/transforms/plugin.md b/website/content/docs/transforms/plugin.md
new file mode 100644
index 0000000..09cbf31
--- /dev/null
+++ b/website/content/docs/transforms/plugin.md
@@ -0,0 +1,7 @@
+---
+title: plugin
+menu:
+ main:
+ parent: transforms
+ weight: 100
+---
diff --git a/website/content/docs/transforms/tableWrite.md b/website/content/docs/transforms/tableWrite.md
new file mode 100644
index 0000000..7a39648
--- /dev/null
+++ b/website/content/docs/transforms/tableWrite.md
@@ -0,0 +1,7 @@
+---
+title: tableWrite
+menu:
+ main:
+ parent: transforms
+ weight: 100
+---
diff --git a/website/content/docs/transforms/uuid.md b/website/content/docs/transforms/uuid.md
new file mode 100644
index 0000000..0127b15
--- /dev/null
+++ b/website/content/docs/transforms/uuid.md
@@ -0,0 +1,7 @@
+---
+title: uuid
+menu:
+ main:
+ parent: transforms
+ weight: 100
+---