File tree

The file tree looks a little like:

- cache*
-- <short_name>
--- processed
---- cleaned   <- Gold standard clean data
---- merged    <- Intermediate files
--- raw        <- Raw copies of cached data
---- doi
---- pubmed
---- scopus
---- zotero
--- geodata    <- Derived geocoded organisations
- config       <- Config files
- data*        <- Outputed csv files etc
-- <short_name>
- html*        <- Outputed html files
-- <short_name>
- logs*
- source        <- All the source code
-- add          <- Add extra metadata (geocode, citations etc)
-- analyse      <- Do some stats
-- bibliography <- Create bibliographic files
-- clean        <- Clean up the metadata
-- config       <- Parse the config file.
-- get          <- Get the metadata
-- networks     <- Generate author network graph
-- plots        <- Generate some plots of the exported data
-- setup        <- Build the file tree (deleting caches as appropriate)
-- web_pages    <- Make the static html files

Folders with a * next to them get generated when the program runs, and can be safely deleted. Although deleting the cache will make the next run take much longer.

Home

Introduction

Install and run

Reference

Misc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

File tree

Clone this wiki locally