Portage CEG Data Curation Survival Guide

This is the repository for the Portage Curation Expert Group Data Curation Survival Guide Data Curation Tools & Resources list.

How to add resources and update this site:

Submissions to the list are made through a Google Form, which populates a Google Sheet. Members of the Portage Curation Expert Group are responsible for reviewing, editing, and approving entries. Once an item is approved, it appears on the Data Curation Tools & Resources list.

Submitting a resource

If you'd like to contribute a resource or tool, perform the following steps:

Check the previous form responses to ensure that your tool/resource has not been previously submitted.
Submit new entries using this Google Form.
Submitted entries will populate this Google Sheet, awaiting final review and approval.

Reviewing submissions (CEG Admins)

When a new submission arrives, review the submitted information. Once you're ready to publish the entry to the Survival Guide, set the value in the Ready to Publish? field (Column I) to YES. This will make it appear in the To Publish tab. Entries can be unpublished by setting the value in the Ready to Publish? field (Column I) to NO

Some cleanup has been carried out automatically in the sheet:

Colons (":") and forward slashes ("/") are automatically removed from the Resource Title field and populated into the Resource Title (for export) field, as those special characters make Jekyll do funny things.
The field pagename is generated automatically,
Entries in the Tags field are automatically normalized and created in the Tags (for export) field.

Some considerations for cleanup:

Ensure that the following fields have the desired information. Edit them, if needed:
- Resource Title (for export) - if editing is required copy (Ctrl+c) and paste as plain text (Ctrl+Shift+v) before editing
- Resource URL - ensure that the link is formatted correctly and works
- Tags (for export) - should be cleaned up, though new entries may have arrived at the end of the list as an "Other" entry. It's good to review these entries every so often to determine whether another tag is required (it can be added to the Google Form question).
- Ensure the Resource Description (Markdown supported) field is written in Markdown-formatted text, and that the resource URL is included in the text (using proper syntax).

Updating the survival guide pages (Site Admins)

The Python3 script csv_to_jekyll_portageceg.py updates the posts/pages by doing the following:

download the newest version of the ToPublish tab (in .csv format)
remove all existing markdown (.md) files in the _posts/ directory.
create a new markdown file in the _posts/ directory for each row in the downloaded csv, with proper yaml front matter and body.

The full process to update the pages are as follows:

pull changes from the remote repository
run the csv_to_jekyll_portageceg.py script in a local repository
add, commit, and push the changes to this github repo.

As of 28-Feb-2020, this script is run automatically every 60 minutes. This is currently taking place on Jay's local machine on a test basis. More information on the setup for automatic updating is found at the bottom of this document, and a generic form of the bash script used is included in this repo as auto_update-example.sh.

This process can be run manually on any machine of a user with appropriate push rights.

Credit: The csv_to_jekyll_portageceg.py script expands upon the code created by Evan Lovely and described here.

How this site works

This site is rendered by Jekyll. Alex Gil has written a descriptive tutorial on Jekyll, but in a nutshell, Jekyll is a 'static site generator' that relies on templates, metadata, and simplified code.

Markdown
Front Matter
Liquid Tags
General Formatting

Markdown

GitHub pages are primarily written in Markdown--a simplified markup language that turns plain text into html. There are a number of different 'flavours' of Markdown; this site is written in Kramdown. Here's a Kramdown Cheat Sheet. Markdown files can be created right in GitHub, or written in your favourite text editor. They must be saved with the extension .md. (fun fact: this README is written in GitHub flavoured Markdown, which doesn't have as many options as Kramdown...especially when it comes to the table of contents. Make sure you specify Kramdown if you are searching outside tutorials!)

Front matter

Each page needs to start with YAML front matter. Basically, this is metadata that provides information to Jekyll on how to render the page. It looks like this:

---
layout: post
title:  Hello World!
date:   2018-10-04
categories: tutorial
tags: website
      jekyll
      library
      portage
permalink: /hello-world/
---

There are more options available for metadata at the link above, but I'll break down what's here.

layout
- this tells Jekyll which layout to use, based on pre-defined designs. The most common layouts will be page and post. Posts are saved for bloglike items and require a date in the filename, formatted like this: 2018-10-05-hosting.md. Pages dont need a date, but keep the filename simple and direct. These pages will rely on templates that provide items like menus and footers. The templates rarely need to be changed, but they often have names like default or layout.
title
- this is the title of the document. This title will appear in the header information (meaning it'll display the title in the browser tab). You can also 'call' on this title to have it display in the body of the page itself or in other places using something called a 'liquid tag'. More on that later :)
date
- this is the date the document was created. Like titles, this date field can be called on in the body of the text, and other places.
categories
- this describes the content of the page. There should be fewer categories than tags, and generally a page should only fit into one category, though multiple categories can be assigned. Categories can be called on to generate dynamic lists on other pages, populate menus, or just generally help to organize information. Categories can also be used to custommize the URL. For example, the URL for this page could be http://portage-ceg.github.io/tutorial/hello-world. If the page has no set category, the URL would look like this: http://portage-ceg.github.io/hello-world.
tags
- another way of describing the page. Pages can have multiple tags, just make sure to format them as they are in the example above. This site has a search function that uses tags. They're handy!
permalink
- this is a helpful feature which sets essentially removes the .html from the URL. Be very careful not to set the same permalink on multiple pages because it will break the site. It's good practice to make the permalink the same as the page title, or a shortened version. See the page URL in the above definition of categories.

Liquid Tags

This is probably the most complex part of Jekyll, but luckily, for a simple static site like this, most of the liquid tags have already been placed out-of-site in the templates. If you want to know more, here is some basic Jekyll documentation on liquid tags. (fun fact: the Liquid language was created by Shopify!)

One way of using liquid tags is to call information from the Front Matter. A tag like {page.title} in Markdown will display the title of the page. A tag like {page.date}will display the date listed in the Front Matter. Liquid tags can also be used for more complex coding operations, like 'for-if' loops.

General Formatting

Markdown is really easy. Rather than listing everything here, you can check out the Kramdown Cheat Sheet OR you can look at the raw output of this readme file to see how this page has been rendered. To access that, view the file here, and click on 'Raw' in the top right corner. You can inspect any page in this repository by looking a the raw contents. This can be especially useful when creating new pages, simply copy and paste from the raw output that you want to emulate, and then fill it in with your own text.

Process to create auto-updating pages:

What JB did to autoschedule updates:

created bash file auto_update.sh
- made executable: chmod +x auto_update.sh
- see auto_update-example.sh as an example.
made python script executable
- chmod +x csv_to_jekyll_portageceg.py
created ssh key on local PC. Connected to GitHub.
cronjobbed auto_update.sh to run every 60 minutes
added to /etc/crontab as root: 0 * * * * <username> /<path_to_local_repo>/auto_update.sh >> /<path_to_local_repo>/cronlog_autoupdater.txt

Name		Name	Last commit message	Last commit date
Latest commit History 713 Commits
_data		_data
_drafts		_drafts
_includes		_includes
_layouts		_layouts
_sass		_sass
assets		assets
css		css
en		en
fr		fr
js		js
.gitignore		.gitignore
404.html		404.html
Data curation survival guide - ToPublish.csv		Data curation survival guide - ToPublish.csv
Gemfile		Gemfile
Gemfile.lock		Gemfile.lock
README.md		README.md
_config.yml		_config.yml
_migration_plan.md		_migration_plan.md
auto_update-example.sh		auto_update-example.sh
code-favicon.png		code-favicon.png
csv_to_jekyll_portageceg.py		csv_to_jekyll_portageceg.py
index.html		index.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Portage CEG Data Curation Survival Guide

How to add resources and update this site:

Submitting a resource

Reviewing submissions (CEG Admins)

Updating the survival guide pages (Site Admins)

How this site works

Markdown

Front matter

Liquid Tags

General Formatting

Process to create auto-updating pages:

About

Releases

Packages

Languages

ecclary/portage-ceg.github.io

Folders and files

Latest commit

History

Repository files navigation

Portage CEG Data Curation Survival Guide

How to add resources and update this site:

Submitting a resource

Reviewing submissions (CEG Admins)

Updating the survival guide pages (Site Admins)

How this site works

Markdown

Front matter

Liquid Tags

General Formatting

Process to create auto-updating pages:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages