Skip to content

Transition Guide

Jon Quandt edited this page Nov 1, 2018 · 3 revisions

In December 2018, FDsys will be retired and govinfo will be the only interface to retrieve newly added or updated content and metadata. This document provides examples of how to migrate to govinfo tools in order to get the most recently added or updated content.

FDsys and govinfo are two different interfaces to the same repository of content and metadata.

This guide is specifically to help developers and data partners understand options for accessing govinfo content and metadata.

Below are some examples for retrieving Congressional Bills and related Bill Status information from govinfo. Similar processes could be used to retrieve other collections.

Retrieving Congressional Bills data

There are two main options for programmatically retrieving content and metadata from the govinfo site:

  1. Use the govinfo collections and packages services - Swagger docs here. See the API tour for a decent overview.

  2. Replace the gpo sitemaps with the equivalent govinfo sitemaps and package url links

    • govinfo is an AJAX-driven site, so data users should consider constructing links to the required formats based on the package ID in the sitemaps

govinfo API

You'll need an api.data.gov key - We recommend passing that key to the api via the X-Api-key header. Examples in this document add a demo key to the URL for ease of use.

Base request to show what collections are available:

Find new BILLS using the collections service:

  1. https://api.govinfo.gov/collections/BILLS/2018-10-30T10:37:00Z/2018-10-31T10:38:00Z?pageSize=100&offset=0&api_key=DEMO_KEY

The endTime is optional, but useful. You can also filter by docClass by adding it as a parameter. docClass is collection-specific-- for Bills, docClass is equivalent to bill type (e.g. h, hres, s, hconres, sconres)

{
	"count": 969,
	"message": null,
	"nextPage": "https://api.govinfo.gov/collections/BILLS/2018-10-30T10:37:40Z/2018-10-31T10:37:46Z?offset=100&pageSize=100",
	"previousPage": null,
	"packages": [{
			"packageId": "BILLS-115hr2464enr",
			"lastModified": "2018-10-31T10:37:46Z",
			"packageLink": "https://api.govinfo.gov/packages/BILLS-115hr2464enr/summary",
			"docClass": "hr",
			"title": "An Act To designate the facility of the United States Postal Service located at 25 New Chardon Street Lobby in Boston, Massachusetts, as the John Fitzgerald Kennedy Post Office."
		}, {
			"packageId": "BILLS-115hr1208enr",
			"lastModified": "2018-10-31T10:37:46Z",
			"packageLink": "https://api.govinfo.gov/packages/BILLS-115hr1208enr/summary",
			"docClass": "hr",
			"title": "An Act To designate the facility of the United States Postal Service located at 9155 Schaefer Road, Converse, Texas, as the Converse Veterans Post Office Building."
		},
		...
                {
			"packageId": "BILLS-115s3021enr",
			"lastModified": "2018-10-31T10:37:45Z",
			"packageLink": "https://api.govinfo.gov/packages/BILLS-115s3021enr/summary",
			"docClass": "s",
			"title": "America’s Water Infrastructure Act of 2018"
		},
...

This will provide a list of packages in JSON that you can retrieve content and metadata directly from. We'll use the America's Water Infrastructure Act of 2018 as an example:

BILLS-115s3021enr

packageLink provides a json summary of metadata, similar to what you would find on the govinfo content detail page. Here's an abbreviated snippet. the actual summary includes even more information, like bill version, type, session, references to public laws, etc.

{
    "title": "An Act To provide for improvements to the rivers and harbors of the United States, to provide for the conservation and development of water and related resources, to provide for water pollution control activities, and for other purposes.",
    "shortTitle": [
        {
            "type": "measure",
            "title": "America’s Water Infrastructure Act of 2018"
        },
        {
            "type": "level",
            "level": "title",
            "title": "Water Resources Development Act of 2018"
        }
    ],
    "collectionCode": "BILLS",
    "collectionName": "Congressional Bills",
    "category": "Bills and Statutes",
    "dateIssued": "2018-10-24",
    "detailsLink": "https://www.govinfo.gov/app/details/BILLS-115s3021enr",
    "packageId": "BILLS-115s3021enr",
    "download": {
        "txtLink": "https://api.govinfo.gov/packages/BILLS-115s3021enr/htm",
        "xmlLink": "https://api.govinfo.gov/packages/BILLS-115s3021enr/xml",
        "pdfLink": "https://api.govinfo.gov/packages/BILLS-115s3021enr/pdf",
        "modsLink": "https://api.govinfo.gov/packages/BILLS-115s3021enr/mods",
        "premisLink": "https://api.govinfo.gov/packages/BILLS-115s3021enr/premis",
        "zipLink": "https://api.govinfo.gov/packages/BILLS-115s3021enr/zip"
    },
    "related": {
        "billStatusLink": "https://www.govinfo.gov/bulkdata/BILLSTATUS/115/s/BILLSTATUS-115s3021.xml"
    },

Notice the download section. If you request one of those links, you'll retrieve the relevant content.

"xmlLink": "https://api.govinfo.gov/packages/BILLS-115s3021enr/xml?api_key=DEMO_KEY",

"pdfLink": "https://api.govinfo.gov/packages/BILLS-115s3021enr/pdf?api_key=DEMO_KEY",

"zipLink": "https://api.govinfo.gov/packages/BILLS-115s3021enr/zip?api_key=DEMO_KEY"

If you know which format you want to retrieve (and know that it will be available if the packageId exists), you could also just construct the necessary links from the collections service via the packageId element.

https://api.govinfo.gov/packages/{{packageId}}/{{contentType}} (include your API key via header or path)

govinfo BILLS Sitemaps

Note that sitemaps organization has changed from FDsys to govinfo. Now there is no top-level year sitemap. Sitemaps are organized by collection then collection year: [Robots.txt] (https://www.govinfo.gov/robots.txt) https://www.govinfo.gov/sitemap/BILLS_sitemap_index.xml https://www.govinfo.gov/sitemap/BILLS_2018_sitemap.xml

sample entry

<url>
	<loc>https://www.govinfo.gov/app/details/BILLS-115s3021enr</loc>
	<lastmod>2018-10-31T10:43:38.699Z</lastmod>
	<changefreq>monthly</changefreq>
	<priority>1.0</priority>
</url>

MODS: https://www.govinfo.gov/metadata/pkg/BILLS-115s3021enr/mods.xml

XML: https://www.govinfo.gov/content/pkg/BILLS-115s3021enr/xml/BILLS-115s3021enr.xml

PDF: https://www.govinfo.gov/content/pkg/BILLS-115s3021enr/pdf/BILLS-115s3021enr.pdf

ZIP: https://www.govinfo.gov/content/pkg/BILLS-115s3021enr.zip

BILLSTATUS

Again, two options:

  1. Use the govinfo bulkdata repository. Handy xml and json versions available.

  2. Use bulkdata BILLSTATUS sitemaps

Note: We're working on adding bulk data, particularly BILLSTATUS, to the API, but it likely won't be done before end of the Calendar year.

There is an RSS feed that announces batch updates to the BILLSTATUS directory: https://www.govinfo.gov/rss/billstatus-batch.xml

bulkdata repository via json

If you use the json endpoint, you'll see something like the following for the BILLSTATUS direcctory:

{
	"formattedLastModifiedTime": "26-Jan-2017 02:35",
	"name": "BILLSTATUS",
	"folder": true,
	"displayLabel": "Bill Status",
	"link": "http://www.govinfo.gov/bulkdata/json/BILLSTATUS",
	"justFileName": "BILLSTATUS"
},

going through the link will take you to the next level down:

{
	"files": [{
			"formattedLastModifiedTime": "18-Feb-2016 03:40",
			"name": "113",
			"folder": true,
			"displayLabel": "113th Congress (2013-2014)",
			"link": "http://www.govinfo.gov/bulkdata/json/BILLSTATUS/113",
			"justFileName": "113"
		}, {
			"formattedLastModifiedTime": "17-Feb-2016 04:58",
			"name": "114",
			"folder": true,
			"displayLabel": "114th Congress (2015-2016)",
			"link": "http://www.govinfo.gov/bulkdata/json/BILLSTATUS/114",
			"justFileName": "114"
		}, {
			"formattedLastModifiedTime": "04-Jan-2017 08:19",
			"name": "115",
			"folder": true,
			"displayLabel": "115th Congress (2017-2018)",
			"link": "http://www.govinfo.gov/bulkdata/json/BILLSTATUS/115",
			"justFileName": "115"
		}, {
			"formattedLastModifiedTime": "26-Jan-2017 02:35",
			"name": "resources",
			"folder": true,
			"displayLabel": "resources",
			"link": "http://www.govinfo.gov/bulkdata/json/BILLSTATUS/resources",
			"justFileName": "resources"
		}
	]
}

you can then proceed down to the lowest level and retrieve the billstatus as you want.

BILLSTATUS sitemap

Alternatively, of course, you could use the BILLSTATUS bulkdata sitemaps.

https://www.govinfo.gov/sitemap/bulkdata/BILLSTATUS/sitemapindex.xml

That might be the overall easiest, as it provides direct access to all of the xml, including a last modified date.

https://www.govinfo.gov/sitemap/bulkdata/BILLSTATUS/113s/sitemap.xml

sample entry:

<url>
	<loc>https://www.govinfo.gov/bulkdata/BILLSTATUS/113/s/BILLSTATUS-113s1422.xml</loc>
	<lastmod>2017-02-19T21:59:00.100Z</lastmod>
	<changefreq>monthly</changefreq>
	<priority>1.0</priority>
</url>