Skip to content

NHANEStoMySQL

Tim Erickson edited this page Jun 4, 2019 · 7 revisions

The path from SAS transport files to CODAP

The NHANES site lets you download SAS transport files (.xpt). But for our system to use them, we need them in MySQL (or some other database that we will access from CODAP/JS).

Download SAS Transport file

On the NHANES site, click the download button. I have been keeping the relevant files to be uploaded in content/Projects/NHANES 2015–2016/nhanes15. Our example file is for biochemistry, the "standard biochemistry profile," which includes stuff like Sodium and Iron. Its file name is BIOPRO_I.XPT.

SAS transport to CSV

Do this is RStudio. Here is example code:

library(tidyverse)
library(data.table)

biochem <- read_xpt("BIOPRO_I.xpt")
fwrite(biochem, file="biochem.csv")

But if you haven't just done this, there will be errors. be sure tidyverse is loaded. That's in the Packages tab

Alternate path

See https://www.statmethods.net/input/importingdata.html.

library(Hmisc)
biochem <- sasxport.get("BIOPRO_I.xpt")

CSV to MySQL

Using Sequel Pro, choose File\Import and select the .csv. Check that the column headings are correct; the most common mistake for me has been to fail to indicate that this is going to be a new table.

Additional munging

For the data to appear in the NHANES plugin, you also need to do the following in MySQL (probably in Sequel Pro).

  • Make sure whatever (new) variables you want to appear in the interface are identified in the varlist table.
  • Be sure that any values that need translating are listed (keyed by the internal variable name) in the decoder table.
  • (Surprisingly important!) Make sure any new table has SEQN as a key!

Prepping the MySQL database

We often need to compute a new column in the MySQL so that we don't have to do it in JS. In NHANES, a good example is in sexual behavior, where they have two different items representing the same question, one asked of males, one of females, e.g., the number of opposite-sex partners. In that case, we might want a single column that collapses the two.

First we need to know the names of the relevant attributes; these are from the CDC at https://wwwn.cdc.gov/Nchs/Nhanes/2015-2016/SXQ_I.htm:

SXD101 - # male sex partners/lifetime (F)
SXD171 - # female sex partners/lifetime (M)
SXQ130 - # female sex partners/lifetime (F)
SXQ410 - # men anal/oral sex partners/lifetime(M)

Then we check the table. What's in the cell when it's the wrong sex? Looks like nothing, but maybe it's an empty string. We'll use UPDATE and IF, like this:

UPDATE sexbeh SET SXPOLIFE = IF(SXD101 IS NULL or SXD101 = '', SXD171, SXD101)
UPDATE sexbeh SET SXPSLIFE = IF(SXQ410 IS NULL or SXQ410 = '', SXQ130, SXQ410)
Clone this wiki locally