Skip to content
briatte edited this page Nov 7, 2014 · 12 revisions

Because R has the capacity to expand through packages, new useRs often look for lists of "most useful packages" to find out what is being used by the rest of the community. Here are some personal suggestions.

Visualization

R can do many things well, but it is especially good with graphics if you compare it with commercial software like SPSS or Stata. If you are looking for the plot package to install, you will want ggplot2, which also installs plyr and reshape2, two very useful packages for data manipulation.

If you are looking forward to drawing maps with R, ggmap and rmaps are two excellent packages with different qualities. And if you want interactive graphics, you will want to look at ggvis and at the wonderful shiny package.

Data manipulation

I have mentioned plyr and reshape2. The dplyr package is the next iteration of plyr. It provides incredibly fast and easy-to-understand "verbs" (functions) to play with data, and also interfaces very well with SQL drivers like RPostgreSQL.

For those who need to manipulate text, stringr and qdap are two excellent tools. stringr is an all-purpose package for strings, whereas qdap goes further into discourse analysis.

Last, for those who need to manipulate dates, the lubridate package is a must-have.

Data importers

Several packages serve to import foreign formats that are not supported by the pre-installed foreign package, such as the xlsx package reads XLS(X) files, or the jsonlite for JSON objects.

For web scraping, the downloader and httr packages will deal with HTTPS and more complex settings, and the XML (or the developing rvest) package will parse XML and HTML pages.

Several of the packages cited above are linked to each other: installing httr, for instance, will also install RCurl (the R interface to HTTP/FTP/etc.) jsonlite, stringr and more.

R also supports several data APIs – have a look, for instance, at the WDI package to access World Bank indicators, or quandl for tons of time series data from many sources.

Modelling

Social scientists use a wide array of models in their research. Here are three examples:

  • For those who work on surveys, the survey package is a wonderfully simple tool to produce weighted estimates. See also the usgsd repository and the `SAScii package, both by Anthony Damico.
  • For those who spend their days doing regression, the arm, car, lme4, nlme and quantreg packages often tons of diagnostic tools and support for linear and nonlinear multilevel models, among other things.
  • For those who work with networks, the igraph and network packages do the job perfectly, and packages like ergm will add complex network model capabilities.

Publishing

Those interested in writing slides with R code will want to take a look at slidify, and those who want to write full-fledged reports that can be reproduced in R will love knitr, as well as rapporter for further Markdown functions, and stargazer for printing out regression tables.


The course installs tons of packages. If you are looking for a "top 10" list, the one below should be (almost) enough for most classes:

  1. MASS (weighted quantile functions and much more)
  • devtools (install development functions)
  • foreach (parallel computing)

See also that list.

Next in line:

  1. memisc (survey tools)
  • questionr (survey tools)
  • GGally (graphics)
  • googleVis (animated graphics)
  • mgcv (generalized additive modelling)
  • zelig (postestimation)
  • FactoMineR or ClustOfVar (PCA)
  • text manipulation (tm, topicmodels, qdap)
  • Twitter (twitteR, streamR)
  • maps (maptools, ggmap)
Clone this wiki locally