Skip to content

Latest commit

 

History

History
63 lines (45 loc) · 2.2 KB

README.md

File metadata and controls

63 lines (45 loc) · 2.2 KB

News Time Series

News Time Series is an R package designed to interact with the New York Times API, facilitating the collection and processing of articles and their images from 1851 to 2024.

Installation

You can install the development version of News Time Series from GitHub with:

# install.packages("devtools")
devtools::install_github("mateoservent/NewsTimeSeries")

nyt_timeseries()

nyt_timeseries() is a function that retrieves articles from the New York Times API based on specified search criteria. This function allows users to query articles over a specified date range and search term, returning a tibble containing details about each article.

library(NewsTimeSeries)

# Example usage of nyt_timeseries()
nyt_timeseries(api_key = "<your_api_key>", query = "search_term",
                             begin_date = "YYYY-MM-DD", end_date = "YYYY-MM-DD")

The nyt_timeseries() function is designed with API rate limits in mind. However, for extensive search queries covering a wide date range, the data collection process might need to be spread over multiple days.

To accommodate this, nyt_timeseries() includes the continue_loading argument. Setting continue_loading = TRUE allows the function to pick up where it left off, continuing to add articles to an existing articles tibble stored in the global environment.

For example, if you have already collected some data and need to continue from a specific point, you can use:

library(NewsTimeSeries)

# Continuing the article collection from a specific point
nyt_timeseries(api_key = "<your_api_key>", query = 'search_term',
               begin_date = '1851-01-01', end_date = '2024-01-01', 
               continue_loading = TRUE)

About

  • This experimental package was developed during a SICSS in 2023 and in collaboration with Joel Martinez.