-
Notifications
You must be signed in to change notification settings - Fork 54
Get Event Information
When you want to find detailed information about a specific event you can use the QueryEventArticlesIter
and QueryEvent
classes. The classes can be used to obtain all the information that is in Event Registry shown on the event page (i.e. http://eventregistry.org/event/eng-2940883).
The QueryEventArticlesIter
class is a helper class that allows one to quickly obtain the list of articles that are associated with a particular event.
Example of usage
A simple example that will list all English articles about event eng-2940883
is as follows:
from eventregistry import *
er = EventRegistry(apiKey = YOUR_API_KEY)
iter = QueryEventArticlesIter("eng-2940883", lang = "eng")
for art in iter.execQuery(er, sortBy = "date"):
print art
QueryEventArticlesIter
constructor accepts the following arguments:
QueryEventArticlesIter(eventUri,
lang = None,
keywords = None,
conceptUri = None,
categoryUri = None,
sourceUri = None,
sourceLocationUri = None,
sourceGroupUri = None,
authorUri = None,
locationUri = None,
dateStart = None,
dateEnd = None,
dateMentionStart = None,
dateMentionEnd = None,
keywordsLoc = "body",
startSourceRankPercentile = 0,
endSourceRankPercentile = 100)
-
eventUri
: the event URI from which we want to obtain news articles. -
lang
: return articles that are written in the specified language. If more than one language is specified, resulting articles should be written in any of the languages. -
keywords
: limit the event articles to those that mention the specified keywords. A single keyword/phrase can be provided as a string, multiple keywords/phrases can be provided as a list of strings. UseQueryItems.AND()
if all provided keywords/phrases should be mentioned, orQueryItems.OR()
if any of the keywords/phrases should be mentioned. orQueryItems.OR()
to specify a list of keywords where any of the keywords have to appear -
conceptUri
: limit the event articles to those where the concept with concept URI is mentioned. A single concept URI can be provided as a string, multiple concept URIs can be provided as a list of strings. UseQueryItems.AND()
if all provided concepts should be mentioned, orQueryItems.OR()
if any of the concepts should be mentioned. To obtain a concept URI using a concept label useEventRegistry.getConceptUri()
. -
categoryUri
: limit the event articles to those that are assigned into a particular category. A single category can be provided as a string, while multiple categories can be provided as a list inQueryItems.AND()
orQueryItems.OR()
. A category URI can be obtained from a category name usingEventRegistry.getCategoryUri()
. -
sourceUri
: limit the event articles to those that were written by a news source sourceUri. If multiple sources should be considered, useQueryItems.OR()
to provide a list of sources. Source URI for a given news source name can be obtained usingEventRegistry.getNewsSourceUri()
. -
sourceLocationUri
: limit the event articles to those that were written by news sources located in the given geographic location. If multiple source locations are provided, then put them into a list insideQueryItems.OR()
Location URI can either be a city or a country. Location URI for a given name can be obtained usingEventRegistry.getLocationUri()
. -
sourceGroupUri
: limit the event articles to those that were written by news sources that are assigned to the specified source group. If multiple source groups are provided, then put them into a list insideQueryItems.OR()
Source group URI for a given name can be obtained usingEventRegistry.getSourceGroupUri()
. -
authorUri
: find articles that were written by a specific author. If multiple authors should be considered, useQueryItems.OR()
to provide a list of authors. Author URI for a given author name can be obtained usingEventRegistry.getAuthorUri()
. -
locationUri
: find articles that describe something that occurred at a particular location. The value can be a string or a list of strings provided inQueryItems.OR()
. Location URI can either be a city or a country. Location URI for a given name can be obtained usingEventRegistry.getLocationUri()
. -
dateStart
: find articles that were written on or after dateStart. The date should be provided in YYYY-MM-DD format, datetime.time or datetime.datetime. -
dateEnd
: find articles that occurred before or on dateEnd. The date should be provided in YYYY-MM-DD format, datetime.time or datetime.datetime. -
dateMentionStart
: limit the event articles to those that explicitly mention a date that is equal or greater than dateMentionStart. -
dateMentionEnd
: limit the event articles to those that explicitly mention a date that is lower or equal to dateMentionEnd. -
keywordsLoc
: where should we look when searching using the keywords provided by "keywords" parameter. "body" (default), "title", or "body,title" -
startSourceRankPercentile
: starting percentile of the sources to consider in the results (default: 0). The value should be in range 0-100 and divisible by 10. -
endSourceRankPercentile
: ending percentile of the sources to consider in the results (default: 100). The value should be in range 0-100 and divisible by 10. -
sortBy
: how should the articles be sorted before we decide which ones to return. Options:id
(internal id),date
(published date),cosSim
(closeness to event centroid),socialScore
(total shares in social media). -
returnInfo
: sets the properties of various types of data that is returned (articles, concepts, categories, news sources, ...)
Methods
The class has two main methods: count()
and execQuery()
.
count(er)
count(er)
method simply returns the number of articles assigned to the event that are in the specified language(s). Expected arguments are:
-
er
is the instance of theEventRegistry
class.
execQuery(er,
sortBy = "cosSim", sortByAsc = False,
returnInfo = ReturnInfo(articleInfo = ArticleInfoFlags(bodyLen = -1)),
maxItems = -1)
The execQuery
returns an iterator over the articles in the event. The meaning of the parameters the execQuery
method accepts is as follows: Most commonly, the parameters set are the er
, sortBy
, returnInfo
and potentially maxItems
, but the class also supports limiting results to a subset of articles about the event by specifying parameters that are otherwise available when searching for articles in general. A full list of parameters is described below:
-
er
: an instance of EventRegistry class that should be used to obtain the necessary data. -
sortBy
: the order in which event articles are sorted. Options:id
(internal id),date
(published date),cosSim
(closeness to event centroid),sourceImportance
(importance of the news source),socialScore
(total shares in social media). -
sortByAsc
: should the results be sorted in ascending order (True
) or descending (False
). -
returnInfo
: what details should be included in the returned information. See details. -
maxItems
: max number of about the event to return by the iterator. Use default (-1) to simply return all the articles.
The QueryEvent class
provides a more extended set of functionalities for a given event. The class can be used to obtain not only the list of associated articles but also core event information, a timeline of reporting about the event, list of top news sources reporting about the event, related events, etc.
Example of usage
To start, let us look at a simple example of usage of the QueryEvent()
class to obtain information about event with URI eng-2940883
:
from eventregistry import *
er = EventRegistry(apiKey = YOUR_API_KEY)
# we are interested in event with URI eng-2940883
q = QueryEvent("eng-2940883")
# get core event information (location, date, top concepts, ...)
q.setRequestedResult(RequestEventInfo())
res = er.execQuery(q)
The resulting JSON object contained in res
will contain:
{
"eng-2940883": {
"info": { ... }, // details about the event
}
}
The returned information about articles in the event follows the Article data model.
QueryEvent
constructor accepts a single argument eventUriOrList
:
QueryEvent(eventUriOrList,
requestedResult = None)
-
eventUriOrList
: can be a string representing a single event URI or it can be a list of event URIs (at most 50). For all requested results exceptRequestEventInfo()
, only a single event URI can be provided. -
requestedResult
: the information about the event to return. Can be any of theRequestEvent*
classes described below. IfNone
, then theRequestEventInfo()
instance will be set.
QueryEvent
class provides a method setRequestedResult()
that can be used to specify which details about the event you wish to obtain. The argument in the method call has to be an instance that has a base class RequestEvent
. Below are the classes that can be specified in the setRequestedResult()
calls:
RequestEventInfo
RequestEventInfo(returnInfo = ReturnInfo())
RequestEventInfo
class can provide the core information about the event - the title, summary, location, date, concepts, categories and the number of articles reporting about the event.
-
returnInfo
: sets the properties of various types of data that is returned (event details, concepts, categories, news sources, ...)
RequestEventArticles
RequestEventArticles(page = 1,
count = 100,
lang = None,
keywords = None,
conceptUri = None,
categoryUri = None,
sourceUri = None,
sourceLocationUri = None,
sourceGroupUri = None,
authorUri = None,
locationUri = None,
dateStart = None,
dateEnd = None,
dateMentionStart = None,
dateMentionEnd = None,
keywordsLoc = "body",
startSourceRankPercentile = 0,
endSourceRankPercentile = 100,
sortBy = "cosSim", sortByAsc = False,
returnInfo = ReturnInfo())
RequestEventArticles
returns details about the articles assigned to the event. Most commonly, you only need to set the page
, count
, sortBy
and returnInfo
parameters, but the class also supports limiting results to a subset of articles by specifying parameters that are otherwise available when searching for articles in general. Full list of parameters is described below:
-
page
: which page of the articles to return (starting from 1). -
count
: number of articles to return (max 100). -
lang
: return articles that are written in the specified language. If more than one language is specified, resulting articles should be written in any of the languages. -
keywords
: limit the event articles to those that mention the specified keywords. A single keyword/phrase can be provided as a string, multiple keywords/phrases can be provided as a list of strings. UseQueryItems.AND()
if all provided keywords/phrases should be mentioned, orQueryItems.OR()
if any of the keywords/phrases should be mentioned. orQueryItems.OR()
to specify a list of keywords where any of the keywords have to appear -
conceptUri
: limit the event articles to those where the concept with concept URI is mentioned. A single concept URI can be provided as a string, multiple concept URIs can be provided as a list of strings. UseQueryItems.AND()
if all provided concepts should be mentioned, orQueryItems.OR()
if any of the concepts should be mentioned. To obtain a concept URI using a concept label useEventRegistry.getConceptUri()
. -
categoryUri
: limit the event articles to those that are assigned into a particular category. A single category can be provided as a string, while multiple categories can be provided as a list inQueryItems.AND()
orQueryItems.OR()
. A category URI can be obtained from a category name usingEventRegistry.getCategoryUri()
. -
sourceUri
: limit the event articles to those that were written by a news source sourceUri. If multiple sources should be considered, useQueryItems.OR()
to provide a list of sources. Source URI for a given news source name can be obtained usingEventRegistry.getNewsSourceUri()
. -
sourceLocationUri
: limit the event articles to those that were written by news sources located in the given geographic location. If multiple source locations are provided, then put them into a list insideQueryItems.OR()
Location URI can either be a city or a country. Location URI for a given name can be obtained usingEventRegistry.getLocationUri()
. -
sourceGroupUri
: limit the event articles to those that were written by news sources that are assigned to the specified source group. If multiple source groups are provided, then put them into a list insideQueryItems.OR()
Source group URI for a given name can be obtained usingEventRegistry.getSourceGroupUri()
. -
authorUri
: find articles that were written by a specific author. If multiple authors should be considered, useQueryItems.OR()
to provide a list of authors. Author URI for a given author name can be obtained usingEventRegistry.getAuthorUri()
. -
locationUri
: find articles that describe something that occurred at a particular location. The value can be a string or a list of strings provided inQueryItems.OR()
. Location URI can either be a city or a country. Location URI for a given name can be obtained usingEventRegistry.getLocationUri()
. -
dateStart
: find articles that were written on or after dateStart. The date should be provided in YYYY-MM-DD format, datetime.time or datetime.datetime. -
dateEnd
: find articles that occurred before or on dateEnd. The date should be provided in YYYY-MM-DD format, datetime.time or datetime.datetime. -
dateMentionStart
: limit the event articles to those that explicitly mention a date that is equal or greater than dateMentionStart. -
dateMentionEnd
: limit the event articles to those that explicitly mention a date that is lower or equal to dateMentionEnd. -
keywordsLoc
: where should we look when searching using the keywords provided by "keywords" parameter. "body" (default), "title", or "body,title" -
startSourceRankPercentile
: starting percentile of the sources to consider in the results (default: 0). The value should be in range 0-100 and divisible by 10. -
endSourceRankPercentile
: ending percentile of the sources to consider in the results (default: 100). The value should be in range 0-100 and divisible by 10. -
sortBy
: how should the articles be sorted before we decide which ones to return. Options:id
(internal id),date
(published date),cosSim
(closeness to event centroid),socialScore
(total shares in social media). -
returnInfo
: sets the properties of various types of data that is returned (articles, concepts, categories, news sources, ...)
RequestEventArticleUriWgts
RequestEventArticleUriWgts(lang = None,
sortBy = "cosSim", sortByAsc = False)
RequestEventArticleUriWgts
returns a simple list of article URIs for articles that are assigned to the event.
-
lang
: languages in which should the returned articles be. IfNone
is used, then articles in the event will not be filtered by language. -
sortBy
andsortByAsc
parameters determine in which order should the URIs be returned.
RequestEventKeywordAggr
RequestEventKeywordAggr(lang = "eng")
RequestEventKeywordAggr
returns top keywords extracted from articles in the event
-
lang
: if notNone
then the top keywords will only be computed from the articles in the specified language.
RequestEventSourceAggr
RequestEventSourceAggr
returns the information about the news sources that reported about the event. The class does not accept any additional arguments.
RequestEventDateMentionAggr
RequestEventDateMentionAggr
returns information about the dates that were mentioned in the articles about the event. The class does not accept any additional arguments.
RequestEventArticleTrend
RequestEventArticleTrend
provides a list of core article information that can be used to display how the intensity of reporting about the event has been changing over time.
RequestEventSimilarEvents
RequestEventSimilarEvents(conceptInfoList,
count = 50,
maxDayDiff = sys.maxint,
addArticleTrendInfo = False,
aggrHours = 6,
includeSelf = False,
returnInfo = ReturnInfo()
)
RequestEventSimilarEvents
returns a list of events related to the given event.
-
conceptInfoList
: array of concepts and their importance, e.g. [{ "uri": "http://en.wikipedia.org/wiki/Barack_Obama", "wgt": 100 }, ...]. The list of at most 20 concepts is used to identify the related events. -
count
determines the number of similar events to return (max 50). -
maxDayDiff
: what is the maximum time difference between the similar events and this one. -
addArticleTrendInfo
: add info how the articles in the similar events are distributed over time. -
aggrHours
: ifsimilarEventsAddArticleTrendInfo == True
then this is the aggregating time window. -
includeSelf
: should the info about the event itself also be included among the results? -
returnInfo
: what details should be included in the returned information. See details.
Core Information
Usage tracking
Terminology
EventRegistry
class
ReturnInfo
class
Data models for returned information
Finding concepts for keywords
Filtering content by news sources
Text analytics
Semantic annotation, categorization, sentiment
Searching
Searching for events
Searching for articles
Article/event info
Get event information
Get article information
Other
Supported languages
Different ways to search using keywords
Feed of new articles/events
Social media shares
Daily trends
Find the event for your own text
Article URL to URI mapping