Python package for API access to news articles and events in the Event Registry
Added
Analytics.semanticSimilarity
API call. It can be used to determine how semantically related two documents are. The documents can be in the same or different languages.Analytics.extractArticleInfo
API call. It provides functionality to extract article title, body, date, author and other information from the given URL.dataType
parameter when searching for articles. Event Registry is now separating collected content by data type. The possible values for data type are "news", "pr" (for PR content) and "blogs" (we will start indexing and providing blog content shortly). The dataType parameter can be set in the QueryArticles and QueryArticlesIter classes as well as in the EventRegistry.getNewsSourceUri
and EventRegistry.suggestNewsSources
.Changed
EventRegistry.suggestNewsSources()
and EventRegistry.getNewsSourceUri()
now also accepts dataType
parameter, which is by default ["news", "pr"]. It determines what kind of data sources to include in the generated suggestions.QueryArticles
and QueryArticlesIter
classes now supports additional parameter dataType
that determines what type of data should be returned. By default, the value is news
. For now it can also be pr
or an array with both values.Removed
QueryArticles.addRequestedResult()
, QueryEvents.addRequestedResult()
, QueryArticle.addRequestedResult()
, QueryEvent.addRequestedResult()
, and Query.clearRequestedResults()
. As before, a single result type can be requested per call so the methods are not usable. Use setRequestedResult()
methods.id
property from different returned data objects. Although the documentation clearly stated that the property is for internal use only, users commonly used the property, which caused potential issues.Added
Analytics
that can be used to semantically annotate a document, categorize the document into a predefined taxonomy of categories or to detect a language of a text. In future, more analytics methods will be added to this class. NOTE: the functionality is currently in BETA. The API calls or the provided outputs may change in the future.links
into the output of the article format. It contains the list of URLs extracted from the article body (not from the whole HTML but just the part containing the body).sentiment
property will be by default added to the output format for the article. It can be null
if the property is not set.Removed
details
from all the *InfoFlags
that had it (ArticleInfoFlag
, SourceInfoFlag
, etc.). All the properties provided previously by this property are provided anyway using the other flags.flags
from all the *InfoFlags
. The flag represents some internal properties that are not publicly useful.Added
allowUseOfArchive
to EventRegistry
constructor. The flag determines if queries made by that EventRegistry instance can use the archive data (data since Jan 2014) or just the recent data (last 31 days of content). Queries made on the archive use more of your data plan tokens so if you just want to use the recent content, make sure that you set the flag to False
. Note that archive data can be accessed only by paid subscribers.EventRegistry.printLastReqStats()
which prints to console some stats regarding the latest executed request. It prints whether the archive was used in the query, the number of tokens used by the request, etc.allowUseOfArchive
to the EventRegistry.execQuery()
method. It can be used to override the flag about the use of archive that was set when constructing the Event Registry
class.Changed
Deprecated
Removed
categoryIncludeSub
and ignoreCategoryIncludeSub
. The flag is set to true and can not be changed.maxItems
from QueryArticlesIter.execQuery()
and QueryEventIter.execQuery()
. The iterator will always cache the maximum number of items that can be returned with a single query.Fixed
Added
QueryArticles
and QueryArticlesIter
now support additional constructor argument keywordsLoc
which allows users to specify where should the keywords provided using keywords
occur. Default is body
(the keywords should be mentioned in the body of the article), other valid options are title
(should be mentioned in the article's title) or title,body
(should be mentioned anywhere in the article).QueryArticles
and QueryArticlesIter
: same as keywordsLoc
determines keyword location for keywords
, an ignoreKeywordsLoc
parameter can also be specified for determining the location of the keywords to ignore, which are determined by ignoreKeywords
parameter.keywordLoc
parameter in the BaseQuery
.EventRegistry.suggestLocationsAtCoordinate()
method which returns geographic places near the given geo locationsEventRegistry.suggestSourcesAtCoordinate()
method which returns the list of news sources that are close to the given geographic locationEventRegistry.suggestSourcesAtPlace()
method that can return a list of news sources that we are crawling at the specified place or country. The input argument has to be a location URI obtained by calling EventRegistry.getLocationUri()
.EventRegistry.getUrl()
method which for a given query object returns the url that can be used to make a direct HTTP request.videos
property to Article
data model. When one or more videos were identified in an article you can retrieve them by setting video=True
flag in ArticleInfoFlags
.Changed
ArticleMapper.getArticleUri()
now returns None
or string
, no longer a list
. We no longer store multiple versions of the articles with the same url.ArticleInfoFlags
. In case you didn't set parameter values by name, then check if it matches the desired properties. The change was done to reflect importance and usability of individual parameters.Removed
EventRegistry.getArticleUris()
no longer accepts parameter includeAllVersions
.Added
QueryArticles
and QueryEvents
: When creating an instance of the class using a parameter that is a list (such as conceptUri
, categoryUri
, ...) you can (should) now provide the list using the QueryItems.AND()
or QueryItems.OR()
methods to explicitly define whether Boolean AND
or OR
should be used between the multiple items. If just a list is provided instead, a warning will be displayed in the console output. If a single value is used for the parameter, it is still perfectly ok to provide it directly as string
.QueryArticles
and QueryEvents
: Added two new supported parameters sourceLocationUri
and sourceGroupUri
. Parameter sourceLocationUri
can be used to specify a location URI (obtained with EventRegistry.getLocationUri
) to use a set of news sources from a specific geographic location. The locations used can be cities or countries. sourceGroupUri
can be used to use in search a set of news sources that belong to a manually curated list of news sources (such as top business related sources, top entertainment sources, ...). See next item to see how to find the values for this parameter.EventRegistry
class. Added methods suggestSourceGroups()
and getSourceGroupUri()
that can be used to get the list of news source groups that match a given name/uri (suggestSourceGroups()
) or the single top suggestion (getSourceGroupUri()
). Source groups are that can be used to find or filter content to a specific set of publishers.sortBy
values are now also sourceAlexaGlobalRank
(global rank of the news source) and sourceAlexaCountryRank
(country rank of the news source).SourceInfoFlags
flag image
was added which, if True
adds image
and thumbImage
fields to the returned source information.Changed
QueryArticles
and QueryEvents
: Default values for parameters conceptUri
, categoryUri
and other parameters that accept lists were changed from []
to None
to reflect the preference for using QueryItems
class when specifying an array of values.QueryArticles
and QueryEvents
: changed method setArticleUriList()
to a static method initWithArticleUriList()
to avoid mistakenly creating an instance with query parameters and additionaly caling the setArticleUriList()
.QueryArticles
and QueryEvents
: method initWithComplexQuery()
now accepts also query as a string
value, not only instances of ComplexArticleQuery
and ComplexEventQuery
.SourceInfoFlags
flag importance
was changed to ranking
since now we return multiple rankings for the sourceSourceInfoFlags
flag tags
was changed to sourceGroups
since term tags
was too generic.socialScore
property is now named shares
to better represent the content. The returned object can now include also shares on Google Plus, Pinterest, LinkedIn. The name of the parameter socialScore
in ArticleInfoFlags
was also changed to shares
.importance
property was changed to an object ranking
containing multiple indicators of source importance.Deprecated
sortBy
value sourceImportance
is now deprecated. Use value sourceImportanceRank
. Is is equvalent to reversed value of sourceImportance
therefore also make sure to negate your existing value of sortByAsc
value. The parameter was changed to make it comparable to added sorting options sourceAlexaGlobalRank
and sourceAlexaCountryRank
which also represent rankings (lower value means better value).Removed
QueryArticles
and QueryEvents
: removed the conceptOper
parameter. It's functionality is now replaced by providing the array of values inside QueryItems.AND()
or QueryItems.OR()
.QueryArticles
and QueryEvents
: removed the utility methods addConcept()
, addLocation()
, addCategory()
, addNewsSource()
, addKeyword()
, setDateLimit()
, setDateMentionLimit()
. The values of these parameters should be set when initializing the object. The methods were removed since users used static method initWithComplexQuery()
and additionally calling these methods which had no effect on the results.For power users we have added a query language that can use nested query objects and AND and OR operators on all query items.
All details about the query language are described on our documentation page:
In this release we introduce two major changes. The first change is the possibility of using iterators to iterate over search results containing events and articles. Details and an example of the iterator can be read on the blog post: http://blog.eventregistry.org/2017/03/05/simplifying-the-data-access-with-iterators/ as welll as in the documentation: https://github.com/EventRegistry/event-registry-python/wiki/Searching-for-events#queryeventsiter https://github.com/EventRegistry/event-registry-python/wiki/Searching-for-articles#queryarticlesiter https://github.com/EventRegistry/event-registry-python/wiki/Get-event-information#queryeventarticlesiter
The other significant change is that we have removed the EventRegistry.login() method. The users should now authenticate using their API key. You can specify your API key when you create EventRegistry instance:
er = EventRegistry(apiKey = YOUR_API_KEY)
If you don't know how to obtain your API key, please check the documentation: https://github.com/EventRegistry/event-registry-python/wiki/EventRegistry-class#authorization