Skip to content
Floris de Bijl edited this page May 28, 2021 · 1 revision

Welcome to the OpenTitles wiki!

Adding to the media definitions

To add a website to the list of media you should fork this project, edit media.json according to the template below and subsequently make a pull request. Note that you should add your suggested website under the corresponding nation ID, such as NL, US, BE, DE, FR, etc. If one is not present, please add one in accordance with ISO 3166 alpha-2.

Make sure you also add the domain to manifest.json.

Format:

{
  "NAME": "MEDIA NAME",
  "PREFIX": "URL IN COMMON WITH ALL FEEDS",
  "SUFFIX": "SUFFIX IN COMMON WITH ALL FEEDS",
  "FEEDS": [
    "FEED NAME ONE"
    ,"FEED NAME TWO"
    ,"FEED NAME THREE"
  ],
  "ID_CONTAINER": "RSS CHILD ITEM THAT CONTAINS THE ID (USUALLY GUID)",
  "ID_MASK": "REGEX TO RETRIEVE THE ID FROM THE ID_CONTAINER",
  "PAGE_ID_LOCATION": "WHERE THE ID CAN BE FOUND ON THE ARTICLE WEBPAGE",
  "PAGE_ID_QUERY": "WHAT QUERY TO USE WHEN LOOKING FOR THE ID ON THE WEBPAGE",
  "MATCH_DOMAINS": ["DOMAIN ONE", "DOMAIN TWO"],
  "TITLE_QUERY": "CSS SELECTOR FOR THE TITLE ELEMENT"
}

Some notes:

  • PREFIX is usually the domain followed by /feeds/ or /rss/. The API will join prefix + feed[i] + suffix to create the target for the feed parser. SUFFIX is usually empty, but some sites may require .xml or .rss to be suffixed.
  • RSS feeds consists of <item>s, which describe every article. To identify an article regardless of it's title, we need an ID that persists between title changes. Since GUID is described in the RSS standard as "A string that uniquely identifies the item.", it's the most likely candidate for our ID. Most sites generally just put the link to the article here, or a shortened permalink. In any event, we need to extract the ID from this element. The API will use ID_CONTAINER to determine which child of <item> contains the ID and ID_MASK as regex to extract the ID from this container.
  • TITLE_QUERY should be a CSS selector that matches the closest element to the title - preferable the h1/span that contains the actual title text. This will be used by the content script (colloquially 'the plugin' or 'extension') with a querySelector to inject the pop-up button.
  • PAGE_ID_LOCATION can be either 'url', 'var' or 'page'. Url has previously been the default, where the ID is looked up in the url using the ID_MASK as regex. Var is used when the ID is not present in the url, but rather in a global variable. Page means that the ID can be found as a property of an element on the page - this has not been implemented as of yet.
  • PAGE_ID_QUERY is currently only used when PAGE_ID_LOCATION is set to var. In that case it simply contains the window property path where the ID is located, for example: CBSNEWS.tracking.articleId.

Example:

{
  "NAME": "NOS",
  "PREFIX": "http://feeds.nos.nl/",
  "SUFFIX": "",
  "FEEDS": [
    "nosnieuwsbinnenland"
    ,"nosnieuwsalgemeen"
    ,"nosnieuwsbuitenland"
    ,"nosnieuwspolitiek"
    ,"nosnieuwseconomie"
    ,"nosnieuwscultuurenmedia"
    ,"nosnieuwstech"
    ,"nosnieuwskoningshuis"
    ,"nossportalgemeen"
  ],
  "ID_CONTAINER": "guid",
  "ID_MASK": "[0-9]{7}",
  "PAGE_ID_LOCATION": "url",
  "PAGE_ID_QUERY": "",
  "MATCH_DOMAINS": ["nos.nl", "jeugdjournaal.nl"],
  "TITLE_QUERY": ".article__title"
}
Clone this wiki locally