Kafka Connect RSS and Atom Source Connector.
Connector supports polling multiple URLs and sending output to a single topic. Sample configuration file can be found in the repository here.
URLs should be percent encoded and separated by space. Tasks will be split evenly, e.g. for 5 URLs and 3 tasks.max
there will be 3 tasks created with 2, 2 and 1 URLs each.
If tasks.max
is higher than provided number of URLs, only the necessary number of tasks will be created with 1 URL each.
Connector has following configuration options:
Name | Description | Type | Default Value | Importance |
---|---|---|---|---|
rss.urls |
RSS or Atom feed URLs | string | high | |
topic |
Topic to write to | string | high | |
sleep.seconds |
Time in seconds that connector will wait until querying feed again | int | 60 | medium |
Message has the following schema:
{
"schema": {
"type": "struct",
"fields": [
{
"type": "struct",
"fields": [
{
"type": "string",
"optional": true,
"field": "title"
},
{
"type": "string",
"optional": false,
"field": "url"
}
],
"optional": false,
"name": "org.kaliy.kafka.rss.Feed",
"version": 1,
"field": "feed"
},
{
"type": "string",
"optional": false,
"field": "title"
},
{
"type": "string",
"optional": false,
"field": "id"
},
{
"type": "string",
"optional": false,
"field": "link"
},
{
"type": "string",
"optional": true,
"field": "content"
},
{
"type": "string",
"optional": true,
"field": "author"
},
{
"type": "string",
"optional": true,
"field": "date"
}
],
"optional": false,
"name": "org.kaliy.kafka.rss.Item",
"version": 1
}
}
Sample message with JSON converter without embedded schema:
{
"feed": {
"title": "CNN.com - RSS Channel - App International Edition",
"url": "http://rss.cnn.com/rss/edition.rss"
},
"title": "The 56,000-mile electric car journey",
"id": "https://www.cnn.com/2019/03/22/motorsport/electric-car-around-the-world-wiebe-wakker-spt-intl/index.html",
"link": "https://www.cnn.com/2019/03/22/motorsport/electric-car-around-the-world-wiebe-wakker-spt-intl/index.html",
"content": "For three years and 90,000 kilometers and counting, he's traveled the world powered both by electricity and strangers' kindness.",
"author": "CNN",
"date": "2019-03-22T13:34:17Z"
}
- 0.1.0 (2019-03-24): Initial release
- 0.1.1 (2022-11-24):
- 0.1.2 (2022-11-24):
- Support podcasts, the link is a URL to download a file. (#11)
Some development notes can be found here.
To compile and execute unit and integration tests mvn verify
command can be used.