HyperFeeder generates a personally tailored podcast -- just for you (or your audience)!
HyperFeeder is a fully configurable and extensible framework for building automated podcasts however you want. HyperFeeder uses a plugin system that generates a podcast step by step from Intro and music to individual podcast segments and news stories to how the podcast will be arranged and presented. At each step you can either choose from an existing list of plugins that will generate different kinds of content or simply write your own plugin for any step of the podcast generation process.
With existing tools and plugins you can currently build a podcast with data from Hacker News, Reddit, any podcast with transcripts in the RSS Feed, RSS Newsletters like those on Substack. The plan is to expand this tool to ingest many different configurable sources for building podcast content from as well as new sources for augmenting the source content. See the Issues Tab for the planned and in-progress roadmap. Also, feel free to open new issues for feature requests or pull requests for new features you'd like to contribute back.
- The podcasts produced by this framework should be fully autonomous and need no human intervention to search for content, generate audio, compose audio segments, produce instrospective metadata about podcast content and publish to a podcast feed. (In progress)
- Anyone should be able to submit easily composable new features that can help make all autonomous podcasts better. (In Progress)
- Any source of data can be a source of podcast content. (In Progress)
-
For the text to speech generation script I use say for simplicity right now which is only available on MacOS. I welcome Pull Requests to update the script or add new versions of the script for text-to-speech compatabillity with other systems.
-
For audio Stitching this project uses FFMPEG which can be downloaded for any system here
-
You will need Python version 3.9.17 or newer. If your Python version is older, you can use
pyenv
to manage multiple versions of Python on your system. Here's how to do it:-
If you don't have
pyenv
installed, you can install it following the instructions on the pyenv GitHub. -
After
pyenv
is installed, you can install Python 3.9.17 with the following commands:pyenv install 3.9.17 pyenv global 3.9.17
-
Now, check your Python version:
python --version
You should see
Python 3.9.17
as the output.
-
-
To access some features, you will need an OpenAI API key. Here's how to get one:
-
After signing up and logging in, navigate to the API section.
-
Here, you'll find your API key. Make sure to keep this key secure and do not share it with anyone.
Clone this repository using git:
git clone <repo_url>
cd <repo_directory>
Run the setup script:
./setup_script.sh
HyperFeeder is made to be easily configurable and extensible with plugins. You can easily use existing plugins in different configurations by either modifying the plugins used in each step of the podcast generation process manually in the .config.env
file or you can run the configurePlugins.sh
script to use preset plugin configurations for generating a podcast based on either news
or podcasts
.
Different plugins require specific data sources and configuration options to be set in the .config.env
to work properly. These must be updated manually (for now).
Check the plugin directories for details on what each plugin requires in the .config.env
file to be run. You can also check your active plugins by looking at the very top of the .config.env file. The current Plugin Types are: NEW_PODCAST_DATA_SOURCE_PLUGINS
, NEW_PODCAST_INTRO_PLUGINS
, NEW_PODCAST_SCRAPER_PLUGINS
, NEW_PODCAST_SUMMARY_PLUGINS
(DEPRECATED), NEW_PODCAST_SEGMENT_WRITER_PLUGINS
.
After setting up, install the required dependencies:
pip install -r requirements.txt
Here's a brief overview of the main folders and files in this project:
generatePodcast.sh
: Runs all scripts in the correct sequence to build a podcastpodcastTextGenerationApp/
: The main Python application.podcastMetaInfoScripts/
: Scripts to manage podcast metadata (chapters and description).audioScripts/
: Scripts to manage audio files.audioScripts/podcast_intro_music.mp3
: Default intro music
To generate a new podcast you can run the generatePodcast script by running python generatePodcast.py
.
If the podcast generation process fails at some point you can re-run any of the failed scripts defined in generatePodcast.py directly.
Running generatePodcast.py
just runs the scripts in the correct sequence.
If you want to run / debug / modify the python app for podcast text generation (this is where most of the podcast script generation logic lives) you can follow these instructions:
To run the application, navigate to the podcastTextGenerationApp
directory and run app.py
:
cd podcastTextGenerationApp
python podcastTextGenerator.py
Every output produced by each plugin is saved in the output directory under a folder with the name of the current Date Time when the script was run. In order to retry the podcast generation simply pass the name of the folder created in the output directory like so:
./generatePodcast.sh Podcast-Jul01-2023-07AM
The podcastTextGenerationApp
directory is the heart of this framework. When modifying and creating your own podcasts this is most likely where you will want to start. The podcastTextGenerationApp
uses a plugin architecture so you can extend the functionality of this app and easily contribute your own plugins!
When the app is run by the generatePodcast.py script it will proceed to generate text for a podcast by invoking plugins in the following order:
-
podcastDataSourcePlugins: These plugins will be invoked to generate the urls for a set of "Stories" that will be used to generate the rest of the podcast. Any plugin used here must return data in the form of a "Story" class. Subclasses of Story are valid outputs of this plugin as well, see the HackerNewsStory class as an example.
-
podcastIntroPlugins: These plugins will be invoked to write the Intro for the podcast based on the "stories" picked by the
podcastDataSourcePlugins
. This is a good place to inject some personality or branding to the podcast based on your choice of plugins or modification to existing plugins. -
podcastScraperPlugins: These plugins will be invoked to scrape the text from the urls determined by the
podcastDataSourcePlugins
. This plugin adds a 'raw_text' directory filled with the raw text scraped from these urls that will be used by the next set of plugins. -
podcastSegmentWriterPlugins: These plugins generate the final text that will be used to produce spoken audio for the podcast. The output of this plugin will be written to the podcast's
segment_text
directory. -
podcastOutroWriterPlugins: These plugins generate a the outro to the podcast. This is another good place to inject some personality or branding to the podcast based on your choice of plugins or modification to existing plugins.
In order to build a useful long term podcast you're going to need to be able to store some state about stories covered in the past. To ensure the rest of this application can remain stateless to allow greater deployability options any state that plugins need to access should be stored in a simple Firebase Realtime DB. However, this app should work without firebase as a dependency and any firebase dependencies will remain optional for anyone wanting to run this app. Firebase is extremely simple to setup in your project by following a few simple steps:
- Go to your Firebase Console and add a new project: https://console.firebase.google.com/
- Once you've created your new project, go to "Project Settings" and open the "Service Accounts" Tab.
- Click the "Generate New Private Key" button
- Add the newly downloaded private key file to the
/secrets
folder in this directory. This folder should be ignored by git, but you should ensure this file is not uploaded to a git repository or anywhere public. - Update the
FIREBASE_SERVICE_ACCOUNT_KEY_PATH
variable in this directory's.config.env
with a path to the private key file. - Finally open the Realtime Database page in your new Firebase Project and copy the Database Reference Url ending with
.firebaseio.com
- Also add this url to the same
.config.env
file with the keyFIREBASE_DATABASE_URL
.
The NewsletterRSSFeedPlugin
fetches stories from newsletters in RSS feed format and uses Firebase to store the timestamp of the last fetched story for each feed. If a Firebase FIREBASE_DATABASE_URL
environment variable is not defined then this plugin simply returns a list of the most recent newsletter items in the RSS Feed.
+-------------------+
| podcastDataSource |
| Plugins |
+---------+---------+
|
|
+---------v---------+
| podcastIntro |
| Plugins |
+---------+---------+
|
|
+---------v---------+
| podcastScraper |
| Plugins |
+---------+---------+
|
|
|
+---------v---------+
| podcastSegment |
| WriterPlugins |
+-------------------+
|
|
+---------v---------+
| podcastOutro |
| WriterPlugins |
+-------------------+
|
|
+---------v---------+
| podcastProducer |
| Plugins |
+-------------------+
|
|
v
- You can easily change the intro background music by replacing the file
podcast_intro_music.mp3
with your own music. This file was generated with Suno.
- Modify the intro by modifying the prompt for the intro here:
podcastTextGenerationApp/podcastIntroWriter.py
- Modify the way that the news story segments are presented by modifying the prompt here:
podcastTextGenerationApp/storySegmentWriter.py
Contributions, issues and feature requests are welcome! Feel free to check issues page. If you'd like to contribute new features, open a pull request.
To run tests the $PYTHONPATH for the current session must include the podcastTextGenerationApp
directory.
Here's an example of how this can be set prior to running tests:
export PYTHONPATH=${PYTHONPATH}:/Users/<your username>/<your path to this app>/HyperFeeder/podcastTextGenerationApp
Change directory to the HyperFeeder
top level directory (if you're not there already).
Then you should be able to run python -m unittest
successfully.
If you have any questions, feel free to reach out to me at <[email protected]>
.
Enjoy your new podcast!
Thanks goes to these wonderful people (emoji key):
David Norman 🚇 |
This project follows the all-contributors specification. Contributions of any kind welcome!