Initial commit of linkyrss

nicokruger · Jun 16, 2023 · 9c397d0 · 9c397d0
commit 9c397d0
Show file tree

Hide file tree

Showing 31 changed files with 870 additions and 0 deletions.
diff --git a/.gitignore b/.gitignore
@@ -0,0 +1,34 @@
+# Byte-compiled / optimized / DLL files
+__pycache__/
+*.py[cod]
+*$py.class
+
+# Distribution / packaging
+dist/
+build/
+*.egg-info/
+*.egg
+*.whl
+
+# Virtual environments
+venv/
+env/
+.env/
+
+# IDE-specific files
+.idea/
+.vscode/
+*.sublime-project
+*.sublime-workspace
+
+# Test-related files
+*.pytest_cache/
+*.coverage
+htmlcov/
+
+# Other
+.DS_Store
+*.log
+*.swp
+*.swo
+
diff --git a/LICENSE.md b/LICENSE.md
@@ -0,0 +1,9 @@
+The MIT License (MIT)
+
+Copyright (c) Dan Flettre
+
+Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
diff --git a/README.md b/README.md
@@ -0,0 +1,120 @@
+# LinkyRSS: AI-Powered Bookmarking Tool 📚🔖
+
+Welcome to **LinkyRSS** - the combination of RSS, Bookmarks and AI. LinkyRSS will automatically summarise the actual content of the things you bookmark and make it available as an RSS feed for tagging/searching/consumption in your existing RSS flow.
+
+
+**Want to try LinkyRSS? Visit our [demo site](http://linkyrssdemo.inmytree.co.za)**
+
+> **Note:** Links posted to the demo site maybe be cleared out periodically.
+
+![Adding a new link](example_add.png)
+*Adding a new link*
+
+![View page on the web app](example_web.png)
+*View page on the web app*
+
+![Generated RSS feed in RSSGuard](example_rss.png)
+*Generated RSS feed in RSSGuard*
+
+
+
+## Key Features 🌟
+
+- **AI-Generated Summaries:** Don't just store links, get a concise and informative summary of the content. 
+
+- **RSS Feed:** Access your bookmarks anytime, anywhere using your favorite RSS reader. Stay updated on your saved content effortlessly.
+
+- **Automated Title and Summary:** Simply provide a link and LinkyRSS will automatically generate a suitable title and summary.
+
+- **Effective Categorization & Search:** Tired of losing important links in a disorganized mess? With LinkyRSS, you can leverage your existing RSS tagging mechanism.
+
+## System Design 🧩
+
+LinkyRSS is designed to be simple yet robust. At its core, it's a Flask application. It leverages Langchain to utilize AI language models like GPT from OpenAI to generate summaries and titles for links. At the moment, data is stored in either an in-memory database or a DynamoDB database (easy to extend for other database systems).
+
+Routes available in the application are:
+- `/` : Home page showing count of links saved.
+- `/api/extract` : Post a URL here to extract its title and summary.
+- `/view` : View all stored links with their titles and summaries.
+- `/rss` : RSS feed of all stored links. RSS 2.0 XML compatible with all popular RSS readers.
+
+## Getting Started 🚀
+
+### Prerequisites
+
+- Python 3.6+
+- [OpenAI Key](https://platform.openai.com/signup/)
+- (optional) [AWS Account](https://portal.aws.amazon.com/gp/aws/developer/registration/index.html) (optional, for DynamoDB and AWS Lambda support)
+
+### Installation
+
+1. Clone the repository to your local machine:
+
+```bash
+git clone https://github.com/YOUR_GITHUB_USERNAME/LinkyRSS.git
+cd LinkyRSS
+```
+
+2. Install the necessary Python packages:
+
+```bash
+pip install -r requirements.txt
+```
+
+3. Export necessary environment variables:
+
+- For OpenAI key:
+
+```bash
+export OPENAI_KEY=your_openai_key
+```
+
+- For AWS DynamoDB (optional)
+
+```bash
+export DATABASE_TYPE=dynamo
+export DYNAMODB_TABLE=your_dynamodb_table
+```
+
+By default, the application uses an in-memory database. To use DynamoDB, set the DATABASE_TYPE to dynamo.
+
+4. Run the application
+
+```bash
+Run the application:
+```
+
+Navigate to *localhost:8080* on your web browser to start using LinkyRSS!
+
+## Adding LinkyRSS to Your Feed Reader 📑
+
+Adding your LinkyRSS feed to your favorite feed reader is easy. Here's how you can do it:
+
+1. Run your LinkyRSS application. Ensure it's either hosted on a server or running locally.
+
+2. Navigate to the `/rss` endpoint of your LinkyRSS application. For instance, if you're running it locally on port 8080, the URL would be `http://localhost:8080/rss`.
+
+3. Copy this URL.
+
+4. Open your feed reader. The process of adding a new feed may vary between different feed readers. Generally, look for an option to "Add New Feed" or "Subscribe".
+
+5. Paste the copied URL into the feed URL input box.
+
+6. Confirm and add the feed.
+
+Now, your LinkyRSS feed should be added to your feed reader. The reader will update with new bookmarks as they are added to your LinkyRSS.
+
+
+## Contribution Guidelines 👨💻👩💻
+
+LinkyRSS is an open source project, feel free to submit any changes.
+
+## License 📄
+
+LinkyRSS is distributed under the MIT license. See [LICENSE](LICENSE.md) for more information.
+
+## Contact 📧
+
+*Your Contact Information*
+
+
diff --git a/app.py b/app.py
@@ -0,0 +1,83 @@
+from flask import Flask, render_template, request, Response
+import werkzeug.exceptions
+from serverless_wsgi import handle_request
+import os
+import logging
+logging.basicConfig(level=logging.DEBUG)
+logging.getLogger('botocore').setLevel(logging.WARNING)
+logging.getLogger('boto3').setLevel(logging.WARNING)
+logging.getLogger('urllib3').setLevel(logging.WARNING)
+logging.getLogger('werkzeug').setLevel(logging.WARNING)
+logging.getLogger('openai').setLevel(logging.WARNING)
+logger = logging.getLogger(__name__)
+
+import linkyai
+import database
+
+
+def make_app():
+    database_type = os.environ.get('DATABASE_TYPE', 'memory')
+    if database_type == 'dynamo':
+        db = database.DynamoDatabase()
+    elif database_type == 'memory':
+        db = database.InMemoryDatabase()
+    else:
+        # default
+        db = database.InMemoryDatabase()
+
+    app = Flask(
+        __name__,
+        static_folder='./static',
+        static_url_path='/',
+        template_folder='templates'
+    )
+
+    @app.route('/api/extract', methods=['POST'])
+    def extract_info():
+        url = request.form.get('url')
+
+        data = linkyai.get_summary(url)
+        logger.info(f"Extracted {data}")
+
+        db.save_url(url, data['title'], data['summary'])
+
+        return render_template('extract.html', data=data)
+
+    @app.route('/')
+    def home():
+        items = db.get_links()
+
+        return render_template('index.html', count=len(items))
+
+
+    @app.route('/view')
+    def view():
+        items = db.get_links()
+
+        return render_template('view.html', items=items)
+
+    @app.route('/rss')
+    def rss():
+        items = db.get_links()
+
+        return Response(render_template('rss.xml', items=items), mimetype='application/rss+xml')
+
+    @app.errorhandler(Exception)
+    def handle_error(e):
+        if isinstance(e, werkzeug.exceptions.NotFound):
+            return render_template('error.html', message='Page not found'), 404
+        else:
+            logger.exception(e)
+            return render_template('error.html', message=str(e)), 500
+
+    return app
+
+def lambda_handler(event, context):
+    if event.get('source', '') == 'keepwarm':
+        print("keepwarm")
+        return {'statusCode': 200}
+    return handle_request(make_app(), event, context)
+
+if __name__ == '__main__':
+    make_app().run(host='0.0.0.0', port=int(os.environ.get('PORT', 8080)))
+
diff --git a/database.py b/database.py
@@ -0,0 +1,49 @@
+import os
+from datetime import datetime
+from boto3.dynamodb.conditions import Key
+import boto3
+
+class DynamoDatabase:
+    def __init__(self):
+        dynamodb = boto3.resource('dynamodb')
+        self.table = dynamodb.Table(os.environ['DYNAMODB_TABLE'])
+
+        self.pk = 'URL'
+        self.sk_prefix = 'LINK#'
+
+    def save_url(self, url, title, summary):
+        self.table.put_item(
+            Item={
+                'pk': self.pk,
+                'sk': f'{self.sk_prefix}{datetime.now().isoformat()}#{url}',
+                'url': url,
+                'title': title,
+                'summary': summary,
+            }
+        )
+
+    def get_links(self):
+        response = self.table.query(
+            KeyConditionExpression=Key('pk').eq(self.pk) & Key('sk').begins_with(self.sk_prefix),
+            ScanIndexForward=False
+        )
+
+        return response['Items'] or []
+
+class InMemoryDatabase:
+    def __init__(self):
+        self.data = []
+
+    def save_url(self, url, title, summary):
+        self.data.append({
+            'url': url,
+            'title': title,
+            'summary': summary,
+        })
+
+    def get_links(self):
+        # reverse chronological order
+        return self.data[::-1]
+
+
+
diff --git a/example_add.png b/example_add.png
diff --git a/example_rss.png b/example_rss.png
diff --git a/example_web.png b/example_web.png
diff --git a/index.py b/index.py
@@ -0,0 +1,82 @@
+import os
+
+import langchain
+from langchain.chains import LLMChain, LLMRequestsChain
+from langchain.llms import OpenAI
+from langchain.prompts import PromptTemplate
+from langchain.llms import VertexAI
+from langchain import PromptTemplate, LLMChain
+
+#langchain.debug = True
+
+#template = """Question: {question}
+#
+#Answer: Let's think step by step."""
+#
+#prompt = PromptTemplate(template=template, input_variables=["question"])
+#
+#llm = VertexAI()
+#llm_chain = LLMChain(prompt=prompt, llm=llm)
+#
+#question = "What NFL team won the Super Bowl in the year Justin Beiber was born?"
+#
+#a = llm_chain.run(question)
+#print(a)
+
+
+template = """Within the markdown block below is the full content of a website I am interested in.
+
+
+```
+{requests_result}
+```
+
+{query}?"""
+
+
+#"http://feeds.hanselman.com/~/676711904/0/scotthanselman~Using-Home-Assistant-to-integrate-a-Unifi-Protect-G-Doorbell-and-Amazon-Alexa-to-announce-visitors",
+#"http://feeds.hanselman.com/~/673288256/0/scotthanselman~NET-Hot-Reload-and-Refused-to-connect-to-ws-because-it-violates-the-Content-Security-Policy-directive-because-Web-Sockets",
+#"http://feeds.hanselman.com/~/673288256/0/scotthanselman~NET-Hot-Reload-and-Refused-to-connect-to-ws-because-it-violates-the-Content-Security-Policy-directive-because-Web-Sockets",
+#"https://www.theverge.com/2023/6/2/23746354/apple-vr-headset-rumors-metaverse-potential",
+#"https://lifehacker.com/30-of-the-best-queer-movies-of-the-last-century-1850471612",
+#"https://slashdot.org/story/23/06/02/1039236/fidelity-cuts-reddit-valuation-by-41?utm_source=atom1.0mainlinkanon&utm_medium=feed",
+#"https://tech.slashdot.org/story/23/06/02/1237215/meta-requires-office-workers-to-return-to-desks-three-days-a-week?utm_source=atom1.0mainlinkanon&utm_medium=feed",
+#"https://browse.feddit.de/",
+#"https://fedia.io/",
+#"https://blurha.sh/",
+#"https://www.inmytree.co.za",
+#"https://generalrobots.substack.com/p/dimension-hopper-part-1",
+#"https://aws.amazon.com/blogs/machine-learning/technology-innovation-institute-trains-the-state-of-the-art-falcon-llm-40b-foundation-model-on-amazon-sagemaker/"
+
+for url in [x.strip() for x in open("urls.txt").readlines()]:
+    llm = VertexAI(max_output_tokens=1024)
+    PROMPT = PromptTemplate(
+        input_variables=["query", "requests_result"],
+        template=template,
+    )
+
+    chain = LLMRequestsChain(llm_chain = LLMChain(llm=llm, prompt=PROMPT))
+    inputs = {
+        "query": "What is the article about?",
+        "url": url
+    }
+    a = chain(inputs)
+
+#print(a)
+    print("---------")
+    print(url)
+    print(a['output'])
+    print("------------------")
+
+
+
+
+
+
+#from langchain.embeddings import VertexAIEmbeddings
+#
+#embeddings = VertexAIEmbeddings()
+#text = "This is a test document."
+#query_result = embeddings.embed_query(text)
+#doc_result = embeddings.embed_documents([text])
+#print(query_result)