Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add parsers and blogs #169

Open
ccstan99 opened this issue Aug 27, 2023 · 1 comment
Open

Add parsers and blogs #169

ccstan99 opened this issue Aug 27, 2023 · 1 comment
Labels
agisf Related to AGISF course

Comments

@ccstan99
Copy link
Collaborator

ccstan99 commented Aug 27, 2023

To handle suggestions from agisf:

Add to scrape entire blog:

  • epochai.org/blog,
  • ai-risk-discussions.org,
  • aiscc.org

Implement parsers for special_docs/indices:

  • 80000hours,
  • givewell.org,
  • twitter.com,
  • csis.org,
  • blog.heim.xyz,
  • astralcodexten.substack.com
  • rethinkpriorities.org
@ccstan99 ccstan99 added the agisf Related to AGISF course label Aug 27, 2023
@ccstan99
Copy link
Collaborator Author

ccstan99 commented Sep 9, 2023

Would it help to use LangChain's WebBaseLoader as a default until the unimplemented parsers get implemented?
https://python.langchain.com/docs/integrations/document_loaders/web_base

from langchain.document_loaders import WebBaseLoader
loader = WebBaseLoader("https://epochai.org/blog/")
docs = loader.load()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
agisf Related to AGISF course
Projects
None yet
Development

No branches or pull requests

1 participant