Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up project #9

Merged
merged 40 commits into from
Apr 21, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
00b6794
initial commit
a-gleeson Apr 16, 2024
8c1542c
Requirements.txt: removed llama-cpp-python, added anthropic
James-Osmond Apr 16, 2024
71da0b3
Removed llama-cpp-python, added anthropic to `poetry.lock` and `pypro…
James-Osmond Apr 16, 2024
a469e10
Updated pyproject.toml to allow local running of app
James-Osmond Apr 16, 2024
c8e5ac0
Removed empty import
James-Osmond Apr 16, 2024
9709905
Linted with black
James-Osmond Apr 16, 2024
1178556
Added comment
James-Osmond Apr 16, 2024
1e5b28e
Reran pre-commit hooks
James-Osmond Apr 16, 2024
e2b9f10
Update README.md `pre-commit` instructions
James-Osmond Apr 16, 2024
95f577d
Added Transcript() class
James-Osmond Apr 16, 2024
3a67eea
Removed pandas import
James-Osmond Apr 16, 2024
016b7f1
Downloadable transcripts
James-Osmond Apr 16, 2024
63506a8
update poetry
Apr 16, 2024
d82faa1
New page for transcript
James-Osmond Apr 16, 2024
9b86b0f
Added download txt button
James-Osmond Apr 16, 2024
431ca32
formatting and edit attendees section
Apr 16, 2024
88ee055
api notebook
a-gleeson Apr 16, 2024
c1d72f6
update notebook
a-gleeson Apr 16, 2024
20c2849
Example `query_llm()` function
James-Osmond Apr 16, 2024
8ff3741
Summarise button
James-Osmond Apr 16, 2024
312826f
formatting
hannabh Apr 16, 2024
b7e8247
format transcript page
hannabh Apr 16, 2024
2f6b298
rename pages
hannabh Apr 16, 2024
bd53074
summary page formatting
hannabh Apr 16, 2024
750a6dc
New homepage with cabinet photo
James-Osmond Apr 16, 2024
7154b7e
Fixed typo
James-Osmond Apr 16, 2024
db199ac
Added `transcript` argument to `llm_summarise()` function
James-Osmond Apr 16, 2024
91c5cf3
api class
a-gleeson Apr 16, 2024
7a4b2d5
Fnc to interrogate model and return body text
PhilOutram Apr 16, 2024
4193deb
working API integration
a-gleeson Apr 16, 2024
74e7817
format response
a-gleeson Apr 16, 2024
4dcc573
api integrations
a-gleeson Apr 17, 2024
b68c96d
add logo
hannabh Apr 17, 2024
f3f8108
Conflicts fixed
James-Osmond Apr 17, 2024
2452958
Conversation works
James-Osmond Apr 17, 2024
cf890ae
Conversation bot (WIP)
James-Osmond Apr 17, 2024
997112a
Conversation works
James-Osmond Apr 17, 2024
1821559
clean up project
a-gleeson Apr 21, 2024
7446dd6
add in prompts into doc string
a-gleeson Apr 21, 2024
887ec74
missing files from rebase
a-gleeson Apr 21, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
*.env
models
models/*
data
data/*
notebooks
notebooks/*
16 changes: 16 additions & 0 deletions .env.template
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
export ENV="dev"
export PROJECT_PATH="/Users/<xx>/"
export OPENSEARCH_URL="https://localhost:9200"

export LOADER_CONFIG="file_loader" # defaults to s3_loader
export VECTOR_STORE_CONFIG="opensearch" # defaults to opensearch
export LLM_MODEL="local_llm" # defaults to hosted_llm

export SUMMARISE_API = "xxxxxxx"
export SUMMARISE_URL = "https://xxxx.amazonaws.com/api"
export FACTCHECK_API = "xxxxxxx"
export FACTCHECK_URL = "https://xxxx.amazonaws.com/api"
export GLOSSERY_API = "xxxxxxx"
export GLOSSERY_URL = "https://xxxx.amazonaws.com/api"
export CONVERSATION_API = "xxxxxxx"
export CONVERSATION_URL = "https://xxxx.amazonaws.com/api"
3 changes: 3 additions & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
*.ipynb filter=nbstripout
*.zpln filter=nbstripout
*.ipynb diff=ipynb
10 changes: 10 additions & 0 deletions .github/workflows/black.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
name: Black

on: [push, pull_request]

jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: psf/black@stable
10 changes: 10 additions & 0 deletions .github/workflows/isort.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
name: isort
on:
- push

jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: isort/isort-action@v1
169 changes: 169 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,169 @@
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
.pybuilder/
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version

# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock

# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock

# pdm
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# in version control.
# https://pdm.fming.dev/#use-with-ide
.pdm.toml

# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# pytype static type analyzer
.pytype/

# Cython debug symbols
cython_debug/

# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/

#mac
.DS_Store

#models
/models/*

#data
/data/*
/vacancies
2 changes: 2 additions & 0 deletions .isort.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
[tool.isort]
profile = "black"
15 changes: 15 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
repos:
- repo: https://github.com/psf/black
rev: 24.3.0
hooks:
- id: black
- repo: https://github.com/pycqa/isort
rev: 5.13.2
hooks:
- id: isort
name: isort (python)
args: ["--profile", "black"]
- repo: https://github.com/kynan/nbstripout
rev: 0.7.1
hooks:
- id: nbstripout
1 change: 1 addition & 0 deletions .python-version
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
3.10.13
1 change: 1 addition & 0 deletions .venv
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
hackathon
32 changes: 32 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
FROM python:3.10-slim as builder

ENV PYTHONUNBUFFERED=1
ENV PYTHONDONTWRITEBYTECODE=1
ENV POETRY_VIRTUALENVS_CREATE=false

WORKDIR /app

RUN apt-get update -y && apt-get install -y gcc g++ && \
# Prevent apt-get cache from being persisted to this layer.
rm -rf /var/lib/apt/lists/*
RUN pip install --upgrade pip && pip install poetry

COPY poetry.lock* pyproject.toml /app/

COPY . /app/

RUN poetry install --no-interaction --only main --no-ansi
#--no-root --sync

# ---- Run Stage ----
FROM python:3.10-slim as runner

WORKDIR /app

COPY --from=builder /app /app
COPY --from=builder /usr/local/lib/python3.10/site-packages /usr/local/lib/python3.10/site-packages
COPY --from=builder /usr/local/bin /usr/local/bin

EXPOSE 8501

CMD ["streamlit", "run", "app/home.py"]
23 changes: 10 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
# hackathon



## Workflow

## How it works (for the hackathon)
Expand All @@ -12,11 +11,6 @@

# Set up

https://python.langchain.com/docs/integrations/chat/llama2_chat
https://python.langchain.com/docs/templates/llama2-functions
https://huggingface.co/blog/llama2#how-to-prompt-llama-2
https://python.langchain.com/docs/integrations/llms/llamacpp#grammars

## 1. pyenv

Install here: [https://github.com/pyenv/pyenv#homebrew-on-macos]
Expand Down Expand Up @@ -133,6 +127,14 @@ Pre commit hooks run after commit to fix up formatting and other issues. Install
pre-commit install
```

You can then run:

```sh
pre-commit run --all-files
```

and commit any changes made to files in the repo.

## 6. Add secrets into .env

- Run `cp .env.template .env` and update the secrets.
Expand Down Expand Up @@ -199,9 +201,7 @@ docker-compose down

check opensearch by visiting http://localhost:5601/app/login? or running `curl https://localhost:9200 -ku 'admin:admin'`

## Sagemaker setup
- Launch a SageMaker Notebook from SageMaker > Notebook > Notebook instances > Create notebook instance
- Select `ml.g4dn.xlarge` instance type (see [https://aws.amazon.com/sagemaker/pricing/] for pricing)
## Conda setup

### Install Python dependencies

Expand All @@ -211,11 +211,8 @@ Create a new terminal and run the following:
# Switch to a bash shell
bash

# Change to the repo root
cd ~/SageMaker/hackathon

# Activate a Python 3.10 environment pre-configured with PyTorch
conda create -n hackathon python=3.10.13
# Activate a Python 3.10 environment
conda create -n hackathon python=$(cat .python-version)
conda activate hackathon

Expand Down
Empty file added app/__init__.py
Empty file.
Loading
Loading