Skip to content

Commit 5d827ea

Browse files
committed
add figs and deploy
1 parent 947d10f commit 5d827ea

File tree

6 files changed

+92
-53
lines changed

6 files changed

+92
-53
lines changed

.github/workflows/update_space.yml

+66-12
Original file line numberDiff line numberDiff line change
@@ -1,28 +1,82 @@
1-
name: Run Python script
1+
name: Deploy to Hugging Face Spaces
22

33
on:
44
push:
55
branches:
66
- main
77

88
jobs:
9-
build:
9+
deploy:
1010
runs-on: ubuntu-latest
1111

1212
steps:
1313
- name: Checkout
14-
uses: actions/checkout@v2
14+
uses: actions/checkout@v3
1515

16-
- name: Set up Python
17-
uses: actions/setup-python@v2
16+
- name: Set up Conda
17+
uses: conda-incubator/setup-miniconda@v2
1818
with:
19-
python-version: '3.9'
19+
activate-environment: kg4s
20+
environment-file: environment.yml
21+
auto-activate-base: false
22+
use-mamba: true
2023

21-
- name: Install Gradio
22-
run: python -m pip install gradio
24+
- name: Verify Conda installation
25+
shell: bash -l {0}
26+
run: |
27+
conda info
28+
conda list
2329
24-
- name: Log in to Hugging Face
25-
run: python -c 'import huggingface_hub; huggingface_hub.login(token="${{ secrets.hf_token }}")'
30+
- name: Install huggingface_hub
31+
shell: bash -l {0}
32+
run: |
33+
pip install huggingface_hub
34+
pip list
2635
27-
- name: Deploy to Spaces
28-
run: gradio deploy
36+
- name: Deploy to Hugging Face Spaces
37+
env:
38+
HF_TOKEN: ${{ secrets.HF_TOKEN }}
39+
shell: bash -l {0}
40+
run: |
41+
python - <<EOF
42+
import os
43+
import sys
44+
from huggingface_hub import HfApi
45+
46+
print("Python script started")
47+
print(f"Python version: {sys.version}")
48+
print(f"Current working directory: {os.getcwd()}")
49+
print(f"Contents of current directory: {os.listdir('.')}")
50+
51+
sys.path.append('scripts')
52+
print(f"Updated sys.path: {sys.path}")
53+
print(f"Contents of scripts directory: {os.listdir('scripts')}")
54+
55+
print("Importing demo from run_db_interface")
56+
from run_db_interface import demo
57+
print("Demo imported successfully")
58+
59+
api = HfApi()
60+
print("HfApi initialized")
61+
62+
print("Creating/verifying repository")
63+
api.create_repo(
64+
repo_id="abby101/xurveyor-0",
65+
repo_type="space",
66+
space_sdk="gradio",
67+
token="$HF_TOKEN"
68+
)
69+
print("Repository created or verified")
70+
71+
print("Starting deployment")
72+
demo.deploy(
73+
repo_id="abby101/xurveyor-0",
74+
hf_token="$HF_TOKEN",
75+
)
76+
print("Deployment completed")
77+
EOF
78+
79+
- name: Check Hugging Face Space
80+
run: |
81+
echo "Deployment process completed. Please check your Hugging Face Space at https://huggingface.co/spaces/abby101/xurveyor-0"
82+
echo "If the space is not updated, please check the logs above for any errors."

.gitignore

+3-1
Original file line numberDiff line numberDiff line change
@@ -165,5 +165,7 @@ requirements.txt
165165
wandb/
166166
slurm_logs/
167167
notebooks/
168-
misc/
168+
misc/polymathic_data_files
169+
misc/notes
170+
misc/test.ipynb
169171
Meta-Llama-3-70B-Instruct/

README.md

+23-25
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,3 @@
1-
---
2-
title: git_config_-global_credential.helper_store
3-
app_file: scripts/run_db_interface.py
4-
sdk: gradio
5-
sdk_version: 4.40.0
6-
---
71
# Mapping the Data Landscape For Generalizable Scientific Models
82

93
This is a WIP that builds a knowledge base to store structured information extracted from scientific publications, datasets and articles using LLMs.
@@ -14,25 +8,29 @@ This tool helps us identify the gaps where current foundation models lack covera
148

159
We use the Llama-3-70B-Instruct model for structured information extraction.
1610

17-
<div style="display: flex; justify-content: space-between; gap: 20px;">
18-
<figure style="margin: 0; width: 48%;">
19-
<img src="misc/eval_pipeline.png" alt="Fig 1" style="width: 100%; height: 300px; object-fit: contain;">
20-
<figcaption style="font-size: 0.9em; text-align: center; margin-top: 10px;">
21-
Prompt optimization pipeline to maximize precision of the model annotated
22-
predictions by running on manually annotated subset of scientific corpora.
23-
The tagged outputs can be generated as JSON or in a readable format, and be
24-
generated using temperature and nucleus sampling (sweep hyperparams).
25-
</figcaption>
26-
</figure>
27-
<figure style="margin: 0; width: 48%;">
28-
<img src="misc/pipeline.png" alt="Fig 2" style="width: 100%; height: 300px; object-fit: contain;">
29-
<figcaption style="font-size: 0.9em; text-align: center; margin-top: 10px;">
30-
Illustration of the structured prediction pipeline on the full corpus of
31-
scientific papers, which runs optimized prompts and stores the model's
32-
outputs in a SQL db.
33-
</figcaption>
34-
</figure>
35-
</div>
11+
## Workflow
12+
13+
<table>
14+
<tr>
15+
<td width="50%" valign="top">
16+
<img src="misc/eval_pipeline.png" alt="Fig 1" width="100%">
17+
<p align="center">
18+
<em>Fig 1: Prompt optimization pipeline to maximize precision of the model annotated
19+
predictions by running on manually annotated subset of scientific corpora. The
20+
tagged outputs can be generated as JSON or in a readable format, and be
21+
generated using temperature and nucleus sampling (sweep hyperparams).</em>
22+
</p>
23+
</td>
24+
<td width="50%" valign="top">
25+
<img src="misc/pipeline.png" alt="Fig 2" width="100%">
26+
<p align="center">
27+
<em>Fig 2: Illustration of the structured prediction pipeline on the full corpus of
28+
scientific papers, which runs optimized prompts and stores the model's outputs in
29+
a SQL db.</em>
30+
</p>
31+
</td>
32+
</tr>
33+
</table>
3634

3735
## Installation
3836

misc/eval_pipeline.png

82.8 KB
Loading

misc/pipeline.png

171 KB
Loading

requirements.txt

-15
This file was deleted.

0 commit comments

Comments
 (0)