Skip to content

Commit 816f60c

Browse files
committed
Initial Commit
0 parents  commit 816f60c

File tree

113 files changed

+171745
-0
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

113 files changed

+171745
-0
lines changed

.gitignore

+8
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
build
2+
dist
3+
*.egg-info
4+
**/__pycache__/
5+
*.log
6+
**/*.lic
7+
.vscode
8+
.editorconfig

README.md

+64
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
# Gorilla: Large Language Model Connected with Massive APIs
2+
By Shishir G. Patil, Tianjun Zhang, Xin Wang, and Joseph E. Gonzalez ([Project Website](https://shishirpatil.github.io/gorilla/))
3+
4+
![](https://github.com/ShishirPatil/gorilla/blob/gh-pages/assets/img/logo.png)
5+
6+
_See the paper!_ [https://arxiv.org/abs/2305.15334](https://arxiv.org/abs/2305.15334)
7+
8+
_Join our Discord!_ [https://discord.gg/3apqwwME](https://discord.gg/3apqwwME)
9+
10+
`Gorilla` enables LLMs to use tools by invoking APIs. Given a natural language query, Gorilla comes up with the semantically- and syntactically- correct API to invoke. With Gorilla, we are the first to demonstrate how to use LLMs to invoke 1,600+ (and growing) API calls accurately while reducing hallucination. Join us, as we try to build the API store for LLMs! Hop on our Discord, or open a PR, or email us if you would like to have your API incorporated as well.
11+
12+
## Repository Organization
13+
14+
We include the `APIBench` dataset created by self-instruct in `data/apibench`. All the 1640 API documentation is in `data/api`. We convert this into a LLM-friendly chat format for evaluation, and the questions are in `eval/eval-data/questions`, and the corresponding responces are in `eval/eval-data/responses`. We have also included the evaluation scripts are in `eval/eval-scripts`. This would be entirely sufficient to train Gorilla yourself, and reproduce our results.
15+
Additionally, to make it more accessible, we will also release the model weights soon! Eitherways, if you run into any issues please feel free to reach out to us either through Discord or email or raise a Github issue.
16+
17+
## Abstract
18+
19+
From our [paper](https://arxiv.org/abs/2305.15334):
20+
21+
```text
22+
Large Language Models (LLMs) have seen an impressive wave of advances recently, with models
23+
now excelling in a variety of tasks, such as mathematical reasoning and program synthesis.
24+
However, their potential to effectively use tools via API calls remains unfulfilled.
25+
This is a challenging task even for today's state-of-the-art LLMs such as GPT-4, largely
26+
due to their inability to generate accurate input arguments and their tendency to hallucinate
27+
the wrong usage of an API call. We release Gorilla, a finetuned LLaMA-based model that surpasses
28+
the performance of GPT-4 on writing API calls. When combined with a document retriever,
29+
Gorilla demonstrates a strong capability to adapt to test-time document changes, enabling
30+
flexible user updates or version changes. It also substantially mitigates the issue of
31+
hallucination, commonly encountered when prompting LLMs directly. To evaluate the model's
32+
ability, we introduce APIBench, a comprehensive dataset consisting of HuggingFace, TorchHub,
33+
and TensorHub APIs. The successful integration of the retrieval system with Gorilla demonstrates
34+
the potential for LLMs to use tools more accurately, keep up with frequently updated documentation,
35+
and consequently increase the reliability and applicability of their outputs.
36+
```
37+
38+
## FAQ(s)
39+
40+
Can we use Gorilla with Langchain, Toolformer, AutoGPT etc?
41+
42+
Absolutely! You've highlighted a great aspect of our tools. Gorilla is an end-to-end model, specifically tailored to serve correct API calls without requiring any additional coding. It's designed to work as part of a wider ecosystem and can be flexibly integrated with other tools.
43+
44+
Langchain, is a versatile developer tool. Its "agents" can efficiently swap in any LLM, Gorilla included, making it a highly adaptable solution for various needs.
45+
46+
AutoGPT, on the other hand, concentrates on the art of prompting GPT series models. It's worth noting that Gorilla, as a fully fine-tuned model, consistently shows remarkable accuracy, and lowers hallucination, outperforming GPT-4 in making specific API calls.
47+
48+
Now, when it comes to ToolFormer, Toolformer zeroes in on a select set of tools, providing specialized functionalities. Gorilla, in contrast, has the capacity to manage thousands of API calls, offering a broader coverage over a more extensive range of tools.
49+
50+
The beauty of these tools truly shines when they collaborate, complementing each other's strengths and capabilities to create an even more powerful and comprehensive solution. This is where your contribution can make a difference. We enthusiastically welcome any inputs to further refine and enhance these tools.
51+
52+
## Citation
53+
```text
54+
@article{patil2023gorilla,
55+
title={Gorilla: Large Language Model Connected with Massive APIs},
56+
author={Shishir G. Patil and Tianjun Zhang and Xin Wang and Joseph E. Gonzalez},
57+
year={2023},
58+
journal={arXiv preprint arXiv:2305.15334},
59+
}
60+
```
61+
62+
63+
64+

data/api/huggingface_api.jsonl

+936
Large diffs are not rendered by default.

data/api/tensorflowhub_api.jsonl

+696
Large diffs are not rendered by default.

data/api/torchhub_api.jsonl

+94
Large diffs are not rendered by default.

data/apibench/huggingface_eval.json

+911
Large diffs are not rendered by default.

data/apibench/huggingface_train.json

+8,191
Large diffs are not rendered by default.

data/apibench/tensorflow_eval.json

+688
Large diffs are not rendered by default.

data/apibench/tensorflow_train.json

+6,190
Large diffs are not rendered by default.

data/apibench/torchhub_eval.json

+186
Large diffs are not rendered by default.

data/apibench/torchhub_train.json

+837
Large diffs are not rendered by default.

eval/README.md

+34
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
# Gorilla
2+
3+
![](https://github.com/ShishirPatil/gorilla/blob/gh-pages/assets/img/gorilla_method.png)
4+
5+
## Get Started
6+
7+
### Getting GPT-3.5-turbo, GPT-4 and Claude Responses
8+
9+
If you want to get LLM response for the API call, use the following command to get the responses:
10+
11+
```bash
12+
python get_llm_responses.py --model gpt-3.5-turbo --api_key $API_KEY --output_file gpt-3.5-turbo_torchhub_0_shot.jsonl --question_data ../data/questions/questions_0_shot_torchhub.jsonl --api_name torchhub
13+
```
14+
15+
### Evaluate the Response with AST tree matching
16+
17+
After the responses of the LLM is generated, we can start to evaluate the generated responses with respect to our dataset:
18+
19+
```bash
20+
cd eval-scripts
21+
python ast_eval_th.py --api_dataset ../../data/api/torchhub_api.jsonl --apibench ../../data/apibench/torchhub_eval.json --llm_responses ../eval-data/responses/torchhub/response_torchhub_Gorilla_FT_0_shot.jsonl
22+
```
23+
24+
## Citation
25+
26+
If you use Gorilla in your work, please cite us with:
27+
```text
28+
@article{patil2023gorilla,
29+
title={Gorilla: Large Language Model Connected with Massive APIs},
30+
author={Shishir G. Patil and Tianjun Zhang and Xin Wang and Joseph E. Gonzalez},
31+
year={2023},
32+
journal={arXiv preprint arXiv:2305.15334},
33+
}
34+
```

0 commit comments

Comments
 (0)