Skip to content

Commit

Permalink
Merge branch 'master' of github.com:binary-husky/chatgpt_academic
Browse files Browse the repository at this point in the history
  • Loading branch information
binary-husky committed Jul 15, 2023
2 parents b19a615 + 801f734 commit a3e938a
Show file tree
Hide file tree
Showing 21 changed files with 1,531 additions and 192 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/build-with-chatglm.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# https://docs.github.com/en/actions/publishing-packages/publishing-docker-images#publishing-images-to-github-packages
name: Create and publish a Docker image for ChatGLM support
name: build-with-chatglm

on:
push:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/build-with-latex.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# https://docs.github.com/en/actions/publishing-packages/publishing-docker-images#publishing-images-to-github-packages
name: Create and publish a Docker image for Latex support
name: build-with-latex

on:
push:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/build-without-local-llms.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# https://docs.github.com/en/actions/publishing-packages/publishing-docker-images#publishing-images-to-github-packages
name: Create and publish a Docker image
name: build-without-local-llms

on:
push:
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -150,3 +150,4 @@ request_llm/jittorllms
multi-language
request_llm/moss
media
flagged
30 changes: 20 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ To translate this project to arbitary language with GPT, read and run [`multi_la

<div align="center">

功能 | 描述
功能(⭐= 近期新增功能) | 描述
--- | ---
一键润色 | 支持一键润色、一键查找论文语法错误
一键中英互译 | 一键中英互译
Expand All @@ -45,11 +45,12 @@ Latex论文一键校对 | [函数插件] 仿Grammarly对Latex文章进行语法
[谷歌学术统合小助手](https://www.bilibili.com/video/BV19L411U7ia) | [函数插件] 给定任意谷歌学术搜索页面URL,让gpt帮你[写relatedworks](https://www.bilibili.com/video/BV1GP411U7Az/)
互联网信息聚合+GPT | [函数插件] 一键[让GPT从互联网获取信息](https://www.bilibili.com/video/BV1om4y127ck)回答问题,让信息永不过时
⭐Arxiv论文精细翻译 | [函数插件] 一键[以超高质量翻译arxiv论文](https://www.bilibili.com/video/BV1dz4y1v77A/),目前最好的论文翻译工具
[实时语音对话输入](https://github.com/binary-husky/gpt_academic/blob/master/docs/use_audio.md) | [函数插件] 异步监听音频,完全脱手操作,自动断句,自动寻找回答时机
公式/图片/表格显示 | 可以同时显示公式的[tex形式和渲染形式](https://user-images.githubusercontent.com/96192199/230598842-1d7fcddd-815d-40ee-af60-baf488a199df.png),支持公式、代码高亮
多线程函数插件支持 | 支持多线调用chatgpt,一键处理[海量文本](https://www.bilibili.com/video/BV1FT411H7c5/)或程序
启动暗色[主题](https://github.com/binary-husky/gpt_academic/issues/173) | 在浏览器url后面添加```/?__theme=dark```可以切换dark主题
[多LLM模型](https://www.bilibili.com/video/BV1wT411p7yf)支持 | 同时被GPT3.5、GPT4、[清华ChatGLM](https://github.com/THUDM/ChatGLM-6B)[复旦MOSS](https://github.com/OpenLMLab/MOSS)同时伺候的感觉一定会很不错吧?
ChatGLM2微调模型 | 支持加载ChatGLM2微调模型,提供ChatGLM2微调插件
[多LLM模型](https://www.bilibili.com/video/BV1wT411p7yf)支持 | 同时被GPT3.5、GPT4、[清华ChatGLM2](https://github.com/THUDM/ChatGLM2-6B)[复旦MOSS](https://github.com/OpenLMLab/MOSS)同时伺候的感觉一定会很不错吧?
ChatGLM2微调模型 | 支持加载ChatGLM2微调模型,提供ChatGLM2微调辅助插件
更多LLM模型接入,支持[huggingface部署](https://huggingface.co/spaces/qingxu98/gpt-academic) | 加入Newbing接口(新必应),引入清华[Jittorllms](https://github.com/Jittor/JittorLLMs)支持[LLaMA](https://github.com/facebookresearch/llama)[盘古α](https://openi.org.cn/pangu/)
更多新功能展示 (图像生成等) …… | 见本文档结尾处 ……

Expand Down Expand Up @@ -115,12 +116,12 @@ python -m pip install -r requirements.txt # 这个步骤和pip安装一样的步
```


<details><summary>如果需要支持清华ChatGLM/复旦MOSS作为后端,请点击展开此处</summary>
<details><summary>如果需要支持清华ChatGLM2/复旦MOSS作为后端,请点击展开此处</summary>
<p>

【可选步骤】如果需要支持清华ChatGLM/复旦MOSS作为后端,需要额外安装更多依赖(前提条件:熟悉Python + 用过Pytorch + 电脑配置够强):
【可选步骤】如果需要支持清华ChatGLM2/复旦MOSS作为后端,需要额外安装更多依赖(前提条件:熟悉Python + 用过Pytorch + 电脑配置够强):
```sh
# 【可选步骤I】支持清华ChatGLM。清华ChatGLM备注:如果遇到"Call ChatGLM fail 不能正常加载ChatGLM的参数" 错误,参考如下: 1:以上默认安装的为torch+cpu版,使用cuda需要卸载torch重新安装torch+cuda; 2:如因本机配置不够无法加载模型,可以修改request_llm/bridge_chatglm.py中的模型精度, 将 AutoTokenizer.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True) 都修改为 AutoTokenizer.from_pretrained("THUDM/chatglm-6b-int4", trust_remote_code=True)
# 【可选步骤I】支持清华ChatGLM2。清华ChatGLM备注:如果遇到"Call ChatGLM fail 不能正常加载ChatGLM的参数" 错误,参考如下: 1:以上默认安装的为torch+cpu版,使用cuda需要卸载torch重新安装torch+cuda; 2:如因本机配置不够无法加载模型,可以修改request_llm/bridge_chatglm.py中的模型精度, 将 AutoTokenizer.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True) 都修改为 AutoTokenizer.from_pretrained("THUDM/chatglm-6b-int4", trust_remote_code=True)
python -m pip install -r request_llm/requirements_chatglm.txt

# 【可选步骤II】支持复旦MOSS
Expand All @@ -144,6 +145,8 @@ python main.py
### 安装方法II:使用Docker

1. 仅ChatGPT(推荐大多数人选择,等价于docker-compose方案1)
[![basic](https://github.com/binary-husky/gpt_academic/actions/workflows/build-without-local-llms.yml/badge.svg?branch=master)](https://github.com/binary-husky/gpt_academic/actions/workflows/build-without-local-llms.yml)
[![basic](https://github.com/binary-husky/gpt_academic/actions/workflows/build-with-latex.yml/badge.svg?branch=master)](https://github.com/binary-husky/gpt_academic/actions/workflows/build-with-latex.yml)

``` sh
git clone https://github.com/binary-husky/gpt_academic.git # 下载项目
Expand All @@ -158,7 +161,8 @@ docker run --rm -it -e WEB_PORT=50923 -p 50923:50923 gpt-academic
```
P.S. 如果需要依赖Latex的插件功能,请见Wiki。另外,您也可以直接使用docker-compose获取Latex功能(修改docker-compose.yml,保留方案4并删除其他方案)。

2. ChatGPT + ChatGLM + MOSS(需要熟悉Docker)
2. ChatGPT + ChatGLM2 + MOSS(需要熟悉Docker)
[![chatglm](https://github.com/binary-husky/gpt_academic/actions/workflows/build-with-chatglm.yml/badge.svg?branch=master)](https://github.com/binary-husky/gpt_academic/actions/workflows/build-with-chatglm.yml)

``` sh
# 修改docker-compose.yml,保留方案2并删除其他方案。修改docker-compose.yml中方案2的配置,参考其中注释即可
Expand Down Expand Up @@ -284,6 +288,7 @@ Tip:不指定文件直接点击 `载入对话历史存档` 可以查看历史h

### II:版本:
- version 3.5(Todo): 使用自然语言调用本项目的所有函数插件(高优先级)
- version 3.46: 支持完全脱手操作的实时语音对话
- version 3.45: 支持自定义ChatGLM2微调模型
- version 3.44: 正式支持Azure,优化界面易用性
- version 3.4: +arxiv论文翻译、latex论文批改功能
Expand All @@ -306,13 +311,18 @@ gpt_academic开发者QQ群-2:610599535
- 某些浏览器翻译插件干扰此软件前端的运行
- 官方Gradio目前有很多兼容性Bug,请务必使用`requirement.txt`安装Gradio

### III:参考与学习
### III:主题
可以通过修改`THEME`选项(config.py)变更主题
1. `Chuanhu-Small-and-Beautiful` [网址](https://github.com/GaiZhenbiao/ChuanhuChatGPT/)


### IV:参考与学习

```
代码中参考了很多其他优秀项目中的设计,顺序不分先后:
# 清华ChatGLM-6B:
https://github.com/THUDM/ChatGLM-6B
# 清华ChatGLM2-6B:
https://github.com/THUDM/ChatGLM2-6B
# 清华JittorLLMs:
https://github.com/Jittor/JittorLLMs
Expand Down
8 changes: 8 additions & 0 deletions config.py
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,8 @@
# 是否在提交时自动清空输入框
AUTO_CLEAR_TXT = False

# 色彩主体,可选 ["Default", "Chuanhu-Small-and-Beautiful"]
THEME = "Default"

# 加一个live2d装饰
ADD_WAIFU = False
Expand Down Expand Up @@ -123,3 +125,9 @@
NEWBING_COOKIES = """
put your new bing cookies here
"""


# 阿里云实时语音识别 配置难度较高 仅建议高手用户使用 参考 https://github.com/binary-husky/gpt_academic/blob/master/docs/use_audio.md
ENABLE_AUDIO = False
ALIYUN_TOKEN="" # 例如 f37f30e0f9934c34a992f6f64f7eba4f
ALIYUN_APPKEY="" # 例如 RoPlZrM88DnAFkZK
20 changes: 18 additions & 2 deletions crazy_functional.py
Original file line number Diff line number Diff line change
Expand Up @@ -392,7 +392,7 @@ def get_crazy_functions():
})
from crazy_functions.Latex输出PDF结果 import Latex翻译中文并重新编译PDF
function_plugins.update({
"Arixv翻译(输入arxivID)[需Latex]": {
"Arixv论文精细翻译(输入arxivID)[需Latex]": {
"Color": "stop",
"AsButton": False,
"AdvancedArgs": True,
Expand All @@ -403,7 +403,7 @@ def get_crazy_functions():
}
})
function_plugins.update({
"本地论文翻译(上传Latex压缩包)[需Latex]": {
"本地Latex论文精细翻译(上传Latex项目)[需Latex]": {
"Color": "stop",
"AsButton": False,
"AdvancedArgs": True,
Expand All @@ -416,6 +416,22 @@ def get_crazy_functions():
except:
print('Load function plugin failed')


try:
from toolbox import get_conf
ENABLE_AUDIO, = get_conf('ENABLE_AUDIO')
if ENABLE_AUDIO:
from crazy_functions.语音助手 import 语音助手
function_plugins.update({
"实时音频采集": {
"Color": "stop",
"AsButton": True,
"Function": HotReload(语音助手)
}
})
except:
print('Load function plugin failed')

# try:
# from crazy_functions.虚空终端 import 终端
# function_plugins.update({
Expand Down
93 changes: 93 additions & 0 deletions crazy_functions/live_audio/aliyunASR.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
import time, threading, json


class AliyunASR():

def test_on_sentence_begin(self, message, *args):
# print("test_on_sentence_begin:{}".format(message))
pass

def test_on_sentence_end(self, message, *args):
# print("test_on_sentence_end:{}".format(message))
message = json.loads(message)
self.parsed_sentence = message['payload']['result']
self.event_on_entence_end.set()
print(self.parsed_sentence)

def test_on_start(self, message, *args):
# print("test_on_start:{}".format(message))
pass

def test_on_error(self, message, *args):
# print("on_error args=>{}".format(args))
pass

def test_on_close(self, *args):
self.aliyun_service_ok = False
pass

def test_on_result_chg(self, message, *args):
# print("test_on_chg:{}".format(message))
message = json.loads(message)
self.parsed_text = message['payload']['result']
self.event_on_result_chg.set()

def test_on_completed(self, message, *args):
# print("on_completed:args=>{} message=>{}".format(args, message))
pass


def audio_convertion_thread(self, uuid):
# 在一个异步线程中采集音频
import nls # pip install git+https://github.com/aliyun/alibabacloud-nls-python-sdk.git
import tempfile
from scipy import io
from toolbox import get_conf
from .audio_io import change_sample_rate
from .audio_io import RealtimeAudioDistribution
NEW_SAMPLERATE = 16000
rad = RealtimeAudioDistribution()
rad.clean_up()
temp_folder = tempfile.gettempdir()
TOKEN, APPKEY = get_conf('ALIYUN_TOKEN', 'ALIYUN_APPKEY')
self.aliyun_service_ok = True
URL="wss://nls-gateway.aliyuncs.com/ws/v1"
sr = nls.NlsSpeechTranscriber(
url=URL,
token=TOKEN,
appkey=APPKEY,
on_sentence_begin=self.test_on_sentence_begin,
on_sentence_end=self.test_on_sentence_end,
on_start=self.test_on_start,
on_result_changed=self.test_on_result_chg,
on_completed=self.test_on_completed,
on_error=self.test_on_error,
on_close=self.test_on_close,
callback_args=[uuid.hex]
)

r = sr.start(aformat="pcm",
enable_intermediate_result=True,
enable_punctuation_prediction=True,
enable_inverse_text_normalization=True)

while not self.stop:
# time.sleep(self.capture_interval)
audio = rad.read(uuid.hex)
if audio is not None:
# convert to pcm file
temp_file = f'{temp_folder}/{uuid.hex}.pcm' #
dsdata = change_sample_rate(audio, rad.rate, NEW_SAMPLERATE) # 48000 --> 16000
io.wavfile.write(temp_file, NEW_SAMPLERATE, dsdata)
# read pcm binary
with open(temp_file, "rb") as f: data = f.read()
# print('audio len:', len(audio), '\t ds len:', len(dsdata), '\t need n send:', len(data)//640)
slices = zip(*(iter(data),) * 640) # 640个字节为一组
for i in slices: sr.send_audio(bytes(i))
else:
time.sleep(0.1)

if not self.aliyun_service_ok:
self.stop = True
self.stop_msg = 'Aliyun音频服务异常,请检查ALIYUN_TOKEN和ALIYUN_APPKEY是否过期。'
r = sr.stop()
51 changes: 51 additions & 0 deletions crazy_functions/live_audio/audio_io.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
import numpy as np
from scipy import interpolate

def Singleton(cls):
_instance = {}

def _singleton(*args, **kargs):
if cls not in _instance:
_instance[cls] = cls(*args, **kargs)
return _instance[cls]

return _singleton


@Singleton
class RealtimeAudioDistribution():
def __init__(self) -> None:
self.data = {}
self.max_len = 1024*1024
self.rate = 48000 # 只读,每秒采样数量

def clean_up(self):
self.data = {}

def feed(self, uuid, audio):
self.rate, audio_ = audio
# print('feed', len(audio_), audio_[-25:])
if uuid not in self.data:
self.data[uuid] = audio_
else:
new_arr = np.concatenate((self.data[uuid], audio_))
if len(new_arr) > self.max_len: new_arr = new_arr[-self.max_len:]
self.data[uuid] = new_arr

def read(self, uuid):
if uuid in self.data:
res = self.data.pop(uuid)
print('\r read-', len(res), '-', max(res), end='', flush=True)
else:
res = None
return res

def change_sample_rate(audio, old_sr, new_sr):
duration = audio.shape[0] / old_sr

time_old = np.linspace(0, duration, audio.shape[0])
time_new = np.linspace(0, duration, int(audio.shape[0] * new_sr / old_sr))

interpolator = interpolate.interp1d(time_old, audio.T)
new_audio = interpolator(time_new).T
return new_audio.astype(np.int16)
2 changes: 1 addition & 1 deletion crazy_functions/对话历史存档.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ def write_chat_to_file(chatbot, history=None, file_name=None):
file_name = 'chatGPT对话历史' + time.strftime("%Y-%m-%d-%H-%M-%S", time.localtime()) + '.html'
os.makedirs('./gpt_log/', exist_ok=True)
with open(f'./gpt_log/{file_name}', 'w', encoding='utf8') as f:
from theme import advanced_css
from theme.theme import advanced_css
f.write(f'<!DOCTYPE html><head><meta charset="utf-8"><title>对话历史</title><style>{advanced_css}</style></head>')
for i, contents in enumerate(chatbot):
for j, content in enumerate(contents):
Expand Down
Loading

0 comments on commit a3e938a

Please sign in to comment.