Merge branch 'master' of github.com:binary-husky/chatgpt_academic

binary-husky · Jul 15, 2023 · a3e938a · a3e938a
2 parents b19a615 + 801f734
commit a3e938a
Show file tree

Hide file tree

Showing 21 changed files with 1,531 additions and 192 deletions.
diff --git a/.github/workflows/build-with-chatglm.yml b/.github/workflows/build-with-chatglm.yml
@@ -1,5 +1,5 @@
 # https://docs.github.com/en/actions/publishing-packages/publishing-docker-images#publishing-images-to-github-packages
-name: Create and publish a Docker image for ChatGLM support
+name: build-with-chatglm
 
 on:
   push:

diff --git a/.github/workflows/build-with-latex.yml b/.github/workflows/build-with-latex.yml
@@ -1,5 +1,5 @@
 # https://docs.github.com/en/actions/publishing-packages/publishing-docker-images#publishing-images-to-github-packages
-name: Create and publish a Docker image for Latex support
+name: build-with-latex
 
 on:
   push:

diff --git a/.github/workflows/build-without-local-llms.yml b/.github/workflows/build-without-local-llms.yml
@@ -1,5 +1,5 @@
 # https://docs.github.com/en/actions/publishing-packages/publishing-docker-images#publishing-images-to-github-packages
-name: Create and publish a Docker image
+name: build-without-local-llms
 
 on:
   push:

diff --git a/.gitignore b/.gitignore
@@ -150,3 +150,4 @@ request_llm/jittorllms
 multi-language
 request_llm/moss
 media
+flagged
diff --git a/README.md b/README.md
@@ -25,7 +25,7 @@ To translate this project to arbitary language with GPT, read and run [`multi_la
 
 <div align="center">
 
-功能 | 描述
+功能（⭐= 近期新增功能） | 描述
 --- | ---
 一键润色 | 支持一键润色、一键查找论文语法错误
 一键中英互译 | 一键中英互译
@@ -45,11 +45,12 @@ Latex论文一键校对 | [函数插件] 仿Grammarly对Latex文章进行语法
 [谷歌学术统合小助手](https://www.bilibili.com/video/BV19L411U7ia) | [函数插件] 给定任意谷歌学术搜索页面URL，让gpt帮你[写relatedworks](https://www.bilibili.com/video/BV1GP411U7Az/)
 互联网信息聚合+GPT | [函数插件] 一键[让GPT从互联网获取信息](https://www.bilibili.com/video/BV1om4y127ck)回答问题，让信息永不过时
 ⭐Arxiv论文精细翻译 | [函数插件] 一键[以超高质量翻译arxiv论文](https://www.bilibili.com/video/BV1dz4y1v77A/)，目前最好的论文翻译工具
+⭐[实时语音对话输入](https://github.com/binary-husky/gpt_academic/blob/master/docs/use_audio.md) | [函数插件] 异步监听音频，完全脱手操作，自动断句，自动寻找回答时机
 公式/图片/表格显示 | 可以同时显示公式的[tex形式和渲染形式](https://user-images.githubusercontent.com/96192199/230598842-1d7fcddd-815d-40ee-af60-baf488a199df.png)，支持公式、代码高亮
 多线程函数插件支持 | 支持多线调用chatgpt，一键处理[海量文本](https://www.bilibili.com/video/BV1FT411H7c5/)或程序
 启动暗色[主题](https://github.com/binary-husky/gpt_academic/issues/173) | 在浏览器url后面添加```/?__theme=dark```可以切换dark主题
-[多LLM模型](https://www.bilibili.com/video/BV1wT411p7yf)支持 | 同时被GPT3.5、GPT4、[清华ChatGLM](https://github.com/THUDM/ChatGLM-6B)、[复旦MOSS](https://github.com/OpenLMLab/MOSS)同时伺候的感觉一定会很不错吧？
-ChatGLM2微调模型 | 支持加载ChatGLM2微调模型，提供ChatGLM2微调插件
+[多LLM模型](https://www.bilibili.com/video/BV1wT411p7yf)支持 | 同时被GPT3.5、GPT4、[清华ChatGLM2](https://github.com/THUDM/ChatGLM2-6B)、[复旦MOSS](https://github.com/OpenLMLab/MOSS)同时伺候的感觉一定会很不错吧？
+⭐ChatGLM2微调模型 | 支持加载ChatGLM2微调模型，提供ChatGLM2微调辅助插件
 更多LLM模型接入，支持[huggingface部署](https://huggingface.co/spaces/qingxu98/gpt-academic) | 加入Newbing接口(新必应)，引入清华[Jittorllms](https://github.com/Jittor/JittorLLMs)支持[LLaMA](https://github.com/facebookresearch/llama)和[盘古α](https://openi.org.cn/pangu/)
 更多新功能展示 (图像生成等) …… | 见本文档结尾处 ……
 
@@ -115,12 +116,12 @@ python -m pip install -r requirements.txt # 这个步骤和pip安装一样的步
 ```
 
 
-<details><summary>如果需要支持清华ChatGLM/复旦MOSS作为后端，请点击展开此处</summary>
+<details><summary>如果需要支持清华ChatGLM2/复旦MOSS作为后端，请点击展开此处</summary>
 <p>
 
-【可选步骤】如果需要支持清华ChatGLM/复旦MOSS作为后端，需要额外安装更多依赖（前提条件：熟悉Python + 用过Pytorch + 电脑配置够强）：
+【可选步骤】如果需要支持清华ChatGLM2/复旦MOSS作为后端，需要额外安装更多依赖（前提条件：熟悉Python + 用过Pytorch + 电脑配置够强）：
 ```sh
-# 【可选步骤I】支持清华ChatGLM。清华ChatGLM备注：如果遇到"Call ChatGLM fail 不能正常加载ChatGLM的参数" 错误，参考如下： 1：以上默认安装的为torch+cpu版，使用cuda需要卸载torch重新安装torch+cuda； 2：如因本机配置不够无法加载模型，可以修改request_llm/bridge_chatglm.py中的模型精度, 将 AutoTokenizer.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True) 都修改为 AutoTokenizer.from_pretrained("THUDM/chatglm-6b-int4", trust_remote_code=True)
+# 【可选步骤I】支持清华ChatGLM2。清华ChatGLM备注：如果遇到"Call ChatGLM fail 不能正常加载ChatGLM的参数" 错误，参考如下： 1：以上默认安装的为torch+cpu版，使用cuda需要卸载torch重新安装torch+cuda； 2：如因本机配置不够无法加载模型，可以修改request_llm/bridge_chatglm.py中的模型精度, 将 AutoTokenizer.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True) 都修改为 AutoTokenizer.from_pretrained("THUDM/chatglm-6b-int4", trust_remote_code=True)
 python -m pip install -r request_llm/requirements_chatglm.txt  
 
 # 【可选步骤II】支持复旦MOSS
@@ -144,6 +145,8 @@ python main.py
 ### 安装方法II：使用Docker
 
 1. 仅ChatGPT（推荐大多数人选择，等价于docker-compose方案1）
+[![basic](https://github.com/binary-husky/gpt_academic/actions/workflows/build-without-local-llms.yml/badge.svg?branch=master)](https://github.com/binary-husky/gpt_academic/actions/workflows/build-without-local-llms.yml)
+[![basic](https://github.com/binary-husky/gpt_academic/actions/workflows/build-with-latex.yml/badge.svg?branch=master)](https://github.com/binary-husky/gpt_academic/actions/workflows/build-with-latex.yml)
 
 ``` sh
 git clone https://github.com/binary-husky/gpt_academic.git  # 下载项目
@@ -158,7 +161,8 @@ docker run --rm -it -e WEB_PORT=50923 -p 50923:50923 gpt-academic
 ```
 P.S. 如果需要依赖Latex的插件功能，请见Wiki。另外，您也可以直接使用docker-compose获取Latex功能（修改docker-compose.yml，保留方案4并删除其他方案）。
 
-2. ChatGPT + ChatGLM + MOSS（需要熟悉Docker）
+2. ChatGPT + ChatGLM2 + MOSS（需要熟悉Docker）
+[![chatglm](https://github.com/binary-husky/gpt_academic/actions/workflows/build-with-chatglm.yml/badge.svg?branch=master)](https://github.com/binary-husky/gpt_academic/actions/workflows/build-with-chatglm.yml)
 
 ``` sh
 # 修改docker-compose.yml，保留方案2并删除其他方案。修改docker-compose.yml中方案2的配置，参考其中注释即可
@@ -284,6 +288,7 @@ Tip：不指定文件直接点击 `载入对话历史存档` 可以查看历史h
 
 ### II：版本:
 - version 3.5(Todo): 使用自然语言调用本项目的所有函数插件（高优先级）
+- version 3.46: 支持完全脱手操作的实时语音对话
 - version 3.45: 支持自定义ChatGLM2微调模型
 - version 3.44: 正式支持Azure，优化界面易用性
 - version 3.4: +arxiv论文翻译、latex论文批改功能
@@ -306,13 +311,18 @@ gpt_academic开发者QQ群-2：610599535
     - 某些浏览器翻译插件干扰此软件前端的运行
     - 官方Gradio目前有很多兼容性Bug，请务必使用`requirement.txt`安装Gradio
 
-### III：参考与学习
+### III：主题
+可以通过修改`THEME`选项（config.py）变更主题
+1. `Chuanhu-Small-and-Beautiful` [网址](https://github.com/GaiZhenbiao/ChuanhuChatGPT/)
+
+
+### IV：参考与学习
 
 ```
 代码中参考了很多其他优秀项目中的设计，顺序不分先后：
 
-# 清华ChatGLM-6B:
-https://github.com/THUDM/ChatGLM-6B
+# 清华ChatGLM2-6B:
+https://github.com/THUDM/ChatGLM2-6B
 
 # 清华JittorLLMs:
 https://github.com/Jittor/JittorLLMs

diff --git a/config.py b/config.py
@@ -89,6 +89,8 @@
 # 是否在提交时自动清空输入框
 AUTO_CLEAR_TXT = False
 
+# 色彩主体，可选 ["Default", "Chuanhu-Small-and-Beautiful"]
+THEME = "Default"
 
 # 加一个live2d装饰
 ADD_WAIFU = False
@@ -123,3 +125,9 @@
 NEWBING_COOKIES = """
 put your new bing cookies here
 """
+
+
+# 阿里云实时语音识别 配置难度较高 仅建议高手用户使用 参考 https://github.com/binary-husky/gpt_academic/blob/master/docs/use_audio.md
+ENABLE_AUDIO = False
+ALIYUN_TOKEN=""    # 例如 f37f30e0f9934c34a992f6f64f7eba4f
+ALIYUN_APPKEY=""   # 例如 RoPlZrM88DnAFkZK
diff --git a/crazy_functional.py b/crazy_functional.py
@@ -392,7 +392,7 @@ def get_crazy_functions():
         })
         from crazy_functions.Latex输出PDF结果 import Latex翻译中文并重新编译PDF
         function_plugins.update({
-            "Arixv翻译（输入arxivID）[需Latex]": {
+            "Arixv论文精细翻译（输入arxivID）[需Latex]": {
                 "Color": "stop",
                 "AsButton": False,
                 "AdvancedArgs": True,
@@ -403,7 +403,7 @@ def get_crazy_functions():
             }
         })
         function_plugins.update({
-            "本地论文翻译（上传Latex压缩包）[需Latex]": {
+            "本地Latex论文精细翻译（上传Latex项目）[需Latex]": {
                 "Color": "stop",
                 "AsButton": False,
                 "AdvancedArgs": True,
@@ -416,6 +416,22 @@ def get_crazy_functions():
     except:
         print('Load function plugin failed')
 
+
+    try:
+        from toolbox import get_conf
+        ENABLE_AUDIO, = get_conf('ENABLE_AUDIO')
+        if ENABLE_AUDIO:
+            from crazy_functions.语音助手 import 语音助手
+            function_plugins.update({
+                "实时音频采集": {
+                    "Color": "stop",
+                    "AsButton": True,
+                    "Function": HotReload(语音助手)
+                }
+            })
+    except:
+        print('Load function plugin failed')
+
     # try:
     #     from crazy_functions.虚空终端 import 终端
     #     function_plugins.update({

diff --git a/crazy_functions/live_audio/aliyunASR.py b/crazy_functions/live_audio/aliyunASR.py
@@ -0,0 +1,93 @@
+import time, threading, json
+
+
+class AliyunASR():
+
+    def test_on_sentence_begin(self, message, *args):
+        # print("test_on_sentence_begin:{}".format(message))
+        pass
+
+    def test_on_sentence_end(self, message, *args):
+        # print("test_on_sentence_end:{}".format(message))
+        message = json.loads(message)
+        self.parsed_sentence = message['payload']['result']
+        self.event_on_entence_end.set()
+        print(self.parsed_sentence)
+
+    def test_on_start(self, message, *args):
+        # print("test_on_start:{}".format(message))
+        pass
+
+    def test_on_error(self, message, *args):
+        # print("on_error args=>{}".format(args))
+        pass
+
+    def test_on_close(self, *args):
+        self.aliyun_service_ok = False
+        pass
+
+    def test_on_result_chg(self, message, *args):
+        # print("test_on_chg:{}".format(message))
+        message = json.loads(message)
+        self.parsed_text = message['payload']['result']
+        self.event_on_result_chg.set()
+
+    def test_on_completed(self, message, *args):
+        # print("on_completed:args=>{} message=>{}".format(args, message))
+        pass
+
+
+    def audio_convertion_thread(self, uuid):
+        # 在一个异步线程中采集音频
+        import nls  # pip install git+https://github.com/aliyun/alibabacloud-nls-python-sdk.git
+        import tempfile
+        from scipy import io
+        from toolbox import get_conf
+        from .audio_io import change_sample_rate
+        from .audio_io import RealtimeAudioDistribution
+        NEW_SAMPLERATE = 16000
+        rad = RealtimeAudioDistribution()
+        rad.clean_up()
+        temp_folder = tempfile.gettempdir()
+        TOKEN, APPKEY = get_conf('ALIYUN_TOKEN', 'ALIYUN_APPKEY')
+        self.aliyun_service_ok = True
+        URL="wss://nls-gateway.aliyuncs.com/ws/v1"
+        sr = nls.NlsSpeechTranscriber(
+                    url=URL,
+                    token=TOKEN,
+                    appkey=APPKEY,
+                    on_sentence_begin=self.test_on_sentence_begin,
+                    on_sentence_end=self.test_on_sentence_end,
+                    on_start=self.test_on_start,
+                    on_result_changed=self.test_on_result_chg,
+                    on_completed=self.test_on_completed,
+                    on_error=self.test_on_error,
+                    on_close=self.test_on_close,
+                    callback_args=[uuid.hex]
+                )
+
+        r = sr.start(aformat="pcm",
+                enable_intermediate_result=True,
+                enable_punctuation_prediction=True,
+                enable_inverse_text_normalization=True)
+
+        while not self.stop:
+            # time.sleep(self.capture_interval)
+            audio = rad.read(uuid.hex) 
+            if audio is not None:
+                # convert to pcm file
+                temp_file = f'{temp_folder}/{uuid.hex}.pcm' # 
+                dsdata = change_sample_rate(audio, rad.rate, NEW_SAMPLERATE) # 48000 --> 16000
+                io.wavfile.write(temp_file, NEW_SAMPLERATE, dsdata)
+                # read pcm binary
+                with open(temp_file, "rb") as f: data = f.read()
+                # print('audio len:', len(audio), '\t ds len:', len(dsdata), '\t need n send:', len(data)//640)
+                slices = zip(*(iter(data),) * 640)    # 640个字节为一组
+                for i in slices: sr.send_audio(bytes(i))
+            else:
+                time.sleep(0.1)
+
+            if not self.aliyun_service_ok:
+                self.stop = True
+                self.stop_msg = 'Aliyun音频服务异常，请检查ALIYUN_TOKEN和ALIYUN_APPKEY是否过期。'
+        r = sr.stop()
diff --git a/crazy_functions/live_audio/audio_io.py b/crazy_functions/live_audio/audio_io.py
@@ -0,0 +1,51 @@
+import numpy as np
+from scipy import interpolate
+
+def Singleton(cls):
+    _instance = {}
+
+    def _singleton(*args, **kargs):
+        if cls not in _instance:
+            _instance[cls] = cls(*args, **kargs)
+        return _instance[cls]
+
+    return _singleton
+
+
+@Singleton
+class RealtimeAudioDistribution():
+    def __init__(self) -> None:
+        self.data = {}
+        self.max_len = 1024*1024
+        self.rate = 48000   # 只读，每秒采样数量
+
+    def clean_up(self):
+        self.data = {}
+
+    def feed(self, uuid, audio):
+        self.rate, audio_ = audio
+        # print('feed', len(audio_), audio_[-25:])
+        if uuid not in self.data:
+            self.data[uuid] = audio_
+        else:
+            new_arr = np.concatenate((self.data[uuid], audio_))
+            if len(new_arr) > self.max_len: new_arr = new_arr[-self.max_len:]
+            self.data[uuid] = new_arr
+
+    def read(self, uuid):
+        if uuid in self.data:
+            res = self.data.pop(uuid)
+            print('\r read-', len(res), '-', max(res), end='', flush=True)
+        else:
+            res = None
+        return res
+
+def change_sample_rate(audio, old_sr, new_sr):
+    duration = audio.shape[0] / old_sr
+
+    time_old  = np.linspace(0, duration, audio.shape[0])
+    time_new  = np.linspace(0, duration, int(audio.shape[0] * new_sr / old_sr))
+
+    interpolator = interpolate.interp1d(time_old, audio.T)
+    new_audio = interpolator(time_new).T
+    return new_audio.astype(np.int16)
diff --git a/crazy_functions/对话历史存档.py b/crazy_functions/对话历史存档.py
@@ -12,7 +12,7 @@ def write_chat_to_file(chatbot, history=None, file_name=None):
         file_name = 'chatGPT对话历史' + time.strftime("%Y-%m-%d-%H-%M-%S", time.localtime()) + '.html'
     os.makedirs('./gpt_log/', exist_ok=True)
     with open(f'./gpt_log/{file_name}', 'w', encoding='utf8') as f:
-        from theme import advanced_css
+        from theme.theme import advanced_css
         f.write(f'<!DOCTYPE html><head><meta charset="utf-8"><title>对话历史</title><style>{advanced_css}</style></head>')
         for i, contents in enumerate(chatbot):
             for j, content in enumerate(contents):