Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

name 'googleapiclient' is not defined #3

Open
lucasgordon opened this issue Jun 8, 2023 · 3 comments
Open

name 'googleapiclient' is not defined #3

lucasgordon opened this issue Jun 8, 2023 · 3 comments

Comments

@lucasgordon
Copy link

Cannot run, googleapplicant not defined in cell 2, results in error

@dilzilla
Copy link

dilzilla commented Jun 8, 2023

This code fixed the problems for me. Note that this is block one and two.


import openai
from langchain.chat_models.openai import ChatOpenAI
from concurrent.futures import ThreadPoolExecutor
import tiktoken
from pathlib import Path
from langchain.schema import (
    HumanMessage,
    SystemMessage
)

YOUR_OPENAI_API_KEY = "Key Here "  # Replace with your actual OpenAI API key

chat = ChatOpenAI(
    model="gpt-3.5-turbo",
    temperature=0.2,
    max_tokens=500,
    openai_api_key=YOUR_OPENAI_API_KEY
)

def load_text(file_path):
    with Path(file_path).open("r") as file:
        return file.read()

def save_to_file(responses, output_file):
    with Path(output_file).open('w') as file:
        file.write("\n".join(responses))

def call_openai_api(chunk):
    messages = [
        SystemMessage(content="Clean the following transcripts of all grammatical mistakes, misplaced words, and identify the speakers."),
        HumanMessage(content=chunk)
    ]
    response = chat(messages)
    return response.content.strip()

def split_into_chunks(text, n_tokens=300):
    encoding = tiktoken.encoding_for_model('gpt-3.5-turbo')
    tokens = encoding.encode(text)
    chunks = []
    for i in range(0, len(tokens), n_tokens):
        chunks.append(' '.join(encoding.decode(tokens[i:i + n_tokens])))
    return chunks

def process_chunks(input_file, output_file, delay=0):  # delay in seconds (if you hit a rate limit error)
    text = load_text(input_file)
    chunks = split_into_chunks(text)[:5]
    responses = []
    for chunk in tqdm(chunks):
        responses.append(call_openai_api(chunk))

    save_to_file(responses, output_file)

if __name__ == "__main__":
    input_file = "YouTube.txt"
    output_file = "clean_transcript.txt"
    process_chunks(input_file, output_file)

    # Can take up to a few minutes to run depending on the size of your data input

@Hanalia
Copy link

Hanalia commented Jun 9, 2023

I added the below and it works

!pip install google-api-python-client
import googleapiclient.discovery
from tqdm import tqdm
from youtube_transcript_api import YouTubeTranscriptApi

@emmethalm
Copy link
Owner

Big thanks to you both for helping debug this error. I've just pushed this patch to main.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants