llm-eval.mp4
This demo is a full stack example that uses the following:
- A Next.js app with Prisma for the database.
- Trigger.dev Realtime to stream updates to the frontend.
- Work with multiple LLM models using the AI SDK
- Distribute tasks across multiple tasks using the new
batch.triggerByTaskAndWait
method.
- After cloning the repo, run
npm install
to install the dependencies. - Copy the
.env.example
file to.env
and fill in the required environment variables. If you haven't already, sign up for a free Trigger.dev account here and create a new project. - Run
npx prisma migrate dev
to create the database and generate the Prisma client. - Copy the project ref from the Trigger.dev dashboard and and add it to the
trigger.config.ts
file. - Run the Next.js server with
npm run dev
. - In a separate terminal, run the Trigger.dev dev CLI command with with
npx trigger dev
(it may ask you to authorize the CLI if you haven't already).
Now you should be able to visit http://localhost:3000
and see the app running. Enter a prompt and click "Evaluate" to see the LLM-generated responses.
- View the Trigger.dev task code in the src/trigger/ai.ts file.
- The
evaluateModels
task uses thebatch.triggerByTaskAndWait
method to distribute the task to the different LLM models. - It then passes the results through to a
summarizeEvals
task that calculates some dummy "tags" for each LLM response. - We use a useRealtimeRunsWithTag hook to subscribe to the different evaluation tasks runs in the src/components/llm-evaluator.tsx file.
- We then pass the relevant run down into three different components for the different models:
- The
AnthropicEval
component: src/components/evals/Anthropic.tsx - The
XAIEval
component: src/components/evals/XAI.tsx - The
OpenAIEval
component: src/components/evals/OpenAI.tsx
- The
- Each of these components then uses useRealtimeRunWithStreams to subscribe to the different LLM responses.
To learn more about Trigger.dev Realtime, take a look at the following resources:
- Trigger.dev Documentation - learn about Trigger.dev and its features.
- Batch Trigger docs - learn about the Batch Trigger feature of Trigger.dev.
- Realtime docs - learn about the Realtime feature of Trigger.dev.
- React hooks - learn about the React hooks provided by Trigger.dev.