Skip to content

Conversation

joepio
Copy link
Collaborator

@joepio joepio commented Jun 24, 2025

@AndreiPauseAI created it, I just opened this PR to have a place to track progress and discuss things.

Preview

…individually. This version is incomplete, as the backend logic has not yet been updated.
… to planning out the rest of the functionality. Prompt changes and which data gets passed should be the last things before the new system is operational.
… limits and a myriad of visual bugs, it seems to be working correctly on my end. The search functionality is so far only used for researching the target of the email. The code is messy, bug filled and needs lots of cleaning.
…write, mostly type declarations, checking that the errors still present are, in fact, result of outdated check files.
Copy link

netlify bot commented Jun 24, 2025

Deploy Preview for pauseai ready!

Name Link
🔨 Latest commit 8716d46
🔍 Latest deploy log https://app.netlify.com/projects/pauseai/deploys/685ea3a168006800089f5637
😎 Deploy Preview https://deploy-preview-382--pauseai.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@Wituareard
Copy link
Collaborator

Huh, I thought we already had a PR but thanks

@AndreiPauseAI
Copy link
Collaborator

Does anyone know any better ways to get around the netlify 30s timeout for serverless functions other than background functions? Those have a 15 minute timeout, but require a paid subscription and don't directly pass outputs.

Copy link
Collaborator

@AndreiPauseAI AndreiPauseAI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only 3 files are changed, namely package.json (newer packages) and files for the /write page, which I've tested locally and worked. The current version isn't very useful, so even if it's broken no great loss, so in my eyes this pull is valid.

Copy link
Collaborator

@AndreiPauseAI AndreiPauseAI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Necessary update, as this should have been included in the previous commit.

@EllaHughesPAUSE
Copy link

Hi! My user feedback

  1. When i click reset the text fields still have my previous answers
  2. Think we need to clarify instructions a little so you use each tab one by one - autofill is maybe a bad label because makes me think i should only press if i havent entered information
  3. maybe autofill button should be underneath (as well as reset) so its clear you enter after?
  4. There is a little emoji in the automatic answers, maybe it should be a pause symbol ?
  5. Copy contents button appears to disappear the contents? and not copy it?
  6. When i went back to tab 1 and entered the politicians name specifically the AI gave me this "I'd be happy to help you find individuals involved with AI safety, particularly in relation to Sir Mark Hendrick MP. Let me search for relevant information.I didn't find specific information about Sir Mark Hendrick MP's involvement with AI safety in the initial search. Let me search for individuals who are prominent in the AI safety field, as that seems to be what you're looking for.I apologize for the error. Let me try a more specific query:I see I've reached my limit of search queries for this turn. Let me search for more information in a new turn."
  7. i feel like autofill should be seperate from initiating the search/thinking. autofill to me should take the relevant content from the prior tab and input it into the text boxes in the next tab
  8. is it that tab 1 is first, autofill, then the next 3 tabs should all be filled in prior to autofill? or should i autofill each page?
  9. final letter good with some small weird parts

Reset all button now calls "clear" instead of "runTest"
clear() now force updates the arrays so the contents display immediately.
:wq
@anthonybailey
Copy link
Collaborator

Copying from https://discord.com/channels/1100491867675709580/1222902730083799120/1386728025625727018 - Pro gets us background serverless functions.

Context for short-term spend option.

It's quite plausible the builder api should move outside of Netlify altogether, but short-term that's non-trivial and even long-term one deployment system / web framework has upside.

Andrei had other options to squeeze below timeouts too, basically it just drops the research quality a bit.

$20 for a trial to cover flexibility about this during the PauseCon opportunity seems a no brainer.

But Joep owns the account - are there other complications?

@Pato-desu
Copy link
Collaborator

First impressions:
Sorry if this is too harsh but I feel using the tool would be more complicated than writing the email myself.
There are lots of buttons, lots of boxes on each tab, a bunch of text. Don't really know where to begin and how everything connects to each other and if the tabs communicate with each other.

The helper texts that say "Question x" make me think I may need to write a question there ? But the headers for each textbox don't seem to imply that. I think those headers could be a lot clearer about what I'm supposed to write there.

I think the tabs and the rest of the buttons should be really differentiated from each other, They are really confusing at this time. I would expect the second row to be other tabs or sub-tabs. I don't really understand what the write button is supposed to do.

I could continue but I think those are the most important points probably. Maybe in a call it would be faster to talk about the tool if you want even more feedback.

@anthonybailey
Copy link
Collaborator

anthonybailey commented Jun 25, 2025 via email

@anthonybailey
Copy link
Collaborator

anthonybailey commented Jun 25, 2025 via email

@Pato-desu
Copy link
Collaborator

Are you planning on finishing it before PauseCon? Maybe the whole Software team (+ me) should be working on this, then? @anthonybailey

@anthonybailey
Copy link
Collaborator

anthonybailey commented Jun 25, 2025

I'm going to remote pair with Andrei tomorrow, then demo result to Ella and fix whatever we have up however is best for the PauseCon session where folk play with it to send DeepMind stuff to representatives.

If particular work (weird tech investigations) separate out then unblocking contributions are welcome, but for the most part I do not think this parallelizes beyond two.

@anthonybailey
Copy link
Collaborator

I'm going to remote pair with Andrei tomorrow, then demo result to Ella and fix whatever we have up however is best for the PauseCon session where folk play with it to send DeepMind stuff to representatives.

If particular work (weird tech investigations) separate out then unblocking contributions are welcome, but for the most part I do t think this parallelizes beyond two.

@anthonybailey
Copy link
Collaborator

I've been stupid sleepy, but did some work with Andrei yesterday/today and there was a lot more from him.

I expect to code and make the yes/no call on one key unknown technical issue (background functions for longer LLM think time greater robustness) tonight.

Even if we get unlucky on that unknown, Andrei raised a possibility I had not considered and it is a great idea: if we end up only having the highest quality version of the writer running on a developer machine rather than the public website, we could serve still the website page that way at PauseCon - at minimum for a demo, and (idea occurring as I write) possibly across the local network so everyone could use it.)

And we're making various steps in the direction of simple UI for the tool in general and the PauseCon ask in particular.

@anthonybailey
Copy link
Collaborator

Update for Andrei - Testing Session Results (2025-06-27)

Hi Andrei! Here's what I discovered during overnight testing:

Key Findings

1. Rate Limit Death Spiral 🚨

  • Web searches consume 80k-130k input tokens per request (vs 20k/minute limit on Max subscription)
  • This creates a "death spiral" - sustained requests prevent token recovery
  • Testing 96 requests: only 2 succeeded with web search, rest failed after ~2s
  • Conference demo risk: Multiple users = system appears broken

2. Billing Discovery 💰

  • Found I have 2 Anthropic orgs (same email, different API keys)
  • Was accidentally burning credits (~$0.50/request) on wrong org
  • Switched back to Max subscription key - no charges now
  • Max limits: 20k input tokens/min, 8k output/min, 50 requests/min

3. Usage Logging Implemented 📊

  • Added comprehensive logging to track rate limits and usage
  • Opt-in via touch write-usage.log
  • Captures token usage, rate limits, errors, duration
  • Clean implementation - doesn't clutter +server.ts

4. Model Comparison Plan 🔄

  • Testing Claude 3.5 Haiku vs Sonnet for web searches
  • Haiku: 50k tokens/min (25% better), 73% cheaper
  • Next step: Compare research quality on identical queries
  • If quality acceptable, Haiku might solve rate limit issues

What I Didn't Test

  • Background serverless functions - These cost money and the rate limit questions were more urgent
  • The current setup works fine locally, just hits limits with web search

Recommendation for Demo

Without model switching or rate limit improvements, we should:

  1. Demo without web search OR
  2. Carefully manage single-user demos OR
  3. Switch to Haiku if quality testing passes

The AIXR content degradation issue remains (firstCut removes key terms), but that's a separate fix.

Let me know if you want the full testing details or have questions!

- Implement comprehensive usage logging with rate limit tracking
- Opt-in via write-usage.log file existence
- Capture token usage, rate limits, duration, and errors
- Clean wrapper pattern keeps server code uncluttered
- Logs both successful API calls and rate limit errors
- No performance impact when logging disabled

This helps monitor API usage patterns and debug rate limit issues
during conference demos and multi-user testing.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
@anthonybailey
Copy link
Collaborator

Update: Logging Implementation Committed

Hi Andrei - I've committed the usage logging implementation to help track API usage during demos.

To get the changes into your fork:

Since I pushed to PauseAI/pauseai-website (not your fork), you'll need to pull the changes:

# Add PauseAI repo as upstream if you haven't already
git remote add upstream https://github.com/PauseAI/pauseai-website.git

# Fetch and merge the logging changes
git fetch upstream locally-working-snapshot
git merge upstream/locally-working-snapshot

# Push to your fork to update this PR
git push origin locally-working-snapshot

What's included:

  • New file: src/lib/usage-logger.ts - Logging module
  • Updated: src/routes/api/write/+server.ts - Uses the logger
  • Opt-in via touch write-usage.log
  • Logs token usage, rate limits, errors, duration
  • No performance impact when disabled

The commit is: 2586555 on PauseAI/pauseai-website

@anthonybailey
Copy link
Collaborator

Final Update: Max Subscription ≠ API Access

Resolution: Just got clarity from Anthropic support via Fin:

The Answer

  • Max subscriptions are chat-only (claude.ai + Claude Code)
  • API access requires separate credits purchased through Console
  • Max subscription cannot cover API usage - they're completely different products

What This Means for Demo

  • Web search API requires Console credits (~$10/1000 searches + token costs)
  • Rate limit math still problematic: 80k-130k tokens per search vs 20k/min limit
  • Conference demo options:
    1. Use Console credits for limited web search demos
    2. Demo without web search (safer for sustained use)
    3. Test Haiku 3.5 (better rate limits + 73% cheaper)

Technical Impact

  • Logging implementation still valuable for monitoring any API usage
  • Rate limit testing confirmed the death spiral behavior is real
  • Model comparison (Sonnet vs Haiku) becomes more important for cost control

The system works technically, but web search + sustained multi-user load requires careful cost/rate limit management.

Current Status

API functioning normally once we use the right billing model. Ready for demo with proper expectations about web search limitations.

@anthonybailey
Copy link
Collaborator

Andrei in particular: I am making UX changes on to improve workflow on dev, particularly for PauseCon DeepMind usecase (but it's a good shift wider than that.)

Code will go on the locally-working branch, avoid big refractors there. Changes to the background functions branch not blocked, but won't go there until subsequent round of productionization.

Production has a key enabled and I also have a local key of my own. Current cost even without b"e as quick as you can" prompting is reasonable. Quality acceptable but improvable. LP

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants