c cue / 0.1.0 Live · 127.0.0.1:7821
SHIPPING · v0.1 / apache 2.0 / by suryanand / macos · windows soon

cluely. cue.
open-source meeting assistant

Stealth overlay invisible to screen-share. Agent-driven answers wired to your model. Speech in, speech out. Local-first by default. Cloud opt-in per stage. A documented HTTP daemon under the GUI so any client can drive it — CLI, Raycast, OBS, your own.

$pip install cue
./deploy →
cue · session · listening openrouter:claude-3.5-sonnet · 84ms
$ cue daemon start
 daemon up  |  bind=127.0.0.1:7821  |  agents=5

$ cue agent use interview-coding
 default agent  interview-coding

$ cue listen
[ vad: silero-vad  |  stt: distil-whisper  |  tts: webspeech ]
▸  interviewer: "design a rate limiter for 1M qps"
▸  cue ⟶  restating: token-bucket vs sliding-window. clarify scope?
▸  cue ⟶  brute force: O(n) per request, redis hash. optimal: ...
 streaming · 240 tokens · $0.003
SECTION 01 // CAPABILITIES

cluely's feature set,
your control plane.

Same overlay UX as the closed-source incumbent. None of the lock-in. Every input, every output stays on your machine unless you wire it otherwise. Engineered for operators, not marketers.

01 STEALTH

Invisible to screen capture.

setContentProtection(true) on Mac and Windows hides the overlay from screen-record + screen-share APIs. Always-on-top. Click-through optional. Hotkey kill.

02 AUDIO PIPE

STT in, TTS out.

Speech-to-speech pipeline modeled on huggingface/speech-to-speech. Optional Coqui XTTS-v2 voice-cloning — the AI whispers in your voice through your earpiece.

03 ROUTER

Two providers, one router.

Ollama and OpenRouter side by side. Per-agent model. Auto-fallback on first error.

04 AGENTS

Every agent is a TOML file.

Five built-ins ship. Hot-swap via ⌘1-5. Override any prompt by dropping a file in ~/.cue/agents/.

05 DAEMON

HTTP under everything.

FastAPI with OpenAPI at /docs. The GUI calls it. The CLI calls it. Anyone can.

06 PRIVACY POSTURE

Strictly local, fully cloud, or anywhere between.

Run Ollama + on-device STT for zero-egress. Or route to OpenRouter for higher quality. Or mix per-agent — coding on Sonnet, notetaker on local Llama. The daemon binds to 127.0.0.1 by default; LAN-bind requires a bearer token. No telemetry. Ever.

SECTION 02 // PROVIDERS

ollama. openrouter.
side by side.

Bring your own. Run interview-coding through Sonnet on the cloud. Run notetaker on local Llama. Switch any time. cue model list gives you both catalogues in one shot.

Ollama

LOCAL

On-device inference. Zero egress. Free. The default for notetaker-class agents where latency < quality.

  • llama3:8b — everyday default, ~30 tok/s on M-series
  • mistral:7b — concise, fast, low-VRAM
  • qwen2.5:14b — coding-strong
  • any GGUFcue model pull passthrough

OpenRouter

CLOUD

200+ frontier models behind one API key. Used when quality matters more than privacy — live coding, system design, tough behavioral.

  • claude-3.5-sonnet — flagship for coding + design
  • gpt-4o — behavioral interview default
  • gpt-4o-mini — cheap and quick
  • llama-3.1-8b:free — the always-fallback
SECTION 03 // AGENTS

five agents,
shipped on day one.

Each one is a TOML file you can read, edit, share. Override the system prompt, change the model, set a hotkey, define KB scope. The daemon picks up your changes immediately — no restart.

// num // slug // purpose // model // hotkey
001 interview-behavioral STAR-format answers, mirrors your resume + JD. Never invents. claude-3.5-sonnet ⌘1
002 interview-coding Brute force then optimal, with code, edges, complexity. claude-3.5-sonnet ⌘2
003 interview-system-design Layered: clarify, capacity, diagram, walkthrough, bottlenecks. claude-3.5-sonnet ⌘3
004 sales-discovery MEDDPICC nudges, next discovery question, what not to say. gpt-4o-mini ⌘4
005 meeting-notetaker Silent listener. After the call: minutes, decisions, action items. gpt-4o-mini · llama3:8b ⌘5
SECTION 04 // DEPLOY

install to first answer
in two minutes.

STAGE 01

Install the package

pip install cue for daemon + CLI. Add cue[stt,tts,kb] later for the heavier optional engines.

STAGE 02

Configure

cue init writes ~/.cue/config.toml. Drop in your OpenRouter key, or skip and run Ollama-only.

STAGE 03

Ask

cue ask "talk about a hard project" streams an answer through the default agent.

STAGE 04

Daemon up

cue daemon start exposes the full pipeline at 127.0.0.1:7821. OpenAPI at /docs.

STAGE 05

Overlay

Run cue start for the stealth Cluely-style overlay. Or skip the GUI and stay in the terminal.

~/projects/cue · zsh node-mac-01
$ pip install cue
collecting cue==0.1.0 ...
✓ installed

$ cue init
openrouter key: ********
✓ Cue initialised
  config:  ~/.cue/config.toml
  agents:  5 built-in
  daemon:  127.0.0.1:7821

$ cue ask "design a url shortener"
interview-system-design · sonnet · 92ms
restating: shortener with billion ops/day...
1) clarify: write QPS, read QPS, vanity?
2) capacity: ~10K writes/sec, ~100K reads/sec
3) diagram: client → edge cache → svc → kv ...

$ cue daemon start
✓ daemon started

$ cue start
launched ./Cue.app · stealth: on
SECTION 05 // ENDPOINT MAP

a daemon under everything.

The GUI calls these endpoints. The CLI calls these endpoints. Anyone can. OpenAPI docs live at /docs. Default bind 127.0.0.1:7821. LAN bind opt-in.

// verb// path// purpose
POST/v1/chatSSE token stream through any agent + model
GET/v1/agentsList built-in + user agents
POST/v1/agents/:slugCreate or update an agent
DELETE/v1/agents/:slugRemove a user agent
GET/v1/modelsBoth Ollama + OpenRouter catalogues
POST/v1/sttAudio in → transcript chunks planned · 0.2
POST/v1/ttsText in → audio out planned · 0.2
GET/v1/sessionsPast session transcripts + answers
GET/v1/healthDaemon + provider status
SECTION 06 // CLI SURFACE

cli-first. always.

The terminal is the source of truth. Run interviews from a tmux pane if you want. The GUI is one client; the CLI is another.

cue / commands
# lifecycle
cue init                                   first-run setup wizard
cue daemon start | stop | status | logs   manage the FastAPI daemon
cue start                                  launch the Electron overlay
cue doctor                                 diagnose providers, paths, agents

# live
cue listen [--agent <name>]                 CLI listening mode (textual TUI)
cue ask "question" [--agent <name>]          one-shot Q&A
cue practice <agent>                        mock-interview TUI (0.2)

# agents
cue agent list / new / show / use / rm     manage ~/.cue/agents/*.toml

# models
cue model list                             both ollama + openrouter
cue model set <agent> <provider:model>     per-agent override
cue model benchmark                        rank by p50 latency from your network

# config / privacy
cue config get / set <key> [value]          persistent settings
cue stealth on / off                        screen-capture invisibility
SECTION 07 // MAKER

built and maintained by
suryanand sunil.

Cue is part of the valthrax open-source studio — an independent operator's bench for tools at the LLM-software boundary. Sibling projects: pressmark, switchboard, murmur. All Apache-2.0. All written by one person, in public.

suryanand.com

PORTFOLIO

Founding engineer & technical architect. ~8 years shipping zero-to-one products in production: ex-Google logistics, Araxis AI, healthcare, ecommerce, AI/agent platforms.

github.com/Suryanandx

SOURCE

Every project ships open. Every commit is on GitHub. No private fork, no enterprise tier, no roadmap behind a wall. Pitch ideas, file issues, send PRs — all welcome.

// got an idea you wish existed? pitch it on valthrax — if a few engineers would use it, it gets built