SHIPPING · v0.1 / apache 2.0 / by suryanand / macos · windows soon

cluely. cue.
open-source meeting assistant

Stealth overlay invisible to screen-share. Agent-driven answers wired to your model. Speech in, speech out. Local-first by default. Cloud opt-in per stage. A documented HTTP daemon under the GUI so any client can drive it — CLI, Raycast, OBS, your own.

Download for your OS ↓

$pip install cue

./source →

cue · session · listening openrouter:claude-3.5-sonnet · 84ms

$ cue daemon start
✓ daemon up  |  bind=127.0.0.1:7821  |  agents=5

$ cue agent use interview-coding
✓ default agent → interview-coding

$ cue listen
[ vad: silero-vad  |  stt: distil-whisper  |  tts: webspeech ]
▸  interviewer: "design a rate limiter for 1M qps"
▸  cue ⟶  restating: token-bucket vs sliding-window. clarify scope?
▸  cue ⟶  brute force: O(n) per request, redis hash. optimal: ...
█ streaming · 240 tokens · $0.003

SECTION 00 // DOWNLOAD

install & go.
no setup wizard, no signup, no card.

Three ways in: download the desktop app, install the Python CLI, or run from source. Each route ends with the same overlay and the same daemon. Pick what fits how you work.

⌘ macOS · 11+

Mac (DMG)

Apple Silicon and Intel. Drag to /Applications, open, paste your OpenRouter key, done.

Cue-latest.dmg ↓

⊞ Win 10 · 11

Windows (EXE)

NSIS installer. Auto-creates desktop + start-menu shortcuts. Same onboarding as macOS.

Cue-Setup-latest.exe ↓

▲ Linux · x64 · arm64

Linux (AppImage)

Single-file portable build. Make executable, double-click to run. .deb also available for Debian / Ubuntu.

Cue-latest.AppImage ↓

Python CLI

CLI

One pip install. Same daemon under the hood. Power users skip the GUI.

$ pip install cue
$ cue init
$ cue daemon start
$ cue ask "design a rate limiter"

From source

DEV

Clone, run, hack. Apache-2.0 — fork freely.

$ git clone https://github.com/Suryanandx/cue
$ cd cue/electron
$ npm install && npm start
# or build a redistributable
$ npm run build:mac

// first launch shows an onboarding wizard — paste your openrouter key, or connect to local ollama. test, save, done.

SECTION 01 // CAPABILITIES

cluely's feature set,
your control plane.

Same overlay UX as the closed-source incumbent. None of the lock-in. Every input, every output stays on your machine unless you wire it otherwise. Engineered for operators, not marketers.

01 STEALTH

Invisible to screen capture.

setContentProtection(true) on Mac and Windows hides the overlay from screen-record + screen-share APIs. Always-on-top. Click-through optional. Hotkey kill.

02 AUDIO PIPE

STT in, TTS out.

Speech-to-speech pipeline modeled on huggingface/speech-to-speech. Optional Coqui XTTS-v2 voice-cloning — the AI whispers in your voice through your earpiece.

03 ROUTER

Two providers, one router.

Ollama and OpenRouter side by side. Per-agent model. Auto-fallback on first error.

04 AGENTS

Every agent is a TOML file.

Five built-ins ship. Hot-swap via ⌘1-5. Override any prompt by dropping a file in ~/.cue/agents/.

05 DAEMON

HTTP under everything.

FastAPI with OpenAPI at /docs. The GUI calls it. The CLI calls it. Anyone can.

06 PRIVACY POSTURE

Strictly local, fully cloud, or anywhere between.

Run Ollama + on-device STT for zero-egress. Or route to OpenRouter for higher quality. Or mix per-agent — coding on Sonnet, notetaker on local Llama. The daemon binds to 127.0.0.1 by default; LAN-bind requires a bearer token. No telemetry. Ever.

SECTION 02 // PROVIDERS

ollama. openrouter.
side by side.

Bring your own. Run interview-coding through Sonnet on the cloud. Run notetaker on local Llama. Switch any time. cue model list gives you both catalogues in one shot.

Ollama

LOCAL

On-device inference. Zero egress. Free. The default for notetaker-class agents where latency < quality.

llama3:8b — everyday default, ~30 tok/s on M-series
mistral:7b — concise, fast, low-VRAM
qwen2.5:14b — coding-strong
any GGUF — cue model pull passthrough

OpenRouter

CLOUD

200+ frontier models behind one API key. Used when quality matters more than privacy — live coding, system design, tough behavioral.

claude-3.5-sonnet — flagship for coding + design
gpt-4o — behavioral interview default
gpt-4o-mini — cheap and quick
llama-3.1-8b:free — the always-fallback

SECTION 03 // AGENTS

five agents,
shipped on day one.

Each one is a TOML file you can read, edit, share. Override the system prompt, change the model, set a hotkey, define KB scope. The daemon picks up your changes immediately — no restart.

// num // slug // purpose // model // hotkey

001 interview-behavioral STAR-format answers, mirrors your resume + JD. Never invents. claude-3.5-sonnet ⌘1

002 interview-coding Brute force then optimal, with code, edges, complexity. claude-3.5-sonnet ⌘2

003 interview-system-design Layered: clarify, capacity, diagram, walkthrough, bottlenecks. claude-3.5-sonnet ⌘3

004 sales-discovery MEDDPICC nudges, next discovery question, what not to say. gpt-4o-mini ⌘4

005 meeting-notetaker Silent listener. After the call: minutes, decisions, action items. gpt-4o-mini · llama3:8b ⌘5

SECTION 04 // DEPLOY

install to first answer
in two minutes.

STAGE 01

Install the package

pip install cue for daemon + CLI. Add cue[stt,tts,kb] later for the heavier optional engines.

STAGE 02

Configure

cue init writes ~/.cue/config.toml. Drop in your OpenRouter key, or skip and run Ollama-only.

STAGE 03

Ask

cue ask "talk about a hard project" streams an answer through the default agent.

STAGE 04

Daemon up

cue daemon start exposes the full pipeline at 127.0.0.1:7821. OpenAPI at /docs.

STAGE 05

Overlay

Run cue start for the stealth Cluely-style overlay. Or skip the GUI and stay in the terminal.

~/projects/cue · zsh node-mac-01

$ pip install cue
collecting cue==0.1.0 ...
✓ installed

$ cue init
openrouter key: ********
✓ Cue initialised
  config:  ~/.cue/config.toml
  agents:  5 built-in
  daemon:  127.0.0.1:7821

$ cue ask "design a url shortener"
interview-system-design · sonnet · 92ms
restating: shortener with billion ops/day...
1) clarify: write QPS, read QPS, vanity?
2) capacity: ~10K writes/sec, ~100K reads/sec
3) diagram: client → edge cache → svc → kv ...

$ cue daemon start
✓ daemon started

$ cue start
launched ./Cue.app · stealth: on
█

SECTION 05 // ENDPOINT MAP

a daemon under everything.

The GUI calls these endpoints. The CLI calls these endpoints. Anyone can. OpenAPI docs live at /docs. Default bind 127.0.0.1:7821. LAN bind opt-in.

// verb	// path	// purpose
POST	/v1/chat	SSE token stream through any agent + model
GET	/v1/agents	List built-in + user agents
POST	/v1/agents/:slug	Create or update an agent
DELETE	/v1/agents/:slug	Remove a user agent
GET	/v1/models	Both Ollama + OpenRouter catalogues
POST	/v1/stt	Audio in → transcript chunks planned · 0.2
POST	/v1/tts	Text in → audio out planned · 0.2
GET	/v1/sessions	Past session transcripts + answers
GET	/v1/health	Daemon + provider status

SECTION 06 // CLI SURFACE

cli-first. always.

The terminal is the source of truth. Run interviews from a tmux pane if you want. The GUI is one client; the CLI is another.

cue / commands

# lifecycle
cue init                                   first-run setup wizard
cue daemon start | stop | status | logs   manage the FastAPI daemon
cue start                                  launch the Electron overlay
cue doctor                                 diagnose providers, paths, agents

# live
cue listen [--agent <name>]                 CLI listening mode (textual TUI)
cue ask "question" [--agent <name>]          one-shot Q&A
cue practice <agent>                        mock-interview TUI (0.2)

# agents
cue agent list / new / show / use / rm     manage ~/.cue/agents/*.toml

# models
cue model list                             both ollama + openrouter
cue model set <agent> <provider:model>     per-agent override
cue model benchmark                        rank by p50 latency from your network

# config / privacy
cue config get / set <key> [value]          persistent settings
cue stealth on / off                        screen-capture invisibility

cluely. cue.
open-source meeting assistant

install & go.
no setup wizard, no signup, no card.

Mac (DMG)

Windows (EXE)

Linux (AppImage)

Python CLI

From source

cluely's feature set,
your control plane.

Invisible to screen capture.

STT in, TTS out.

Two providers, one router.

Every agent is a TOML file.

HTTP under everything.

Strictly local, fully cloud, or anywhere between.

ollama. openrouter.
side by side.

Ollama

OpenRouter

five agents,
shipped on day one.

install to first answer
in two minutes.

Install the package

Configure

Ask

Daemon up

Overlay

a daemon under everything.

cli-first. always.

built and maintained by
suryanand sunil.

suryanand.com

github.com/Suryanandx

cluely. cue. open-source meeting assistant

install & go. no setup wizard, no signup, no card.

Mac (DMG)

Windows (EXE)

Linux (AppImage)

Python CLI

From source

cluely's feature set, your control plane.

Invisible to screen capture.

STT in, TTS out.

Two providers, one router.

Every agent is a TOML file.

HTTP under everything.

Strictly local, fully cloud, or anywhere between.

ollama. openrouter. side by side.

Ollama

OpenRouter

five agents, shipped on day one.

install to first answer in two minutes.

Install the package

Configure

Ask

Daemon up

Overlay

a daemon under everything.

cli-first. always.

built and maintained by suryanand sunil.

suryanand.com

github.com/Suryanandx

cluely. cue.
open-source meeting assistant

install & go.
no setup wizard, no signup, no card.

cluely's feature set,
your control plane.

ollama. openrouter.
side by side.

five agents,
shipped on day one.

install to first answer
in two minutes.

built and maintained by
suryanand sunil.