privacy & security
poltergeist is a local-first product. nothing leaves your machine unless you flip an explicit switch — and even then, only in shapes you've consented to. this page lays out exactly what stays, what moves, and how to verify each claim for yourself.
what stays local
- every message, document, comment, meeting, and pr poltergeist ever indexes.
- the extracted entities, the vector index, the lexical index.
- every query you've ever run and every answer poltergeist has produced.
- the model weights themselves — the extraction and ranking model both run on-device.
you can verify this with a network monitor. with default settings, poltergeist makes only outbound requests directly to the connector providers (gmail api, slack api, etc.) using your own tokens. there is no "phone home" endpoint.
on macos, the simplest verification is little snitch or the built-in tcpdump -i any host -not gmail.com -and host -not slack.com. on linux, ss -tnp and the audit subsystem. you should see traffic only to the api hosts of the connectors you've authorised.
where credentials live
every oauth token or api key sits in your operating system's secure credential store:
| os | store |
|---|---|
| macos | keychain (login keychain) |
| windows | windows credential manager |
| linux | gnome-keyring or kwallet (whichever is active); falls back to libsecret |
poltergeist never writes tokens to a plain file. when you disconnect a connector, the token is removed from the keychain immediately — not at the next sync.
telemetry
poltergeist ships two kinds of telemetry by default, both off:
- anonymous crash reports — disabled. enable in settings → privacy to send stack traces (no user data) when the indexer panics.
- usage stats — disabled. enable to send daily counts of {queries run, items indexed, model latency p50/p99}. no content, no metadata about what was queried.
both are opt-in, both off out of the box, and both can be turned off again without losing functionality. the full schema of what each report contains is in docs/telemetry.md.
does the model train on my data?
no. the extraction and ranking models are pre-trained, shipped as static weights, and run inference-only inside the indexer. there is no online learning, no gradient updates, no batched-then-uploaded fine-tuning. you can strace the indexer and confirm there is no write activity on the model files.
the model also can't see your data over the wire. it's loaded from disk on startup and runs in the same process; there is no rpc, no server, no inference api.
optional end-to-end encrypted sync
if you want your vault on two machines, poltergeist ships an optional sync service. it is:
- opt-in — off by default; configure in settings → sync.
- end-to-end encrypted — keys are derived from a passphrase you pick, then split with shamir's secret sharing. we get the encrypted blobs; we cannot read them.
- self-hostable — the sync server is a small go binary; run it on your own box if you'd rather. spec is documented in
docs/sync-protocol.md.
if you lose your sync passphrase, we cannot help you recover your data. that's the trade-off for not having a backdoor. write the passphrase down somewhere offline.
permissions per connector
each connector requests the narrowest scope its provider exposes:
| connector | scopes |
|---|---|
| gmail | gmail.readonly, gmail.metadata |
| slack | channels:history, groups:history, im:history, mpim:history + matching :read |
| notion | read on the pages you share with the integration |
| linear | personal api key, read scopes only |
| github | repo:read, issues:read, pull-requests:read |
| calendar | calendar.readonly |
| drive | drive.metadata.readonly (default), drive.readonly if you opt in |
no connector ever writes back to the source. poltergeist is read-only by design — we never want to be in a position to send a slack message on your behalf.
threat model
what poltergeist protects against, and what it doesn't:
protects against
- a third party reading your indexed content — there is none to read.
- a third party reading your tokens — they live in your os keychain.
- silent training on your data — no training is possible.
- vendor lock-in — your vault is plain markdown, on your disk.
does not protect against
- an attacker with root on your machine — they can read your keychain and your vault.
- malicious obsidian plugins — if you open the vault in obsidian, plugins installed there share the vault and the obsidian process. install obsidian plugins from people you trust.
- cloud sync of the underlying apps — if your gmail is in a breach, poltergeist can't pull it back.
responsible disclosure
found something? we'd rather hear about it than have it be public:
- email
[email protected]with details and a reproduction. - or open a github security advisory at github.com/nikrich/ghost-brain/security/advisories/new.
- we'll respond within 72 hours, fix critical issues within 14 days, and credit you in the release notes unless you'd rather we didn't.
no bug bounty program yet, but if your report saves us from shipping a real issue, we'll send you something nicer than a t-shirt.