Transparency

What we tell you. What we hold.

Halopen discloses more than the category leader does, on purpose. The transcription engine, the providers, the data retention, the audit-log architecture — all named. Below is the rule for what we show publicly and what we keep as intellectual property, and why we draw the line where we do.

Last updated 2026-05-11.

The thesis

The category leader treats transparency as risk. Their privacy policy names “third-party LLMs” without telling you which ones. Their model choice is buried on a non-discoverable URL. Their architecture is a black box. The defensive posture reads to careful users as they’re hiding something.

Halopen’s posture is the opposite, on purpose. We name the transcription engine: OpenAI’s gpt-4o-transcribe in Cloud, Whisper Large v3 via WhisperKit on the Apple Neural Engine in Privacy mode. We name the subprocessors and their retention windows. We publish the audit-log specification. We explain what we build on top of the model rather than pretending the model itself is our secret sauce.

This works because the model is rentable by anyone with an API key — hiding it doesn’t protect us, naming it tells you we picked the best and aren’t afraid to say so. The moat is the integration. The integration takes months to build properly. And the long-term defense is being the most-trusted dictation app on Mac, not the most-secret one.

What we tell everyone

The following items are public in our settings copy, marketing pages, support threads, and investor materials. If you ever find a Halopen surface where one of these is hidden or vague, it’s a bug.

Cloud transcription model. OpenAI’s gpt-4o-transcribe. The world’s most accurate publicly-available speech-to-text engine. Named specifically in Settings → Transcription, on /privacy/, and in our comparison page.
Privacy-mode transcription model. OpenAI’s Whisper Large v3 — open-source, MIT-licensed — running via WhisperKit (Argmax’s open-source Swift library) on Apple’s Neural Engine through CoreML. The model file is 626 MB, downloaded once, stored at ~/Library/Application Support/Halopen/Models/.
Every subprocessor + every retention policy. Supabase (backend + auth + edge functions). OpenAI (Cloud STT — may retain audio up to 30 days for abuse monitoring per their standard API policy; does not train on it). Cloudflare (web hosting + CDN). Stripe (payments). Resend (transactional email). Full disclosure in /privacy/ §05.
Architecture pillars. Native Swift, not Electron. Verbatim by default — we don’t polish you into a corporate version of yourself unless you ask. No 6-minute cap. No screen capture. No always-on networking. Hold-to-talk via the function key.
Pricing. Free at 8,000 words per month forever, no card required. Pro Monthly $19. Pro Annual $179. Pro Lifetime $499 one-time. The free tier is a hard cap, not a stealth-unlimited; we say so on /pricing/.
Vocabulary biasing — Cloud mode. Halopen pre-feeds your personal dictionary, selected text, cursor-adjacent context, and (opt-in) clipboard contents to OpenAI’s transcription engine via the API’s prompt parameter, capped at 244 tokens. Proper nouns, brand names, and technical jargon land on the first pass rather than getting cleaned up after the model gets them wrong. The category of biasing is disclosed; the specific token allocation across context sources is held close (see §02).
Vocabulary biasing — Privacy mode. Currently unavailable because of a bug in WhisperKit’s prompt-token handling for Whisper Large v3 — we reproduced it tonight against Argmax’s own command-line tool. We will restore Privacy-mode biasing when the upstream library ships a fix, or when we switch to an alternate implementation. Until then, Privacy-mode dictation transcribes without biasing. We chose to be upfront about the gap rather than ship the over-claim.
The audit log. Every cloud call Halopen makes is recorded locally in UserDefaults under halopen.auditLog.entries — metadata only (endpoint host + path, status code, duration, audio seconds; no audio, no transcripts). Open it from Settings → Privacy → Open audit log. Capped at 500 entries on a ring buffer so it can never grow unbounded. The audit log is the falsifiability of the next item.
Issue #12 — the load-bearing brand promise. In Privacy mode, Halopen makes zero cloud calls during transcription. The brand promise is verifiable: switch to Privacy, dictate, open the audit log, confirm no transcription entries appeared. We verified this end-to-end on the v1.5.0 binary. If we ever ship a regression that adds a network call to the Privacy-mode transcription path, this promise breaks first — and we treat that as the highest-priority bug class Halopen can have.
The 8K free-tier cap mechanism. Word counts roll up across both Cloud and Privacy modes. Offline-tolerant counter; bootstrap-flushes to the server on next online launch. There’s no bypass via mode-switching; Privacy minutes count against the same cap.
The brand-cleared rule. Halopen never names competitors directly in user-facing copy. We say “the category leader” or “the leading dictation app.” This is a positioning choice (we compete against the category default, not one brand), not a hiding choice.

What we hold close

The following are competitive intel. The category of each decision is disclosable; the specific tunings are not. None of these are hidden because they’d embarrass us. They’re held because publishing them would tell a clone-attempt engineer exactly what to copy.

Vocabulary biasing token allocation. We feed selected text, cursor-adjacent text, clipboard, personal dictionary, and recent dictations into the prompt. The specific token budget per source, the prioritization rules when total exceeds 244, the secret-likely heuristic we run over clipboard contents — held.
App-bundle-aware injection profiles. Halopen tunes its cursor-injection behavior per app: VS Code, Cursor, Slack, Notion, Mail, iMessage, terminals, IDE chat panels. The category is disclosed. The specific tuning per app (autocomplete-suspend windows, mention-syntax preservation, cursor-positioning logic) is held.
Edge function prompt templates. Polish (gpt-4o-mini cleanup), Editorial (gpt-4o full rewrite), Command Mode (claude-sonnet-4-6 highlight + speak + replace) all use carefully tuned system + user prompts behind the scenes. The model names are public; the exact prompt structure, the anti-injection hardening pattern, the chunking strategy for long-form Editorial — held.
Secret-likely heuristic filter. Halopen filters out content that looks like credentials before feeding it to the cloud (long high-entropy strings, common credential prefixes, JWT-shaped tokens). The protection is disclosed. The specific entropy threshold, the full prefix allowlist, the regex patterns are held — disclosing them would teach attackers how to bypass.
ANE compute split. Halopen runs Whisper Large v3 with encoder + decoder on Apple’s Neural Engine and mel-spectrogram on CPU + GPU. The architectural choice is disclosed. The benchmark data behind it, the fallback paths for failure modes, the specific configuration values are held.
Accuracy benchmark methodology. Halopen benchmarks against leading dictation apps on a standardized corpus. The category is disclosed (and reproducible benchmarks will land on /research/accuracy-benchmarks/ when v1.5.0 ships). The exact corpus composition, the WER measurement methodology details, the per-app tuning we apply to keep our numbers honest are held.
Customer support patterns + internal ops. Specific churn reasons, customer thread content, refund patterns, support metrics — held.
Internal decisions + roadmap. Halopen’s DECISIONS.md, GROWTH_PLAN.md, COMPETITIVE_PLAN.md — held. The fact that we track every decision is disclosable; the content of those decisions is not.
Marketing tactics + content strategy. Which keywords we rank for, the content calendar, the publishing cadence, the email sequence triggers — held.

The decision rule

For anything ambiguous, the rule is one sentence:

If a competitor could copy our implementation tonight from a two-sentence description and ship it in two weeks, that’s IP — hold it. If they’d need our brand, distribution, customer data, or months of tuning to match us, it’s a feature — disclose it.

The transcription model is a feature, not IP. Anyone can rent gpt-4o-transcribe or download Whisper Large v3. The audit log is a feature, not IP. Anyone could implement metadata-only request logging. The fact that we publish the decision rule itself is a feature, not IP — you couldn’t transplant Halopen’s judgment by copying this document.

The token-allocation budget for vocabulary biasing across five context sources, on the other hand, is months of tuning. The app-bundle injection profile for VS Code took weeks of testing against actual autocomplete-race edge cases. Those are the kinds of things we hold close.

Every user value also ships as marketing

Internal operating rule: when Halopen ships a feature, the feature’s story ships in the same release. The audit log is a product feature and a brand pillar. The 8K word counter is a product feature and a brand-honesty signal. The fn-key hold model is a product choice and a positioning move.

Translation: any improvement we make to Halopen the product also becomes content on /learn/, a comparison row on /vs/, a settings explainer line, an audit-log entry users can verify, and where appropriate a public commitment we can be measured against later. Shipping a feature without telling users about it is leaving brand equity on the table; we treat that as a shipping defect.

When we get this wrong

Two failure modes we explicitly hedge against, both worth publishing here so users can hold us to them:

Over-claim. Saying we do something we don’t actually do, or used to but no longer do. The category leader does this on purpose for marketing. Halopen does not, and when it happens by accident we fix it immediately — in the same commit as the code change that invalidated the claim, never “later.” Most recent example: tonight (2026-05-11) we disabled Privacy-mode vocabulary biasing due to a WhisperKit library bug, and updated /learn/why-halopen-runs-on-your-mac/ in the same commit (f63f81d) so the page no longer claimed something that had just stopped being true.

Over-hide. Treating something as IP that should actually be in the “disclosed” column. Defaults to the user’s disadvantage. If you ever ask Halopen support “what does your app use for X?” and you get a non-answer, that’s a bug — please reply telling us, and we’ll fix the policy.

Why this page is public

We could keep this document internal — most companies do. Publishing it does three things for us:

It doubles the trust signal. We don’t just claim transparency; we publish the rules for what we disclose and what we hold, and we’re willing to be held to them.
It forces internal discipline. We know this page is public, so we can’t quietly bend it. The next time we’re tempted to over-claim or over-hide, this page is sitting there watching.
It does work the category leader cannot. They will not publish their version of this document. They cannot — their disclosure default is opacity, and publishing the rule would force them to honor it. Publishing ours is a thing the category structurally cannot copy.

Five edges where the disclose-vs-hold line is still moving. We name them rather than pretend the policy is finished — the policy is a working document, and the live edges are themselves part of the transparency surface.

Benchmark methodology vs corpus. The existence of our accuracy benchmark and the method we use (held-out reference transcripts, word error rate computed on full sentences not tokens, no per-model prompt tuning) are disclosable. The exact corpus stays held — disclosing it would let a competitor train against the same materials and game the comparison. When v1.5.0 benchmarks land at /research/accuracy-benchmarks/, the methodology will be open; the corpus pointers will be omitted.
The list of profiled apps vs the per-app tunings. The fact that Halopen ships injection profiles for VS Code, Cursor, Slack, Notion, Mail, iMessage, terminals, and IDE chat panels is disclosable. The specific tunings per app (autocomplete-suspend windows, mention-syntax preservation, cursor-positioning logic, race-condition handling) stay held. The list answers does my app work well with Halopen; the tunings would teach an exact reimplementation.
Directional roadmap vs specific dates. "Halopen is investing in non-English transcription quality" is disclosable. "Halopen will ship X by Y date" is not — internal ship dates slip, public commitments calcify, and a missed public date erodes trust harder than a quiet pivot. We will share direction, not waterfall plans.
Customer feedback themes vs individual threads. "Halopen sees recurring requests for Windows parity and hands-free toggle mode" is disclosable as a feedback theme. The contents of a specific support email, the name of the customer who wrote it, the count of customers in a given churn bucket — held. Themes are a transparency surface; threads are private.
Voice + editorial standard. Halopen's BRAND.md banned-phrase list is partially disclosable (any reader can see we don't use "AI-powered" anywhere on this site, and the reasoning is documented in product context). The full voice rubric, including the editorial bar every content piece must clear before shipping, stays internal — it's a quality tool, not a marketing surface, and publishing it would invite gaming rather than improvement.

These five edges may move. If we add new ones, they appear here. If an item moves from held to disclosed (or the reverse), the move is logged in the change history at the top of this page rather than silently re-categorized.

/privacy/ — the user-facing privacy disclosure that this policy backs.
/compare/halopen-vs-the-leading-dictation-app/ — the comparison page that names the model and explains the wedge.
/learn/why-halopen-runs-on-your-mac/ — the long-form explanation of the on-device path.
/learn/cloud-vs-privacy-mode/ — decision guide for picking the right transcription mode.
/learn/the-halopen-audit-log/ — what the audit log records and how to use it for verification.
/manifesto/ — the brand thesis this policy operationalizes.