Every call is logged here with its reasoning and — once resolved — its realized outcome. We do not claim a predictive edge or market-beating returns. What we stand behind is transparent calibration, posture-tagged reads every day (act / watch / avoid), and total transparency. No ACT-tier call has resolved yet; the ACT row will populate as those calls hit their targets or stops. WATCH and AVOID cohorts are accruing.
How every call is made the process, in the open
1 · Read the regime first
We classify the market state (trend / range / high-volatility) and how tradeable it is. Every day produces a read — the posture (act / watch / avoid) just reflects how confident we are.
2 · Grade the read
Every read carries a grade (A / B / C) and a posture. Grade-A reads get an 'act' posture; weaker reads stay 'watch' or 'avoid' — but they still ship, with their realized outcome, so the public surface is never empty.
3 · Commit the risk up front
Every call ships with an entry, an invalidation, a target, and a required reward:risk. We decide where we're wrong before we're wrong.
4 · Let the market resolve it
A triple-barrier rule (target / invalidation / time) closes each call objectively. Win or miss, it's logged — and confidence is calibrated against real outcomes.
Research-mode signals — not for trading internal R&D · transparency only
Behind the ledger the work the desk is doing every day
Patterns we have learned to avoid
—
Each one is a specific market condition we have seen go wrong before. The next time a similar one shows up, the posture is tighter by default.
Edge claims we are testing in public
—
See the hypotheses page → We name every "edge" we are testing — including the ones the live data is now arguing against.
The gate, in numbers what we filter, and what we'd produce if we didn't
The engine considers many candidate setups, then filters them through a gate before publishing. The numbers below answer one question: is the gate doing real work, or is it filtering out money?
How to read this.
ACT — published as calls. The realized number is real P&L if you risked one "trade-unit" per call (1 trade-unit = the risk you would put on a single call; +1 = a 1-for-1 winner, −1 = a stop-out).
WATCH / AVOID — calls the gate filtered out before publication. No reader could have taken these — they were never recommendations. The number is counterfactual: what they would have produced if the gate had let them through. It's a measure of gate-quality, not a P&L.
If WATCH/AVOID counterfactuals are negative → the gate caught losers, doing its job. If they're positive → the gate may be over-tight (it filtered candidates that would have worked); we accept that as the cost of capital discipline.
Distribution of resolved candidates
Calibration plot — what our conviction says vs what happened filter cohort, no ACT calls yet
The gold-standard "are we honest?" chart. Each dot is a bin of resolved calls grouped by what the model said the win-probability was. A perfectly calibrated system has dots ON the diagonal: "we said 60% and 60% of those won." Below the diagonal = overconfident (we claimed more certainty than reality delivered). Above = underconfident. Caveat: this is filter cohort (calls we declined to publish); no ACT calls have resolved yet. The ACT calibration plot will appear here once we have ≥30 resolved ACT bins.
Dashed line = perfect calibration. Below = overconfident.
Why we publish this: a calibration plot is the single hardest thing for a non-honest signal service to fake. If our conviction number drifts away from reality, this chart shows it before we do.
How simpler alternatives would have done over the same window
The trivial alternative to any trading discipline is just buy and hold. For the same days our resolved-signals ledger covers, here is what each spot asset did if you simply held it. If BH lost money, our gate's AVOID posture was avoiding a falling market — exactly what capital discipline is supposed to do.
Other baselines coming. Bollinger-band breakout signals + 200-day MA crossings on the same window, each scored on their realized 5-day forward returns. Pending V2 implementation.
Forecast accuracy — checked against the dumbest baseline measured vs not-yet-measured
For each thing we forecast, we ask the same question: does our call beat the dumbest baseline a coin or a calculator could produce? Three answers below — only one of them is currently passing.
The triptych above is the honest summary. We don't sell a directional edge. We tell you the regime, the conviction, and the gate decision — and we measure each one against the trivial alternative so you can see what's working and what isn't.
The ledger — every call, newest first
Outcome distribution across resolved calls
Loading…
When the desk disagreed with itself
We don't run one model — we run four, in parallel, watching the same bar. When two of them disagree on what the market is doing, that disagreement gets logged here. Each one made us widen the posture before sizing. Last 7 days, most recent first.
Days we sat out
The honest read on some days is "no setup is good enough to size." We log those out loud, in public. A quiet ledger is a healthy desk, not a broken one. Last 14 days, one row per asset per day.