Back to articles
DocumentationSource: backend/agents/notes/trade_failure_analysis.md

AUTHOR: Tony Mudau

Orchestration + Technical Agent Trade Failure Analysis

Scope and objective

This analysis focuses on:

  • backend/agents/orchestration_agent/main.py
  • backend/agents/technical_agent/main.py
  • Related storage/execution paths in backend/agents/common.py and backend/agents/execution_agent/main.py

Goal: explain why trades can be late/poor and whether these two agents currently have enough history/context to improve decisions.

Executive conclusion

No, the current orchestration + technical stack does not have enough closed-loop learning from past failures.

The system does persist lots of data (decisions, trade history, portfolio snapshots), but there is a disconnect:

  • Trade outcomes are written to SQLite (trade_history) and decision logs.
  • Technical memory uses JSON (*_trades.json and *_patterns.json) and influences confidence.
  • There is no production code path that updates technical memory from real closed trades.

Result: the technical agent appears "memory-aware" on paper, but in live operation it can repeatedly make similar mistakes because the memory dataset is stale or empty.


What each agent currently does

Orchestration agent behavior

The orchestration flow:

  1. Pulls M15 + H1 + D1 OHLCV and tick data for candidate symbols.
  2. Computes precheck score (spread, volatility, liquidity).
  3. Calls technical agent for directional hypothesis.
  4. Calls market/news agent.
  5. Applies MTF advisory and reversal guard (confidence adjustments, mostly soft penalties).
  6. Selects best symbol by combined score.
  7. Runs proposal through portfolio + risk + execution.
  8. Logs decisions and execution outcomes.

Key strength: good observability (agent_trace, decision logging, trade history).

Key weakness: no explicit "failure memory feedback" into the next signal generation.

Technical agent behavior

Technical agent has strong local tooling:

  • Indicator context (EMA50/200, RSI, MACD, ATR)
  • MTF stack and alignment scoring
  • Swing structure and volatility regime
  • Session filter
  • Correlation/exposure overlay
  • Pattern memory ranking from past *_trades.json

Important: the memory engine only learns when record_trade_outcome() is called.

That function exists and works, but I only found usage in tests, not in live execution/close reconciliation paths.


Why trades are likely bad or late

1) Missing closed-loop learning from real outcomes (highest impact)

Evidence:

  • technical_agent.main defines record_trade_outcome() and refresh_pattern_memory().
  • execution_agent and orchestration_agent persist outcomes in SQLite (trade_history) but do not call record_trade_outcome().
  • Search result shows record_trade_outcome() usage only in test_technical_agent.py.

Impact:

  • Pattern win rates used in confidence are not updated from real wins/losses.
  • Failed setups are not penalized over time.
  • Agent can repeat the same pattern families that already lost.

2) Trade history and technical memory are split into two stores with no sync

Current stores:

  • SQLite: trade_history has robust trade lifecycle and PnL.
  • JSON memory: *_trades.json and *_patterns.json power technical confidence.

No bridge process currently syncs closed rows from SQLite into technical memory JSON.

Impact:

  • Technical agent's "learned patterns" can drift from real broker outcomes.
  • Runtime confidence can be detached from actual performance.

3) Entry timing sensitivity without recency/freshness guards

Orchestration chooses entries using latest fetched bar/tick, but there is no explicit check that:

  • The latest M15 bar is fresh enough.
  • Signal age is still valid at execution time (especially for scheduled trades).
  • Price moved too far from hypothesis entry before execution.

Impact:

  • "Late" entries can happen when market regime shifted but proposal remains valid.
  • Scheduled trades can execute old thesis on new price structure.

4) Confidence threshold logic can still permit weak edge

Current orchestration minimum effective confidence is around 0.52. Technical confidence is blended from context alignment + pattern rate + MTF + adjustments; this can pass even with limited real outcome grounding.

Impact:

  • If memory quality is weak, threshold gating does not fully protect against low-quality entries.

5) Multi-timeframe opposition is advisory, not hard risk gate

Current MTF logic mostly soft-penalizes confidence and gives hints (aligned, pullback, counter-HTF aggressive).

Impact:

  • Counter-HTF entries can still go through if other components keep confidence above threshold.
  • In trending periods this can create repeated fade attempts.

6) No explicit performance-aware throttling by pattern/session/symbol

System logs enough to compute:

  • loss streak by symbol
  • poor hit-rate by session
  • poor hit-rate by setup pattern

But no runtime policy blocks/reduces risk after persistent underperformance.

Impact:

  • Repeated losses can cluster before any adaptive braking occurs.

How technical agent tools currently work (practical explanation)

Context builder (build_market_context)

Requires >=220 bars and high/low/close columns. Produces:

  • trend via EMA50 vs EMA200
  • RSI state (oversold/overbought/neutral)
  • MACD signal (bullish/bearish)
  • volatility bucket from ATR/price

This is the core state used for rule gating and memory lookup.

MTF stack (build_mtf_stack)

Builds context for entry TF + H1 + D1 and outputs:

  • inferred directions (buy/sell/neutral)
  • alignment score
  • aligned/opposed timeframe counts

Used for strategy mode selection and confidence contribution.

Strategy mode selector (choose_strategy_mode)

Chooses one mode:

  • trend
  • htf_pullback
  • mean_reversion

Mode changes entry gates and TP multiplier.

Pattern memory + scoring

Memory files:

  • {symbol}_trades.json: compact historical outcomes
  • {symbol}_patterns.json: aggregated setup buckets with win rate/sample size

At analysis time:

  • Relevant patterns are ranked by context + HTF relation + win rate.
  • Average pattern win rate influences confidence.
  • HTF relation can add small memory lift/penalty.

This is good design, but only if live outcomes continually feed memory.

Correlation overlay

correlation_overlay() estimates same-side currency pressure using open positions and can:

  • apply confidence penalty
  • hard block high correlated exposure

This reduces concentration risk but does not solve timing quality by itself.

Hypothesis generator (generate_trade_hypothesis)

Builds direction/SL/TP/confidence/explanation if setup qualifies. Confidence blends:

  • base score
  • pattern win rate
  • alignment
  • trend strength
  • MTF alignment
  • volatility adjustment
  • memory lift
  • correlation penalty

If no setup qualifies, returns diagnostic no-trade payload.


Why this can still lose money even with many safeguards

The system is good at static snapshot evaluation, but weak at adaptive learning:

  • It can diagnose current chart state.
  • It logs outcomes.
  • It does not reliably transform outcomes into future behavior changes.

In short: instrumented but not self-correcting.


Concrete remediation plan

Phase 1: Wire closed trades into technical memory (must-do)

Implement a sync job that:

  1. Reads newly closed rows from trade_history.
  2. Reconstructs/uses stored entry context (trend/rsi/macd/volatility + h1/d1 if available).
  3. Calls record_trade_outcome(symbol, context, direction, pnl=..., mtf_stack=...).
  4. Marks those history rows as memory_synced=1 (new DB column) to avoid duplicates.

Minimum schema extension:

  • trade_history.memory_synced INTEGER DEFAULT 0
  • Optional: trade_history.entry_context_json TEXT, trade_history.entry_mtf_json TEXT

Important: store context at entry time, not close time, so learning reflects the actual decision environment.

Phase 2: Add freshness and stale-thesis guards

Before execution (especially scheduled):

  • reject if last bar too old for timeframe
  • reject if price moved beyond max slippage vs proposal entry
  • optional: recompute technical signal and require directional agreement

This directly addresses "late trades."

Phase 3: Performance-aware adaptation rules

Use recent closed-trade stats to adapt:

  • if symbol win rate over last N < threshold, reduce lot multiplier or pause symbol
  • if setup pattern loss streak >= K, block pattern for cooldown window
  • if session underperforms, increase confidence threshold in that session

Phase 4: Improve failure traceability

Add a per-trade "decision snapshot" record containing:

  • precheck score and components
  • technical confidence and key indicators
  • MTF alignment + reversal guard result
  • final proposal confidence after LLM/portfolio/risk

This makes root-cause reviews deterministic and auditable.


Suggested metrics dashboard for ongoing control

Track these daily/rolling:

  • win rate, expectancy, net PnL by symbol
  • win rate by strategy mode (trend, htf_pullback, mean_reversion)
  • win rate by session (asia, london, new_york)
  • average adverse excursion before close
  • latency between signal generation and execution
  • stale-thesis rejection count (after guard rollout)

Direct answers to your questions

  1. Do orchestration + technical currently have enough history/context to execute well?
  • Not yet. They have partial memory structure but no full closed-loop learning from real outcomes.
  1. Why are trades bad/late?
  • Most likely from missing adaptive feedback, stale-thesis execution risk, and permissive confidence gating when memory quality is weak.
  1. How do technical tools work?
  • They are robust rule/context tools with pattern memory weighting, but they depend on continuous real outcome ingestion to become truly adaptive.

Next implementation target (recommended)

Build a trade_outcome_sync module that runs every scheduler cycle:

  • pulls unsynced closed trades
  • maps each trade to stored entry context
  • updates technical memory and pattern aggregates
  • writes sync status back to DB

This is the smallest high-impact change to make the agents actually learn from past failures.