AUTHOR: Tony Mudau
Orchestration + Technical Agent Trade Failure Analysis
Scope and objective
This analysis focuses on:
backend/agents/orchestration_agent/main.pybackend/agents/technical_agent/main.py- Related storage/execution paths in
backend/agents/common.pyandbackend/agents/execution_agent/main.py
Goal: explain why trades can be late/poor and whether these two agents currently have enough history/context to improve decisions.
Executive conclusion
No, the current orchestration + technical stack does not have enough closed-loop learning from past failures.
The system does persist lots of data (decisions, trade history, portfolio snapshots), but there is a disconnect:
- Trade outcomes are written to SQLite (
trade_history) and decision logs. - Technical memory uses JSON (
*_trades.jsonand*_patterns.json) and influences confidence. - There is no production code path that updates technical memory from real closed trades.
Result: the technical agent appears "memory-aware" on paper, but in live operation it can repeatedly make similar mistakes because the memory dataset is stale or empty.
What each agent currently does
Orchestration agent behavior
The orchestration flow:
- Pulls M15 + H1 + D1 OHLCV and tick data for candidate symbols.
- Computes precheck score (spread, volatility, liquidity).
- Calls technical agent for directional hypothesis.
- Calls market/news agent.
- Applies MTF advisory and reversal guard (confidence adjustments, mostly soft penalties).
- Selects best symbol by combined score.
- Runs proposal through portfolio + risk + execution.
- Logs decisions and execution outcomes.
Key strength: good observability (agent_trace, decision logging, trade history).
Key weakness: no explicit "failure memory feedback" into the next signal generation.
Technical agent behavior
Technical agent has strong local tooling:
- Indicator context (
EMA50/200,RSI,MACD,ATR) - MTF stack and alignment scoring
- Swing structure and volatility regime
- Session filter
- Correlation/exposure overlay
- Pattern memory ranking from past
*_trades.json
Important: the memory engine only learns when record_trade_outcome() is called.
That function exists and works, but I only found usage in tests, not in live execution/close reconciliation paths.
Why trades are likely bad or late
1) Missing closed-loop learning from real outcomes (highest impact)
Evidence:
technical_agent.maindefinesrecord_trade_outcome()andrefresh_pattern_memory().execution_agentandorchestration_agentpersist outcomes in SQLite (trade_history) but do not callrecord_trade_outcome().- Search result shows
record_trade_outcome()usage only intest_technical_agent.py.
Impact:
- Pattern win rates used in confidence are not updated from real wins/losses.
- Failed setups are not penalized over time.
- Agent can repeat the same pattern families that already lost.
2) Trade history and technical memory are split into two stores with no sync
Current stores:
- SQLite:
trade_historyhas robust trade lifecycle and PnL. - JSON memory:
*_trades.jsonand*_patterns.jsonpower technical confidence.
No bridge process currently syncs closed rows from SQLite into technical memory JSON.
Impact:
- Technical agent's "learned patterns" can drift from real broker outcomes.
- Runtime confidence can be detached from actual performance.
3) Entry timing sensitivity without recency/freshness guards
Orchestration chooses entries using latest fetched bar/tick, but there is no explicit check that:
- The latest M15 bar is fresh enough.
- Signal age is still valid at execution time (especially for scheduled trades).
- Price moved too far from hypothesis entry before execution.
Impact:
- "Late" entries can happen when market regime shifted but proposal remains valid.
- Scheduled trades can execute old thesis on new price structure.
4) Confidence threshold logic can still permit weak edge
Current orchestration minimum effective confidence is around 0.52.
Technical confidence is blended from context alignment + pattern rate + MTF + adjustments; this can pass even with limited real outcome grounding.
Impact:
- If memory quality is weak, threshold gating does not fully protect against low-quality entries.
5) Multi-timeframe opposition is advisory, not hard risk gate
Current MTF logic mostly soft-penalizes confidence and gives hints (aligned, pullback, counter-HTF aggressive).
Impact:
- Counter-HTF entries can still go through if other components keep confidence above threshold.
- In trending periods this can create repeated fade attempts.
6) No explicit performance-aware throttling by pattern/session/symbol
System logs enough to compute:
- loss streak by symbol
- poor hit-rate by session
- poor hit-rate by setup pattern
But no runtime policy blocks/reduces risk after persistent underperformance.
Impact:
- Repeated losses can cluster before any adaptive braking occurs.
How technical agent tools currently work (practical explanation)
Context builder (build_market_context)
Requires >=220 bars and high/low/close columns. Produces:
- trend via EMA50 vs EMA200
- RSI state (oversold/overbought/neutral)
- MACD signal (bullish/bearish)
- volatility bucket from ATR/price
This is the core state used for rule gating and memory lookup.
MTF stack (build_mtf_stack)
Builds context for entry TF + H1 + D1 and outputs:
- inferred directions (buy/sell/neutral)
- alignment score
- aligned/opposed timeframe counts
Used for strategy mode selection and confidence contribution.
Strategy mode selector (choose_strategy_mode)
Chooses one mode:
trendhtf_pullbackmean_reversion
Mode changes entry gates and TP multiplier.
Pattern memory + scoring
Memory files:
{symbol}_trades.json: compact historical outcomes{symbol}_patterns.json: aggregated setup buckets with win rate/sample size
At analysis time:
- Relevant patterns are ranked by context + HTF relation + win rate.
- Average pattern win rate influences confidence.
- HTF relation can add small memory lift/penalty.
This is good design, but only if live outcomes continually feed memory.
Correlation overlay
correlation_overlay() estimates same-side currency pressure using open positions and can:
- apply confidence penalty
- hard block high correlated exposure
This reduces concentration risk but does not solve timing quality by itself.
Hypothesis generator (generate_trade_hypothesis)
Builds direction/SL/TP/confidence/explanation if setup qualifies. Confidence blends:
- base score
- pattern win rate
- alignment
- trend strength
- MTF alignment
- volatility adjustment
- memory lift
- correlation penalty
If no setup qualifies, returns diagnostic no-trade payload.
Why this can still lose money even with many safeguards
The system is good at static snapshot evaluation, but weak at adaptive learning:
- It can diagnose current chart state.
- It logs outcomes.
- It does not reliably transform outcomes into future behavior changes.
In short: instrumented but not self-correcting.
Concrete remediation plan
Phase 1: Wire closed trades into technical memory (must-do)
Implement a sync job that:
- Reads newly
closedrows fromtrade_history. - Reconstructs/uses stored entry context (trend/rsi/macd/volatility + h1/d1 if available).
- Calls
record_trade_outcome(symbol, context, direction, pnl=..., mtf_stack=...). - Marks those history rows as
memory_synced=1(new DB column) to avoid duplicates.
Minimum schema extension:
trade_history.memory_synced INTEGER DEFAULT 0- Optional:
trade_history.entry_context_json TEXT,trade_history.entry_mtf_json TEXT
Important: store context at entry time, not close time, so learning reflects the actual decision environment.
Phase 2: Add freshness and stale-thesis guards
Before execution (especially scheduled):
- reject if last bar too old for timeframe
- reject if price moved beyond max slippage vs proposal entry
- optional: recompute technical signal and require directional agreement
This directly addresses "late trades."
Phase 3: Performance-aware adaptation rules
Use recent closed-trade stats to adapt:
- if symbol win rate over last N < threshold, reduce lot multiplier or pause symbol
- if setup pattern loss streak >= K, block pattern for cooldown window
- if session underperforms, increase confidence threshold in that session
Phase 4: Improve failure traceability
Add a per-trade "decision snapshot" record containing:
- precheck score and components
- technical confidence and key indicators
- MTF alignment + reversal guard result
- final proposal confidence after LLM/portfolio/risk
This makes root-cause reviews deterministic and auditable.
Suggested metrics dashboard for ongoing control
Track these daily/rolling:
- win rate, expectancy, net PnL by symbol
- win rate by strategy mode (
trend,htf_pullback,mean_reversion) - win rate by session (
asia,london,new_york) - average adverse excursion before close
- latency between signal generation and execution
- stale-thesis rejection count (after guard rollout)
Direct answers to your questions
- Do orchestration + technical currently have enough history/context to execute well?
- Not yet. They have partial memory structure but no full closed-loop learning from real outcomes.
- Why are trades bad/late?
- Most likely from missing adaptive feedback, stale-thesis execution risk, and permissive confidence gating when memory quality is weak.
- How do technical tools work?
- They are robust rule/context tools with pattern memory weighting, but they depend on continuous real outcome ingestion to become truly adaptive.
Next implementation target (recommended)
Build a trade_outcome_sync module that runs every scheduler cycle:
- pulls unsynced closed trades
- maps each trade to stored entry context
- updates technical memory and pattern aggregates
- writes sync status back to DB
This is the smallest high-impact change to make the agents actually learn from past failures.