TRADING4AIFinancial-agent reliability layer: failure patterns, runner differences, and claim safety

Reliability Corpus

Failure patterns as a reusable asset.

This corpus keeps known backtest mistakes, assumption drift, and claim-safety overreach in one place so agents can search, cite, and revalidate against it.

Limitations

This corpus packages failure patterns and review notes; it does not replace human judgment or runner verification.

A corpus entry is evidence of a known failure shape, not proof that a strategy is invalid in every context.

Source availability, runner behavior, and assumptions can change over time.

Delivery contract

Reliability Corpus: HTML page plus static JSON artifact

A static reference library of failure patterns, runner-difference notes, assumption drift, and claim-safety examples.

Boundary semantics

  • artifactStatus: static_public_sample
  • evidenceScope: warning_pattern
  • reviewMode: reference_only
  • citationPolicy: warning_reference_only
  • This corpus is a warning reference, not a final verdict about a strategy or system.
  • Use the corpus to trigger deeper verification and review.
  • Do not upgrade a corpus hit into a universal invalidation claim.

Intended users

  • AI agents checking generated financial claims against known failure patterns
  • Human reviewers comparing a strategy, report, or action against reusable warnings
  • Crawlers indexing TRADING4AI's public reliability materials

Artifact provenance

  • schemaVersion: trading4ai-public-reliability-v1
  • artifactVersion: reliability-corpus-sample-2026-05-23
  • generatedAt: 2026-05-23T00:00:00.000Z
  • staticSnapshotAt: 2026-05-23T00:00:00.000Z
  • artifactUrl: /reliability/reliability-corpus-sample.json
  • citationFields: artifactVersion, generatedAt, entries[].firstSeenAt, entries[].lastUpdatedAt, entries[].sourceRefs
  • When citing a corpus entry, keep artifactVersion, generatedAt, firstSeenAt, lastUpdatedAt, and sourceRefs in the review trace.
  • Treat a corpus hit as a warning reference only; it is not a final verdict about a strategy, report, or system.

Inputs: Corpus lookup context

  • domain: optional backtest_assumptions, runner_difference, claim_safety, or related category
  • severity: optional low, medium, or high
  • source: optional source-ref or text search
  • updatedSince: optional date filter for changed entries

Outputs: Reference corpus result

  • entries: failure-pattern records with id, pattern, domain, severity, and notes
  • sourceRefs: public references for each pattern
  • firstSeenAt and lastUpdatedAt
  • filters, generatedAt, and limitations
  • machine-readable JSON artifact

Suitable for

  • Checking whether a backtest or strategy report has known failure modes
  • Teaching financial agents to avoid unsafe or overconfident claim patterns
  • Providing crawlable reference material for reliability and claim-safety review

Not suitable for

  • No live strategy scoring
  • No complete financial-risk ontology
  • No execution approval
  • No guarantee that all possible failure modes are covered

Source / limitation policy

  • A corpus hit is a warning shape, not proof that every related strategy is invalid
  • The corpus does not replace runner verification or human judgment
  • Source behavior, platform documentation, and assumptions can change over time

Crawler and agent guidance

  • Index the HTML page for browsable failure patterns
  • Use the JSON artifact for stable machine retrieval
  • Treat entries as reference warnings, not final verdicts

How to use this page

Read the HTML page first. Use JSON only for bounded handoff.

Read the HTML page first to understand scope, limitations, and non-goals. Open the linked scenario pack only when the task is already concrete. Download the JSON only after the boundary is clear, and escalate to human review before irreversible action or clearance-like claims.

Start here

  • Read the HTML page first when the task is to compare a financial claim, report, or workflow against known warning patterns.
  • Treat corpus entries as reusable review hints that trigger deeper verification, not as final proof by themselves.
  • Use the filters and sample collections as a static reference workflow, not as a scoring engine.

Open the linked pack when

  • Open the linked pack when the review has become concrete, especially for publication wording, claim safety, or scenario-level escalation logic.
  • Use the pack when you need a task-shaped checklist instead of a broad warning library.

Download the JSON when

  • Download the JSON only after the HTML page makes the warning-only scope and limitations explicit.
  • Use the artifact for retrieval, audit storage, or pattern benchmarking when another agent needs the stable failure-pattern dataset.

Escalate to human review when

  • A corpus hit is being treated as final proof, execution approval, or a universal invalidation claim.
  • The review would block or approve money movement, compliance language, or public claims without source-specific verification.
  • Runner behavior, assumptions, or time windows need fresh confirmation before any high-stakes conclusion.

This page does not do

  • It does not score a live strategy, rank systems, or certify that a report is safe.
  • It does not replace runner verification, human judgment, or scenario-specific evidence review.
  • It does not run backtests or produce request-time backend judgments.

Related public material packs

Use the concrete scenario pack when the task is already specific

The service page explains the general contract. These linked packs are the next step when the user, crawler, or agent already knows the real-world review scenario it needs to inspect.

Start here next

Claim-safety review before publication

An agent is about to restate marketing, evidence, or strategy claims in public-facing language and needs a warning-layer review first.

Why it matters: This is one of the most reusable reliability tasks because many financial-agent failures come from overclaiming, not from missing raw data.

Why start here: Start here next when an agent is about to restate evidence, strategy language, or marketing claims in public-facing form.

Fallback

Browse full catalog

Use the Public Materials index when the task is still broad, or when the service is right but the scenario pack is not obvious yet.

Index page: /agent-verification/materials

JSON artifact: /reliability/public-materials-sample.json

Filters

Corpus summary

All entries6
Filtered entries6
JSON path/reliability/reliability-corpus-sample.json
Default domainall

Boundary semantics

artifactStatus: static_public_sample

evidenceScope: warning_pattern

reviewMode: reference_only

citationPolicy: warning_reference_only

This corpus is a warning reference, not a final verdict about a strategy or system.

Use the corpus to trigger deeper verification and review.

Do not upgrade a corpus hit into a universal invalidation claim.

Usage guidance

The corpus is for warning and review, not for final verdicts.

Do not upgrade a corpus hit into a claim that the whole strategy is invalid.

How to use a corpus hit

A corpus hit is a reusable warning reference about a known failure shape.

A hit should trigger deeper verification, runner comparison, or human review before reuse.

A hit does not mean the entire strategy, report, or workflow is permanently invalid.

Citation checklist

  • Keep firstSeenAt, lastUpdatedAt, and sourceRefs visible when citing a corpus entry.
  • Preserve the domain and severity so the warning is not taken out of context.
  • Carry notes and limitations forward when the hit influences a human or agent decision.

Do not upgrade to

  • Do not upgrade a corpus hit into final proof, universal invalidation, or execution approval.
  • Do not present a corpus entry as legal advice, enforcement action, or formal compliance determination.
  • Do not use the corpus as a live strategy score or a replacement for runner-specific evidence.

Sample inventory

Pick the scenario that matches the risk review task

These named corpus packs show which review path a crawler, analyst, or agent should inspect first.

backtest_assumption_audit

Backtest assumption audit

Use this collection when an agent or reviewer needs to sanity-check a strategy report for lookahead, repainting, and cost-assumption mistakes.

Why it matters: Shows how the corpus can catch common reasons a backtest result should not be trusted at face value.

runner_difference_review

Runner difference review

Use this collection when comparing results across runners, synthetic samples, or mismatched execution contracts.

Why it matters: Shows why agent-facing results need runner and data-contract context before any comparison is treated as meaningful.

claim_safety_review

Claim safety review

Use this collection when a financial agent is about to restate marketing or evidence into public-facing conclusions.

Why it matters: Shows how the corpus can block overconfident or unsafe public statements before they are repeated downstream.

Public review collections

Named corpus packs for common audit tasks

These static collections show how the corpus can be used as a reference pack for backtest, runner, and claim-safety review.

backtest_assumption_audit

Backtest assumption audit

Use this collection when an agent or reviewer needs to sanity-check a strategy report for lookahead, repainting, and cost-assumption mistakes.

Entries: 3

Filter hint: backtest_assumptions / high

runner_difference_review

Runner difference review

Use this collection when comparing results across runners, synthetic samples, or mismatched execution contracts.

Entries: 2

Filter hint: runner_difference

claim_safety_review

Claim safety review

Use this collection when a financial agent is about to restate marketing or evidence into public-facing conclusions.

Entries: 1

Filter hint: claim_safety / high

Entries

Corpus entries and failure patterns

backtest_assumptions

highlookahead_bias

Lookahead bias and future leakage

artifactStatus: static_public_sample

evidenceScope: warning_pattern

reviewIntent: Trigger deeper timestamp and runner verification before any public or operational reuse.

First seen: 2026-05-22T00:00:00.000Z

Last updated: 2026-05-23T00:00:00.000Z

Usage guidance

  • Use this entry to request deeper timestamp-alignment and feature-construction review.

Do not upgrade to

  • Do not present this entry as final proof that every related strategy is invalid.
  • The strategy appears to see future bars or derived values before they are available.
  • Treat as invalid until timestamp alignment and indicator construction are verified.

backtest_assumptions

highrepainting

Repainting indicator or chart logic

artifactStatus: static_public_sample

evidenceScope: warning_pattern

reviewIntent: Trigger deeper indicator-timing verification before any public performance restatement.

First seen: 2026-05-22T00:00:00.000Z

Last updated: 2026-05-23T00:00:00.000Z

Usage guidance

  • Use this entry to ask whether indicator values change after bar close or after recomputation.

Do not upgrade to

  • Do not present this entry as automatic proof of fraud or intentional deception.
  • Historical values can change after the bar closes or after the indicator recalculates.
  • Do not present a repainting series as stable historical evidence.

backtest_assumptions

highfee_slippage_drift

Fee and slippage assumption drift

artifactStatus: static_public_sample

evidenceScope: warning_pattern

reviewIntent: Trigger cost-model verification before publishing backtest conclusions.

First seen: 2026-05-22T00:00:00.000Z

Last updated: 2026-05-23T00:00:00.000Z

Usage guidance

  • Use this entry to ask for explicit fee, slippage, and fill-assumption documentation.

Do not upgrade to

  • Do not present this entry as final proof that the whole strategy is worthless.
  • Backtest results can degrade sharply when fee and slippage assumptions are made more realistic.
  • A result without explicit fee and slippage assumptions is not auditable.

runner_difference

mediumsample_vs_live_gap

Sample-versus-live gap

artifactStatus: static_public_sample

evidenceScope: warning_pattern

reviewIntent: Trigger scope and dataset-boundary review before sample outputs are reused as live evidence.

First seen: 2026-05-22T00:00:00.000Z

Last updated: 2026-05-23T00:00:00.000Z

Usage guidance

  • Use this entry to ask whether a result came from sample, offline, synthetic, or live conditions.

Do not upgrade to

  • Do not present this entry as proof that a system can never work live.
  • Synthetic or offline samples can validate flow but do not prove live performance.
  • Always label sample data as sample data.

runner_difference

highrunner_contract_mismatch

Runner contract mismatch

artifactStatus: static_public_sample

evidenceScope: warning_pattern

reviewIntent: Trigger runner-contract diff review before comparing outputs across environments.

First seen: 2026-05-22T00:00:00.000Z

Last updated: 2026-05-23T00:00:00.000Z

Usage guidance

  • Use this entry to ask whether fill, timing, and execution assumptions were matched across runners.

Do not upgrade to

  • Do not present this entry as automatic proof that one runner is correct and the other is wrong.
  • Different runners can disagree on order fill behavior, bar-close logic, or execution timing.
  • Do not compare results across runners without a documented contract diff.

claim_safety

highclaim_safety_overreach

Claim safety overreach

artifactStatus: static_public_sample

evidenceScope: warning_pattern

reviewIntent: Trigger claim-boundary review before evidence packaging is upgraded into public-facing conclusions.

First seen: 2026-05-22T00:00:00.000Z

Last updated: 2026-05-23T00:00:00.000Z

Usage guidance

  • Use this entry to ask whether a statement exceeds what the cited evidence can support.

Do not upgrade to

  • Do not present this entry as legal advice, enforcement action, or final compliance determination.
  • A packaging layer can summarize evidence, but it must not claim legal clearance or guaranteed safety.
  • Do not rewrite marketing claims into verified facts.

Machine-readable

Corpus JSON artifact

The filtered corpus is exposed in a stable JSON shape for downstream agents and audit storage.

{
  "status": "success",
  "data": {
    "contract": {
      "id": "reliability-corpus",
      "name": "Reliability Corpus",
      "summary": "A static reference library of failure patterns, runner-difference notes, assumption drift, and claim-safety examples.",
      "htmlPath": "/agent-verification/reliability-corpus",
      "artifactPath": "/reliability/reliability-corpus-sample.json",
      "artifactProvenance": {
        "schemaVersion": "trading4ai-public-reliability-v1",
        "artifactVersion": "reliability-corpus-sample-2026-05-23",
        "generatedAt": "2026-05-23T00:00:00.000Z",
        "staticSnapshotAt": "2026-05-23T00:00:00.000Z",
        "artifactUrl": "/reliability/reliability-corpus-sample.json",
        "citationFields": [
          "artifactVersion",
          "generatedAt",
          "entries[].firstSeenAt",
          "entries[].lastUpdatedAt",
          "entries[].sourceRefs"
        ],
        "howToCite": [
          "When citing a corpus entry, keep artifactVersion, generatedAt, firstSeenAt, lastUpdatedAt, and sourceRefs in the review trace.",
          "Treat a corpus hit as a warning reference only; it is not a final verdict about a strategy, report, or system."
        ]
      },
      "artifactStatus": "static_public_sample",
      "evidenceScope": "warning_pattern",
      "reviewMode": "reference_only",
      "citationPolicy": {
        "mode": "warning_reference_only",
        "summary": "This corpus is a warning reference, not a final verdict about a strategy or system.",
        "guidance": [
          "Use the corpus to trigger deeper verification and review.",
          "Do not upgrade a corpus hit into a universal invalidation claim."
        ]
      },
      "intendedUsers": [
        "AI agents checking generated financial claims against known failure patterns",
        "Human reviewers comparing a strategy, report, or action against reusable warnings",
        "Crawlers indexing TRADING4AI's public reliability materials"
      ],
      "acceptedInput": {
        "label": "Corpus lookup context",
        "items": [
          "domain: optional backtest_assumptions, runner_difference, claim_safety, or related category",
          "severity: optional low, medium, or high",
          "source: optional source-ref or text search",
          "updatedSince: optional date filter for changed entries"
        ]
      },
      "outputShape": {
        "label": "Reference corpus result",
        "items": [
          "entries: failure-pattern records with id, pattern, domain, severity, and notes",
          "sourceRefs: public references for each pattern",
          "firstSeenAt and lastUpdatedAt",
          "filters, generatedAt, and limitations",
          "machine-readable JSON artifact"
        ]
      },
      "suitableUseCases": [
        "Checking whether a backtest or strategy report has known failure modes",
        "Teaching financial agents to avoid unsafe or overconfident claim patterns",
        "Providing crawlable reference material for reliability and claim-safety review"
      ],
      "limitations": [
        "A corpus hit is a warning shape, not proof that every related strategy is invalid",
        "The corpus does not replace runner verification or human judgment",
        "Source behavior, platform documentation, and assumptions can change over time"
      ],
      "nonGoals": [
        "No live strategy scoring",
        "No complete financial-risk ontology",
        "No execution approval",
        "No guarantee that all possible failure modes are covered"
      ],
      "crawlerGuidance": [
        "Index the HTML page for browsable failure patterns",
        "Use the JSON artifact for stable machine retrieval",
        "Treat entries as reference warnings, not final verdicts"
      ]
    },
    "usageRules": {
      "serviceId": "reliability-corpus",
      "startHere": [
        "Read the HTML page first when the task is to compare a financial claim, report, or workflow against known warning patterns.",
        "Treat corpus entries as reusable review hints that trigger deeper verification, not as final proof by themselves.",
        "Use the filters and sample collections as a static reference workflow, not as a scoring engine."
      ],
      "openLinkedPackWhen": [
        "Open the linked pack when the review has become concrete, especially for publication wording, claim safety, or scenario-level escalation logic.",
        "Use the pack when you need a task-shaped checklist instead of a broad warning library."
      ],
      "downloadJsonWhen": [
        "Download the JSON only after the HTML page makes the warning-only scope and limitations explicit.",
        "Use the artifact for retrieval, audit storage, or pattern benchmarking when another agent needs the stable failure-pattern dataset."
      ],
      "escalateToHumanReviewWhen": [
        "A corpus hit is being treated as final proof, execution approval, or a universal invalidation claim.",
        "The review would block or approve money movement, compliance language, or public claims without source-specific verification.",
        "Runner behavior, assumptions, or time windows need fresh confirmation before any high-stakes conclusion."
      ],
      "whatThisPageDoesNotDo": [
        "It does not score a live strategy, rank systems, or certify that a report is safe.",
        "It does not replace runner verification, human judgment, or scenario-specific evidence review.",
        "It does not run backtests or produce request-time backend judgments."
      ]
    },
    "interpretationGuide": {
      "hitMeaning": [
        "A corpus hit is a reusable warning reference about a known failure shape.",
        "A hit should trigger deeper verification, runner comparison, or human review before reuse.",
        "A hit does not mean the entire strategy, report, or workflow is permanently invalid."
      ],
      "citationChecklist": [
        "Keep firstSeenAt, lastUpdatedAt, and sourceRefs visible when citing a corpus entry.",
        "Preserve the domain and severity so the warning is not taken out of context.",
        "Carry notes and limitations forward when the hit influences a human or agent decision."
      ],
      "doNotUpgradeTo": [
        "Do not upgrade a corpus hit into final proof, universal invalidation, or execution approval.",
        "Do not present a corpus entry as legal advice, enforcement action, or formal compliance determination.",
        "Do not use the corpus as a live strategy score or a replacement for runner-specific evidence."
      ]
    },
    "entries": [
      {
        "id": "lookahead_bias",
        "pattern": "Lookahead bias and future leakage",
        "domain": "backtest_assumptions",
        "severity": "high",
        "artifactStatus": "static_public_sample",
        "evidenceScope": "warning_pattern",
        "reviewIntent": "Trigger deeper timestamp and runner verification before any public or operational reuse.",
        "usageGuidance": [
          "Use this entry to request deeper timestamp-alignment and feature-construction review."
        ],
        "doNotUpgradeTo": [
          "Do not present this entry as final proof that every related strategy is invalid."
        ],
        "sourceRefs": [
          "https://www.freqtrade.io/en/stable/lookahead-analysis/"
        ],
        "firstSeenAt": "2026-05-22T00:00:00.000Z",
        "lastUpdatedAt": "2026-05-23T00:00:00.000Z",
        "notes": [
          "The strategy appears to see future bars or derived values before they are available.",
          "Treat as invalid until timestamp alignment and indicator construction are verified."
        ]
      },
      {
        "id": "repainting",
        "pattern": "Repainting indicator or chart logic",
        "domain": "backtest_assumptions",
        "severity": "high",
        "artifactStatus": "static_public_sample",
        "evidenceScope": "warning_pattern",
        "reviewIntent": "Trigger deeper indicator-timing verification before any public performance restatement.",
        "usageGuidance": [
          "Use this entry to ask whether indicator values change after bar close or after recomputation."
        ],
        "doNotUpgradeTo": [
          "Do not present this entry as automatic proof of fraud or intentional deception."
        ],
        "sourceRefs": [
          "https://www.tradingview.com/support/solutions/43000599874-what-is-repainting/"
        ],
        "firstSeenAt": "2026-05-22T00:00:00.000Z",
        "lastUpdatedAt": "2026-05-23T00:00:00.000Z",
        "notes": [
          "Historical values can change after the bar closes or after the indicator recalculates.",
          "Do not present a repainting series as stable historical evidence."
        ]
      },
      {
        "id": "fee_slippage_drift",
        "pattern": "Fee and slippage assumption drift",
        "domain": "backtest_assumptions",
        "severity": "high",
        "artifactStatus": "static_public_sample",
        "evidenceScope": "warning_pattern",
        "reviewIntent": "Trigger cost-model verification before publishing backtest conclusions.",
        "usageGuidance": [
          "Use this entry to ask for explicit fee, slippage, and fill-assumption documentation."
        ],
        "doNotUpgradeTo": [
          "Do not present this entry as final proof that the whole strategy is worthless."
        ],
        "sourceRefs": [
          "https://www.quantconnect.com/docs/v2/writing-algorithms/backtesting/performance"
        ],
        "firstSeenAt": "2026-05-22T00:00:00.000Z",
        "lastUpdatedAt": "2026-05-23T00:00:00.000Z",
        "notes": [
          "Backtest results can degrade sharply when fee and slippage assumptions are made more realistic.",
          "A result without explicit fee and slippage assumptions is not auditable."
        ]
      },
      {
        "id": "sample_vs_live_gap",
        "pattern": "Sample-versus-live gap",
        "domain": "runner_difference",
        "severity": "medium",
        "artifactStatus": "static_public_sample",
        "evidenceScope": "warning_pattern",
        "reviewIntent": "Trigger scope and dataset-boundary review before sample outputs are reused as live evidence.",
        "usageGuidance": [
          "Use this entry to ask whether a result came from sample, offline, synthetic, or live conditions."
        ],
        "doNotUpgradeTo": [
          "Do not present this entry as proof that a system can never work live."
        ],
        "sourceRefs": [
          "https://www.quantconnect.com/docs/v2/writing-algorithms/backtesting",
          "https://www.freqtrade.io/en/stable/backtesting/"
        ],
        "firstSeenAt": "2026-05-22T00:00:00.000Z",
        "lastUpdatedAt": "2026-05-23T00:00:00.000Z",
        "notes": [
          "Synthetic or offline samples can validate flow but do not prove live performance.",
          "Always label sample data as sample data."
        ]
      },
      {
        "id": "runner_contract_mismatch",
        "pattern": "Runner contract mismatch",
        "domain": "runner_difference",
        "severity": "high",
        "artifactStatus": "static_public_sample",
        "evidenceScope": "warning_pattern",
        "reviewIntent": "Trigger runner-contract diff review before comparing outputs across environments.",
        "usageGuidance": [
          "Use this entry to ask whether fill, timing, and execution assumptions were matched across runners."
        ],
        "doNotUpgradeTo": [
          "Do not present this entry as automatic proof that one runner is correct and the other is wrong."
        ],
        "sourceRefs": [
          "https://www.quantconnect.com/docs/v2/writing-algorithms/backtesting",
          "https://www.freqtrade.io/en/stable/lookahead-analysis/"
        ],
        "firstSeenAt": "2026-05-22T00:00:00.000Z",
        "lastUpdatedAt": "2026-05-23T00:00:00.000Z",
        "notes": [
          "Different runners can disagree on order fill behavior, bar-close logic, or execution timing.",
          "Do not compare results across runners without a documented contract diff."
        ]
      },
      {
        "id": "claim_safety_overreach",
        "pattern": "Claim safety overreach",
        "domain": "claim_safety",
        "severity": "high",
        "artifactStatus": "static_public_sample",
        "evidenceScope": "warning_pattern",
        "reviewIntent": "Trigger claim-boundary review before evidence packaging is upgraded into public-facing conclusions.",
        "usageGuidance": [
          "Use this entry to ask whether a statement exceeds what the cited evidence can support."
        ],
        "doNotUpgradeTo": [
          "Do not present this entry as legal advice, enforcement action, or final compliance determination."
        ],
        "sourceRefs": [
          "https://www.sec.gov/oiea/investor-alerts-and-bulletins",
          "https://consumer.ftc.gov/articles/what-know-about-cryptocurrency-and-scams"
        ],
        "firstSeenAt": "2026-05-22T00:00:00.000Z",
        "lastUpdatedAt": "2026-05-23T00:00:00.000Z",
        "notes": [
          "A packaging layer can summarize evidence, but it must not claim legal clearance or guaranteed safety.",
          "Do not rewrite marketing claims into verified facts."
        ]
      }
    ],
    "sampleInventory": [
      {
        "id": "backtest_assumption_audit",
        "title": "Backtest assumption audit",
        "scenario": "Use this collection when an agent or reviewer needs to sanity-check a strategy report for lookahead, repainting, and cost-assumption mistakes.",
        "whyItMatters": "Shows how the corpus can catch common reasons a backtest result should not be trusted at face value."
      },
      {
        "id": "runner_difference_review",
        "title": "Runner difference review",
        "scenario": "Use this collection when comparing results across runners, synthetic samples, or mismatched execution contracts.",
        "whyItMatters": "Shows why agent-facing results need runner and data-contract context before any comparison is treated as meaningful."
      },
      {
        "id": "claim_safety_review",
        "title": "Claim safety review",
        "scenario": "Use this collection when a financial agent is about to restate marketing or evidence into public-facing conclusions.",
        "whyItMatters": "Shows how the corpus can block overconfident or unsafe public statements before they are repeated downstream."
      }
    ],
    "sampleCollections": [
      {
        "id": "backtest_assumption_audit",
        "title": "Backtest assumption audit",
        "scenario": "Use this collection when an agent or reviewer needs to sanity-check a strategy report for lookahead, repainting, and cost-assumption mistakes.",
        "filters": {
          "domain": "backtest_assumptions",
          "severity": "high"
        },
        "entryIds": [
          "lookahead_bias",
          "repainting",
          "fee_slippage_drift"
        ]
      },
      {
        "id": "runner_difference_review",
        "title": "Runner difference review",
        "scenario": "Use this collection when comparing results across runners, synthetic samples, or mismatched execution contracts.",
        "filters": {
          "domain": "runner_difference"
        },
        "entryIds": [
          "sample_vs_live_gap",
          "runner_contract_mismatch"
        ]
      },
      {
        "id": "claim_safety_review",
        "title": "Claim safety review",
        "scenario": "Use this collection when a financial agent is about to restate marketing or evidence into public-facing conclusions.",
        "filters": {
          "domain": "claim_safety",
          "severity": "high"
        },
        "entryIds": [
          "claim_safety_overreach"
        ]
      }
    ],
    "totalEntries": 6,
    "filteredEntries": 6,
    "filters": {},
    "generatedAt": "2026-05-23T00:00:00.000Z",
    "limitations": [
      "This corpus packages failure patterns and review notes; it does not replace human judgment or runner verification.",
      "A corpus entry is evidence of a known failure shape, not proof that a strategy is invalid in every context.",
      "Source availability, runner behavior, and assumptions can change over time."
    ]
  },
  "request_id": "reliability_corpus_2a5c4fff-c6f1-45a1-af68-e8edb3c17a80",
  "timestamp": "2026-05-23T00:00:00.000Z"
}