Keith Kitchen

Tag: evaluation

8 items with this tag.

DateTitleTags
  • Aug 03, 2025

    eval debt registers become standard in quarterly planning

    • generalnote
    • ai
    • evaluation
    • 2025
  • Jul 05, 2025

    eval set version locks prevent accidental benchmark inflation

    • techjournal
    • ai
    • evaluation
    • 2025
  • Mar 03, 2025

    meta review of agent rollback benchmark methodologies

    • researchdigest
    • ai
    • evaluation
    • 2025
  • Jan 27, 2025

    evidence review on post deployment eval drift

    • researchdigest
    • ai
    • evaluation
    • 2025
  • Dec 14, 2024

    evidence review on long context degradation patterns

    • researchdigest
    • ai
    • evaluation
    • 2024
  • Dec 09, 2024

    meta analysis on llm judge reliability across domains

    • researchdigest
    • ai
    • evaluation
    • 2024
  • Nov 23, 2024

    evidence review on retrieval eval methods in production

    • researchdigest
    • ai
    • evaluation
    • 2024
  • Nov 12, 2024

    benchmark leaderboards now hide real risk

    • thoughtpiece
    • ai
    • evaluation
    • 2024

Dedicated reader

Stacked notes

Freeze this note on the left. Open internal note links into the compare pane on the right.

Current noteLoading...
Stacked noteChoose a note

Select an internal link from the left pane.

Service Counter

Keith Kitchen is a curated atheneum for market systems, research notes, and working ideas. © 2026

  • Home
  • Notes
  • Portfolio
  • Interactive Lab
  • GitHub