survey on ai incident taxonomies and reporting quality

Public AI incident repositories have expanded, yet reporting standards remain inconsistent across severity definitions, root-cause labels, and remediation detail (OECD AI incidents monitor).

see also: ai incident reporting datasets are still sparse · ai safety evals move into procurement checklists

evidence stack

  • Event volume is rising faster than schema quality.
  • Severity labels are often incomparable across sources.
  • Remediation detail is frequently too shallow for reuse.

method boundary

Taxonomy quality improves when incident reporting is tied to operational postmortems rather than external PR summaries.

my take

Incident transparency is improving, but shared learning still lags because categories are not stable enough for robust comparison.

linkage

  • [[ai incident reporting datasets are still sparse]]
  • [[ai safety evals move into procurement checklists]]
  • [[governance sandboxes speed ai rollouts]]

ending questions

which taxonomy field is most critical to make ai incident datasets genuinely comparable?