跳转到内容

2026-04-03-yomiya-content-intake-benchmark-design

Date: 2026-04-03
Scope: product/YomiyaContentSystem/30-Research-and-Evidence/
Status: Approved design

Simplify the research-stage intake flow so the system does not repeat value judgment that humans have already done. The system should focus on transcript acquisition, classification, and fallback handling.

The previous benchmark still gave the system too much gatekeeping responsibility.

The corrected model is:

  1. human preselects the resource
  2. system gets usable text
  3. AI classifies the resource
  4. failures go to manual fallback

At this stage the system should not re-judge:

  • whether a resource is worth studying
  • whether it is a strong Phase 1 candidate
  • whether it belongs in a research pool

Those are treated as upstream human decisions.

Capture only enough metadata to process the resource.

Choose the first usable path:

  • native transcript, captions, article, or description
  • ASR
  • failure fallback

Once text exists, output:

  • level
  • scene
  • content_structure
  • recommended_collection_direction
  • other lightweight classification fields

If text cannot be obtained, mark the resource for manual supplement or hold. Do not force full classification.

The simplified flow should be:

human preselection -> transcript -> classify

not:

system value gate -> transcript -> classify

  1. product/YomiyaContentSystem/30-Research-and-Evidence/foundations/yomiya-content-intake-benchmark.md
  2. product/YomiyaContentSystem/30-Research-and-Evidence/foundations/yomiya-content-research-workflow.md
  3. aligned usage guidance in the unified sample list