2026-04-03-yomiya-content-intake-benchmark-design

Yomiya Content Intake Flow Design

Date: 2026-04-03
Scope: product/YomiyaContentSystem/30-Research-and-Evidence/
Status: Approved design

Goal

Simplify the research-stage intake flow so the system does not repeat value judgment that humans have already done. The system should focus on transcript acquisition, classification, and fallback handling.

Core Decision

The previous benchmark still gave the system too much gatekeeping responsibility.

The corrected model is:

human preselects the resource
system gets usable text
AI classifies the resource
failures go to manual fallback

What The System Should Not Do

At this stage the system should not re-judge:

whether a resource is worth studying
whether it is a strong Phase 1 candidate
whether it belongs in a research pool

Those are treated as upstream human decisions.

What The System Should Do

1. Minimal registration

Capture only enough metadata to process the resource.

2. Transcript-path resolution

Choose the first usable path:

native transcript, captions, article, or description
ASR
failure fallback

3. Classification

Once text exists, output:

level
scene
content_structure
recommended_collection_direction
other lightweight classification fields

4. Failure fallback

If text cannot be obtained, mark the resource for manual supplement or hold. Do not force full classification.

Main Principle

The simplified flow should be:

human preselection -> transcript -> classify

not:

system value gate -> transcript -> classify

Deliverables

product/YomiyaContentSystem/30-Research-and-Evidence/foundations/yomiya-content-intake-benchmark.md
product/YomiyaContentSystem/30-Research-and-Evidence/foundations/yomiya-content-research-workflow.md
aligned usage guidance in the unified sample list