Source Prioritization

Overview

Given limited time to extract procedures from sources, prioritize which sources to process for maximum procedure value.

Steps

Step 1: Gather source metadata

For each candidate source, collect or estimate:

Basic information:
- Title, creator, type, URL
- Length (duration/pages)
- Publication date
- Initial reason for adding to queue
Quick assessment (don’t deep-dive yet):
- Skim description/abstract/table of contents
- Check creator credentials
- Note topic areas covered
- Estimate procedure density (HIGH/MEDIUM/LOW)
Volume metrics:
- Total videos/items (for channels)
- Total hours of content
- Estimated transcript characters (hours x 9000)

Create standardized metadata for each source.

Step 2: Score procedure density

Rate how rich each source is in extractable procedures (1-5):

5 - Almost entirely procedural (tutorials, courses, how-tos) 4 - Mostly procedural with some context 3 - Mix of procedural and informational 2 - Mostly informational with some procedures 1 - Almost entirely informational/entertainment

High density signals:

Tutorial channels, course creators
“How to” in most titles
Step-by-step format common
Practical demonstrations

Low density signals:

News/commentary
Entertainment focus
Opinion/reaction content
Abstract theory without application

Step 3: Score uniqueness

Assess whether this source has knowledge unavailable elsewhere (1-5):

5 - Completely unique - only source for this knowledge 4 - Rare - few others have this perspective/access 3 - Somewhat unique - different angle on common topics 2 - Common - similar to many other sources 1 - Commodity - same as everyone else

High uniqueness signals:

Original research/data
Unique professional access
Contrarian successful approaches
Proprietary methods
Decades of specialized experience

Low uniqueness signals:

Summarizes others’ work
Common knowledge packaged
Follows trends
No original insight

Step 4: Score relevance

Assess how relevant to YOUR specific goals (1-5):

5 - Directly applicable to current goals 4 - Highly relevant to goals 3 - Moderately relevant 2 - Tangentially relevant 1 - Interesting but not relevant

NOTE: This is PERSONAL - depends on what you’re trying to achieve. Cross-reference with extraction_goals and library_gaps.

Step 5: Score credibility

Rate creator’s track record of producing results (1-5):

5 - Proven exceptional results, widely recognized 4 - Strong track record, credible in field 3 - Some evidence of results 2 - Claims results but limited evidence 1 - No evidence, unverified claims

Credibility signals:

Verifiable achievements
Peer recognition
Students/followers with results
Published/cited work
Professional credentials used

Low credibility signals:

Only self-reported success
No verifiable outcomes
Sells without substance
Contradicted by evidence

Step 6: Score extractability

Rate how easy it is to extract procedures (1-5, higher = easier):

5 - Very easy - clear explanations, transcripts, structured 4 - Easy - good explanations, some structure 3 - Moderate - requires inference 2 - Hard - implicit, scattered, unstructured 1 - Very hard - no transcripts, unclear, heavily visual

Factors that improve extractability:

Transcripts available
Clear step-by-step explanations
Written summaries/notes
Consistent format
Explicit about methods

Factors that reduce extractability:

No transcripts, heavy accent
Visual demonstrations without explanation
Scattered across many videos
Personality-heavy, procedure-light
Assumes high prior knowledge

Step 7: Calculate composite scores and ROI

Calculate weighted score for each source:

raw_score = (procedure_density x 0.25) + (uniqueness x 0.25) + (relevance x 0.20) + (credibility x 0.15) + (extractability x 0.15)

Then estimate extraction metrics:

estimated_procedures = content_hours x density_factor

density_factor: score_5=3.0, score_4=2.0, score_3=1.0, score_2=0.5, score_1=0.2

unique_procedures = estimated_procedures x (uniqueness / 5)

extraction_value = unique_procedures x (relevance / 5)

extraction_hours = content_hours x extraction_multiplier

manual_deep: 3.0 hours per 1 hour content
manual_quick: 1.5 hours per 1 hour content
automated_review: 0.5 hours per 1 hour content

adjusted_cost = extraction_hours / extractability_score

ROI = extraction_value / adjusted_cost

ROI interpretation:

excellent: > 2.0 procedures per hour
good: 1.0 - 2.0 procedures per hour
acceptable: 0.5 - 1.0 procedures per hour
poor: < 0.5 procedures per hour

Step 8: Assign tiers

Assign each source to a tier based on scores:

TIER 1 - EXTRACT NOW:

Criteria: ROI > 2.0 OR (raw_score > 4.0 AND relevance = 5)
Action: Extract immediately, high priority

TIER 2 - EXTRACT SOON:

Criteria: ROI 1.0-2.0 OR raw_score 3.5-4.0
Action: Extract when Tier 1 complete

TIER 3 - EXTRACT LATER:

Criteria: ROI 0.5-1.0 OR raw_score 3.0-3.5
Action: Extract if time permits

TIER 4 - MAYBE:

Criteria: ROI 0.25-0.5 OR raw_score 2.5-3.0
Action: Only if specifically needed

TIER 5 - SKIP:

Criteria: ROI < 0.25 OR raw_score < 2.5
Action: Do not extract

Special considerations:

Foundational sources: bump to Tier 1 if Tier 2-3
Time-sensitive: bump priority if may become unavailable
Bundled value: extract early if unlocks other sources
Diminishing returns: after 10 procedures from one source, reassess

Step 9: Create extraction schedule

Build ordered extraction schedule:

Within Tier 1, sequence by:
- Foundational sources first
- Quick wins early (high ROI, low volume)
- Batch similar sources for efficiency
Allocate time based on budget:
- Sum hours needed for each tier
- If budget limited, may not reach lower tiers
- Leave buffer for unexpected difficulty
Set checkpoints:
- After every 10 procedures extracted
- After completing each source
- Weekly review of priorities
Create schedule document with:
- Rank, source name, ROI, hours, expected procedures
- Cumulative totals
- Summary statistics

Step 10: Document skip list

For Tier 5 sources and any dropped from lower tiers:

Record why skipped:
- Below ROI threshold
- Insufficient time
- Redundant with existing library
- Low credibility
Decide retention:
- KEEP FOR LATER: Still valuable, review next quarter
- DROP: Remove from queue entirely
- CONDITIONAL: Keep if specific need arises
Create skip list table: | Source | Raw Score | ROI | Reason | Disposition |

When to Use

When you have multiple sources waiting for extraction
Before starting a procedure extraction sprint
When building extraction queue from content backlog
When deciding whether to process new source vs queued sources
During quarterly library planning
When extraction time is limited and must be optimized
After accumulating “watch later” or “read later” items
When onboarding to GOSM and choosing initial extraction targets

Verification

All sources scored on all dimensions with evidence
Weights reflect current priorities
ROI calculation is consistent across sources
Top priorities align with actual needs
Time allocation is realistic and complete
Skipped sources have clear rationale

Input: $ARGUMENTS

Apply this procedure to the input provided.