Files
JobSourceAgent/jobsource/pipeline.py
2026-06-17 08:38:15 -04:00

13 lines
739 B
Python

"""Batch orchestration: dedup, per-record isolation, cascade, persistence, summary.
Scaffold stub -- not implemented yet.
"""
# TODO (pipeline): implement run_batch() per CLAUDE.md "Pipeline stages".
# run_batch() contract:
# - Accept batch_size, search terms, location, hours_old overrides.
# - Call the job source, dedup by job_id against the DB (skip already-seen jobs).
# - For each new RawJob, run the full cascade (resolve -> careers -> extract) in isolation:
# one failing record must NEVER abort the batch — catch, record failed/needs_review, continue.
# - Persist each JobResult to the DB and export output/results.csv when done.
# - Print a run summary: per-stage counts + % of new jobs reaching position_found.