Files
JobSourceAgent/jobsource/extract.py
2026-06-17 08:38:15 -04:00

13 lines
779 B
Python

"""Extract one open position URL from a careers page (Stage 3).
Scaffold stub -- not implemented yet.
"""
# TODO (Stage 3): implement per CLAUDE.md "Stage 3 — Extract one open position (return on first hit)".
# Cascade order (return early on first hit):
# 1. ATS JSON — if ATS is already known from Stage 2, return first posting URL directly.
# 2. JobPosting JSON-LD — parse application/ld+json for a `url` field.
# 3. Job-like anchors — first <a> matching /job, /position, /opening, /vacancy in href.
# 4. Cheap-LLM classification — Pydantic AI typed output (classifier_model).
# 5. Browser-agent fallback — handled inside the fused Stage-2 agent call in agent_fallback.py.
# Returns (url: str | None, method: str) so callers know which tier resolved it.