CLI Reference¶
Command reference for honestroles.
Top-Level Commands¶
$ honestroles --help
Available commands:
runingest syncingest validateingest sync-allplugins validateconfig validatereport-qualityinitdoctorreliability checkadapter inferruns listruns showrecommend build-indexrecommend matchrecommend evaluaterecommend feedback addrecommend feedback summarizepublish neondb migratepublish neondb syncpublish neondb verifyscaffold-plugineda generateeda diffeda gateeda dashboard
Output Formats¶
Structured-output commands default to JSON and accept --format {json,table}.
json: stable machine-readable payloads (default).table: concise human-readable summaries for terminals and CI logs.
honestroles eda dashboard launches Streamlit and does not use payload formatting.
Command Matrix¶
| Command | Required flags | Description | Output |
|---|---|---|---|
honestroles run |
--pipeline-config, optional --plugins |
Runs runtime pipeline | JSON/table diagnostics |
honestroles plugins validate |
--manifest |
Validates and loads plugin manifest | JSON/table plugin listing |
honestroles config validate |
--pipeline |
Validates pipeline config | JSON/table normalized config |
honestroles report-quality |
--pipeline-config, optional --plugins |
Runs runtime and computes quality report | JSON/table quality summary |
honestroles ingest sync |
--source, --source-ref, optional --output-parquet, --report-file, --state-file, --write-raw, --max-pages, --max-jobs, --full-refresh, --timeout-seconds, --max-retries, --base-backoff-seconds, --user-agent, --quality-policy, --strict-quality, --merge-policy, --retain-snapshots, --prune-inactive-days |
Fetches one public ATS source and writes latest parquet + snapshot/report artifacts | JSON/table sync summary |
honestroles ingest validate |
--source, --source-ref, optional --report-file, --write-raw, --max-pages, --max-jobs, --timeout-seconds, --max-retries, --base-backoff-seconds, --user-agent, --quality-policy, --strict-quality |
Fetches + normalizes + evaluates ingestion quality without overwriting latest parquet | JSON/table validation summary |
honestroles ingest sync-all |
--manifest, optional --report-file, --fail-fast |
Runs multi-source ingestion from ingest.toml in manifest order |
JSON/table batch summary |
honestroles init |
--input-parquet, optional --pipeline-config, --plugins-manifest, --output-parquet, --sample-rows, --force |
Scaffolds pipeline config + plugin manifest from sample data | JSON/table scaffold summary |
honestroles doctor |
--pipeline-config, optional --plugins, --sample-rows, --policy, --strict |
Validates environment, config, schema readiness, output path, and reliability policy thresholds | JSON/table checks + summary |
honestroles reliability check |
--pipeline-config, optional --plugins, --sample-rows, --policy, --output-file, --strict |
Runs policy-aware reliability checks and writes gate artifact | JSON/table checks + summary + artifact |
honestroles adapter infer |
--input-parquet, optional --output-file, --sample-rows, --top-candidates, --min-confidence, optional --print |
Infers draft [input.adapter] mapping/coercion config from parquet input |
JSON/table artifact summary |
honestroles runs list |
optional --limit, --status, --command, --since, --contains-code |
Lists recorded run lineage entries with filters | JSON/table run rows |
honestroles runs show |
--run-id |
Shows one recorded run lineage payload | JSON/table run record |
honestroles recommend build-index |
--input-parquet, optional --output-dir, --policy |
Builds deterministic retrieval index artifacts for API serving | JSON/table index summary |
honestroles recommend match |
--index-dir, exactly one of --candidate-json or --resume-text, optional --profile-id, --top-k, --policy, --include-excluded |
Produces ranked, explainable matches from index artifacts | JSON/table match summary |
honestroles recommend evaluate |
--index-dir, --golden-set, optional --thresholds, optional --policy |
Computes precision@k / recall@k and enforces thresholds |
JSON/table eval summary |
honestroles recommend feedback add |
--profile-id, --job-id, --event, optional --meta-json |
Appends feedback event and updates profile weights | JSON/table feedback event summary |
honestroles recommend feedback summarize |
optional --profile-id |
Summarizes feedback history and effective profile weights | JSON/table feedback summary |
honestroles publish neondb migrate |
optional --database-url-env, optional --schema |
Applies versioned DB migrations under the HonestRoles Neon schema | JSON/table migration summary |
honestroles publish neondb sync |
--jobs-parquet, --index-dir, optional --database-url-env, --schema, --sync-report, --require-quality-pass, --full-refresh, --batch-id |
Publishes canonical jobs + features + facets to Neon and records batch metadata | JSON/table sync summary |
honestroles publish neondb verify |
optional --database-url-env, optional --schema |
Verifies required tables/functions/migration state for API queries | JSON/table check summary |
honestroles scaffold-plugin |
--name, optional --output-dir |
Copies bundled plugin template | JSON/table scaffold path + package name |
honestroles eda generate |
--input-parquet, optional --output-dir, --quality-profile, repeated --quality-weight, --top-k, --max-rows, optional --rules-file |
Builds deterministic profile artifacts (summary.json, tables, figures, report) |
JSON/table artifact summary |
honestroles eda diff |
--baseline-dir, --candidate-dir, optional --output-dir, optional --rules-file |
Compares two profile artifact dirs and writes diff artifacts (diff.json, drift tables) |
JSON/table diff summary |
honestroles eda gate |
--candidate-dir, optional --baseline-dir, optional --rules-file, optional --fail-on, optional --warn-on |
Evaluates gate policy and drift thresholds for CI | JSON/table gate summary + exit status |
honestroles eda dashboard |
--artifacts-dir, optional --diff-dir, optional --host, --port |
Launches Streamlit artifact viewer | Process exit code |
ingest sync, ingest validate, and ingest sync-all¶
--source-ref values:
greenhouse: board tokenlever: site/company handleashby: job board nameworkable: company subdomain with public Workable careers API access
Default per-source output locations:
- latest parquet:
dist/ingest/<source>/<source_ref>/jobs.parquet - source report:
dist/ingest/<source>/<source_ref>/sync_report.json - state file:
.honestroles/ingest/state.json - snapshots directory:
dist/ingest/<source>/<source_ref>/snapshots/ - catalog parquet:
dist/ingest/<source>/<source_ref>/catalog.parquet - optional raw payload:
dist/ingest/<source>/<source_ref>/raw.jsonlwith--write-raw(or adjacent to--output-parquetwhen that flag is set)
Default batch report location:
dist/ingest/sync_all_report.jsonwhen--report-fileis omitted.
ingest sync report payload fields include:
schema_versionstatussource,source_refrequest_count,fetched_count,normalized_count,dedup_droppednew_count,updated_count,unchanged_countskipped_by_state,tombstoned_count,coverage_completeretry_count,http_status_countsquality_status,quality_summary,quality_check_codeskey_field_completeness(company_non_null_pct,posted_at_non_null_pct,description_text_non_null_pct,location_or_remote_signal_pct)stage_timings_ms,warnings- warning codes can include:
INGEST_TRUNCATED(run hit limits or could not fully cover source)INGEST_PAGE_REPEAT_DETECTED(source pagination repeated the same page payload)merge_policy,retained_snapshot_count,pruned_snapshot_count,pruned_inactive_countquality_policy_source,quality_policy_hashhigh_watermark_before,high_watermark_afteroutput_paths(latest parquet, report, snapshot parquet, catalog parquet, state file, optional raw)- optional
error(type,message) on failures
ingest validate payload fields include:
schema_version,statussource,source_refrequest_count,fetched_count,normalized_count,dedup_droppedquality_status,quality_summary,quality_check_codeskey_field_completenessrows_evaluatedstage_timings_ms,warningsoutput_paths(validation report, optional raw JSONL)- optional
error(type,message) on failures
ingest sync-all batch payload fields include:
schema_version,statusstarted_at_utc,finished_at_utc,duration_mstotal_sources,pass_count,fail_counttotal_rows_written,total_fetched_count,total_request_countquality_summarykey_field_completeness(weighted aggregate across successful source runs)stage_timings_mssources(one entry per attempted source)report_filecheck_codes(aggregate warn codes)
For full manifest schema details, see Ingest Manifest Schema. For quality policy schema details, see Ingest Quality Policy Schema.
recommend build-index, recommend match, recommend evaluate, recommend feedback¶
recommend build-index default output root is:
dist/recommend/index/<index_id>/manifest.jsondist/recommend/index/<index_id>/jobs_latest.jsonldist/recommend/index/<index_id>/facets.jsondist/recommend/index/<index_id>/quality_summary.jsondist/recommend/index/<index_id>/shards/*.json
recommend match agent-facing result fields:
job_idscorematch_reasonsrequired_missing_skillsapply_urlposted_atsourcequality_flags
With --include-excluded, payload additionally includes excluded_jobs and
deterministic exclude_reasons codes.
recommend evaluate computes averaged precision@k and recall@k from a
golden-set JSON and enforces recommend_eval.toml thresholds.
recommend feedback add/summarize persists feedback state in:
.honestroles/recommend/feedback/events.jsonl.honestroles/recommend/feedback/weights/<profile_id>.json
publish neondb migrate, publish neondb sync, and publish neondb verify¶
Default schema: honestroles_api.
Managed DB objects include:
- tables:
jobs_live,job_features,job_facets,publish_batches,feedback_events,profile_weights,profile_cache,migration_history - function:
match_jobs_v1(candidate jsonb, top_k int, include_excluded boolean, policy_override jsonb)
publish neondb sync notes:
--require-quality-passdefaults to enabled.- If
--sync-reportis omitted, the command attempts<jobs_parquet_dir>/sync_report.json. --full-refreshforces a truncation-based rebuild before upsert.- Feedback sidecar state from
.honestroles/recommend/feedback/is synchronized into DB feedback/weight tables when present.
Run Lineage¶
Tracked commands write a run record to:
.honestroles/runs/<run_id>/run.json
Tracked commands:
runreport-qualityadapter infereda generateeda diffeda gatereliability checkingest syncingest validateingest sync-allrecommend build-indexrecommend matchrecommend evaluaterecommend feedback addrecommend feedback summarizepublish neondb migratepublish neondb syncpublish neondb verify
Run schema fields include:
schema_versionrun_idcommandstatusstarted_at_utc,finished_at_utc,duration_msinput_hash,input_hashesconfig_hashartifact_pathscheck_codesingest_metrics(for ingest commands)recommend_metrics(for recommend commands)publish_metrics(for publish commands)error(present on failures)
Exit Codes¶
| Exit code | Meaning |
|---|---|
0 |
Success. Includes doctor/reliability check statuses pass and warn when --strict is not set. |
1 |
Generic HonestRolesError, failed eda gate policy, doctor/reliability check status fail, strict escalation of warn, or failed ingestion batch/source. |
2 |
ConfigValidationError (invalid args, invalid/missing config, bad run lookup, manifest/state parse errors, etc.). |
3 |
Plugin load/validation/execution failure. |
4 |
StageExecutionError. |
Examples¶
$ honestroles init --input-parquet data/jobs.parquet --pipeline-config pipeline.toml --plugins-manifest plugins.toml
$ honestroles ingest sync --source greenhouse --source-ref stripe --quality-policy ingest_quality.toml --strict-quality --merge-policy updated_hash --retain-snapshots 30 --prune-inactive-days 90 --timeout-seconds 20 --max-retries 4 --base-backoff-seconds 0.5 --user-agent "honestroles-batch/1.0" --format table
$ honestroles ingest validate --source greenhouse --source-ref stripe --quality-policy ingest_quality.toml --strict-quality --format table
$ honestroles ingest sync-all --manifest ingest.toml --format table
$ honestroles doctor --pipeline-config pipeline.toml --plugins plugins.toml --policy reliability.toml --format table
$ honestroles reliability check --pipeline-config pipeline.toml --plugins plugins.toml --strict --output-file dist/reliability/latest/gate_result.json --format table
$ honestroles run --pipeline-config pipeline.toml --plugins plugins.toml --format table
$ honestroles adapter infer --input-parquet data/jobs.parquet --output-file dist/adapters/adapter-draft.toml
$ honestroles runs list --limit 10 --command ingest.sync-all --contains-code INGEST_TRUNCATED --format table
$ honestroles runs show --run-id <run_id>
$ honestroles recommend build-index --input-parquet dist/ingest/greenhouse/stripe/jobs.parquet --policy recommendation.toml
$ honestroles recommend match --index-dir dist/recommend/index/<index_id> --candidate-json examples/candidate.json --top-k 25 --include-excluded
$ honestroles recommend evaluate --index-dir dist/recommend/index/<index_id> --golden-set examples/recommend_golden_set.json --thresholds recommend_eval.toml
$ honestroles recommend feedback add --profile-id jane_doe --job-id 12345 --event interviewed
$ honestroles recommend feedback summarize --profile-id jane_doe
$ honestroles publish neondb migrate --database-url-env NEON_DATABASE_URL --schema honestroles_api
$ honestroles publish neondb sync --database-url-env NEON_DATABASE_URL --schema honestroles_api --jobs-parquet dist/ingest/greenhouse/stripe/jobs.parquet --index-dir dist/recommend/index/<index_id> --sync-report dist/ingest/greenhouse/stripe/sync_report.json --require-quality-pass --format table
$ honestroles publish neondb verify --database-url-env NEON_DATABASE_URL --schema honestroles_api --format table
$ honestroles eda generate --input-parquet data/jobs.parquet --output-dir dist/eda/latest
$ honestroles eda diff --baseline-dir dist/eda/baseline --candidate-dir dist/eda/candidate --output-dir dist/eda/diff
$ honestroles eda gate --candidate-dir dist/eda/candidate --baseline-dir dist/eda/baseline --rules-file eda-rules.toml
$ honestroles eda dashboard --artifacts-dir dist/eda/candidate --diff-dir dist/eda/diff