← Back to history

Pipeline run

5f26dd63-d84b-41e3-a586-2d142c8e3fd1

Pipeline LLM cost (USD)
API 1: $0.0029 API 2: $0.0000 API 3: $0.0000 Total: $0.0029

Client output enrichment

v2 Skill cluster · Nature of work · AI index · Tech stack maturity · Evidence · KRA description
SPARSE JD
Nature of work
no_db_connection
Tech stack maturity
Mainstream Modern
AI index (0 = no AI use, 5 = totally AI-dependent · v2.1)
0.20 / 5
· Title match
Has AI skill
· AI skill (primary)
· AI skill (secondary)
· On AI team
· Builds AI products
vocab breakdown (legacy)
Assistants (×1):
Frameworks (×2):
Models / concepts (×3): ML
Evidence — skills matched in JD (11)
Python SQL Apache Spark Apache Airflow Snowflake dbt Apache Kafka AWS Amazon RDS Amazon S3 Apache Flink
Skill cluster (0 dimension groups, role-scoped)
No dimension groups computed for this JD.
Show KRA description ↓
Build and maintain Airflow DAGs for ETL pipelines from RDS to S3 to Snowflake Design and optimize data warehouse schemas in Snowflake Manage Spark jobs for large-scale data transformations Build streaming pipelines using Kafka and Flink Ensure data quality and observability across all pipelines Partner with analytics to expose curated datasets Python, SQL, Spark, Airflow, Snowflake, dbt, Kafka, AWS

Signals

Skill data-engineer
0.82
Alias ml-engineer
1.00
KRA data-engineer
0.45
Status: extract_from_jd_done Created: 2026-05-18T20:25:53.886179Z Updated: 2026-05-18T20:25:53.886179Z
Flow Current 3-step pipeline

1 POST /skills/extract-from-jd

2 POST /skills/extract-details

3 POST /skills/final-role-output

Role Chosen role & resolution

No chosen role stored for this run.

Job description

ML Engineer — DataCo

We're hiring an ML Engineer to own our data infrastructure.

Responsibilities:
- Build and maintain Airflow DAGs for ETL pipelines from RDS to S3 to Snowflake
- Design and optimize data warehouse schemas in Snowflake
- Manage Spark jobs for large-scale data transformations
- Build streaming pipelines using Kafka and Flink
- Ensure data quality and observability across all pipelines
- Partner with analytics to expose curated datasets

Required skills: Python, SQL, Spark, Airflow, Snowflake, dbt, Kafka, AWS

Skills from this JD

Each row merges API 1 extraction, API 2 library match / v3 orchestration (dimensions + locked dims), and API 3 persistence tags.

Python Primary No API 2 row (run stopped after API 1 or history missing)
SQL Primary No API 2 row (run stopped after API 1 or history missing)
Apache Spark Primary No API 2 row (run stopped after API 1 or history missing)
Apache Airflow Primary No API 2 row (run stopped after API 1 or history missing)
Snowflake Primary No API 2 row (run stopped after API 1 or history missing)
dbt Primary No API 2 row (run stopped after API 1 or history missing)
Apache Kafka Primary No API 2 row (run stopped after API 1 or history missing)
AWS Primary No API 2 row (run stopped after API 1 or history missing)
Amazon RDS Primary No API 2 row (run stopped after API 1 or history missing)
Amazon S3 Primary No API 2 row (run stopped after API 1 or history missing)
Apache Flink Primary No API 2 row (run stopped after API 1 or history missing)

Library artifacts (this run)

No artifact rows for this run.
nano JD Parser — gpt-4.1-nano click to toggle
RoleML Engineer
CompanyDataCo
DomainOther
JD type pass
Show raw JSON
{
  "JD_type": "pass",
  "about_company": null,
  "certifications": [],
  "company_name": "DataCo",
  "ctc": null,
  "domain": {
    "primary": {
      "aliases": [],
      "domain": "Other"
    },
    "secondary": null
  },
  "education": [],
  "experience": {
    "max": null,
    "min": null,
    "raw": null
  },
  "job_locations": [],
  "role": "ML Engineer",
  "role_archetype": "Data",
  "roles_and_responsibilities": [
    {
      "bullet_count": 6,
      "heading": "Responsibilities",
      "heading_was_present": true,
      "source_marker": {
        "first_5_words": "Build and maintain Airflow DAGs",
        "last_5_words": "to expose curated datasets"
      },
      "text": "Build and maintain Airflow DAGs for ETL pipelines from RDS to S3 to Snowflake\nDesign and optimize data warehouse schemas in Snowflake\nManage Spark jobs for large-scale data transformations\nBuild streaming pipelines using Kafka and Flink\nEnsure data quality and observability across all pipelines\nPartner with analytics to expose curated datasets",
      "word_count": 56
    },
    {
      "bullet_count": 0,
      "heading": "Required skills",
      "heading_was_present": true,
      "source_marker": {
        "first_5_words": "Python, SQL, Spark, Airflow,",
        "last_5_words": "Snowflake, dbt, Kafka, AWS"
      },
      "text": "Python, SQL, Spark, Airflow, Snowflake, dbt, Kafka, AWS",
      "word_count": 8
    }
  ],
  "urls": []
}
API 1 — extract-from-jd click to toggle
{
  "final_skills": [
    {
      "is_primary": true,
      "skill_name": "Python"
    },
    {
      "is_primary": true,
      "skill_name": "SQL"
    },
    {
      "is_primary": true,
      "skill_name": "Apache Spark"
    },
    {
      "is_primary": true,
      "skill_name": "Apache Airflow"
    },
    {
      "is_primary": true,
      "skill_name": "Snowflake"
    },
    {
      "is_primary": true,
      "skill_name": "dbt"
    },
    {
      "is_primary": true,
      "skill_name": "Apache Kafka"
    },
    {
      "is_primary": true,
      "skill_name": "AWS"
    },
    {
      "is_primary": true,
      "skill_name": "Amazon RDS"
    },
    {
      "is_primary": true,
      "skill_name": "Amazon S3"
    },
    {
      "is_primary": true,
      "skill_name": "Apache Flink"
    }
  ],
  "jd_role": {
    "display_name": "ML Engineer",
    "rationale": null,
    "role_archetype": "Data",
    "slug": ""
  },
  "nano_parsed": {
    "JD_type": "pass",
    "about_company": null,
    "certifications": [],
    "company_name": "DataCo",
    "ctc": null,
    "domain": {
      "primary": {
        "aliases": [],
        "domain": "Other"
      },
      "secondary": null
    },
    "education": [],
    "experience": {
      "max": null,
      "min": null,
      "raw": null
    },
    "job_locations": [],
    "role": "ML Engineer",
    "role_archetype": "Data",
    "roles_and_responsibilities": [
      {
        "bullet_count": 6,
        "heading": "Responsibilities",
        "heading_was_present": true,
        "source_marker": {
          "first_5_words": "Build and maintain Airflow DAGs",
          "last_5_words": "to expose curated datasets"
        },
        "text": "Build and maintain Airflow DAGs for ETL pipelines from RDS to S3 to Snowflake\nDesign and optimize data warehouse schemas in Snowflake\nManage Spark jobs for large-scale data transformations\nBuild streaming pipelines using Kafka and Flink\nEnsure data quality and observability across all pipelines\nPartner with analytics to expose curated datasets",
        "word_count": 56
      },
      {
        "bullet_count": 0,
        "heading": "Required skills",
        "heading_was_present": true,
        "source_marker": {
          "first_5_words": "Python, SQL, Spark, Airflow,",
          "last_5_words": "Snowflake, dbt, Kafka, AWS"
        },
        "text": "Python, SQL, Spark, Airflow, Snowflake, dbt, Kafka, AWS",
        "word_count": 8
      }
    ],
    "urls": []
  },
  "run_id": null,
  "stage3_signals": {
    "alias_match_roles": [
      {
        "display_name": "ML Engineer",
        "matched_count": null,
        "role_id": 3,
        "score": 1.0,
        "slug": "ml-engineer",
        "total_count": null
      }
    ],
    "kra_match_roles": [
      {
        "display_name": "Data Engineer",
        "matched_count": null,
        "role_id": 2,
        "score": 0.4495,
        "slug": "data-engineer",
        "total_count": null
      },
      {
        "display_name": "DevOps Engineer",
        "matched_count": null,
        "role_id": 10,
        "score": 0.374,
        "slug": "devops-engineer",
        "total_count": null
      },
      {
        "display_name": "Cloud Architect",
        "matched_count": null,
        "role_id": 9,
        "score": 0.3738,
        "slug": "cloud-architect",
        "total_count": null
      },
      {
        "display_name": "ML Engineer",
        "matched_count": null,
        "role_id": 3,
        "score": 0.372,
        "slug": "ml-engineer",
        "total_count": null
      },
      {
        "display_name": "Backend Engineer",
        "matched_count": null,
        "role_id": 1,
        "score": 0.3593,
        "slug": "backend-engineer",
        "total_count": null
      }
    ],
    "skill_match_roles": [
      {
        "display_name": "Data Engineer",
        "matched_count": 9,
        "role_id": 2,
        "score": 0.8182,
        "slug": "data-engineer",
        "total_count": 11
      },
      {
        "display_name": "Backend Engineer",
        "matched_count": 3,
        "role_id": 1,
        "score": 0.2727,
        "slug": "backend-engineer",
        "total_count": 11
      },
      {
        "display_name": "Cybersecurity Engineer",
        "matched_count": 2,
        "role_id": 5,
        "score": 0.1818,
        "slug": "cybersecurity-engineer",
        "total_count": 11
      },
      {
        "display_name": "ML Engineer",
        "matched_count": 2,
        "role_id": 3,
        "score": 0.1818,
        "slug": "ml-engineer",
        "total_count": 11
      },
      {
        "display_name": "Cloud Architect",
        "matched_count": 2,
        "role_id": 9,
        "score": 0.1818,
        "slug": "cloud-architect",
        "total_count": 11
      }
    ],
    "stage35_ran": false
  },
  "stage4_decision": {
    "alias_collision_detected": false,
    "case": "E",
    "chosen_role": null,
    "confidence": 0.0,
    "llm2_fired": false,
    "llm2_reasoning": null,
    "queued": true,
    "reasoning": "low_kra: top KRA 0.45 \u003c 0.55"
  },
  "stage5_updates": null
}
API 2 — extract-details
{}
API 3 — final-role-output
{}

LLM Calls

Every model call made for this run, in pipeline order. Click a card to see the model's response.

Loading…