Pipeline run
70c135a4-f43f-4d60-86bc-d84ce710a6b2
Pipeline LLM cost (USD)
API 1: $0.0029
API 2: $0.0000
API 3: $0.0000
Total: $0.0029
Client output enrichment
v2 Skill cluster · Nature of work · AI index · Tech stack maturity · Evidence · KRA description
SPARSE JD
Nature of work
—
Tech stack maturity
Mainstream Modern
AI index (0 = no AI use, 5 = totally AI-dependent · v2.1)
0.20 / 5
· Title match
✓ Has AI skill
· AI skill (primary)
· AI skill (secondary)
· On AI team
· Builds AI products
vocab breakdown (legacy)
Assistants (×1):
—
Frameworks (×2):
—
Models / concepts (×3):
ML
Evidence — skills matched in JD (11)
Python
SQL
Apache Spark
Apache Airflow
Snowflake
dbt
Apache Kafka
AWS
Amazon RDS
Amazon S3
Apache Flink
Skill cluster (0 dimension groups, role-scoped)
Show KRA description ↓
Build and maintain Airflow DAGs for ETL pipelines from RDS to S3 to Snowflake
Design and optimize data warehouse schemas in Snowflake
Manage Spark jobs for large-scale data transformations
Build streaming pipelines using Kafka and Flink
Ensure data quality and observability across all pipelines
Partner with analytics to expose curated datasets
Python, SQL, Spark, Airflow, Snowflake, dbt, Kafka, AWS
Signals
Skill
data-engineer
0.82
Alias
ml-engineer
1.00
KRA
data-engineer
0.36
Status:
extract_from_jd_done
Created: 2026-05-18T18:20:37.521229Z
Updated: 2026-05-18T18:20:37.521229Z
Flow
Current 3-step pipeline
1 POST /skills/extract-from-jd
2 POST /skills/extract-details
3 POST /skills/final-role-output
Role
Chosen role & resolution
No chosen role stored for this run.
Job description
ML Engineer — DataCo We're hiring an ML Engineer to own our data infrastructure. Responsibilities: - Build and maintain Airflow DAGs for ETL pipelines from RDS to S3 to Snowflake - Design and optimize data warehouse schemas in Snowflake - Manage Spark jobs for large-scale data transformations - Build streaming pipelines using Kafka and Flink - Ensure data quality and observability across all pipelines - Partner with analytics to expose curated datasets Required skills: Python, SQL, Spark, Airflow, Snowflake, dbt, Kafka, AWS
Skills from this JD
Each row merges API 1 extraction, API 2 library match / v3 orchestration (dimensions + locked dims), and API 3 persistence tags.
Python
Primary
No API 2 row (run stopped after API 1 or history missing)
SQL
Primary
No API 2 row (run stopped after API 1 or history missing)
Apache Spark
Primary
No API 2 row (run stopped after API 1 or history missing)
Apache Airflow
Primary
No API 2 row (run stopped after API 1 or history missing)
Snowflake
Primary
No API 2 row (run stopped after API 1 or history missing)
dbt
Primary
No API 2 row (run stopped after API 1 or history missing)
Apache Kafka
Primary
No API 2 row (run stopped after API 1 or history missing)
AWS
Primary
No API 2 row (run stopped after API 1 or history missing)
Amazon RDS
Primary
No API 2 row (run stopped after API 1 or history missing)
Amazon S3
Primary
No API 2 row (run stopped after API 1 or history missing)
Apache Flink
Primary
No API 2 row (run stopped after API 1 or history missing)
Library artifacts (this run)
No artifact rows for this run.
nano JD Parser — gpt-4.1-nano click to toggle
RoleML Engineer
CompanyDataCo
DomainOther
JD type
pass
Show raw JSON
{
"JD_type": "pass",
"about_company": null,
"certifications": [],
"company_name": "DataCo",
"ctc": null,
"domain": {
"primary": {
"aliases": [],
"domain": "Other"
},
"secondary": null
},
"education": [],
"experience": {
"max": null,
"min": null,
"raw": null
},
"job_locations": [],
"role": "ML Engineer",
"role_archetype": "Data",
"roles_and_responsibilities": [
{
"bullet_count": 6,
"heading": "Responsibilities",
"heading_was_present": true,
"source_marker": {
"first_5_words": "Build and maintain Airflow DAGs",
"last_5_words": "expose curated datasets"
},
"text": "Build and maintain Airflow DAGs for ETL pipelines from RDS to S3 to Snowflake\nDesign and optimize data warehouse schemas in Snowflake\nManage Spark jobs for large-scale data transformations\nBuild streaming pipelines using Kafka and Flink\nEnsure data quality and observability across all pipelines\nPartner with analytics to expose curated datasets",
"word_count": 56
},
{
"bullet_count": 0,
"heading": "Required skills",
"heading_was_present": true,
"source_marker": {
"first_5_words": "Python, SQL, Spark, Airflow,",
"last_5_words": "Snowflake, dbt, Kafka, AWS"
},
"text": "Python, SQL, Spark, Airflow, Snowflake, dbt, Kafka, AWS",
"word_count": 8
}
],
"urls": []
}
API 1 — extract-from-jd click to toggle
{
"final_skills": [
{
"is_primary": true,
"skill_name": "Python"
},
{
"is_primary": true,
"skill_name": "SQL"
},
{
"is_primary": true,
"skill_name": "Apache Spark"
},
{
"is_primary": true,
"skill_name": "Apache Airflow"
},
{
"is_primary": true,
"skill_name": "Snowflake"
},
{
"is_primary": true,
"skill_name": "dbt"
},
{
"is_primary": true,
"skill_name": "Apache Kafka"
},
{
"is_primary": true,
"skill_name": "AWS"
},
{
"is_primary": true,
"skill_name": "Amazon RDS"
},
{
"is_primary": true,
"skill_name": "Amazon S3"
},
{
"is_primary": true,
"skill_name": "Apache Flink"
}
],
"jd_role": {
"display_name": "ML Engineer",
"rationale": null,
"role_archetype": "Data",
"slug": ""
},
"nano_parsed": {
"JD_type": "pass",
"about_company": null,
"certifications": [],
"company_name": "DataCo",
"ctc": null,
"domain": {
"primary": {
"aliases": [],
"domain": "Other"
},
"secondary": null
},
"education": [],
"experience": {
"max": null,
"min": null,
"raw": null
},
"job_locations": [],
"role": "ML Engineer",
"role_archetype": "Data",
"roles_and_responsibilities": [
{
"bullet_count": 6,
"heading": "Responsibilities",
"heading_was_present": true,
"source_marker": {
"first_5_words": "Build and maintain Airflow DAGs",
"last_5_words": "expose curated datasets"
},
"text": "Build and maintain Airflow DAGs for ETL pipelines from RDS to S3 to Snowflake\nDesign and optimize data warehouse schemas in Snowflake\nManage Spark jobs for large-scale data transformations\nBuild streaming pipelines using Kafka and Flink\nEnsure data quality and observability across all pipelines\nPartner with analytics to expose curated datasets",
"word_count": 56
},
{
"bullet_count": 0,
"heading": "Required skills",
"heading_was_present": true,
"source_marker": {
"first_5_words": "Python, SQL, Spark, Airflow,",
"last_5_words": "Snowflake, dbt, Kafka, AWS"
},
"text": "Python, SQL, Spark, Airflow, Snowflake, dbt, Kafka, AWS",
"word_count": 8
}
],
"urls": []
},
"run_id": null,
"stage3_signals": {
"alias_match_roles": [
{
"display_name": "ML Engineer",
"matched_count": null,
"role_id": 3,
"score": 1.0,
"slug": "ml-engineer",
"total_count": null
}
],
"kra_match_roles": [
{
"display_name": "Data Engineer",
"matched_count": null,
"role_id": 2,
"score": 0.3554,
"slug": "data-engineer",
"total_count": null
},
{
"display_name": "Cybersecurity Engineer",
"matched_count": null,
"role_id": 5,
"score": 0.3274,
"slug": "cybersecurity-engineer",
"total_count": null
},
{
"display_name": "DevOps Engineer",
"matched_count": null,
"role_id": 10,
"score": 0.3058,
"slug": "devops-engineer",
"total_count": null
},
{
"display_name": "AR/VR Engineer",
"matched_count": null,
"role_id": 8,
"score": 0.2991,
"slug": "ar-vr-engineer",
"total_count": null
},
{
"display_name": "Cloud Architect",
"matched_count": null,
"role_id": 9,
"score": 0.2898,
"slug": "cloud-architect",
"total_count": null
}
],
"skill_match_roles": [
{
"display_name": "Data Engineer",
"matched_count": 9,
"role_id": 2,
"score": 0.8182,
"slug": "data-engineer",
"total_count": 11
},
{
"display_name": "Backend Engineer",
"matched_count": 3,
"role_id": 1,
"score": 0.2727,
"slug": "backend-engineer",
"total_count": 11
},
{
"display_name": "Cybersecurity Engineer",
"matched_count": 2,
"role_id": 5,
"score": 0.1818,
"slug": "cybersecurity-engineer",
"total_count": 11
},
{
"display_name": "ML Engineer",
"matched_count": 2,
"role_id": 3,
"score": 0.1818,
"slug": "ml-engineer",
"total_count": 11
},
{
"display_name": "Cloud Architect",
"matched_count": 2,
"role_id": 9,
"score": 0.1818,
"slug": "cloud-architect",
"total_count": 11
}
],
"stage35_ran": false
},
"stage4_decision": {
"alias_collision_detected": false,
"case": "E",
"chosen_role": null,
"confidence": 0.0,
"llm2_fired": false,
"llm2_reasoning": null,
"queued": true,
"reasoning": "low_kra: top KRA 0.36 \u003c 0.7"
},
"stage5_updates": null
}
API 2 — extract-details
{}
API 3 — final-role-output
{}
LLM Calls
Every model call made for this run, in pipeline order. Click a card to see the model's response.
Loading…