Pipeline run

b90fffca-a264-4009-a738-d1f55f9cc3ad

Pipeline LLM cost (USD)

API 1: $0.0026 API 2: $0.0000 API 3: $0.0000 Total: $0.0026

Client output enrichment

v2 Skill cluster · Nature of work · AI index · Tech stack maturity · Evidence · KRA description

SPARSE JD role baseline loaded sources · ai_index: role_baseline · nature_of_work: jd · tech_stack_maturity: role_baseline

Nature of work · Data pipeline development

Build and maintain Databricks/Spark pipelines that ingest and process data, while working with analysts and data scientists to tune workflows and enforce data quality and engineering best practices.

"Develop and maintain Databricks pipelines for data ingestion and processing"

Tech stack maturity

Modern Cloud Native

Data engineers typically build cloud-based batch and streaming pipelines and warehouse models, but AI is usually incidental rather than central to the role.

AI index (0 = no AI use, 5 = totally AI-dependent · v2.1)

1.20 / 5

· Title match

✓ Has AI skill

· AI skill (primary)

· AI skill (secondary)

· On AI team

· Builds AI products

vocab breakdown (legacy)

Assistants (×1): —

Frameworks (×2): —

Models / concepts (×3): Machine Learning

Evidence — skills matched in JD (2)

Databricks Apache Spark

Skill cluster (2 dimension groups, role-scoped)

ETL and ELT Tooling

Apache Spark

Cross-cutting / unaligned

Databricks

Show KRA description ↓

• Develop and maintain Databricks pipelines for data ingestion and processing • Collaborate with data scientists and analysts to optimize data workflows • Implement data engineering best practices and ensure data quality and integrity • Utilize Apache Spark for data processing and analysis tasks

Signals

Skill data-engineer

0.50

Alias data-engineer

1.00

KRA data-engineer

0.66

Post-classification

Centroidupdated · n=347

Alias collision log—

New-role queue—

New skills captured0

New KRA captured—

Status: completed Created: 2026-05-27T15:47:05.916480Z Updated: 2026-06-12T15:58:51.268232Z API 3 duration: 6202 ms

Flow Current 3-step pipeline

1 POST /skills/extract-from-jd

2 POST /skills/extract-details

3 POST /skills/final-role-output

Role Chosen role & resolution

Data Engineer

CASE A

slug: data-engineer · id: 2 · source: db

Exact alias hit on data-engineer (1.0) — no other alias at this confidence; skill_top data-engineer 0.50 does not contradict

Resolution: in_db — role exists in library; skill↔dim and role↔dim links saved when applicable.

New skills

Skill↔dim saved

Role↔dim saved

Skipped

Job description

Skills:
Apache Spark, Data Engineering, Python, SQL, Machine Learning, Data Analysis, ETL,

Company Overview

CN Solutions partners with clients to provide top-notch solutions for their manpower needs. Specializing in Web Technologies, Databases, ERP, Data warehousing, and more. Offering staffing solutions, leadership hiring, RPO services, and more.

Job Overview

Senior Databricks role with 7 to 10 years of experience in Hyderabad. Full-Time employment with

Qualifications And Skills

• 7-10 years of experience in Data Engineering and Databricks
• Proficiency in Python, SQL, and data analysis tools
• Strong knowledge of Apache Spark and ETL processes
• Experience in Machine Learning and working with large datasets


Roles And Responsibilities

• Develop and maintain Databricks pipelines for data ingestion and processing
• Collaborate with data scientists and analysts to optimize data workflows
• Implement data engineering best practices and ensure data quality and integrity
• Utilize Apache Spark for data processing and analysis tasks

Skills from this JD

Each row merges API 1 extraction, API 2 library match / v3 orchestration (dimensions + locked dims), and API 3 persistence tags.

Databricks Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)

Canonical: Databricks id=1202 · databricks

Aliases — catalog

Databricks (CANONICAL)

Context tags (catalog)

Apache Spark Databricks Runtime Delta Lake MLflow SQL Analytics Spark cloud integration collaborative workspace data engineering data lakes data pipelines data visualization job scheduling machine learning notebooks real-time analytics

Stored enrichment (catalog DB)

Category: Platform
Sub-category: Data Analytics Platform
Vendor: Databricks, Inc.
License: other_open
Year introduced: 2013
Confidence: 0.97
Version strategy: NOT_APPLICABLE

Maturity reasoning: Databricks appears frequently in data engineering and analytics job postings, especially alongside Spark, Delta Lake, and lakehouse stacks; strong vendor adoption and broad enterprise usage signal mainstream demand.

Skill profile (library / DB)

Skill nature: PLATFORM
Volatility: STABLE
Typical lifespan: EVERGREEN
Category id: 9
Sub-category id: 911
Extractable: True
Also category: False

Dimensions (API 2 worklist)

React Frontend Development Catalog dimension db id 96

Library dimension (catalog)

API 3 link attempts (this skill)

Dimension	Skill↔dim	Role↔dim	Outcome
React Frontend Development d_init_01	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)

Apache Spark Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)

Canonical: Apache Spark id=1350 · apache-spark

Aliases — catalog

Apache Spark (CANONICAL)
apache spark 3 (VERSION)
spark (VERSION)
spark 3 (VERSION)
spark 3.x (VERSION)
spark3 (VERSION)

Context tags (catalog)

Apache Kafka Cluster Manager DAGScheduler Data Lake DataFrame ETL Hadoop MLlib Machine Learning PySpark RDD Scala Spark SQL Spark Streaming SparkSession

Stored enrichment (catalog DB)

Category: Framework
Sub-category: Distributed Data Processing Framework
Vendor: Apache Software Foundation
License: apache_2
Year introduced: 2010
Confidence: 0.94
Version strategy: SEPARATE_ENTITY
Version tag: 3.x

Maturity reasoning: Apache Spark appears in many data engineering JDs and remains a standard for distributed ETL/ELT; its GitHub and vendor ecosystem activity stay strong, with Databricks and cloud platforms still promoting it.

Skill profile (library / DB)

Skill nature: FRAMEWORK
Volatility: STABLE
Typical lifespan: EVERGREEN
Category id: 5
Sub-category id: 1021
Extractable: True
Also category: False

Dimensions (API 2 worklist)

ETL and ELT Tooling Catalog dimension db id 24

Library dimension (catalog)

Roles linked in library: Data Engineer

API 3 link attempts (this skill)

Dimension	Skill↔dim	Role↔dim	Outcome
ETL and ELT Tooling etl-and-elt-tooling	✓	✓	Existing dimension (library) · Role↔dimension saved

All API 3 persistence rows

Same grid as the skill-extractor “Persistence items” table: one row per (skill × dimension) work item.

Skill	Tag	Dimension	Skill↔dim	Role↔dim	Outcome	Notes
Databricks	in_db	React Frontend Development d_init_01	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Apache Spark	in_db	ETL and ELT Tooling etl-and-elt-tooling	✓	✓	Existing dimension (library) · Role↔dimension saved

Library artifacts (this run)

No artifact rows for this run.

nano JD Parser — gpt-4.1-nano click to toggle

RoleSenior Databricks

CompanyCN Solutions

Experience7-10 years of experience

DomainIT Services & Consulting

Location Hyderabad, India (onsite)

JD type pass

Show raw JSON

{
  "JD_type": "pass",
  "about_company": {
    "source_marker": {
      "first_5_words": "CN Solutions partners with clients",
      "last_5_words": "leadership hiring, RPO services, and more."
    },
    "text": "CN Solutions partners with clients to provide top-notch solutions for their manpower needs. Specializing in Web Technologies, Databases, ERP, Data warehousing, and more. Offering staffing solutions, leadership hiring, RPO services, and more.",
    "word_count": 42
  },
  "certifications": [],
  "company_name": "CN Solutions",
  "ctc": null,
  "domain": {
    "primary": {
      "aliases": [
        "ITES",
        "BPO"
      ],
      "domain": "IT Services \u0026 Consulting"
    },
    "secondary": null
  },
  "education": [],
  "experience": {
    "max": 10,
    "min": 7,
    "raw": "7-10 years of experience"
  },
  "job_locations": [
    {
      "aliases": [
        "Hyderabad, AP"
      ],
      "city": "Hyderabad",
      "country": "India",
      "state": null,
      "work_mode": "onsite"
    }
  ],
  "role": "Senior Databricks",
  "role_aliases": [
    "Databricks Engineer",
    "Data Engineer",
    "Senior Data Engineer"
  ],
  "role_archetype": "Data",
  "roles_and_responsibilities": [
    {
      "bullet_count": 4,
      "heading": "Roles And Responsibilities",
      "heading_was_present": true,
      "source_marker": {
        "first_5_words": "\u2022 Develop and maintain Databricks",
        "last_5_words": "for data processing and analysis tasks"
      },
      "text": "\u2022 Develop and maintain Databricks pipelines for data ingestion and processing\n\u2022 Collaborate with data scientists and analysts to optimize data workflows\n\u2022 Implement data engineering best practices and ensure data quality and integrity\n\u2022 Utilize Apache Spark for data processing and analysis tasks",
      "word_count": 42
    }
  ],
  "urls": []
}

API 1 — extract-from-jd click to toggle

{
  "final_skills": [
    {
      "is_primary": true,
      "skill_name": "Databricks"
    },
    {
      "is_primary": true,
      "skill_name": "Apache Spark"
    }
  ],
  "jd_role": {
    "display_name": "Senior Databricks",
    "rationale": null,
    "role_aliases": [
      "Databricks Engineer",
      "Data Engineer",
      "Senior Data Engineer"
    ],
    "role_archetype": "Data",
    "slug": ""
  },
  "nano_parsed": {
    "JD_type": "pass",
    "about_company": {
      "source_marker": {
        "first_5_words": "CN Solutions partners with clients",
        "last_5_words": "leadership hiring, RPO services, and more."
      },
      "text": "CN Solutions partners with clients to provide top-notch solutions for their manpower needs. Specializing in Web Technologies, Databases, ERP, Data warehousing, and more. Offering staffing solutions, leadership hiring, RPO services, and more.",
      "word_count": 42
    },
    "certifications": [],
    "company_name": "CN Solutions",
    "ctc": null,
    "domain": {
      "primary": {
        "aliases": [
          "ITES",
          "BPO"
        ],
        "domain": "IT Services \u0026 Consulting"
      },
      "secondary": null
    },
    "education": [],
    "experience": {
      "max": 10,
      "min": 7,
      "raw": "7-10 years of experience"
    },
    "job_locations": [
      {
        "aliases": [
          "Hyderabad, AP"
        ],
        "city": "Hyderabad",
        "country": "India",
        "state": null,
        "work_mode": "onsite"
      }
    ],
    "role": "Senior Databricks",
    "role_aliases": [
      "Databricks Engineer",
      "Data Engineer",
      "Senior Data Engineer"
    ],
    "role_archetype": "Data",
    "roles_and_responsibilities": [
      {
        "bullet_count": 4,
        "heading": "Roles And Responsibilities",
        "heading_was_present": true,
        "source_marker": {
          "first_5_words": "\u2022 Develop and maintain Databricks",
          "last_5_words": "for data processing and analysis tasks"
        },
        "text": "\u2022 Develop and maintain Databricks pipelines for data ingestion and processing\n\u2022 Collaborate with data scientists and analysts to optimize data workflows\n\u2022 Implement data engineering best practices and ensure data quality and integrity\n\u2022 Utilize Apache Spark for data processing and analysis tasks",
        "word_count": 42
      }
    ],
    "urls": []
  },
  "rejected": false,
  "rejection_reason": null,
  "run_id": "b90fffca-a264-4009-a738-d1f55f9cc3ad",
  "stage3_signals": {
    "alias_found": true,
    "alias_match_roles": [
      {
        "display_name": "Data Engineer",
        "kra_matches": null,
        "matched_count": null,
        "matched_skills": null,
        "role_id": 2,
        "score": 1.0,
        "slug": "data-engineer",
        "total_count": null
      }
    ],
    "kra_match_roles": [
      {
        "display_name": "Data Engineer",
        "kra_matches": [
          {
            "kra_text": "Develops batch and real-time streaming data pipelines using Apache Spark, Apache Kafka, Apache Flink, or Airflow for data movement and processing at scale.",
            "sentence": "Utilize Apache Spark for data processing and analysis tasks",
            "similarity": 0.673
          },
          {
            "kra_text": "Develops batch and real-time streaming data pipelines using Apache Spark, Apache Kafka, Apache Flink, or Airflow for data movement and processing at scale.",
            "sentence": "Develop and maintain Databricks pipelines for data ingestion and processing",
            "similarity": 0.6581
          },
          {
            "kra_text": "Works with data analysts, data scientists, and business stakeholders to define data models, ingestion schedules, and data delivery requirements.",
            "sentence": "Collaborate with data scientists and analysts to optimize data workflows",
            "similarity": 0.6448
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 2,
        "score": 0.6586,
        "slug": "data-engineer",
        "total_count": null
      },
      {
        "display_name": "ML Engineer",
        "kra_matches": [
          {
            "kra_text": "Prepares, cleans, and transforms training datasets, manages feature stores, and builds feature engineering pipelines for model training.",
            "sentence": "Develop and maintain Databricks pipelines for data ingestion and processing",
            "similarity": 0.4983
          },
          {
            "kra_text": "Prepares, cleans, and transforms training datasets, manages feature stores, and builds feature engineering pipelines for model training.",
            "sentence": "Implement data engineering best practices and ensure data quality and integrity",
            "similarity": 0.4685
          },
          {
            "kra_text": "Prepares, cleans, and transforms training datasets, manages feature stores, and builds feature engineering pipelines for model training.",
            "sentence": "Collaborate with data scientists and analysts to optimize data workflows",
            "similarity": 0.4487
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 3,
        "score": 0.4719,
        "slug": "ml-engineer",
        "total_count": null
      },
      {
        "display_name": "Svelte Frontend Developer",
        "kra_matches": [
          {
            "kra_text": "backend data integration",
            "sentence": "Implement data engineering best practices and ensure data quality and integrity",
            "similarity": 0.4955
          },
          {
            "kra_text": "backend data integration",
            "sentence": "Develop and maintain Databricks pipelines for data ingestion and processing",
            "similarity": 0.4824
          },
          {
            "kra_text": "backend data integration",
            "sentence": "Collaborate with data scientists and analysts to optimize data workflows",
            "similarity": 0.4256
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 92,
        "score": 0.4678,
        "slug": "svelte-frontend-developer",
        "total_count": null
      },
      {
        "display_name": "MLOps Engineer",
        "kra_matches": [
          {
            "kra_text": "Validates model performance benchmarks, data schema contracts, and system integration health before signing off on production release readiness.",
            "sentence": "Implement data engineering best practices and ensure data quality and integrity",
            "similarity": 0.4984
          },
          {
            "kra_text": "Automates ML platform operations including scheduled retraining triggers, pipeline orchestration, evaluation workflows, and alerting configuration.",
            "sentence": "Develop and maintain Databricks pipelines for data ingestion and processing",
            "similarity": 0.4411
          },
          {
            "kra_text": "Coordinates model promotion workflows across development, staging, and production environments including integration testing and data contract validation.",
            "sentence": "Collaborate with data scientists and analysts to optimize data workflows",
            "similarity": 0.4344
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 16,
        "score": 0.4579,
        "slug": "ml-ops-engineer",
        "total_count": null
      },
      {
        "display_name": "DevOps Engineer",
        "kra_matches": [
          {
            "kra_text": "Builds and maintains CI/CD pipelines using Jenkins, GitHub Actions, GitLab CI, or CircleCI to automate build, test, security scanning, and deployment workflows.",
            "sentence": "Develop and maintain Databricks pipelines for data ingestion and processing",
            "similarity": 0.4589
          },
          {
            "kra_text": "Collaborates with development teams to improve build processes, reduce deployment friction, containerize applications, and adopt DevOps best practices.",
            "sentence": "Collaborate with data scientists and analysts to optimize data workflows",
            "similarity": 0.4518
          },
          {
            "kra_text": "Collaborates with development teams to improve build processes, reduce deployment friction, containerize applications, and adopt DevOps best practices.",
            "sentence": "Implement data engineering best practices and ensure data quality and integrity",
            "similarity": 0.3892
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 10,
        "score": 0.4333,
        "slug": "devops-engineer",
        "total_count": null
      }
    ],
    "skill_match_roles": [
      {
        "display_name": "Data Engineer",
        "kra_matches": null,
        "matched_count": 1,
        "matched_skills": [
          "Apache Spark"
        ],
        "role_id": 2,
        "score": 0.5,
        "slug": "data-engineer",
        "total_count": 2
      }
    ]
  },
  "stage4_decision": {
    "alias_collision_detected": false,
    "case": "A",
    "chosen_role": {
      "display_name": "Data Engineer",
      "kra_matches": null,
      "matched_count": null,
      "matched_skills": null,
      "role_id": 2,
      "score": 1.0,
      "slug": "data-engineer",
      "total_count": null
    },
    "confidence": 1.0,
    "is_new_role": false,
    "llm2_fired": false,
    "llm2_reasoning": null,
    "matched_dimensions": [],
    "matched_kras": [],
    "matched_skills": [],
    "new_role_display_name": null,
    "new_role_slug": null,
    "queued": false,
    "reasoning": "Exact alias hit on data-engineer (1.0) \u2014 no other alias at this confidence; skill_top data-engineer 0.50 does not contradict",
    "sub_role": null
  },
  "stage5_updates": {
    "centroid_n_after": 347,
    "centroid_updated": true,
    "collision_log_id": null,
    "new_kra_attached": null,
    "new_skills_attached": [],
    "queue_entry_id": null,
    "v3_pipeline_triggered": false,
    "v3_role_slug": null,
    "v3_run_id": null
  }
}

API 2 — extract-details

{
  "alias_matches": [
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 1838,
      "existing_alias_text": "Databricks",
      "input_term": "Databricks",
      "matched_canonical": {
        "category_id": 9,
        "display_name": "Databricks",
        "id": 1202,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "PLATFORM",
        "slug": "databricks",
        "sub_category_id": 911,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 2004,
      "existing_alias_text": "Apache Spark",
      "input_term": "Apache Spark",
      "matched_canonical": {
        "category_id": 5,
        "display_name": "Apache Spark",
        "id": 1350,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "FRAMEWORK",
        "slug": "apache-spark",
        "sub_category_id": 1021,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    }
  ],
  "candidate_roles": [
    {
      "display_name": "Data Engineer",
      "id": 2,
      "rationale": null,
      "role_archetype": null,
      "slug": "data-engineer",
      "source": "db"
    }
  ],
  "chosen_role": {
    "display_name": "Data Engineer",
    "id": 2,
    "rationale": "Exact alias hit on data-engineer (1.0) \u2014 no other alias at this confidence; skill_top data-engineer 0.50 does not contradict",
    "role_archetype": null,
    "slug": "data-engineer",
    "source": "db"
  },
  "dimensions": [
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "React Frontend Development",
        "id": 96,
        "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
        "slug": "d_init_01",
        "source": "db"
      },
      "input_skill": "Databricks",
      "llm_role": null,
      "roles_from_db": []
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "ETL and ELT Tooling",
        "id": 24,
        "rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
        "slug": "etl-and-elt-tooling",
        "source": "db"
      },
      "input_skill": "Apache Spark",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Data Engineer",
          "id": 2,
          "rationale": null,
          "role_archetype": null,
          "slug": "data-engineer",
          "source": "db"
        }
      ]
    }
  ],
  "input_final_skills": [
    "Databricks",
    "Apache Spark"
  ],
  "input_llm_skills": [
    "Databricks",
    "Apache Spark"
  ],
  "new_aliases_persisted": 0,
  "run_id": "b90fffca-a264-4009-a738-d1f55f9cc3ad",
  "skills_detail": [
    {
      "aliases_in_db": [
        {
          "alias_text": "Databricks",
          "alias_type": "CANONICAL",
          "id": 1838,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 9,
        "display_name": "Databricks",
        "id": 1202,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "PLATFORM",
        "slug": "databricks",
        "sub_category_id": 911,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "React Frontend Development",
            "id": 96,
            "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
            "slug": "d_init_01",
            "source": "db"
          },
          "input_skill": "Databricks",
          "llm_role": null,
          "roles_from_db": []
        }
      ],
      "input_skill": "Databricks",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "Apache Spark",
          "alias_type": "CANONICAL",
          "id": 2004,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "apache spark 3",
          "alias_type": "VERSION",
          "id": 2006,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "spark",
          "alias_type": "VERSION",
          "id": 2510,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "spark 3",
          "alias_type": "VERSION",
          "id": 2007,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "spark 3.x",
          "alias_type": "VERSION",
          "id": 2009,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "spark3",
          "alias_type": "VERSION",
          "id": 2008,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 5,
        "display_name": "Apache Spark",
        "id": 1350,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "FRAMEWORK",
        "slug": "apache-spark",
        "sub_category_id": 1021,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "ETL and ELT Tooling",
            "id": 24,
            "rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
            "slug": "etl-and-elt-tooling",
            "source": "db"
          },
          "input_skill": "Apache Spark",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Data Engineer",
              "id": 2,
              "rationale": null,
              "role_archetype": null,
              "slug": "data-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "Apache Spark",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    }
  ],
  "unmatched_skills": []
}

API 3 — final-role-output

{
  "chosen_role": {
    "display_name": "Data Engineer",
    "id": 2,
    "rationale": "Exact alias hit on data-engineer (1.0) \u2014 no other alias at this confidence; skill_top data-engineer 0.50 does not contradict",
    "role_archetype": null,
    "slug": "data-engineer",
    "source": "db"
  },
  "chosen_role_resolution": "in_db",
  "final_input_skills": [
    {
      "skill": "Databricks",
      "tag": "in_db"
    },
    {
      "skill": "Apache Spark",
      "tag": "in_db"
    }
  ],
  "llm_cost_api1_usd": null,
  "llm_cost_api2_usd": null,
  "llm_cost_api3_usd": null,
  "llm_cost_total_usd": null,
  "persistence": {
    "items": [
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "React Frontend Development",
          "id": 96,
          "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
          "slug": "d_init_01",
          "source": "db"
        },
        "dimension_id": 96,
        "input_skill": "Databricks",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [],
        "skill_dimension_saved": true,
        "skill_id": 1202,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "ETL and ELT Tooling",
          "id": 24,
          "rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
          "slug": "etl-and-elt-tooling",
          "source": "db"
        },
        "dimension_id": 24,
        "input_skill": "Apache Spark",
        "llm_role": null,
        "matched_chosen_role": true,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension saved",
        "role_dimension_saved": true,
        "roles_from_db": [
          {
            "display_name": "Data Engineer",
            "id": 2,
            "rationale": null,
            "role_archetype": null,
            "slug": "data-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 1350,
        "skill_tag": "in_db",
        "skipped_reason": null
      }
    ],
    "new_skills_created": 0,
    "role_dimension_saved": 0,
    "skill_dimension_saved": 0,
    "skipped": 0
  },
  "planner_output": null,
  "run_id": "b90fffca-a264-4009-a738-d1f55f9cc3ad"
}

LLM Calls

Every model call made for this run, in pipeline order. Click a card to see the model's response.

Loading…