Pipeline run

77354b2c-e0d4-4a36-97d0-e45a80123e96

Pipeline LLM cost (USD)

API 1: $0.0032 API 2: $0.0002 API 3: $0.0000 Total: $0.0035

Client output enrichment

v2 Skill cluster · Nature of work · AI index · Tech stack maturity · Evidence · KRA description

role baseline loaded sources · ai_index: jd · nature_of_work: jd · tech_stack_maturity: jd

Nature of work · Data pipeline development

Designs and maintains Databricks-based data pipelines and ETL flows, works with cross-functional teams to move data across systems, and enforces data quality while improving pipeline efficiency.

"creating data pipelines, ensuring data quality, and implementing ETL processes"

Tech stack maturity

Modern Cloud Native

Databricks is a cloud-native data engineering platform strongly associated with modern lakehouse architectures and managed distributed processing.

AI index (0 = no AI use, 5 = totally AI-dependent · v2.1)

0.00 / 5

· Title match

· Has AI skill

· AI skill (primary)

· AI skill (secondary)

· On AI team

· Builds AI products

vocab breakdown (legacy)

Assistants (×1): —

Frameworks (×2): —

Models / concepts (×3): —

Evidence — skills matched in JD (8)

Databricks ETL Data Pipelines Data Integration Data Quality Microsoft Azure Azure Databricks PySpark

Skill cluster (2 dimension groups, role-scoped)

Cloud Platforms

Microsoft Azure

Cross-cutting / unaligned

Databricks ETL Data Pipelines Data Integration Data Quality Azure Databricks PySpark

Show KRA description ↓

As a Data Engineer, you will design, develop, and maintain data solutions that facilitate data generation, collection, and processing. Your typical day will involve creating data pipelines, ensuring data quality, and implementing ETL processes to migrate and deploy data across various systems. You will collaborate with cross-functional teams to understand data requirements and deliver effective solutions that meet business needs. - Expected to be an SME. - Collaborate and manage the team to perform. - Responsible for team decisions. - Engage with multiple teams and contribute on key decisions. - Provide solutions to problems for their immediate team and across multiple teams. - Mentor junior team members to enhance their skills and knowledge. - Continuously evaluate and improve data processes to enhance efficiency. - Must To Have Skills: Proficiency in Databricks Unified Data Analytics Platform. - Good To Have Skills: Experience with Microsoft Azure Analytics Services, Microsoft Azure Databricks, PySpark. - Strong understanding of data pipeline architecture and design. - Experience with ETL processes and data integration techniques. - Familiarity with data quality frameworks and best practices.

Signals

Skill —

—

Alias data-engineer

1.00

KRA data-engineer

0.58

Post-classification

Centroidupdated · n=244

Alias collision log—

New-role queue—

New skills captured6

New KRA captured—

Captured for admin review

Azure Databricks ↔ Data Engineer pending

PySpark ↔ Data Engineer pending

ETL primary ↔ Data Engineer pending

Data Pipelines primary ↔ Data Engineer pending

Data Integration primary ↔ Data Engineer pending

Data Quality primary ↔ Data Engineer pending

Status: completed Created: 2026-05-27T15:01:38.292578Z Updated: 2026-06-12T16:58:29.793899Z API 3 duration: 13405 ms

Flow Current 3-step pipeline

1 POST /skills/extract-from-jd

2 POST /skills/extract-details

3 POST /skills/final-role-output

Role Chosen role & resolution

Data Engineer

CASE A

slug: data-engineer · id: 2 · source: db

Exact alias hit on data-engineer (1.0) — no other alias at this confidence; skill_top absent does not contradict

Resolution: in_db — role exists in library; skill↔dim and role↔dim links saved when applicable.

New skills

Skill↔dim saved

Role↔dim saved

Skipped

Job description

Project Role : Data Engineer

Project Role Description : Design, develop and maintain data solutions for data generation, collection, and processing. Create data pipelines, ensure data quality, and implement ETL (extract, transform and load) processes to migrate and deploy data across systems.

Must have skills : Databricks Unified Data Analytics Platform

Good to have skills : Microsoft Azure Databricks, Microsoft Azure Analytics Services, PySpark

Minimum 5 Year(s) Of Experience Is Required

Educational Qualification : 15 years full time education

Summary: As a Data Engineer, you will design, develop, and maintain data solutions that facilitate data generation, collection, and processing. Your typical day will involve creating data pipelines, ensuring data quality, and implementing ETL processes to migrate and deploy data across various systems. You will collaborate with cross-functional teams to understand data requirements and deliver effective solutions that meet business needs. Roles & Responsibilities: - Expected to be an SME. - Collaborate and manage the team to perform. - Responsible for team decisions. - Engage with multiple teams and contribute on key decisions. - Provide solutions to problems for their immediate team and across multiple teams. - Mentor junior team members to enhance their skills and knowledge. - Continuously evaluate and improve data processes to enhance efficiency. Professional & Technical Skills: - Must To Have Skills: Proficiency in Databricks Unified Data Analytics Platform. - Good To Have Skills: Experience with Microsoft Azure Analytics Services, Microsoft Azure Databricks, PySpark. - Strong understanding of data pipeline architecture and design. - Experience with ETL processes and data integration techniques. - Familiarity with data quality frameworks and best practices. Additional Information: - The candidate should have minimum 5 years of experience in Databricks Unified Data Analytics Platform. - This position is based at our Pune office. - A 15 years full time education is required.

Skills from this JD

Each row merges API 1 extraction, API 2 library match / v3 orchestration (dimensions + locked dims), and API 3 persistence tags.

Databricks Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)

Canonical: Databricks id=1202 · databricks

Aliases — catalog

Databricks (CANONICAL)

Context tags (catalog)

Apache Spark Databricks Runtime Delta Lake MLflow SQL Analytics Spark cloud integration collaborative workspace data engineering data lakes data pipelines data visualization job scheduling machine learning notebooks real-time analytics

Stored enrichment (catalog DB)

Category: Platform
Sub-category: Data Analytics Platform
Vendor: Databricks, Inc.
License: other_open
Year introduced: 2013
Confidence: 0.97
Version strategy: NOT_APPLICABLE

Maturity reasoning: Databricks appears frequently in data engineering and analytics job postings, especially alongside Spark, Delta Lake, and lakehouse stacks; strong vendor adoption and broad enterprise usage signal mainstream demand.

Skill profile (library / DB)

Skill nature: PLATFORM
Volatility: STABLE
Typical lifespan: EVERGREEN
Category id: 9
Sub-category id: 911
Extractable: True
Also category: False

Dimensions (API 2 worklist)

React Frontend Development Catalog dimension db id 96

Library dimension (catalog)

API 3 link attempts (this skill)

Dimension	Skill↔dim	Role↔dim	Outcome
React Frontend Development d_init_01	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)

Microsoft Azure Secondary Library skill API 3: existing canonical (in_db) Existing skill (matched library)

Canonical: Microsoft Azure id=97 · microsoft-azure

Aliases — catalog

Microsoft Azure (CANONICAL) primary

Context tags (catalog)

AKS ARM templates App Service Azure Active Directory Azure App Service Azure Blob Storage Azure Cosmos DB Azure DevOps Azure Functions Azure Kubernetes Service Azure Logic Apps Azure Monitor Azure Resource Manager Azure SQL Azure Virtual Machines Bicep Cloud Services Entra ID Functions IaaS Infrastructure as Code Key Vault Logic Apps PaaS Resource Group Serverless Computing Service Bus Storage Account Virtual Machines cloud migration microservices serverless

Stored enrichment (catalog DB)

Category: Platform
Sub-category: Cloud Platform
Vendor: Microsoft
License: other_open
Year introduced: 2010
Confidence: 0.99
Version strategy: NOT_APPLICABLE

Maturity reasoning: Azure appears in large volumes of cloud/DevOps job descriptions and is a core hyperscaler alongside AWS/GCP; Microsoft’s continued product investment and broad enterprise adoption signal mainstream demand.

Skill profile (library / DB)

Skill nature: PLATFORM
Volatility: STABLE
Typical lifespan: EVERGREEN
Category id: 9
Sub-category id: 46
Extractable: True
Also category: False

Dimensions (API 2 worklist)

Cloud & Hosting Providers Catalog dimension db id 414

Library dimension (catalog)

Roles linked in library: PHP Backend Developer
Cloud Platforms Catalog dimension db id 20

Library dimension (catalog)

Roles linked in library: .NET Backend Developer, Backend Developer, Cyber Security Engineer, Data Engineer, DevOps Engineer, Fullstack Developer, Go Backend Developer, Java Backend Developer, Kotlin Backend Developer, ML Engineer, MLOps Engineer, Node.js Backend Developer, Python Backend Developer, Scala Backend Developer

API 3 link attempts (this skill)

Dimension	Skill↔dim	Role↔dim	Outcome
Cloud & Hosting Providers cloud-hosting-providers	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Cloud Platforms cloud-platforms	✓	✓	Existing dimension (library) · Role↔dimension saved

Azure Databricks Secondary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields

Category: Cloud Platforms
Sub-category: general
Skill nature: PLATFORM
Volatility: MEDIUM
Typical lifespan: MULTI_YEAR
Version strategy: UNVERSIONED

PySpark Secondary Library skill API 3: existing canonical (in_db) Existing skill (matched library)

Canonical: Apache Spark id=1350 · apache-spark

Aliases — catalog

Apache Spark (CANONICAL)
apache spark 3 (VERSION)
spark (VERSION)
spark 3 (VERSION)
spark 3.x (VERSION)
spark3 (VERSION)

Context tags (catalog)

Apache Kafka Cluster Manager DAGScheduler Data Lake DataFrame ETL Hadoop MLlib Machine Learning PySpark RDD Scala Spark SQL Spark Streaming SparkSession

Stored enrichment (catalog DB)

Category: Framework
Sub-category: Distributed Data Processing Framework
Vendor: Apache Software Foundation
License: apache_2
Year introduced: 2010
Confidence: 0.94
Version strategy: SEPARATE_ENTITY
Version tag: 3.x

Maturity reasoning: Apache Spark appears in many data engineering JDs and remains a standard for distributed ETL/ELT; its GitHub and vendor ecosystem activity stay strong, with Databricks and cloud platforms still promoting it.

Skill profile (library / DB)

Skill nature: FRAMEWORK
Volatility: STABLE
Typical lifespan: EVERGREEN
Category id: 5
Sub-category id: 1021
Extractable: True
Also category: False

Dimensions (API 2 worklist)

ETL and ELT Tooling Catalog dimension db id 24

Library dimension (catalog)

Roles linked in library: Data Engineer

API 3 link attempts (this skill)

Dimension	Skill↔dim	Role↔dim	Outcome
ETL and ELT Tooling etl-and-elt-tooling	—	—	Skipped — no persistable v3 meta for new skill skill_not_in_db_v3_proposed

ETL Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields

Category: Data Engineering Tools
Sub-category: general
Skill nature: PRACTICE
Volatility: MEDIUM
Typical lifespan: MULTI_YEAR
Version strategy: UNVERSIONED

Data Pipelines Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields

Category: Data Engineering Tools
Sub-category: general
Skill nature: CONCEPT
Volatility: MEDIUM
Typical lifespan: MULTI_YEAR
Version strategy: UNVERSIONED

Data Integration Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields

Category: Data Engineering Tools
Sub-category: general
Skill nature: CONCEPT
Volatility: MEDIUM
Typical lifespan: MULTI_YEAR
Version strategy: UNVERSIONED

Data Quality Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields

Category: Data Engineering Tools
Sub-category: general
Skill nature: CONCEPT
Volatility: MEDIUM
Typical lifespan: MULTI_YEAR
Version strategy: UNVERSIONED

All API 3 persistence rows

Same grid as the skill-extractor “Persistence items” table: one row per (skill × dimension) work item.

Skill	Tag	Dimension	Skill↔dim	Role↔dim	Outcome	Notes
Databricks	in_db	React Frontend Development d_init_01	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Microsoft Azure	in_db	Cloud & Hosting Providers cloud-hosting-providers	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Microsoft Azure	in_db	Cloud Platforms cloud-platforms	✓	✓	Existing dimension (library) · Role↔dimension saved
PySpark	new	ETL and ELT Tooling etl-and-elt-tooling	—	—	Skipped — no persistable v3 meta for new skill	skill_not_in_db_v3_proposed

Library artifacts (this run)

Kind	Detail	DB id
canonical_skill_proposed	Azure Databricks \| type=Cloud Platforms subtype=general nature=PLATFORM lifespan=MULTI_YEAR
canonical_skill_proposed	ETL \| type=Data Engineering Tools subtype=general nature=PRACTICE lifespan=MULTI_YEAR
canonical_skill_proposed	Data Pipelines \| type=Data Engineering Tools subtype=general nature=CONCEPT lifespan=MULTI_YEAR
canonical_skill_proposed	Data Integration \| type=Data Engineering Tools subtype=general nature=CONCEPT lifespan=MULTI_YEAR
canonical_skill_proposed	Data Quality \| type=Data Engineering Tools subtype=general nature=CONCEPT lifespan=MULTI_YEAR
dimension_skill_link_proposed	PySpark ↔ ETL and ELT Tooling
role_dimension_link_proposed	Data Engineer ↔ ETL and ELT Tooling

nano JD Parser — gpt-4.1-nano click to toggle

RoleData Engineer

ExperienceMinimum 5 Year(s) Of Experience Is Required

DomainOther

Location Pune, India (null)

JD type pass

Show raw JSON

{
  "JD_type": "pass",
  "about_company": null,
  "certifications": [],
  "company_name": null,
  "ctc": null,
  "domain": {
    "primary": {
      "aliases": [],
      "domain": "Other"
    },
    "secondary": null
  },
  "education": [
    {
      "level": "null",
      "qualification": "null - null",
      "raw": "15 years full time education",
      "requirement": "required"
    }
  ],
  "experience": {
    "max": null,
    "min": 5,
    "raw": "Minimum 5 Year(s) Of Experience Is Required"
  },
  "job_locations": [
    {
      "aliases": [
        "Puna"
      ],
      "city": "Pune",
      "country": "India",
      "state": null,
      "work_mode": "null"
    }
  ],
  "role": "Data Engineer",
  "role_aliases": [
    "Data Engineer",
    "Data Developer",
    "ETL Developer"
  ],
  "role_archetype": "Data",
  "roles_and_responsibilities": [
    {
      "bullet_count": 0,
      "heading": "Summary",
      "heading_was_present": true,
      "source_marker": {
        "first_5_words": "As a Data Engineer, you",
        "last_5_words": "that meet business needs."
      },
      "text": "As a Data Engineer, you will design, develop, and maintain data solutions that facilitate data generation, collection, and processing. Your typical day will involve creating data pipelines, ensuring data quality, and implementing ETL processes to migrate and deploy data across various systems. You will collaborate with cross-functional teams to understand data requirements and deliver effective solutions that meet business needs.",
      "word_count": 64
    },
    {
      "bullet_count": 7,
      "heading": "Roles \u0026 Responsibilities",
      "heading_was_present": true,
      "source_marker": {
        "first_5_words": "- Expected to be an",
        "last_5_words": "to enhance efficiency."
      },
      "text": "- Expected to be an SME.\n- Collaborate and manage the team to perform.\n- Responsible for team decisions.\n- Engage with multiple teams and contribute on key decisions.\n- Provide solutions to problems for their immediate team and across multiple teams.\n- Mentor junior team members to enhance their skills and knowledge.\n- Continuously evaluate and improve data processes to enhance efficiency.",
      "word_count": 56
    },
    {
      "bullet_count": 5,
      "heading": "Professional \u0026 Technical Skills",
      "heading_was_present": true,
      "source_marker": {
        "first_5_words": "- Must To Have Skills:",
        "last_5_words": "and best practices."
      },
      "text": "- Must To Have Skills: Proficiency in Databricks Unified Data Analytics Platform.\n- Good To Have Skills: Experience with Microsoft Azure Analytics Services, Microsoft Azure Databricks, PySpark.\n- Strong understanding of data pipeline architecture and design.\n- Experience with ETL processes and data integration techniques.\n- Familiarity with data quality frameworks and best practices.",
      "word_count": 56
    }
  ],
  "urls": []
}

API 1 — extract-from-jd click to toggle

{
  "final_skills": [
    {
      "is_primary": true,
      "skill_name": "Databricks"
    },
    {
      "is_primary": false,
      "skill_name": "Microsoft Azure"
    },
    {
      "is_primary": false,
      "skill_name": "Azure Databricks"
    },
    {
      "is_primary": false,
      "skill_name": "PySpark"
    },
    {
      "is_primary": true,
      "skill_name": "ETL"
    },
    {
      "is_primary": true,
      "skill_name": "Data Pipelines"
    },
    {
      "is_primary": true,
      "skill_name": "Data Integration"
    },
    {
      "is_primary": true,
      "skill_name": "Data Quality"
    }
  ],
  "jd_role": {
    "display_name": "Data Engineer",
    "rationale": null,
    "role_aliases": [
      "Data Engineer",
      "Data Developer",
      "ETL Developer"
    ],
    "role_archetype": "Data",
    "slug": ""
  },
  "nano_parsed": {
    "JD_type": "pass",
    "about_company": null,
    "certifications": [],
    "company_name": null,
    "ctc": null,
    "domain": {
      "primary": {
        "aliases": [],
        "domain": "Other"
      },
      "secondary": null
    },
    "education": [
      {
        "level": "null",
        "qualification": "null - null",
        "raw": "15 years full time education",
        "requirement": "required"
      }
    ],
    "experience": {
      "max": null,
      "min": 5,
      "raw": "Minimum 5 Year(s) Of Experience Is Required"
    },
    "job_locations": [
      {
        "aliases": [
          "Puna"
        ],
        "city": "Pune",
        "country": "India",
        "state": null,
        "work_mode": "null"
      }
    ],
    "role": "Data Engineer",
    "role_aliases": [
      "Data Engineer",
      "Data Developer",
      "ETL Developer"
    ],
    "role_archetype": "Data",
    "roles_and_responsibilities": [
      {
        "bullet_count": 0,
        "heading": "Summary",
        "heading_was_present": true,
        "source_marker": {
          "first_5_words": "As a Data Engineer, you",
          "last_5_words": "that meet business needs."
        },
        "text": "As a Data Engineer, you will design, develop, and maintain data solutions that facilitate data generation, collection, and processing. Your typical day will involve creating data pipelines, ensuring data quality, and implementing ETL processes to migrate and deploy data across various systems. You will collaborate with cross-functional teams to understand data requirements and deliver effective solutions that meet business needs.",
        "word_count": 64
      },
      {
        "bullet_count": 7,
        "heading": "Roles \u0026 Responsibilities",
        "heading_was_present": true,
        "source_marker": {
          "first_5_words": "- Expected to be an",
          "last_5_words": "to enhance efficiency."
        },
        "text": "- Expected to be an SME.\n- Collaborate and manage the team to perform.\n- Responsible for team decisions.\n- Engage with multiple teams and contribute on key decisions.\n- Provide solutions to problems for their immediate team and across multiple teams.\n- Mentor junior team members to enhance their skills and knowledge.\n- Continuously evaluate and improve data processes to enhance efficiency.",
        "word_count": 56
      },
      {
        "bullet_count": 5,
        "heading": "Professional \u0026 Technical Skills",
        "heading_was_present": true,
        "source_marker": {
          "first_5_words": "- Must To Have Skills:",
          "last_5_words": "and best practices."
        },
        "text": "- Must To Have Skills: Proficiency in Databricks Unified Data Analytics Platform.\n- Good To Have Skills: Experience with Microsoft Azure Analytics Services, Microsoft Azure Databricks, PySpark.\n- Strong understanding of data pipeline architecture and design.\n- Experience with ETL processes and data integration techniques.\n- Familiarity with data quality frameworks and best practices.",
        "word_count": 56
      }
    ],
    "urls": []
  },
  "rejected": false,
  "rejection_reason": null,
  "run_id": "77354b2c-e0d4-4a36-97d0-e45a80123e96",
  "stage3_signals": {
    "alias_found": true,
    "alias_match_roles": [
      {
        "display_name": "Data Engineer",
        "kra_matches": null,
        "matched_count": null,
        "matched_skills": null,
        "role_id": 2,
        "score": 1.0,
        "slug": "data-engineer",
        "total_count": null
      }
    ],
    "kra_match_roles": [
      {
        "display_name": "Data Engineer",
        "kra_matches": [
          {
            "kra_text": "Works with data analysts, data scientists, and business stakeholders to define data models, ingestion schedules, and data delivery requirements.",
            "sentence": "You will collaborate with cross-functional teams to understand data requirements and deliver effective solutions that meet business needs.",
            "similarity": 0.6078
          },
          {
            "kra_text": "Works with data analysts, data scientists, and business stakeholders to define data models, ingestion schedules, and data delivery requirements.",
            "sentence": "As a Data Engineer, you will design, develop, and maintain data solutions that facilitate data generation, collection, and processing.",
            "similarity": 0.5868
          },
          {
            "kra_text": "Builds data ingestion pipelines to collect data from transactional databases, third-party APIs, event streams, and file sources into centralized data platforms.",
            "sentence": "Your typical day will involve creating data pipelines, ensuring data quality, and implementing ETL processes to migrate and deploy data across various systems.",
            "similarity": 0.5553
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 2,
        "score": 0.5833,
        "slug": "data-engineer",
        "total_count": null
      },
      {
        "display_name": "Flutter Developer",
        "kra_matches": [
          {
            "kra_text": "collaborate with design, product, and backend teams",
            "sentence": "Collaborate and manage the team to perform.",
            "similarity": 0.5903
          },
          {
            "kra_text": "collaborate with design, product, and backend teams",
            "sentence": "You will collaborate with cross-functional teams to understand data requirements and deliver effective solutions that meet business needs.",
            "similarity": 0.5424
          },
          {
            "kra_text": "collaborate with design, product, and backend teams",
            "sentence": "Engage with multiple teams and contribute on key decisions.",
            "similarity": 0.5194
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 74,
        "score": 0.5507,
        "slug": "flutter-developer",
        "total_count": null
      },
      {
        "display_name": "Engineering Manager",
        "kra_matches": [
          {
            "kra_text": "Set team goals and delivery plans",
            "sentence": "Collaborate and manage the team to perform.",
            "similarity": 0.5505
          },
          {
            "kra_text": "Set team goals and delivery plans",
            "sentence": "Provide solutions to problems for their immediate team and across multiple teams.",
            "similarity": 0.5218
          },
          {
            "kra_text": "Set team goals and delivery plans",
            "sentence": "Engage with multiple teams and contribute on key decisions.",
            "similarity": 0.4744
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 121,
        "score": 0.5155,
        "slug": "engineering-manager",
        "total_count": null
      },
      {
        "display_name": "DevOps Engineer",
        "kra_matches": [
          {
            "kra_text": "Collaborates with development teams to improve build processes, reduce deployment friction, containerize applications, and adopt DevOps best practices.",
            "sentence": "Collaborate and manage the team to perform.",
            "similarity": 0.5164
          },
          {
            "kra_text": "Collaborates with development teams to improve build processes, reduce deployment friction, containerize applications, and adopt DevOps best practices.",
            "sentence": "Provide solutions to problems for their immediate team and across multiple teams.",
            "similarity": 0.4728
          },
          {
            "kra_text": "Collaborates with development teams to improve build processes, reduce deployment friction, containerize applications, and adopt DevOps best practices.",
            "sentence": "Engage with multiple teams and contribute on key decisions.",
            "similarity": 0.4561
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 10,
        "score": 0.4818,
        "slug": "devops-engineer",
        "total_count": null
      },
      {
        "display_name": "ML Engineer",
        "kra_matches": [
          {
            "kra_text": "Prepares, cleans, and transforms training datasets, manages feature stores, and builds feature engineering pipelines for model training.",
            "sentence": "Your typical day will involve creating data pipelines, ensuring data quality, and implementing ETL processes to migrate and deploy data across various systems.",
            "similarity": 0.5004
          },
          {
            "kra_text": "Prepares, cleans, and transforms training datasets, manages feature stores, and builds feature engineering pipelines for model training.",
            "sentence": "As a Data Engineer, you will design, develop, and maintain data solutions that facilitate data generation, collection, and processing.",
            "similarity": 0.4797
          },
          {
            "kra_text": "Translates product requirements into machine learning system specifications including feature definitions, model architecture choices, and success metric definitions.",
            "sentence": "You will collaborate with cross-functional teams to understand data requirements and deliver effective solutions that meet business needs.",
            "similarity": 0.4323
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 3,
        "score": 0.4708,
        "slug": "ml-engineer",
        "total_count": null
      }
    ],
    "skill_match_roles": []
  },
  "stage4_decision": {
    "alias_collision_detected": false,
    "case": "A",
    "chosen_role": {
      "display_name": "Data Engineer",
      "kra_matches": null,
      "matched_count": null,
      "matched_skills": null,
      "role_id": 2,
      "score": 1.0,
      "slug": "data-engineer",
      "total_count": null
    },
    "confidence": 1.0,
    "is_new_role": false,
    "llm2_fired": false,
    "llm2_reasoning": null,
    "matched_dimensions": [],
    "matched_kras": [],
    "matched_skills": [],
    "new_role_display_name": null,
    "new_role_slug": null,
    "queued": false,
    "reasoning": "Exact alias hit on data-engineer (1.0) \u2014 no other alias at this confidence; skill_top absent does not contradict",
    "sub_role": null
  },
  "stage5_updates": {
    "centroid_n_after": 244,
    "centroid_updated": true,
    "collision_log_id": null,
    "new_kra_attached": null,
    "new_skills_attached": [
      {
        "is_primary": false,
        "queue_id": 12205,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Azure Databricks",
        "status": "pending"
      },
      {
        "is_primary": false,
        "queue_id": 12206,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "PySpark",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 12207,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "ETL",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 12208,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Data Pipelines",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 12209,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Data Integration",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 12210,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Data Quality",
        "status": "pending"
      }
    ],
    "queue_entry_id": null,
    "v3_pipeline_triggered": false,
    "v3_role_slug": null,
    "v3_run_id": null
  }
}

API 2 — extract-details

{
  "alias_matches": [
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 1838,
      "existing_alias_text": "Databricks",
      "input_term": "Databricks",
      "matched_canonical": {
        "category_id": 9,
        "display_name": "Databricks",
        "id": 1202,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "PLATFORM",
        "slug": "databricks",
        "sub_category_id": 911,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 258,
      "existing_alias_text": "Microsoft Azure",
      "input_term": "Microsoft Azure",
      "matched_canonical": {
        "category_id": 9,
        "display_name": "Microsoft Azure",
        "id": 97,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "PLATFORM",
        "slug": "microsoft-azure",
        "sub_category_id": 46,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "TODO: REMOVE AFTER TESTING \u2014 alias DB write disabled",
      "alias_persisted": false,
      "existing_alias_id": 2004,
      "existing_alias_text": "Apache Spark",
      "input_term": "PySpark",
      "matched_canonical": {
        "category_id": 5,
        "display_name": "Apache Spark",
        "id": 1350,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "FRAMEWORK",
        "slug": "apache-spark",
        "sub_category_id": 1021,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "embedding_alias"
    }
  ],
  "candidate_roles": [
    {
      "display_name": "PHP Backend Developer",
      "id": 86,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "php-backend-developer",
      "source": "db"
    },
    {
      "display_name": ".NET Backend Developer",
      "id": 83,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "dotnet-backend-developer",
      "source": "db"
    },
    {
      "display_name": "Backend Developer",
      "id": 1,
      "rationale": null,
      "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
      "slug": "backend-engineer",
      "source": "db"
    },
    {
      "display_name": "Cyber Security Engineer",
      "id": 5,
      "rationale": null,
      "role_archetype": null,
      "slug": "cybersecurity-engineer",
      "source": "db"
    },
    {
      "display_name": "Data Engineer",
      "id": 2,
      "rationale": null,
      "role_archetype": null,
      "slug": "data-engineer",
      "source": "db"
    },
    {
      "display_name": "DevOps Engineer",
      "id": 10,
      "rationale": null,
      "role_archetype": null,
      "slug": "devops-engineer",
      "source": "db"
    },
    {
      "display_name": "Fullstack Developer",
      "id": 15,
      "rationale": null,
      "role_archetype": null,
      "slug": "full-stack-engineer",
      "source": "db"
    },
    {
      "display_name": "Go Backend Developer",
      "id": 81,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "go-backend-developer",
      "source": "db"
    },
    {
      "display_name": "Java Backend Developer",
      "id": 79,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "java-backend-developer",
      "source": "db"
    },
    {
      "display_name": "Kotlin Backend Developer",
      "id": 84,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "kotlin-server-backend-developer",
      "source": "db"
    },
    {
      "display_name": "ML Engineer",
      "id": 3,
      "rationale": null,
      "role_archetype": null,
      "slug": "ml-engineer",
      "source": "db"
    },
    {
      "display_name": "MLOps Engineer",
      "id": 16,
      "rationale": null,
      "role_archetype": null,
      "slug": "ml-ops-engineer",
      "source": "db"
    },
    {
      "display_name": "Node.js Backend Developer",
      "id": 82,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "node-backend-developer",
      "source": "db"
    },
    {
      "display_name": "Python Backend Developer",
      "id": 80,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "python-backend-developer",
      "source": "db"
    },
    {
      "display_name": "Scala Backend Developer",
      "id": 87,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "scala-backend-developer",
      "source": "db"
    }
  ],
  "chosen_role": {
    "display_name": "Data Engineer",
    "id": 2,
    "rationale": "Exact alias hit on data-engineer (1.0) \u2014 no other alias at this confidence; skill_top absent does not contradict",
    "role_archetype": null,
    "slug": "data-engineer",
    "source": "db"
  },
  "dimensions": [
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "React Frontend Development",
        "id": 96,
        "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
        "slug": "d_init_01",
        "source": "db"
      },
      "input_skill": "Databricks",
      "llm_role": null,
      "roles_from_db": []
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Cloud \u0026 Hosting Providers",
        "id": 414,
        "rationale": "Knowledge of major cloud and hosting vendor platforms for deploying and managing PHP applications.",
        "slug": "cloud-hosting-providers",
        "source": "db"
      },
      "input_skill": "Microsoft Azure",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "PHP Backend Developer",
          "id": 86,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "php-backend-developer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Cloud Platforms",
        "id": 20,
        "rationale": "Underlying cloud providers that host the managed services or infrastructure used by the role, such as AWS, Azure, and GCP.",
        "slug": "cloud-platforms",
        "source": "db"
      },
      "input_skill": "Microsoft Azure",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": ".NET Backend Developer",
          "id": 83,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "dotnet-backend-developer",
          "source": "db"
        },
        {
          "display_name": "Backend Developer",
          "id": 1,
          "rationale": null,
          "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
          "slug": "backend-engineer",
          "source": "db"
        },
        {
          "display_name": "Cyber Security Engineer",
          "id": 5,
          "rationale": null,
          "role_archetype": null,
          "slug": "cybersecurity-engineer",
          "source": "db"
        },
        {
          "display_name": "Data Engineer",
          "id": 2,
          "rationale": null,
          "role_archetype": null,
          "slug": "data-engineer",
          "source": "db"
        },
        {
          "display_name": "DevOps Engineer",
          "id": 10,
          "rationale": null,
          "role_archetype": null,
          "slug": "devops-engineer",
          "source": "db"
        },
        {
          "display_name": "Fullstack Developer",
          "id": 15,
          "rationale": null,
          "role_archetype": null,
          "slug": "full-stack-engineer",
          "source": "db"
        },
        {
          "display_name": "Go Backend Developer",
          "id": 81,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "go-backend-developer",
          "source": "db"
        },
        {
          "display_name": "Java Backend Developer",
          "id": 79,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "java-backend-developer",
          "source": "db"
        },
        {
          "display_name": "Kotlin Backend Developer",
          "id": 84,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "kotlin-server-backend-developer",
          "source": "db"
        },
        {
          "display_name": "ML Engineer",
          "id": 3,
          "rationale": null,
          "role_archetype": null,
          "slug": "ml-engineer",
          "source": "db"
        },
        {
          "display_name": "MLOps Engineer",
          "id": 16,
          "rationale": null,
          "role_archetype": null,
          "slug": "ml-ops-engineer",
          "source": "db"
        },
        {
          "display_name": "Node.js Backend Developer",
          "id": 82,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "node-backend-developer",
          "source": "db"
        },
        {
          "display_name": "Python Backend Developer",
          "id": 80,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "python-backend-developer",
          "source": "db"
        },
        {
          "display_name": "Scala Backend Developer",
          "id": 87,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "scala-backend-developer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "ETL and ELT Tooling",
        "id": 24,
        "rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
        "slug": "etl-and-elt-tooling",
        "source": "db"
      },
      "input_skill": "PySpark",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Data Engineer",
          "id": 2,
          "rationale": null,
          "role_archetype": null,
          "slug": "data-engineer",
          "source": "db"
        }
      ]
    }
  ],
  "input_final_skills": [
    "Databricks",
    "Microsoft Azure",
    "Azure Databricks",
    "PySpark",
    "ETL",
    "Data Pipelines",
    "Data Integration",
    "Data Quality"
  ],
  "input_llm_skills": [
    "Databricks",
    "Microsoft Azure",
    "Azure Databricks",
    "PySpark",
    "ETL",
    "Data Pipelines",
    "Data Integration",
    "Data Quality"
  ],
  "new_aliases_persisted": 0,
  "run_id": "77354b2c-e0d4-4a36-97d0-e45a80123e96",
  "skills_detail": [
    {
      "aliases_in_db": [
        {
          "alias_text": "Databricks",
          "alias_type": "CANONICAL",
          "id": 1838,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 9,
        "display_name": "Databricks",
        "id": 1202,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "PLATFORM",
        "slug": "databricks",
        "sub_category_id": 911,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "React Frontend Development",
            "id": 96,
            "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
            "slug": "d_init_01",
            "source": "db"
          },
          "input_skill": "Databricks",
          "llm_role": null,
          "roles_from_db": []
        }
      ],
      "input_skill": "Databricks",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "Microsoft Azure",
          "alias_type": "CANONICAL",
          "id": 258,
          "is_primary": true,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 9,
        "display_name": "Microsoft Azure",
        "id": 97,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "PLATFORM",
        "slug": "microsoft-azure",
        "sub_category_id": 46,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Cloud \u0026 Hosting Providers",
            "id": 414,
            "rationale": "Knowledge of major cloud and hosting vendor platforms for deploying and managing PHP applications.",
            "slug": "cloud-hosting-providers",
            "source": "db"
          },
          "input_skill": "Microsoft Azure",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "PHP Backend Developer",
              "id": 86,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "php-backend-developer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Cloud Platforms",
            "id": 20,
            "rationale": "Underlying cloud providers that host the managed services or infrastructure used by the role, such as AWS, Azure, and GCP.",
            "slug": "cloud-platforms",
            "source": "db"
          },
          "input_skill": "Microsoft Azure",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": ".NET Backend Developer",
              "id": 83,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "dotnet-backend-developer",
              "source": "db"
            },
            {
              "display_name": "Backend Developer",
              "id": 1,
              "rationale": null,
              "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
              "slug": "backend-engineer",
              "source": "db"
            },
            {
              "display_name": "Cyber Security Engineer",
              "id": 5,
              "rationale": null,
              "role_archetype": null,
              "slug": "cybersecurity-engineer",
              "source": "db"
            },
            {
              "display_name": "Data Engineer",
              "id": 2,
              "rationale": null,
              "role_archetype": null,
              "slug": "data-engineer",
              "source": "db"
            },
            {
              "display_name": "DevOps Engineer",
              "id": 10,
              "rationale": null,
              "role_archetype": null,
              "slug": "devops-engineer",
              "source": "db"
            },
            {
              "display_name": "Fullstack Developer",
              "id": 15,
              "rationale": null,
              "role_archetype": null,
              "slug": "full-stack-engineer",
              "source": "db"
            },
            {
              "display_name": "Go Backend Developer",
              "id": 81,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "go-backend-developer",
              "source": "db"
            },
            {
              "display_name": "Java Backend Developer",
              "id": 79,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "java-backend-developer",
              "source": "db"
            },
            {
              "display_name": "Kotlin Backend Developer",
              "id": 84,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "kotlin-server-backend-developer",
              "source": "db"
            },
            {
              "display_name": "ML Engineer",
              "id": 3,
              "rationale": null,
              "role_archetype": null,
              "slug": "ml-engineer",
              "source": "db"
            },
            {
              "display_name": "MLOps Engineer",
              "id": 16,
              "rationale": null,
              "role_archetype": null,
              "slug": "ml-ops-engineer",
              "source": "db"
            },
            {
              "display_name": "Node.js Backend Developer",
              "id": 82,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "node-backend-developer",
              "source": "db"
            },
            {
              "display_name": "Python Backend Developer",
              "id": 80,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "python-backend-developer",
              "source": "db"
            },
            {
              "display_name": "Scala Backend Developer",
              "id": 87,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "scala-backend-developer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "Microsoft Azure",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Azure Databricks",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Cloud Platforms",
          "skill_nature": "PLATFORM",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "azure-databricks",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "Apache Spark",
          "alias_type": "CANONICAL",
          "id": 2004,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "apache spark 3",
          "alias_type": "VERSION",
          "id": 2006,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "spark",
          "alias_type": "VERSION",
          "id": 2510,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "spark 3",
          "alias_type": "VERSION",
          "id": 2007,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "spark 3.x",
          "alias_type": "VERSION",
          "id": 2009,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "spark3",
          "alias_type": "VERSION",
          "id": 2008,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 5,
        "display_name": "Apache Spark",
        "id": 1350,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "FRAMEWORK",
        "slug": "apache-spark",
        "sub_category_id": 1021,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "ETL and ELT Tooling",
            "id": 24,
            "rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
            "slug": "etl-and-elt-tooling",
            "source": "db"
          },
          "input_skill": "PySpark",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Data Engineer",
              "id": 2,
              "rationale": null,
              "role_archetype": null,
              "slug": "data-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "PySpark",
      "matched_via": "embedding_alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "ETL",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Data Engineering Tools",
          "skill_nature": "PRACTICE",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "etl",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Data Pipelines",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Data Engineering Tools",
          "skill_nature": "CONCEPT",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "data-pipelines",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Data Integration",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Data Engineering Tools",
          "skill_nature": "CONCEPT",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "data-integration",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Data Quality",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Data Engineering Tools",
          "skill_nature": "CONCEPT",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "data-quality",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    }
  ],
  "unmatched_skills": [
    "Azure Databricks",
    "ETL",
    "Data Pipelines",
    "Data Integration",
    "Data Quality"
  ]
}

API 3 — final-role-output

{
  "chosen_role": {
    "display_name": "Data Engineer",
    "id": 2,
    "rationale": "Exact alias hit on data-engineer (1.0) \u2014 no other alias at this confidence; skill_top absent does not contradict",
    "role_archetype": null,
    "slug": "data-engineer",
    "source": "db"
  },
  "chosen_role_resolution": "in_db",
  "final_input_skills": [
    {
      "skill": "Databricks",
      "tag": "in_db"
    },
    {
      "skill": "Microsoft Azure",
      "tag": "in_db"
    },
    {
      "skill": "Azure Databricks",
      "tag": "new"
    },
    {
      "skill": "PySpark",
      "tag": "in_db"
    },
    {
      "skill": "ETL",
      "tag": "new"
    },
    {
      "skill": "Data Pipelines",
      "tag": "new"
    },
    {
      "skill": "Data Integration",
      "tag": "new"
    },
    {
      "skill": "Data Quality",
      "tag": "new"
    }
  ],
  "llm_cost_api1_usd": null,
  "llm_cost_api2_usd": null,
  "llm_cost_api3_usd": null,
  "llm_cost_total_usd": null,
  "persistence": {
    "items": [
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "React Frontend Development",
          "id": 96,
          "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
          "slug": "d_init_01",
          "source": "db"
        },
        "dimension_id": 96,
        "input_skill": "Databricks",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [],
        "skill_dimension_saved": true,
        "skill_id": 1202,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Cloud \u0026 Hosting Providers",
          "id": 414,
          "rationale": "Knowledge of major cloud and hosting vendor platforms for deploying and managing PHP applications.",
          "slug": "cloud-hosting-providers",
          "source": "db"
        },
        "dimension_id": 414,
        "input_skill": "Microsoft Azure",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "PHP Backend Developer",
            "id": 86,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "php-backend-developer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 97,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Cloud Platforms",
          "id": 20,
          "rationale": "Underlying cloud providers that host the managed services or infrastructure used by the role, such as AWS, Azure, and GCP.",
          "slug": "cloud-platforms",
          "source": "db"
        },
        "dimension_id": 20,
        "input_skill": "Microsoft Azure",
        "llm_role": null,
        "matched_chosen_role": true,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension saved",
        "role_dimension_saved": true,
        "roles_from_db": [
          {
            "display_name": ".NET Backend Developer",
            "id": 83,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "dotnet-backend-developer",
            "source": "db"
          },
          {
            "display_name": "Backend Developer",
            "id": 1,
            "rationale": null,
            "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
            "slug": "backend-engineer",
            "source": "db"
          },
          {
            "display_name": "Cyber Security Engineer",
            "id": 5,
            "rationale": null,
            "role_archetype": null,
            "slug": "cybersecurity-engineer",
            "source": "db"
          },
          {
            "display_name": "Data Engineer",
            "id": 2,
            "rationale": null,
            "role_archetype": null,
            "slug": "data-engineer",
            "source": "db"
          },
          {
            "display_name": "DevOps Engineer",
            "id": 10,
            "rationale": null,
            "role_archetype": null,
            "slug": "devops-engineer",
            "source": "db"
          },
          {
            "display_name": "Fullstack Developer",
            "id": 15,
            "rationale": null,
            "role_archetype": null,
            "slug": "full-stack-engineer",
            "source": "db"
          },
          {
            "display_name": "Go Backend Developer",
            "id": 81,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "go-backend-developer",
            "source": "db"
          },
          {
            "display_name": "Java Backend Developer",
            "id": 79,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "java-backend-developer",
            "source": "db"
          },
          {
            "display_name": "Kotlin Backend Developer",
            "id": 84,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "kotlin-server-backend-developer",
            "source": "db"
          },
          {
            "display_name": "ML Engineer",
            "id": 3,
            "rationale": null,
            "role_archetype": null,
            "slug": "ml-engineer",
            "source": "db"
          },
          {
            "display_name": "MLOps Engineer",
            "id": 16,
            "rationale": null,
            "role_archetype": null,
            "slug": "ml-ops-engineer",
            "source": "db"
          },
          {
            "display_name": "Node.js Backend Developer",
            "id": 82,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "node-backend-developer",
            "source": "db"
          },
          {
            "display_name": "Python Backend Developer",
            "id": 80,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "python-backend-developer",
            "source": "db"
          },
          {
            "display_name": "Scala Backend Developer",
            "id": 87,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "scala-backend-developer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 97,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "ETL and ELT Tooling",
          "id": 24,
          "rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
          "slug": "etl-and-elt-tooling",
          "source": "db"
        },
        "dimension_id": 24,
        "input_skill": "PySpark",
        "llm_role": null,
        "matched_chosen_role": true,
        "outcome_line": "Skipped \u2014 no persistable v3 meta for new skill",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Data Engineer",
            "id": 2,
            "rationale": null,
            "role_archetype": null,
            "slug": "data-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": false,
        "skill_id": null,
        "skill_tag": "new",
        "skipped_reason": "skill_not_in_db_v3_proposed"
      }
    ],
    "new_skills_created": 0,
    "role_dimension_saved": 0,
    "skill_dimension_saved": 0,
    "skipped": 1
  },
  "planner_output": null,
  "run_id": "77354b2c-e0d4-4a36-97d0-e45a80123e96"
}

LLM Calls

Every model call made for this run, in pipeline order. Click a card to see the model's response.

Loading…