Pipeline run

c42220f8-c0cc-4546-85e0-3207f16c8f75

Pipeline LLM cost (USD)

API 1: $0.0038 API 2: $0.0006 API 3: $0.0000 Total: $0.0044

Client output enrichment

v2 Skill cluster · Nature of work · AI index · Tech stack maturity · Evidence · KRA description

role baseline loaded sources · ai_index: jd · nature_of_work: jd · tech_stack_maturity: jd

Nature of work · Data pipeline development

Build and automate batch/streaming data pipelines, ETL and data models, then source, cleanse, validate and monitor large-scale data in a governed bank environment while reducing manual steps and supporting cost-aware, reusable solutions.

"“Building advanced automation of data engineering pipelines through removal of manual stages”"

Tech stack maturity

Mainstream Modern

AI index (0 = no AI use, 5 = totally AI-dependent · v2.1)

0.20 / 5

· Title match

✓ Has AI skill

· AI skill (primary)

· AI skill (secondary)

· On AI team

· Builds AI products

vocab breakdown (legacy)

Assistants (×1): —

Frameworks (×2): —

Models / concepts (×3): embeddings

Evidence — skills matched in JD (17)

ETL Data Modeling Data Pipelines Streaming Data Data Ingestion Data Warehousing Data Quality Testing Data Cleansing Data Monitoring Data Analysis Data Exploration Programming Languages Software Engineering Fundamentals Code Development Practices Automation Data Architecture Experiment Design

Skill cluster (1 dimension groups, role-scoped)

Cross-cutting / unaligned

Show KRA description ↓

As a Data Engineer, you’ll be looking to simplify our organisation by developing innovative data driven solutions through data pipelines, modelling and ETL design, inspiring to be commercially successful while keeping our customers, and the bank’s data, safe and secure. You’ll drive customer value by understanding complex business problems and requirements to correctly apply the most appropriate and reusable tool to gather and build data solutions. You’ll support our strategic direction by engaging with the data engineering community to deliver opportunities, along with carrying out complex data engineering tasks to build a scalable data architecture. • Building advanced automation of data engineering pipelines through removal of manual stages • Embedding new data techniques into our business through role modelling, training, and experiment design oversight • Delivering a clear understanding of data platform costs to meet your departments cost saving and income targets • Sourcing new data using the most appropriate tooling for the situation • Developing solutions for streaming data ingestion and transformations in line with our streaming strategy To thrive in this role, you’ll need a strong understanding of data usage and dependencies and experience of extracting value and features from large scale data. You’ll also bring practical experience of programming languages alongside knowledge of data and software engineering fundamentals. Additionally, you’ll need: • Experience of ETL technical design, data quality testing, cleansing and monitoring, data sourcing, and exploration and analysis • Data warehousing and data modelling capabilities • A good understanding of modern code development practices • Experience of working in a governed, and regulatory environment • Strong communication skills with the ability to proactively engage and manage a wide range of stakeholders

Signals

Skill —

—

Alias data-engineer

1.00

KRA data-engineer

0.60

Post-classification

Centroidupdated · n=465

Alias collision log—

New-role queue—

New skills captured17

New KRA captured—

Captured for admin review

ETL primary ↔ Data Engineer pending

Data Modeling primary ↔ Data Engineer pending

Data Pipelines primary ↔ Data Engineer pending

Streaming Data primary ↔ Data Engineer pending

Data Ingestion primary ↔ Data Engineer pending

Data Warehousing primary ↔ Data Engineer pending

Data Quality Testing primary ↔ Data Engineer pending

Data Cleansing primary ↔ Data Engineer pending

Data Monitoring primary ↔ Data Engineer pending

Data Analysis primary ↔ Data Engineer pending

Data Exploration primary ↔ Data Engineer pending

Programming Languages primary ↔ Data Engineer pending

Software Engineering Fundamentals primary ↔ Data Engineer pending

Code Development Practices primary ↔ Data Engineer pending

Automation primary ↔ Data Engineer pending

Experiment Design ↔ Data Engineer pending

Data Architecture primary ↔ Data Engineer pending

Status: completed Created: 2026-05-27T16:44:55.211187Z Updated: 2026-05-27T16:46:32.973418Z API 3 duration: 1829 ms

Flow Current 3-step pipeline

1 POST /skills/extract-from-jd

2 POST /skills/extract-details

3 POST /skills/final-role-output

Role Chosen role & resolution

Data Engineer

CASE A

slug: data-engineer · id: 2 · source: db

Exact alias hit on data-engineer (1.0) — no other alias at this confidence; skill_top absent does not contradict

Resolution: in_db — role exists in library; skill↔dim and role↔dim links saved when applicable.

New skills

Skill↔dim saved

Role↔dim saved

Skipped

Job description

Our people work differently depending on their jobs and needs. From hybrid working to flexible hours , we have plenty of options that help our people to thrive.

This role is based in India and as such all normal working days must be carried out in India.

Join us as a Data Engineer


• You’ll be the voice of our customers, using data to tell their stories and put them at the heart of all decision-making
• We’ll look to you to drive the build of effortless, digital first customer experiences
• If you’re ready for a new challenge and want to make a far-reaching impact through your work, this could be the opportunity you’re looking for
• We're offering this role at vice president level


What you'll do

As a Data Engineer, you’ll be looking to simplify our organisation by developing innovative data driven solutions through data pipelines, modelling and ETL design, inspiring to be commercially successful while keeping our customers, and the bank’s data, safe and secure.

You’ll drive customer value by understanding complex business problems and requirements to correctly apply the most appropriate and reusable tool to gather and build data solutions. You’ll support our strategic direction by engaging with the data engineering community to deliver opportunities, along with carrying out complex data engineering tasks to build a scalable data architecture.

Your responsibilities will also include:


• Building advanced automation of data engineering pipelines through removal of manual stages
• Embedding new data techniques into our business through role modelling, training, and experiment design oversight
• Delivering a clear understanding of data platform costs to meet your departments cost saving and income targets
• Sourcing new data using the most appropriate tooling for the situation
• Developing solutions for streaming data ingestion and transformations in line with our streaming strategy


The skills you'll need

To thrive in this role, you’ll need a strong understanding of data usage and dependencies and experience of extracting value and features from large scale data. You’ll also bring practical experience of programming languages alongside knowledge of data and software engineering fundamentals.

Additionally, you’ll need:


• Experience of ETL technical design, data quality testing, cleansing and monitoring, data sourcing, and exploration and analysis
• Data warehousing and data modelling capabilities
• A good understanding of modern code development practices
• Experience of working in a governed, and regulatory environment
• Strong communication skills with the ability to proactively engage and manage a wide range of stakeholders




Apply for this job

Skills from this JD

Each row merges API 1 extraction, API 2 library match / v3 orchestration (dimensions + locked dims), and API 3 persistence tags.

ETL Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields

Category: Data Engineering Tools
Sub-category: general
Skill nature: PRACTICE
Volatility: MEDIUM
Typical lifespan: MULTI_YEAR
Version strategy: UNVERSIONED

Data Modeling Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)

Canonical: domain modeling id=2379 · domain-modeling

Aliases — catalog

domain modeling (CANONICAL) primary
Domain Modeling (CANONICAL)

Context tags (catalog)

CQRS DDD ERD UML aggregate bounded context business logic context map context mapping data modeling domain events domain-driven design entities entity event sourcing event storming microservices repositories repository pattern service layer services value object value objects

Stored enrichment (catalog DB)

Category: Methodology
Sub-category: Domain Modeling
Confidence: 0.90
Version strategy: NOT_APPLICABLE

Maturity reasoning: Common in software JDs under DDD/business analysis; many roles ask for domain modeling or domain-driven design, and it remains a standard design skill rather than a niche tool.

Skill profile (library / DB)

Skill nature: METHODOLOGY
Volatility: STABLE
Typical lifespan: EVERGREEN
Category id: 8
Sub-category id: 2831
Extractable: True
Also category: False

Dimensions (API 2 worklist)

Application Architecture Patterns Catalog dimension db id 293

Library dimension (catalog)

Roles linked in library: .NET Backend Developer, Python Backend Developer
Service Architecture and Design Patterns Catalog dimension db id 18

Library dimension (catalog)

Roles linked in library: Backend Developer, Java Backend Developer, Kotlin Backend Developer, Node.js Backend Developer, PHP Backend Developer, Ruby Backend Developer, Scala Backend Developer

API 3 link attempts (this skill)

Dimension	Skill↔dim	Role↔dim	Outcome
Application Architecture Patterns application-architecture-patterns	—	—	Skipped — no persistable v3 meta for new skill skill_not_in_db_v3_proposed
Service Architecture and Design Patterns service-architecture-and-design-patterns	—	—	Skipped — no persistable v3 meta for new skill skill_not_in_db_v3_proposed

Data Pipelines Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields

Category: Data Engineering Tools
Sub-category: general
Skill nature: CONCEPT
Volatility: MEDIUM
Typical lifespan: MULTI_YEAR
Version strategy: UNVERSIONED

Streaming Data Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields

Category: Data Engineering Tools
Sub-category: general
Skill nature: CONCEPT
Volatility: MEDIUM
Typical lifespan: MULTI_YEAR
Version strategy: UNVERSIONED

Data Ingestion Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields

Category: Data Engineering Tools
Sub-category: general
Skill nature: PRACTICE
Volatility: MEDIUM
Typical lifespan: MULTI_YEAR
Version strategy: UNVERSIONED

Data Warehousing Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields

Category: Databases
Sub-category: general
Skill nature: CONCEPT
Volatility: MEDIUM
Typical lifespan: MULTI_YEAR
Version strategy: UNVERSIONED

Data Quality Testing Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields

Category: Data Engineering Tools
Sub-category: general
Skill nature: PRACTICE
Volatility: MEDIUM
Typical lifespan: MULTI_YEAR
Version strategy: UNVERSIONED

Data Cleansing Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields

Category: Data Engineering Tools
Sub-category: general
Skill nature: PRACTICE
Volatility: MEDIUM
Typical lifespan: MULTI_YEAR
Version strategy: UNVERSIONED

Data Monitoring Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields

Category: Data Engineering Tools
Sub-category: general
Skill nature: PRACTICE
Volatility: MEDIUM
Typical lifespan: MULTI_YEAR
Version strategy: UNVERSIONED

Data Analysis Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields

Category: Data Engineering Tools
Sub-category: general
Skill nature: PRACTICE
Volatility: MEDIUM
Typical lifespan: MULTI_YEAR
Version strategy: UNVERSIONED

Data Exploration Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields

Category: Data Engineering Tools
Sub-category: general
Skill nature: PRACTICE
Volatility: MEDIUM
Typical lifespan: MULTI_YEAR
Version strategy: UNVERSIONED

Programming Languages Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields

Category: Programming Languages
Sub-category: general
Skill nature: CONCEPT
Volatility: STABLE
Typical lifespan: EVERGREEN
Version strategy: UNVERSIONED

Software Engineering Fundamentals Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields

Category: Software Engineering
Sub-category: general
Skill nature: CONCEPT
Volatility: STABLE
Typical lifespan: EVERGREEN
Version strategy: UNVERSIONED

Code Development Practices Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields

Category: Software Engineering
Sub-category: general
Skill nature: PRACTICE
Volatility: MEDIUM
Typical lifespan: MULTI_YEAR
Version strategy: UNVERSIONED

Automation Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields

Category: Software Engineering
Sub-category: general
Skill nature: PRACTICE
Volatility: MEDIUM
Typical lifespan: MULTI_YEAR
Version strategy: UNVERSIONED

Experiment Design Secondary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields

Category: Data Science
Sub-category: general
Skill nature: PRACTICE
Volatility: MEDIUM
Typical lifespan: MULTI_YEAR
Version strategy: UNVERSIONED

Data Architecture Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields

Category: Data Engineering Tools
Sub-category: general
Skill nature: CONCEPT
Volatility: MEDIUM
Typical lifespan: MULTI_YEAR
Version strategy: UNVERSIONED

All API 3 persistence rows

Same grid as the skill-extractor “Persistence items” table: one row per (skill × dimension) work item.

Skill	Tag	Dimension	Skill↔dim	Role↔dim	Outcome	Notes
Data Modeling	new	Application Architecture Patterns application-architecture-patterns	—	—	Skipped — no persistable v3 meta for new skill	skill_not_in_db_v3_proposed
Data Modeling	new	Service Architecture and Design Patterns service-architecture-and-design-patterns	—	—	Skipped — no persistable v3 meta for new skill	skill_not_in_db_v3_proposed

Library artifacts (this run)

Kind	Detail	DB id
canonical_skill_proposed	ETL \| type=Data Engineering Tools subtype=general nature=PRACTICE lifespan=MULTI_YEAR
canonical_skill_proposed	Data Pipelines \| type=Data Engineering Tools subtype=general nature=CONCEPT lifespan=MULTI_YEAR
canonical_skill_proposed	Streaming Data \| type=Data Engineering Tools subtype=general nature=CONCEPT lifespan=MULTI_YEAR
canonical_skill_proposed	Data Ingestion \| type=Data Engineering Tools subtype=general nature=PRACTICE lifespan=MULTI_YEAR
canonical_skill_proposed	Data Warehousing \| type=Databases subtype=general nature=CONCEPT lifespan=MULTI_YEAR
canonical_skill_proposed	Data Quality Testing \| type=Data Engineering Tools subtype=general nature=PRACTICE lifespan=MULTI_YEAR
canonical_skill_proposed	Data Cleansing \| type=Data Engineering Tools subtype=general nature=PRACTICE lifespan=MULTI_YEAR
canonical_skill_proposed	Data Monitoring \| type=Data Engineering Tools subtype=general nature=PRACTICE lifespan=MULTI_YEAR
canonical_skill_proposed	Data Analysis \| type=Data Engineering Tools subtype=general nature=PRACTICE lifespan=MULTI_YEAR
canonical_skill_proposed	Data Exploration \| type=Data Engineering Tools subtype=general nature=PRACTICE lifespan=MULTI_YEAR
canonical_skill_proposed	Programming Languages \| type=Programming Languages subtype=general nature=CONCEPT lifespan=EVERGREEN
canonical_skill_proposed	Software Engineering Fundamentals \| type=Software Engineering subtype=general nature=CONCEPT lifespan=EVERGREEN
canonical_skill_proposed	Code Development Practices \| type=Software Engineering subtype=general nature=PRACTICE lifespan=MULTI_YEAR
canonical_skill_proposed	Automation \| type=Software Engineering subtype=general nature=PRACTICE lifespan=MULTI_YEAR
canonical_skill_proposed	Experiment Design \| type=Data Science subtype=general nature=PRACTICE lifespan=MULTI_YEAR
canonical_skill_proposed	Data Architecture \| type=Data Engineering Tools subtype=general nature=CONCEPT lifespan=MULTI_YEAR
dimension_skill_link_proposed	Data Modeling ↔ Application Architecture Patterns
dimension_skill_link_proposed	Data Modeling ↔ Service Architecture and Design Patterns

nano JD Parser — gpt-4.1-nano click to toggle

RoleData Engineer

DomainOther

Location India (onsite)

JD type pass

Show raw JSON

{
  "JD_type": "pass",
  "about_company": null,
  "certifications": [],
  "company_name": null,
  "ctc": null,
  "domain": {
    "primary": {
      "aliases": [],
      "domain": "Other"
    },
    "secondary": null
  },
  "education": [],
  "experience": null,
  "job_locations": [
    {
      "aliases": [],
      "city": null,
      "country": "India",
      "state": null,
      "work_mode": "onsite"
    }
  ],
  "role": "Data Engineer",
  "role_aliases": [
    "Data Engineer",
    "Data Developer",
    "ETL Developer"
  ],
  "role_archetype": "Data",
  "roles_and_responsibilities": [
    {
      "bullet_count": 0,
      "heading": "What you\u0027ll do",
      "heading_was_present": true,
      "source_marker": {
        "first_5_words": "As a Data Engineer, you\u2019ll",
        "last_5_words": "build a scalable data architecture."
      },
      "text": "As a Data Engineer, you\u2019ll be looking to simplify our organisation by developing innovative data driven solutions through data pipelines, modelling and ETL design, inspiring to be commercially successful while keeping our customers, and the bank\u2019s data, safe and secure.\n\nYou\u2019ll drive customer value by understanding complex business problems and requirements to correctly apply the most appropriate and reusable tool to gather and build data solutions. You\u2019ll support our strategic direction by engaging with the data engineering community to deliver opportunities, along with carrying out complex data engineering tasks to build a scalable data architecture.",
      "word_count": 104
    },
    {
      "bullet_count": 5,
      "heading": "Your responsibilities will also include",
      "heading_was_present": true,
      "source_marker": {
        "first_5_words": "\u2022 Building advanced automation of",
        "last_5_words": "in line with our streaming strategy"
      },
      "text": "\u2022 Building advanced automation of data engineering pipelines through removal of manual stages\n\u2022 Embedding new data techniques into our business through role modelling, training, and experiment design oversight\n\u2022 Delivering a clear understanding of data platform costs to meet your departments cost saving and income targets\n\u2022 Sourcing new data using the most appropriate tooling for the situation\n\u2022 Developing solutions for streaming data ingestion and transformations in line with our streaming strategy",
      "word_count": 56
    },
    {
      "bullet_count": 5,
      "heading": "The skills you\u0027ll need",
      "heading_was_present": true,
      "source_marker": {
        "first_5_words": "To thrive in this role,",
        "last_5_words": "manage a wide range of stakeholders"
      },
      "text": "To thrive in this role, you\u2019ll need a strong understanding of data usage and dependencies and experience of extracting value and features from large scale data. You\u2019ll also bring practical experience of programming languages alongside knowledge of data and software engineering fundamentals.\n\nAdditionally, you\u2019ll need:\n\n\u2022 Experience of ETL technical design, data quality testing, cleansing and monitoring, data sourcing, and exploration and analysis\n\u2022 Data warehousing and data modelling capabilities\n\u2022 A good understanding of modern code development practices\n\u2022 Experience of working in a governed, and regulatory environment\n\u2022 Strong communication skills with the ability to proactively engage and manage a wide range of stakeholders",
      "word_count": 139
    }
  ],
  "urls": []
}

API 1 — extract-from-jd click to toggle

{
  "final_skills": [
    {
      "is_primary": true,
      "skill_name": "ETL"
    },
    {
      "is_primary": true,
      "skill_name": "Data Modeling"
    },
    {
      "is_primary": true,
      "skill_name": "Data Pipelines"
    },
    {
      "is_primary": true,
      "skill_name": "Streaming Data"
    },
    {
      "is_primary": true,
      "skill_name": "Data Ingestion"
    },
    {
      "is_primary": true,
      "skill_name": "Data Warehousing"
    },
    {
      "is_primary": true,
      "skill_name": "Data Quality Testing"
    },
    {
      "is_primary": true,
      "skill_name": "Data Cleansing"
    },
    {
      "is_primary": true,
      "skill_name": "Data Monitoring"
    },
    {
      "is_primary": true,
      "skill_name": "Data Analysis"
    },
    {
      "is_primary": true,
      "skill_name": "Data Exploration"
    },
    {
      "is_primary": true,
      "skill_name": "Programming Languages"
    },
    {
      "is_primary": true,
      "skill_name": "Software Engineering Fundamentals"
    },
    {
      "is_primary": true,
      "skill_name": "Code Development Practices"
    },
    {
      "is_primary": true,
      "skill_name": "Automation"
    },
    {
      "is_primary": false,
      "skill_name": "Experiment Design"
    },
    {
      "is_primary": true,
      "skill_name": "Data Architecture"
    }
  ],
  "jd_role": {
    "display_name": "Data Engineer",
    "rationale": null,
    "role_aliases": [
      "Data Engineer",
      "Data Developer",
      "ETL Developer"
    ],
    "role_archetype": "Data",
    "slug": ""
  },
  "nano_parsed": {
    "JD_type": "pass",
    "about_company": null,
    "certifications": [],
    "company_name": null,
    "ctc": null,
    "domain": {
      "primary": {
        "aliases": [],
        "domain": "Other"
      },
      "secondary": null
    },
    "education": [],
    "experience": null,
    "job_locations": [
      {
        "aliases": [],
        "city": null,
        "country": "India",
        "state": null,
        "work_mode": "onsite"
      }
    ],
    "role": "Data Engineer",
    "role_aliases": [
      "Data Engineer",
      "Data Developer",
      "ETL Developer"
    ],
    "role_archetype": "Data",
    "roles_and_responsibilities": [
      {
        "bullet_count": 0,
        "heading": "What you\u0027ll do",
        "heading_was_present": true,
        "source_marker": {
          "first_5_words": "As a Data Engineer, you\u2019ll",
          "last_5_words": "build a scalable data architecture."
        },
        "text": "As a Data Engineer, you\u2019ll be looking to simplify our organisation by developing innovative data driven solutions through data pipelines, modelling and ETL design, inspiring to be commercially successful while keeping our customers, and the bank\u2019s data, safe and secure.\n\nYou\u2019ll drive customer value by understanding complex business problems and requirements to correctly apply the most appropriate and reusable tool to gather and build data solutions. You\u2019ll support our strategic direction by engaging with the data engineering community to deliver opportunities, along with carrying out complex data engineering tasks to build a scalable data architecture.",
        "word_count": 104
      },
      {
        "bullet_count": 5,
        "heading": "Your responsibilities will also include",
        "heading_was_present": true,
        "source_marker": {
          "first_5_words": "\u2022 Building advanced automation of",
          "last_5_words": "in line with our streaming strategy"
        },
        "text": "\u2022 Building advanced automation of data engineering pipelines through removal of manual stages\n\u2022 Embedding new data techniques into our business through role modelling, training, and experiment design oversight\n\u2022 Delivering a clear understanding of data platform costs to meet your departments cost saving and income targets\n\u2022 Sourcing new data using the most appropriate tooling for the situation\n\u2022 Developing solutions for streaming data ingestion and transformations in line with our streaming strategy",
        "word_count": 56
      },
      {
        "bullet_count": 5,
        "heading": "The skills you\u0027ll need",
        "heading_was_present": true,
        "source_marker": {
          "first_5_words": "To thrive in this role,",
          "last_5_words": "manage a wide range of stakeholders"
        },
        "text": "To thrive in this role, you\u2019ll need a strong understanding of data usage and dependencies and experience of extracting value and features from large scale data. You\u2019ll also bring practical experience of programming languages alongside knowledge of data and software engineering fundamentals.\n\nAdditionally, you\u2019ll need:\n\n\u2022 Experience of ETL technical design, data quality testing, cleansing and monitoring, data sourcing, and exploration and analysis\n\u2022 Data warehousing and data modelling capabilities\n\u2022 A good understanding of modern code development practices\n\u2022 Experience of working in a governed, and regulatory environment\n\u2022 Strong communication skills with the ability to proactively engage and manage a wide range of stakeholders",
        "word_count": 139
      }
    ],
    "urls": []
  },
  "rejected": false,
  "rejection_reason": null,
  "run_id": "c42220f8-c0cc-4546-85e0-3207f16c8f75",
  "stage3_signals": {
    "alias_found": true,
    "alias_match_roles": [
      {
        "display_name": "Data Engineer",
        "kra_matches": null,
        "matched_count": null,
        "matched_skills": null,
        "role_id": 2,
        "score": 1.0,
        "slug": "data-engineer",
        "total_count": null
      }
    ],
    "kra_match_roles": [
      {
        "display_name": "Data Engineer",
        "kra_matches": [
          {
            "kra_text": "Designs dimensional models, star schemas, data vault structures, and curated data mart tables to support BI tools and self-service analytics consumption.",
            "sentence": "Data warehousing and data modelling capabilities",
            "similarity": 0.623
          },
          {
            "kra_text": "Develops batch and real-time streaming data pipelines using Apache Spark, Apache Kafka, Apache Flink, or Airflow for data movement and processing at scale.",
            "sentence": "Developing solutions for streaming data ingestion and transformations in line with our streaming strategy",
            "similarity": 0.613
          },
          {
            "kra_text": "Works with data analysts, data scientists, and business stakeholders to define data models, ingestion schedules, and data delivery requirements.",
            "sentence": "Experience of ETL technical design, data quality testing, cleansing and monitoring, data sourcing, and exploration and analysis",
            "similarity": 0.5706
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 2,
        "score": 0.6022,
        "slug": "data-engineer",
        "total_count": null
      },
      {
        "display_name": "Java Backend Developer",
        "kra_matches": [
          {
            "kra_text": "persistence and data modeling",
            "sentence": "Data warehousing and data modelling capabilities",
            "similarity": 0.5691
          },
          {
            "kra_text": "persistence and data modeling",
            "sentence": "Embedding new data techniques into our business through role modelling, training, and experiment design oversight",
            "similarity": 0.5151
          },
          {
            "kra_text": "code refactoring and defect fixes",
            "sentence": "A good understanding of modern code development practices",
            "similarity": 0.4735
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 79,
        "score": 0.5192,
        "slug": "java-backend-developer",
        "total_count": null
      },
      {
        "display_name": "Node.js Backend Developer",
        "kra_matches": [
          {
            "kra_text": "data modeling and persistence access",
            "sentence": "Data warehousing and data modelling capabilities",
            "similarity": 0.5597
          },
          {
            "kra_text": "code review and refactoring",
            "sentence": "A good understanding of modern code development practices",
            "similarity": 0.5247
          },
          {
            "kra_text": "data modeling and persistence access",
            "sentence": "Embedding new data techniques into our business through role modelling, training, and experiment design oversight",
            "similarity": 0.4542
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 82,
        "score": 0.5129,
        "slug": "node-backend-developer",
        "total_count": null
      },
      {
        "display_name": "Scala Backend Developer",
        "kra_matches": [
          {
            "kra_text": "application data modeling",
            "sentence": "Data warehousing and data modelling capabilities",
            "similarity": 0.5262
          },
          {
            "kra_text": "application data modeling",
            "sentence": "Embedding new data techniques into our business through role modelling, training, and experiment design oversight",
            "similarity": 0.4731
          },
          {
            "kra_text": "backend workflow orchestration",
            "sentence": "Building advanced automation of data engineering pipelines through removal of manual stages",
            "similarity": 0.4578
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 87,
        "score": 0.4857,
        "slug": "scala-backend-developer",
        "total_count": null
      },
      {
        "display_name": "Ruby Backend Developer",
        "kra_matches": [
          {
            "kra_text": "refactoring and code organization",
            "sentence": "A good understanding of modern code development practices",
            "similarity": 0.5176
          },
          {
            "kra_text": "data access and persistence",
            "sentence": "Data warehousing and data modelling capabilities",
            "similarity": 0.4635
          },
          {
            "kra_text": "automated backend checks",
            "sentence": "Building advanced automation of data engineering pipelines through removal of manual stages",
            "similarity": 0.4293
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 85,
        "score": 0.4701,
        "slug": "ruby-backend-developer",
        "total_count": null
      }
    ],
    "skill_match_roles": []
  },
  "stage4_decision": {
    "alias_collision_detected": false,
    "case": "A",
    "chosen_role": {
      "display_name": "Data Engineer",
      "kra_matches": null,
      "matched_count": null,
      "matched_skills": null,
      "role_id": 2,
      "score": 1.0,
      "slug": "data-engineer",
      "total_count": null
    },
    "confidence": 1.0,
    "is_new_role": false,
    "llm2_fired": false,
    "llm2_reasoning": null,
    "matched_dimensions": [],
    "matched_kras": [],
    "matched_skills": [],
    "new_role_display_name": null,
    "new_role_slug": null,
    "queued": false,
    "reasoning": "Exact alias hit on data-engineer (1.0) \u2014 no other alias at this confidence; skill_top absent does not contradict",
    "sub_role": null
  },
  "stage5_updates": {
    "centroid_n_after": 465,
    "centroid_updated": true,
    "collision_log_id": null,
    "new_kra_attached": null,
    "new_skills_attached": [
      {
        "is_primary": true,
        "queue_id": 21796,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "ETL",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 21797,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Data Modeling",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 21799,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Data Pipelines",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 21801,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Streaming Data",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 21803,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Data Ingestion",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 21804,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Data Warehousing",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 21806,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Data Quality Testing",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 21808,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Data Cleansing",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 21811,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Data Monitoring",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 21813,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Data Analysis",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 21815,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Data Exploration",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 21816,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Programming Languages",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 21818,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Software Engineering Fundamentals",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 21820,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Code Development Practices",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 21822,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Automation",
        "status": "pending"
      },
      {
        "is_primary": false,
        "queue_id": 21824,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Experiment Design",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 21826,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Data Architecture",
        "status": "pending"
      }
    ],
    "queue_entry_id": null,
    "v3_pipeline_triggered": false,
    "v3_role_slug": null,
    "v3_run_id": null
  }
}

API 2 — extract-details

{
  "alias_matches": [
    {
      "alias_persist_skipped_reason": "TODO: REMOVE AFTER TESTING \u2014 alias DB write disabled",
      "alias_persisted": false,
      "existing_alias_id": 5644,
      "existing_alias_text": "Domain Modeling",
      "input_term": "Data Modeling",
      "matched_canonical": {
        "category_id": 8,
        "display_name": "domain modeling",
        "id": 2379,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "METHODOLOGY",
        "slug": "domain-modeling",
        "sub_category_id": 2831,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "embedding_alias"
    }
  ],
  "candidate_roles": [
    {
      "display_name": ".NET Backend Developer",
      "id": 83,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "dotnet-backend-developer",
      "source": "db"
    },
    {
      "display_name": "Python Backend Developer",
      "id": 80,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "python-backend-developer",
      "source": "db"
    },
    {
      "display_name": "Backend Developer",
      "id": 1,
      "rationale": null,
      "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
      "slug": "backend-engineer",
      "source": "db"
    },
    {
      "display_name": "Java Backend Developer",
      "id": 79,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "java-backend-developer",
      "source": "db"
    },
    {
      "display_name": "Kotlin Backend Developer",
      "id": 84,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "kotlin-server-backend-developer",
      "source": "db"
    },
    {
      "display_name": "Node.js Backend Developer",
      "id": 82,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "node-backend-developer",
      "source": "db"
    },
    {
      "display_name": "PHP Backend Developer",
      "id": 86,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "php-backend-developer",
      "source": "db"
    },
    {
      "display_name": "Ruby Backend Developer",
      "id": 85,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "ruby-backend-developer",
      "source": "db"
    },
    {
      "display_name": "Scala Backend Developer",
      "id": 87,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "scala-backend-developer",
      "source": "db"
    }
  ],
  "chosen_role": {
    "display_name": "Data Engineer",
    "id": 2,
    "rationale": "Exact alias hit on data-engineer (1.0) \u2014 no other alias at this confidence; skill_top absent does not contradict",
    "role_archetype": null,
    "slug": "data-engineer",
    "source": "db"
  },
  "dimensions": [
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Application Architecture Patterns",
        "id": 293,
        "rationale": "Structural patterns for organizing Python backend code into maintainable modules, layers, and feature boundaries. This is a coherent cluster because senior backend developers are expected to refactor and shape service internals over time.",
        "slug": "application-architecture-patterns",
        "source": "db"
      },
      "input_skill": "Data Modeling",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": ".NET Backend Developer",
          "id": 83,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "dotnet-backend-developer",
          "source": "db"
        },
        {
          "display_name": "Python Backend Developer",
          "id": 80,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "python-backend-developer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Service Architecture and Design Patterns",
        "id": 18,
        "rationale": "Reusable backend design patterns used to structure service code and boundaries. Covers layering, dependency management, domain modeling, and maintainable service organization.",
        "slug": "service-architecture-and-design-patterns",
        "source": "db"
      },
      "input_skill": "Data Modeling",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Backend Developer",
          "id": 1,
          "rationale": null,
          "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
          "slug": "backend-engineer",
          "source": "db"
        },
        {
          "display_name": "Java Backend Developer",
          "id": 79,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "java-backend-developer",
          "source": "db"
        },
        {
          "display_name": "Kotlin Backend Developer",
          "id": 84,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "kotlin-server-backend-developer",
          "source": "db"
        },
        {
          "display_name": "Node.js Backend Developer",
          "id": 82,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "node-backend-developer",
          "source": "db"
        },
        {
          "display_name": "PHP Backend Developer",
          "id": 86,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "php-backend-developer",
          "source": "db"
        },
        {
          "display_name": "Ruby Backend Developer",
          "id": 85,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "ruby-backend-developer",
          "source": "db"
        },
        {
          "display_name": "Scala Backend Developer",
          "id": 87,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "scala-backend-developer",
          "source": "db"
        }
      ]
    }
  ],
  "input_final_skills": [
    "ETL",
    "Data Modeling",
    "Data Pipelines",
    "Streaming Data",
    "Data Ingestion",
    "Data Warehousing",
    "Data Quality Testing",
    "Data Cleansing",
    "Data Monitoring",
    "Data Analysis",
    "Data Exploration",
    "Programming Languages",
    "Software Engineering Fundamentals",
    "Code Development Practices",
    "Automation",
    "Experiment Design",
    "Data Architecture"
  ],
  "input_llm_skills": [
    "ETL",
    "Data Modeling",
    "Data Pipelines",
    "Streaming Data",
    "Data Ingestion",
    "Data Warehousing",
    "Data Quality Testing",
    "Data Cleansing",
    "Data Monitoring",
    "Data Analysis",
    "Data Exploration",
    "Programming Languages",
    "Software Engineering Fundamentals",
    "Code Development Practices",
    "Automation",
    "Experiment Design",
    "Data Architecture"
  ],
  "new_aliases_persisted": 0,
  "run_id": "c42220f8-c0cc-4546-85e0-3207f16c8f75",
  "skills_detail": [
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "ETL",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Data Engineering Tools",
          "skill_nature": "PRACTICE",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "etl",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "domain modeling",
          "alias_type": "CANONICAL",
          "id": 3675,
          "is_primary": true,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Domain Modeling",
          "alias_type": "CANONICAL",
          "id": 5644,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 8,
        "display_name": "domain modeling",
        "id": 2379,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "METHODOLOGY",
        "slug": "domain-modeling",
        "sub_category_id": 2831,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Application Architecture Patterns",
            "id": 293,
            "rationale": "Structural patterns for organizing Python backend code into maintainable modules, layers, and feature boundaries. This is a coherent cluster because senior backend developers are expected to refactor and shape service internals over time.",
            "slug": "application-architecture-patterns",
            "source": "db"
          },
          "input_skill": "Data Modeling",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": ".NET Backend Developer",
              "id": 83,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "dotnet-backend-developer",
              "source": "db"
            },
            {
              "display_name": "Python Backend Developer",
              "id": 80,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "python-backend-developer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Service Architecture and Design Patterns",
            "id": 18,
            "rationale": "Reusable backend design patterns used to structure service code and boundaries. Covers layering, dependency management, domain modeling, and maintainable service organization.",
            "slug": "service-architecture-and-design-patterns",
            "source": "db"
          },
          "input_skill": "Data Modeling",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Backend Developer",
              "id": 1,
              "rationale": null,
              "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
              "slug": "backend-engineer",
              "source": "db"
            },
            {
              "display_name": "Java Backend Developer",
              "id": 79,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "java-backend-developer",
              "source": "db"
            },
            {
              "display_name": "Kotlin Backend Developer",
              "id": 84,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "kotlin-server-backend-developer",
              "source": "db"
            },
            {
              "display_name": "Node.js Backend Developer",
              "id": 82,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "node-backend-developer",
              "source": "db"
            },
            {
              "display_name": "PHP Backend Developer",
              "id": 86,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "php-backend-developer",
              "source": "db"
            },
            {
              "display_name": "Ruby Backend Developer",
              "id": 85,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "ruby-backend-developer",
              "source": "db"
            },
            {
              "display_name": "Scala Backend Developer",
              "id": 87,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "scala-backend-developer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "Data Modeling",
      "matched_via": "embedding_alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Data Pipelines",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Data Engineering Tools",
          "skill_nature": "CONCEPT",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "data-pipelines",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Streaming Data",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Data Engineering Tools",
          "skill_nature": "CONCEPT",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "streaming-data",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Data Ingestion",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Data Engineering Tools",
          "skill_nature": "PRACTICE",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "data-ingestion",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Data Warehousing",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Databases",
          "skill_nature": "CONCEPT",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "data-warehousing",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Data Quality Testing",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Data Engineering Tools",
          "skill_nature": "PRACTICE",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "data-quality-testing",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Data Cleansing",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Data Engineering Tools",
          "skill_nature": "PRACTICE",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "data-cleansing",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Data Monitoring",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Data Engineering Tools",
          "skill_nature": "PRACTICE",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "data-monitoring",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Data Analysis",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Data Engineering Tools",
          "skill_nature": "PRACTICE",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "data-analysis",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Data Exploration",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Data Engineering Tools",
          "skill_nature": "PRACTICE",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "data-exploration",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Programming Languages",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Programming Languages",
          "skill_nature": "CONCEPT",
          "sub_category": "general",
          "typical_lifespan": "EVERGREEN",
          "version_strategy": "UNVERSIONED",
          "volatility": "STABLE"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "programming-languages",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Software Engineering Fundamentals",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Software Engineering",
          "skill_nature": "CONCEPT",
          "sub_category": "general",
          "typical_lifespan": "EVERGREEN",
          "version_strategy": "UNVERSIONED",
          "volatility": "STABLE"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "software-engineering-fundamentals",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Code Development Practices",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Software Engineering",
          "skill_nature": "PRACTICE",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "code-development-practices",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Automation",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Software Engineering",
          "skill_nature": "PRACTICE",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "automation",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Experiment Design",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Data Science",
          "skill_nature": "PRACTICE",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "experiment-design",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Data Architecture",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Data Engineering Tools",
          "skill_nature": "CONCEPT",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "data-architecture",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    }
  ],
  "unmatched_skills": [
    "ETL",
    "Data Pipelines",
    "Streaming Data",
    "Data Ingestion",
    "Data Warehousing",
    "Data Quality Testing",
    "Data Cleansing",
    "Data Monitoring",
    "Data Analysis",
    "Data Exploration",
    "Programming Languages",
    "Software Engineering Fundamentals",
    "Code Development Practices",
    "Automation",
    "Experiment Design",
    "Data Architecture"
  ]
}

API 3 — final-role-output

{
  "chosen_role": {
    "display_name": "Data Engineer",
    "id": 2,
    "rationale": "Exact alias hit on data-engineer (1.0) \u2014 no other alias at this confidence; skill_top absent does not contradict",
    "role_archetype": null,
    "slug": "data-engineer",
    "source": "db"
  },
  "chosen_role_resolution": "in_db",
  "final_input_skills": [
    {
      "skill": "ETL",
      "tag": "new"
    },
    {
      "skill": "Data Modeling",
      "tag": "in_db"
    },
    {
      "skill": "Data Pipelines",
      "tag": "new"
    },
    {
      "skill": "Streaming Data",
      "tag": "new"
    },
    {
      "skill": "Data Ingestion",
      "tag": "new"
    },
    {
      "skill": "Data Warehousing",
      "tag": "new"
    },
    {
      "skill": "Data Quality Testing",
      "tag": "new"
    },
    {
      "skill": "Data Cleansing",
      "tag": "new"
    },
    {
      "skill": "Data Monitoring",
      "tag": "new"
    },
    {
      "skill": "Data Analysis",
      "tag": "new"
    },
    {
      "skill": "Data Exploration",
      "tag": "new"
    },
    {
      "skill": "Programming Languages",
      "tag": "new"
    },
    {
      "skill": "Software Engineering Fundamentals",
      "tag": "new"
    },
    {
      "skill": "Code Development Practices",
      "tag": "new"
    },
    {
      "skill": "Automation",
      "tag": "new"
    },
    {
      "skill": "Experiment Design",
      "tag": "new"
    },
    {
      "skill": "Data Architecture",
      "tag": "new"
    }
  ],
  "llm_cost_api1_usd": null,
  "llm_cost_api2_usd": null,
  "llm_cost_api3_usd": null,
  "llm_cost_total_usd": null,
  "persistence": {
    "items": [
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Application Architecture Patterns",
          "id": 293,
          "rationale": "Structural patterns for organizing Python backend code into maintainable modules, layers, and feature boundaries. This is a coherent cluster because senior backend developers are expected to refactor and shape service internals over time.",
          "slug": "application-architecture-patterns",
          "source": "db"
        },
        "dimension_id": 293,
        "input_skill": "Data Modeling",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Skipped \u2014 no persistable v3 meta for new skill",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": ".NET Backend Developer",
            "id": 83,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "dotnet-backend-developer",
            "source": "db"
          },
          {
            "display_name": "Python Backend Developer",
            "id": 80,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "python-backend-developer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": false,
        "skill_id": null,
        "skill_tag": "new",
        "skipped_reason": "skill_not_in_db_v3_proposed"
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Service Architecture and Design Patterns",
          "id": 18,
          "rationale": "Reusable backend design patterns used to structure service code and boundaries. Covers layering, dependency management, domain modeling, and maintainable service organization.",
          "slug": "service-architecture-and-design-patterns",
          "source": "db"
        },
        "dimension_id": 18,
        "input_skill": "Data Modeling",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Skipped \u2014 no persistable v3 meta for new skill",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Backend Developer",
            "id": 1,
            "rationale": null,
            "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
            "slug": "backend-engineer",
            "source": "db"
          },
          {
            "display_name": "Java Backend Developer",
            "id": 79,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "java-backend-developer",
            "source": "db"
          },
          {
            "display_name": "Kotlin Backend Developer",
            "id": 84,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "kotlin-server-backend-developer",
            "source": "db"
          },
          {
            "display_name": "Node.js Backend Developer",
            "id": 82,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "node-backend-developer",
            "source": "db"
          },
          {
            "display_name": "PHP Backend Developer",
            "id": 86,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "php-backend-developer",
            "source": "db"
          },
          {
            "display_name": "Ruby Backend Developer",
            "id": 85,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "ruby-backend-developer",
            "source": "db"
          },
          {
            "display_name": "Scala Backend Developer",
            "id": 87,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "scala-backend-developer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": false,
        "skill_id": null,
        "skill_tag": "new",
        "skipped_reason": "skill_not_in_db_v3_proposed"
      }
    ],
    "new_skills_created": 0,
    "role_dimension_saved": 0,
    "skill_dimension_saved": 0,
    "skipped": 2
  },
  "planner_output": null,
  "run_id": "c42220f8-c0cc-4546-85e0-3207f16c8f75"
}