← Back to history

Pipeline run

c42220f8-c0cc-4546-85e0-3207f16c8f75

Pipeline LLM cost (USD)
API 1: $0.0038 API 2: $0.0006 API 3: $0.0000 Total: $0.0044

Client output enrichment

v2 Skill cluster · Nature of work · AI index · Tech stack maturity · Evidence · KRA description
role baseline loaded sources · ai_index: jd · nature_of_work: jd · tech_stack_maturity: jd
Nature of work · Data pipeline development
Build and automate batch/streaming data pipelines, ETL and data models, then source, cleanse, validate and monitor large-scale data in a governed bank environment while reducing manual steps and supporting cost-aware, reusable solutions.
"“Building advanced automation of data engineering pipelines through removal of manual stages”"
Tech stack maturity
Mainstream Modern
AI index (0 = no AI use, 5 = totally AI-dependent · v2.1)
0.20 / 5
· Title match
Has AI skill
· AI skill (primary)
· AI skill (secondary)
· On AI team
· Builds AI products
vocab breakdown (legacy)
Assistants (×1):
Frameworks (×2):
Models / concepts (×3): embeddings
Evidence — skills matched in JD (17)
ETL Data Modeling Data Pipelines Streaming Data Data Ingestion Data Warehousing Data Quality Testing Data Cleansing Data Monitoring Data Analysis Data Exploration Programming Languages Software Engineering Fundamentals Code Development Practices Automation Data Architecture Experiment Design
Skill cluster (1 dimension groups, role-scoped)
Cross-cutting / unaligned
ETL Data Modeling Data Pipelines Streaming Data Data Ingestion Data Warehousing Data Quality Testing Data Cleansing Data Monitoring Data Analysis Data Exploration Programming Languages Software Engineering Fundamentals Code Development Practices Automation Data Architecture Experiment Design
Show KRA description ↓
As a Data Engineer, you’ll be looking to simplify our organisation by developing innovative data driven solutions through data pipelines, modelling and ETL design, inspiring to be commercially successful while keeping our customers, and the bank’s data, safe and secure. You’ll drive customer value by understanding complex business problems and requirements to correctly apply the most appropriate and reusable tool to gather and build data solutions. You’ll support our strategic direction by engaging with the data engineering community to deliver opportunities, along with carrying out complex data engineering tasks to build a scalable data architecture. • Building advanced automation of data engineering pipelines through removal of manual stages • Embedding new data techniques into our business through role modelling, training, and experiment design oversight • Delivering a clear understanding of data platform costs to meet your departments cost saving and income targets • Sourcing new data using the most appropriate tooling for the situation • Developing solutions for streaming data ingestion and transformations in line with our streaming strategy To thrive in this role, you’ll need a strong understanding of data usage and dependencies and experience of extracting value and features from large scale data. You’ll also bring practical experience of programming languages alongside knowledge of data and software engineering fundamentals. Additionally, you’ll need: • Experience of ETL technical design, data quality testing, cleansing and monitoring, data sourcing, and exploration and analysis • Data warehousing and data modelling capabilities • A good understanding of modern code development practices • Experience of working in a governed, and regulatory environment • Strong communication skills with the ability to proactively engage and manage a wide range of stakeholders

Signals

Skill
Alias data-engineer
1.00
KRA data-engineer
0.60

Post-classification

Centroidupdated · n=465
Alias collision log
New-role queue
New skills captured17
New KRA captured

Captured for admin review

ETL primary Data Engineer pending
Data Modeling primary Data Engineer pending
Data Pipelines primary Data Engineer pending
Streaming Data primary Data Engineer pending
Data Ingestion primary Data Engineer pending
Data Warehousing primary Data Engineer pending
Data Quality Testing primary Data Engineer pending
Data Cleansing primary Data Engineer pending
Data Monitoring primary Data Engineer pending
Data Analysis primary Data Engineer pending
Data Exploration primary Data Engineer pending
Programming Languages primary Data Engineer pending
Software Engineering Fundamentals primary Data Engineer pending
Code Development Practices primary Data Engineer pending
Automation primary Data Engineer pending
Experiment Design Data Engineer pending
Data Architecture primary Data Engineer pending
Status: completed Created: 2026-05-27T16:44:55.211187Z Updated: 2026-05-27T16:46:32.973418Z API 3 duration: 1829 ms
Flow Current 3-step pipeline

1 POST /skills/extract-from-jd

2 POST /skills/extract-details

3 POST /skills/final-role-output

Role Chosen role & resolution

Data Engineer

CASE A

slug: data-engineer · id: 2 · source: db

Exact alias hit on data-engineer (1.0) — no other alias at this confidence; skill_top absent does not contradict

Resolution: in_db — role exists in library; skill↔dim and role↔dim links saved when applicable.

0
New skills
0
Skill↔dim saved
0
Role↔dim saved
2
Skipped

Job description

Our people work differently depending on their jobs and needs. From hybrid working to flexible hours , we have plenty of options that help our people to thrive.

This role is based in India and as such all normal working days must be carried out in India.

Join us as a Data Engineer


• You’ll be the voice of our customers, using data to tell their stories and put them at the heart of all decision-making
• We’ll look to you to drive the build of effortless, digital first customer experiences
• If you’re ready for a new challenge and want to make a far-reaching impact through your work, this could be the opportunity you’re looking for
• We're offering this role at vice president level


What you'll do

As a Data Engineer, you’ll be looking to simplify our organisation by developing innovative data driven solutions through data pipelines, modelling and ETL design, inspiring to be commercially successful while keeping our customers, and the bank’s data, safe and secure.

You’ll drive customer value by understanding complex business problems and requirements to correctly apply the most appropriate and reusable tool to gather and build data solutions. You’ll support our strategic direction by engaging with the data engineering community to deliver opportunities, along with carrying out complex data engineering tasks to build a scalable data architecture.

Your responsibilities will also include:


• Building advanced automation of data engineering pipelines through removal of manual stages
• Embedding new data techniques into our business through role modelling, training, and experiment design oversight
• Delivering a clear understanding of data platform costs to meet your departments cost saving and income targets
• Sourcing new data using the most appropriate tooling for the situation
• Developing solutions for streaming data ingestion and transformations in line with our streaming strategy


The skills you'll need

To thrive in this role, you’ll need a strong understanding of data usage and dependencies and experience of extracting value and features from large scale data. You’ll also bring practical experience of programming languages alongside knowledge of data and software engineering fundamentals.

Additionally, you’ll need:


• Experience of ETL technical design, data quality testing, cleansing and monitoring, data sourcing, and exploration and analysis
• Data warehousing and data modelling capabilities
• A good understanding of modern code development practices
• Experience of working in a governed, and regulatory environment
• Strong communication skills with the ability to proactively engage and manage a wide range of stakeholders




Apply for this job

Skills from this JD

Each row merges API 1 extraction, API 2 library match / v3 orchestration (dimensions + locked dims), and API 3 persistence tags.

ETL Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Data Engineering Tools
Sub-category
general
Skill nature
PRACTICE
Volatility
MEDIUM
Typical lifespan
MULTI_YEAR
Version strategy
UNVERSIONED
Data Modeling Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: domain modeling id=2379 · domain-modeling

Aliases — catalog

  • domain modeling (CANONICAL) primary
  • Domain Modeling (CANONICAL)

Context tags (catalog)

CQRS DDD ERD UML aggregate bounded context business logic context map context mapping data modeling domain events domain-driven design entities entity event sourcing event storming microservices repositories repository pattern service layer services value object value objects

Stored enrichment (catalog DB)

Category
Methodology
Sub-category
Domain Modeling
Confidence
0.90
Version strategy
NOT_APPLICABLE

Maturity reasoning: Common in software JDs under DDD/business analysis; many roles ask for domain modeling or domain-driven design, and it remains a standard design skill rather than a niche tool.

Skill profile (library / DB)

Skill nature
METHODOLOGY
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
8
Sub-category id
2831
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • Application Architecture Patterns Catalog dimension db id 293

    Library dimension (catalog)

    Roles linked in library: .NET Backend Developer, Python Backend Developer

  • Service Architecture and Design Patterns Catalog dimension db id 18

    Library dimension (catalog)

    Roles linked in library: Backend Developer, Java Backend Developer, Kotlin Backend Developer, Node.js Backend Developer, PHP Backend Developer, Ruby Backend Developer, Scala Backend Developer

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
Application Architecture Patterns
application-architecture-patterns
Skipped — no persistable v3 meta for new skill
skill_not_in_db_v3_proposed
Service Architecture and Design Patterns
service-architecture-and-design-patterns
Skipped — no persistable v3 meta for new skill
skill_not_in_db_v3_proposed
Data Pipelines Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Data Engineering Tools
Sub-category
general
Skill nature
CONCEPT
Volatility
MEDIUM
Typical lifespan
MULTI_YEAR
Version strategy
UNVERSIONED
Streaming Data Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Data Engineering Tools
Sub-category
general
Skill nature
CONCEPT
Volatility
MEDIUM
Typical lifespan
MULTI_YEAR
Version strategy
UNVERSIONED
Data Ingestion Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Data Engineering Tools
Sub-category
general
Skill nature
PRACTICE
Volatility
MEDIUM
Typical lifespan
MULTI_YEAR
Version strategy
UNVERSIONED
Data Warehousing Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Databases
Sub-category
general
Skill nature
CONCEPT
Volatility
MEDIUM
Typical lifespan
MULTI_YEAR
Version strategy
UNVERSIONED
Data Quality Testing Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Data Engineering Tools
Sub-category
general
Skill nature
PRACTICE
Volatility
MEDIUM
Typical lifespan
MULTI_YEAR
Version strategy
UNVERSIONED
Data Cleansing Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Data Engineering Tools
Sub-category
general
Skill nature
PRACTICE
Volatility
MEDIUM
Typical lifespan
MULTI_YEAR
Version strategy
UNVERSIONED
Data Monitoring Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Data Engineering Tools
Sub-category
general
Skill nature
PRACTICE
Volatility
MEDIUM
Typical lifespan
MULTI_YEAR
Version strategy
UNVERSIONED
Data Analysis Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Data Engineering Tools
Sub-category
general
Skill nature
PRACTICE
Volatility
MEDIUM
Typical lifespan
MULTI_YEAR
Version strategy
UNVERSIONED
Data Exploration Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Data Engineering Tools
Sub-category
general
Skill nature
PRACTICE
Volatility
MEDIUM
Typical lifespan
MULTI_YEAR
Version strategy
UNVERSIONED
Programming Languages Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Programming Languages
Sub-category
general
Skill nature
CONCEPT
Volatility
STABLE
Typical lifespan
EVERGREEN
Version strategy
UNVERSIONED
Software Engineering Fundamentals Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Software Engineering
Sub-category
general
Skill nature
CONCEPT
Volatility
STABLE
Typical lifespan
EVERGREEN
Version strategy
UNVERSIONED
Code Development Practices Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Software Engineering
Sub-category
general
Skill nature
PRACTICE
Volatility
MEDIUM
Typical lifespan
MULTI_YEAR
Version strategy
UNVERSIONED
Automation Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Software Engineering
Sub-category
general
Skill nature
PRACTICE
Volatility
MEDIUM
Typical lifespan
MULTI_YEAR
Version strategy
UNVERSIONED
Experiment Design Secondary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Data Science
Sub-category
general
Skill nature
PRACTICE
Volatility
MEDIUM
Typical lifespan
MULTI_YEAR
Version strategy
UNVERSIONED
Data Architecture Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Data Engineering Tools
Sub-category
general
Skill nature
CONCEPT
Volatility
MEDIUM
Typical lifespan
MULTI_YEAR
Version strategy
UNVERSIONED

All API 3 persistence rows

Same grid as the skill-extractor “Persistence items” table: one row per (skill × dimension) work item.

Skill Tag Dimension Skill↔dim Role↔dim Outcome Notes
Data Modeling new
Application Architecture Patterns
application-architecture-patterns
Skipped — no persistable v3 meta for new skill skill_not_in_db_v3_proposed
Data Modeling new
Service Architecture and Design Patterns
service-architecture-and-design-patterns
Skipped — no persistable v3 meta for new skill skill_not_in_db_v3_proposed

Library artifacts (this run)

Kind Detail DB id
canonical_skill_proposed ETL | type=Data Engineering Tools subtype=general nature=PRACTICE lifespan=MULTI_YEAR
canonical_skill_proposed Data Pipelines | type=Data Engineering Tools subtype=general nature=CONCEPT lifespan=MULTI_YEAR
canonical_skill_proposed Streaming Data | type=Data Engineering Tools subtype=general nature=CONCEPT lifespan=MULTI_YEAR
canonical_skill_proposed Data Ingestion | type=Data Engineering Tools subtype=general nature=PRACTICE lifespan=MULTI_YEAR
canonical_skill_proposed Data Warehousing | type=Databases subtype=general nature=CONCEPT lifespan=MULTI_YEAR
canonical_skill_proposed Data Quality Testing | type=Data Engineering Tools subtype=general nature=PRACTICE lifespan=MULTI_YEAR
canonical_skill_proposed Data Cleansing | type=Data Engineering Tools subtype=general nature=PRACTICE lifespan=MULTI_YEAR
canonical_skill_proposed Data Monitoring | type=Data Engineering Tools subtype=general nature=PRACTICE lifespan=MULTI_YEAR
canonical_skill_proposed Data Analysis | type=Data Engineering Tools subtype=general nature=PRACTICE lifespan=MULTI_YEAR
canonical_skill_proposed Data Exploration | type=Data Engineering Tools subtype=general nature=PRACTICE lifespan=MULTI_YEAR
canonical_skill_proposed Programming Languages | type=Programming Languages subtype=general nature=CONCEPT lifespan=EVERGREEN
canonical_skill_proposed Software Engineering Fundamentals | type=Software Engineering subtype=general nature=CONCEPT lifespan=EVERGREEN
canonical_skill_proposed Code Development Practices | type=Software Engineering subtype=general nature=PRACTICE lifespan=MULTI_YEAR
canonical_skill_proposed Automation | type=Software Engineering subtype=general nature=PRACTICE lifespan=MULTI_YEAR
canonical_skill_proposed Experiment Design | type=Data Science subtype=general nature=PRACTICE lifespan=MULTI_YEAR
canonical_skill_proposed Data Architecture | type=Data Engineering Tools subtype=general nature=CONCEPT lifespan=MULTI_YEAR
dimension_skill_link_proposed Data Modeling ↔ Application Architecture Patterns
dimension_skill_link_proposed Data Modeling ↔ Service Architecture and Design Patterns
nano JD Parser — gpt-4.1-nano click to toggle
RoleData Engineer
DomainOther
Location India (onsite)
JD type pass
Show raw JSON
{
  "JD_type": "pass",
  "about_company": null,
  "certifications": [],
  "company_name": null,
  "ctc": null,
  "domain": {
    "primary": {
      "aliases": [],
      "domain": "Other"
    },
    "secondary": null
  },
  "education": [],
  "experience": null,
  "job_locations": [
    {
      "aliases": [],
      "city": null,
      "country": "India",
      "state": null,
      "work_mode": "onsite"
    }
  ],
  "role": "Data Engineer",
  "role_aliases": [
    "Data Engineer",
    "Data Developer",
    "ETL Developer"
  ],
  "role_archetype": "Data",
  "roles_and_responsibilities": [
    {
      "bullet_count": 0,
      "heading": "What you\u0027ll do",
      "heading_was_present": true,
      "source_marker": {
        "first_5_words": "As a Data Engineer, you\u2019ll",
        "last_5_words": "build a scalable data architecture."
      },
      "text": "As a Data Engineer, you\u2019ll be looking to simplify our organisation by developing innovative data driven solutions through data pipelines, modelling and ETL design, inspiring to be commercially successful while keeping our customers, and the bank\u2019s data, safe and secure.\n\nYou\u2019ll drive customer value by understanding complex business problems and requirements to correctly apply the most appropriate and reusable tool to gather and build data solutions. You\u2019ll support our strategic direction by engaging with the data engineering community to deliver opportunities, along with carrying out complex data engineering tasks to build a scalable data architecture.",
      "word_count": 104
    },
    {
      "bullet_count": 5,
      "heading": "Your responsibilities will also include",
      "heading_was_present": true,
      "source_marker": {
        "first_5_words": "\u2022 Building advanced automation of",
        "last_5_words": "in line with our streaming strategy"
      },
      "text": "\u2022 Building advanced automation of data engineering pipelines through removal of manual stages\n\u2022 Embedding new data techniques into our business through role modelling, training, and experiment design oversight\n\u2022 Delivering a clear understanding of data platform costs to meet your departments cost saving and income targets\n\u2022 Sourcing new data using the most appropriate tooling for the situation\n\u2022 Developing solutions for streaming data ingestion and transformations in line with our streaming strategy",
      "word_count": 56
    },
    {
      "bullet_count": 5,
      "heading": "The skills you\u0027ll need",
      "heading_was_present": true,
      "source_marker": {
        "first_5_words": "To thrive in this role,",
        "last_5_words": "manage a wide range of stakeholders"
      },
      "text": "To thrive in this role, you\u2019ll need a strong understanding of data usage and dependencies and experience of extracting value and features from large scale data. You\u2019ll also bring practical experience of programming languages alongside knowledge of data and software engineering fundamentals.\n\nAdditionally, you\u2019ll need:\n\n\u2022 Experience of ETL technical design, data quality testing, cleansing and monitoring, data sourcing, and exploration and analysis\n\u2022 Data warehousing and data modelling capabilities\n\u2022 A good understanding of modern code development practices\n\u2022 Experience of working in a governed, and regulatory environment\n\u2022 Strong communication skills with the ability to proactively engage and manage a wide range of stakeholders",
      "word_count": 139
    }
  ],
  "urls": []
}
API 1 — extract-from-jd click to toggle
{
  "final_skills": [
    {
      "is_primary": true,
      "skill_name": "ETL"
    },
    {
      "is_primary": true,
      "skill_name": "Data Modeling"
    },
    {
      "is_primary": true,
      "skill_name": "Data Pipelines"
    },
    {
      "is_primary": true,
      "skill_name": "Streaming Data"
    },
    {
      "is_primary": true,
      "skill_name": "Data Ingestion"
    },
    {
      "is_primary": true,
      "skill_name": "Data Warehousing"
    },
    {
      "is_primary": true,
      "skill_name": "Data Quality Testing"
    },
    {
      "is_primary": true,
      "skill_name": "Data Cleansing"
    },
    {
      "is_primary": true,
      "skill_name": "Data Monitoring"
    },
    {
      "is_primary": true,
      "skill_name": "Data Analysis"
    },
    {
      "is_primary": true,
      "skill_name": "Data Exploration"
    },
    {
      "is_primary": true,
      "skill_name": "Programming Languages"
    },
    {
      "is_primary": true,
      "skill_name": "Software Engineering Fundamentals"
    },
    {
      "is_primary": true,
      "skill_name": "Code Development Practices"
    },
    {
      "is_primary": true,
      "skill_name": "Automation"
    },
    {
      "is_primary": false,
      "skill_name": "Experiment Design"
    },
    {
      "is_primary": true,
      "skill_name": "Data Architecture"
    }
  ],
  "jd_role": {
    "display_name": "Data Engineer",
    "rationale": null,
    "role_aliases": [
      "Data Engineer",
      "Data Developer",
      "ETL Developer"
    ],
    "role_archetype": "Data",
    "slug": ""
  },
  "nano_parsed": {
    "JD_type": "pass",
    "about_company": null,
    "certifications": [],
    "company_name": null,
    "ctc": null,
    "domain": {
      "primary": {
        "aliases": [],
        "domain": "Other"
      },
      "secondary": null
    },
    "education": [],
    "experience": null,
    "job_locations": [
      {
        "aliases": [],
        "city": null,
        "country": "India",
        "state": null,
        "work_mode": "onsite"
      }
    ],
    "role": "Data Engineer",
    "role_aliases": [
      "Data Engineer",
      "Data Developer",
      "ETL Developer"
    ],
    "role_archetype": "Data",
    "roles_and_responsibilities": [
      {
        "bullet_count": 0,
        "heading": "What you\u0027ll do",
        "heading_was_present": true,
        "source_marker": {
          "first_5_words": "As a Data Engineer, you\u2019ll",
          "last_5_words": "build a scalable data architecture."
        },
        "text": "As a Data Engineer, you\u2019ll be looking to simplify our organisation by developing innovative data driven solutions through data pipelines, modelling and ETL design, inspiring to be commercially successful while keeping our customers, and the bank\u2019s data, safe and secure.\n\nYou\u2019ll drive customer value by understanding complex business problems and requirements to correctly apply the most appropriate and reusable tool to gather and build data solutions. You\u2019ll support our strategic direction by engaging with the data engineering community to deliver opportunities, along with carrying out complex data engineering tasks to build a scalable data architecture.",
        "word_count": 104
      },
      {
        "bullet_count": 5,
        "heading": "Your responsibilities will also include",
        "heading_was_present": true,
        "source_marker": {
          "first_5_words": "\u2022 Building advanced automation of",
          "last_5_words": "in line with our streaming strategy"
        },
        "text": "\u2022 Building advanced automation of data engineering pipelines through removal of manual stages\n\u2022 Embedding new data techniques into our business through role modelling, training, and experiment design oversight\n\u2022 Delivering a clear understanding of data platform costs to meet your departments cost saving and income targets\n\u2022 Sourcing new data using the most appropriate tooling for the situation\n\u2022 Developing solutions for streaming data ingestion and transformations in line with our streaming strategy",
        "word_count": 56
      },
      {
        "bullet_count": 5,
        "heading": "The skills you\u0027ll need",
        "heading_was_present": true,
        "source_marker": {
          "first_5_words": "To thrive in this role,",
          "last_5_words": "manage a wide range of stakeholders"
        },
        "text": "To thrive in this role, you\u2019ll need a strong understanding of data usage and dependencies and experience of extracting value and features from large scale data. You\u2019ll also bring practical experience of programming languages alongside knowledge of data and software engineering fundamentals.\n\nAdditionally, you\u2019ll need:\n\n\u2022 Experience of ETL technical design, data quality testing, cleansing and monitoring, data sourcing, and exploration and analysis\n\u2022 Data warehousing and data modelling capabilities\n\u2022 A good understanding of modern code development practices\n\u2022 Experience of working in a governed, and regulatory environment\n\u2022 Strong communication skills with the ability to proactively engage and manage a wide range of stakeholders",
        "word_count": 139
      }
    ],
    "urls": []
  },
  "rejected": false,
  "rejection_reason": null,
  "run_id": "c42220f8-c0cc-4546-85e0-3207f16c8f75",
  "stage3_signals": {
    "alias_found": true,
    "alias_match_roles": [
      {
        "display_name": "Data Engineer",
        "kra_matches": null,
        "matched_count": null,
        "matched_skills": null,
        "role_id": 2,
        "score": 1.0,
        "slug": "data-engineer",
        "total_count": null
      }
    ],
    "kra_match_roles": [
      {
        "display_name": "Data Engineer",
        "kra_matches": [
          {
            "kra_text": "Designs dimensional models, star schemas, data vault structures, and curated data mart tables to support BI tools and self-service analytics consumption.",
            "sentence": "Data warehousing and data modelling capabilities",
            "similarity": 0.623
          },
          {
            "kra_text": "Develops batch and real-time streaming data pipelines using Apache Spark, Apache Kafka, Apache Flink, or Airflow for data movement and processing at scale.",
            "sentence": "Developing solutions for streaming data ingestion and transformations in line with our streaming strategy",
            "similarity": 0.613
          },
          {
            "kra_text": "Works with data analysts, data scientists, and business stakeholders to define data models, ingestion schedules, and data delivery requirements.",
            "sentence": "Experience of ETL technical design, data quality testing, cleansing and monitoring, data sourcing, and exploration and analysis",
            "similarity": 0.5706
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 2,
        "score": 0.6022,
        "slug": "data-engineer",
        "total_count": null
      },
      {
        "display_name": "Java Backend Developer",
        "kra_matches": [
          {
            "kra_text": "persistence and data modeling",
            "sentence": "Data warehousing and data modelling capabilities",
            "similarity": 0.5691
          },
          {
            "kra_text": "persistence and data modeling",
            "sentence": "Embedding new data techniques into our business through role modelling, training, and experiment design oversight",
            "similarity": 0.5151
          },
          {
            "kra_text": "code refactoring and defect fixes",
            "sentence": "A good understanding of modern code development practices",
            "similarity": 0.4735
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 79,
        "score": 0.5192,
        "slug": "java-backend-developer",
        "total_count": null
      },
      {
        "display_name": "Node.js Backend Developer",
        "kra_matches": [
          {
            "kra_text": "data modeling and persistence access",
            "sentence": "Data warehousing and data modelling capabilities",
            "similarity": 0.5597
          },
          {
            "kra_text": "code review and refactoring",
            "sentence": "A good understanding of modern code development practices",
            "similarity": 0.5247
          },
          {
            "kra_text": "data modeling and persistence access",
            "sentence": "Embedding new data techniques into our business through role modelling, training, and experiment design oversight",
            "similarity": 0.4542
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 82,
        "score": 0.5129,
        "slug": "node-backend-developer",
        "total_count": null
      },
      {
        "display_name": "Scala Backend Developer",
        "kra_matches": [
          {
            "kra_text": "application data modeling",
            "sentence": "Data warehousing and data modelling capabilities",
            "similarity": 0.5262
          },
          {
            "kra_text": "application data modeling",
            "sentence": "Embedding new data techniques into our business through role modelling, training, and experiment design oversight",
            "similarity": 0.4731
          },
          {
            "kra_text": "backend workflow orchestration",
            "sentence": "Building advanced automation of data engineering pipelines through removal of manual stages",
            "similarity": 0.4578
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 87,
        "score": 0.4857,
        "slug": "scala-backend-developer",
        "total_count": null
      },
      {
        "display_name": "Ruby Backend Developer",
        "kra_matches": [
          {
            "kra_text": "refactoring and code organization",
            "sentence": "A good understanding of modern code development practices",
            "similarity": 0.5176
          },
          {
            "kra_text": "data access and persistence",
            "sentence": "Data warehousing and data modelling capabilities",
            "similarity": 0.4635
          },
          {
            "kra_text": "automated backend checks",
            "sentence": "Building advanced automation of data engineering pipelines through removal of manual stages",
            "similarity": 0.4293
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 85,
        "score": 0.4701,
        "slug": "ruby-backend-developer",
        "total_count": null
      }
    ],
    "skill_match_roles": []
  },
  "stage4_decision": {
    "alias_collision_detected": false,
    "case": "A",
    "chosen_role": {
      "display_name": "Data Engineer",
      "kra_matches": null,
      "matched_count": null,
      "matched_skills": null,
      "role_id": 2,
      "score": 1.0,
      "slug": "data-engineer",
      "total_count": null
    },
    "confidence": 1.0,
    "is_new_role": false,
    "llm2_fired": false,
    "llm2_reasoning": null,
    "matched_dimensions": [],
    "matched_kras": [],
    "matched_skills": [],
    "new_role_display_name": null,
    "new_role_slug": null,
    "queued": false,
    "reasoning": "Exact alias hit on data-engineer (1.0) \u2014 no other alias at this confidence; skill_top absent does not contradict",
    "sub_role": null
  },
  "stage5_updates": {
    "centroid_n_after": 465,
    "centroid_updated": true,
    "collision_log_id": null,
    "new_kra_attached": null,
    "new_skills_attached": [
      {
        "is_primary": true,
        "queue_id": 21796,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "ETL",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 21797,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Data Modeling",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 21799,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Data Pipelines",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 21801,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Streaming Data",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 21803,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Data Ingestion",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 21804,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Data Warehousing",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 21806,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Data Quality Testing",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 21808,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Data Cleansing",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 21811,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Data Monitoring",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 21813,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Data Analysis",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 21815,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Data Exploration",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 21816,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Programming Languages",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 21818,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Software Engineering Fundamentals",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 21820,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Code Development Practices",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 21822,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Automation",
        "status": "pending"
      },
      {
        "is_primary": false,
        "queue_id": 21824,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Experiment Design",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 21826,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Data Architecture",
        "status": "pending"
      }
    ],
    "queue_entry_id": null,
    "v3_pipeline_triggered": false,
    "v3_role_slug": null,
    "v3_run_id": null
  }
}
API 2 — extract-details
{
  "alias_matches": [
    {
      "alias_persist_skipped_reason": "TODO: REMOVE AFTER TESTING \u2014 alias DB write disabled",
      "alias_persisted": false,
      "existing_alias_id": 5644,
      "existing_alias_text": "Domain Modeling",
      "input_term": "Data Modeling",
      "matched_canonical": {
        "category_id": 8,
        "display_name": "domain modeling",
        "id": 2379,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "METHODOLOGY",
        "slug": "domain-modeling",
        "sub_category_id": 2831,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "embedding_alias"
    }
  ],
  "candidate_roles": [
    {
      "display_name": ".NET Backend Developer",
      "id": 83,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "dotnet-backend-developer",
      "source": "db"
    },
    {
      "display_name": "Python Backend Developer",
      "id": 80,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "python-backend-developer",
      "source": "db"
    },
    {
      "display_name": "Backend Developer",
      "id": 1,
      "rationale": null,
      "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
      "slug": "backend-engineer",
      "source": "db"
    },
    {
      "display_name": "Java Backend Developer",
      "id": 79,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "java-backend-developer",
      "source": "db"
    },
    {
      "display_name": "Kotlin Backend Developer",
      "id": 84,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "kotlin-server-backend-developer",
      "source": "db"
    },
    {
      "display_name": "Node.js Backend Developer",
      "id": 82,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "node-backend-developer",
      "source": "db"
    },
    {
      "display_name": "PHP Backend Developer",
      "id": 86,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "php-backend-developer",
      "source": "db"
    },
    {
      "display_name": "Ruby Backend Developer",
      "id": 85,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "ruby-backend-developer",
      "source": "db"
    },
    {
      "display_name": "Scala Backend Developer",
      "id": 87,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "scala-backend-developer",
      "source": "db"
    }
  ],
  "chosen_role": {
    "display_name": "Data Engineer",
    "id": 2,
    "rationale": "Exact alias hit on data-engineer (1.0) \u2014 no other alias at this confidence; skill_top absent does not contradict",
    "role_archetype": null,
    "slug": "data-engineer",
    "source": "db"
  },
  "dimensions": [
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Application Architecture Patterns",
        "id": 293,
        "rationale": "Structural patterns for organizing Python backend code into maintainable modules, layers, and feature boundaries. This is a coherent cluster because senior backend developers are expected to refactor and shape service internals over time.",
        "slug": "application-architecture-patterns",
        "source": "db"
      },
      "input_skill": "Data Modeling",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": ".NET Backend Developer",
          "id": 83,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "dotnet-backend-developer",
          "source": "db"
        },
        {
          "display_name": "Python Backend Developer",
          "id": 80,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "python-backend-developer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Service Architecture and Design Patterns",
        "id": 18,
        "rationale": "Reusable backend design patterns used to structure service code and boundaries. Covers layering, dependency management, domain modeling, and maintainable service organization.",
        "slug": "service-architecture-and-design-patterns",
        "source": "db"
      },
      "input_skill": "Data Modeling",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Backend Developer",
          "id": 1,
          "rationale": null,
          "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
          "slug": "backend-engineer",
          "source": "db"
        },
        {
          "display_name": "Java Backend Developer",
          "id": 79,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "java-backend-developer",
          "source": "db"
        },
        {
          "display_name": "Kotlin Backend Developer",
          "id": 84,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "kotlin-server-backend-developer",
          "source": "db"
        },
        {
          "display_name": "Node.js Backend Developer",
          "id": 82,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "node-backend-developer",
          "source": "db"
        },
        {
          "display_name": "PHP Backend Developer",
          "id": 86,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "php-backend-developer",
          "source": "db"
        },
        {
          "display_name": "Ruby Backend Developer",
          "id": 85,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "ruby-backend-developer",
          "source": "db"
        },
        {
          "display_name": "Scala Backend Developer",
          "id": 87,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "scala-backend-developer",
          "source": "db"
        }
      ]
    }
  ],
  "input_final_skills": [
    "ETL",
    "Data Modeling",
    "Data Pipelines",
    "Streaming Data",
    "Data Ingestion",
    "Data Warehousing",
    "Data Quality Testing",
    "Data Cleansing",
    "Data Monitoring",
    "Data Analysis",
    "Data Exploration",
    "Programming Languages",
    "Software Engineering Fundamentals",
    "Code Development Practices",
    "Automation",
    "Experiment Design",
    "Data Architecture"
  ],
  "input_llm_skills": [
    "ETL",
    "Data Modeling",
    "Data Pipelines",
    "Streaming Data",
    "Data Ingestion",
    "Data Warehousing",
    "Data Quality Testing",
    "Data Cleansing",
    "Data Monitoring",
    "Data Analysis",
    "Data Exploration",
    "Programming Languages",
    "Software Engineering Fundamentals",
    "Code Development Practices",
    "Automation",
    "Experiment Design",
    "Data Architecture"
  ],
  "new_aliases_persisted": 0,
  "run_id": "c42220f8-c0cc-4546-85e0-3207f16c8f75",
  "skills_detail": [
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "ETL",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Data Engineering Tools",
          "skill_nature": "PRACTICE",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "etl",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "domain modeling",
          "alias_type": "CANONICAL",
          "id": 3675,
          "is_primary": true,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Domain Modeling",
          "alias_type": "CANONICAL",
          "id": 5644,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 8,
        "display_name": "domain modeling",
        "id": 2379,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "METHODOLOGY",
        "slug": "domain-modeling",
        "sub_category_id": 2831,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Application Architecture Patterns",
            "id": 293,
            "rationale": "Structural patterns for organizing Python backend code into maintainable modules, layers, and feature boundaries. This is a coherent cluster because senior backend developers are expected to refactor and shape service internals over time.",
            "slug": "application-architecture-patterns",
            "source": "db"
          },
          "input_skill": "Data Modeling",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": ".NET Backend Developer",
              "id": 83,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "dotnet-backend-developer",
              "source": "db"
            },
            {
              "display_name": "Python Backend Developer",
              "id": 80,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "python-backend-developer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Service Architecture and Design Patterns",
            "id": 18,
            "rationale": "Reusable backend design patterns used to structure service code and boundaries. Covers layering, dependency management, domain modeling, and maintainable service organization.",
            "slug": "service-architecture-and-design-patterns",
            "source": "db"
          },
          "input_skill": "Data Modeling",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Backend Developer",
              "id": 1,
              "rationale": null,
              "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
              "slug": "backend-engineer",
              "source": "db"
            },
            {
              "display_name": "Java Backend Developer",
              "id": 79,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "java-backend-developer",
              "source": "db"
            },
            {
              "display_name": "Kotlin Backend Developer",
              "id": 84,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "kotlin-server-backend-developer",
              "source": "db"
            },
            {
              "display_name": "Node.js Backend Developer",
              "id": 82,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "node-backend-developer",
              "source": "db"
            },
            {
              "display_name": "PHP Backend Developer",
              "id": 86,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "php-backend-developer",
              "source": "db"
            },
            {
              "display_name": "Ruby Backend Developer",
              "id": 85,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "ruby-backend-developer",
              "source": "db"
            },
            {
              "display_name": "Scala Backend Developer",
              "id": 87,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "scala-backend-developer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "Data Modeling",
      "matched_via": "embedding_alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Data Pipelines",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Data Engineering Tools",
          "skill_nature": "CONCEPT",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "data-pipelines",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Streaming Data",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Data Engineering Tools",
          "skill_nature": "CONCEPT",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "streaming-data",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Data Ingestion",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Data Engineering Tools",
          "skill_nature": "PRACTICE",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "data-ingestion",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Data Warehousing",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Databases",
          "skill_nature": "CONCEPT",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "data-warehousing",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Data Quality Testing",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Data Engineering Tools",
          "skill_nature": "PRACTICE",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "data-quality-testing",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Data Cleansing",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Data Engineering Tools",
          "skill_nature": "PRACTICE",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "data-cleansing",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Data Monitoring",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Data Engineering Tools",
          "skill_nature": "PRACTICE",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "data-monitoring",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Data Analysis",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Data Engineering Tools",
          "skill_nature": "PRACTICE",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "data-analysis",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Data Exploration",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Data Engineering Tools",
          "skill_nature": "PRACTICE",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "data-exploration",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Programming Languages",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Programming Languages",
          "skill_nature": "CONCEPT",
          "sub_category": "general",
          "typical_lifespan": "EVERGREEN",
          "version_strategy": "UNVERSIONED",
          "volatility": "STABLE"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "programming-languages",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Software Engineering Fundamentals",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Software Engineering",
          "skill_nature": "CONCEPT",
          "sub_category": "general",
          "typical_lifespan": "EVERGREEN",
          "version_strategy": "UNVERSIONED",
          "volatility": "STABLE"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "software-engineering-fundamentals",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Code Development Practices",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Software Engineering",
          "skill_nature": "PRACTICE",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "code-development-practices",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Automation",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Software Engineering",
          "skill_nature": "PRACTICE",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "automation",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Experiment Design",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Data Science",
          "skill_nature": "PRACTICE",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "experiment-design",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Data Architecture",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Data Engineering Tools",
          "skill_nature": "CONCEPT",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "data-architecture",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    }
  ],
  "unmatched_skills": [
    "ETL",
    "Data Pipelines",
    "Streaming Data",
    "Data Ingestion",
    "Data Warehousing",
    "Data Quality Testing",
    "Data Cleansing",
    "Data Monitoring",
    "Data Analysis",
    "Data Exploration",
    "Programming Languages",
    "Software Engineering Fundamentals",
    "Code Development Practices",
    "Automation",
    "Experiment Design",
    "Data Architecture"
  ]
}
API 3 — final-role-output
{
  "chosen_role": {
    "display_name": "Data Engineer",
    "id": 2,
    "rationale": "Exact alias hit on data-engineer (1.0) \u2014 no other alias at this confidence; skill_top absent does not contradict",
    "role_archetype": null,
    "slug": "data-engineer",
    "source": "db"
  },
  "chosen_role_resolution": "in_db",
  "final_input_skills": [
    {
      "skill": "ETL",
      "tag": "new"
    },
    {
      "skill": "Data Modeling",
      "tag": "in_db"
    },
    {
      "skill": "Data Pipelines",
      "tag": "new"
    },
    {
      "skill": "Streaming Data",
      "tag": "new"
    },
    {
      "skill": "Data Ingestion",
      "tag": "new"
    },
    {
      "skill": "Data Warehousing",
      "tag": "new"
    },
    {
      "skill": "Data Quality Testing",
      "tag": "new"
    },
    {
      "skill": "Data Cleansing",
      "tag": "new"
    },
    {
      "skill": "Data Monitoring",
      "tag": "new"
    },
    {
      "skill": "Data Analysis",
      "tag": "new"
    },
    {
      "skill": "Data Exploration",
      "tag": "new"
    },
    {
      "skill": "Programming Languages",
      "tag": "new"
    },
    {
      "skill": "Software Engineering Fundamentals",
      "tag": "new"
    },
    {
      "skill": "Code Development Practices",
      "tag": "new"
    },
    {
      "skill": "Automation",
      "tag": "new"
    },
    {
      "skill": "Experiment Design",
      "tag": "new"
    },
    {
      "skill": "Data Architecture",
      "tag": "new"
    }
  ],
  "llm_cost_api1_usd": null,
  "llm_cost_api2_usd": null,
  "llm_cost_api3_usd": null,
  "llm_cost_total_usd": null,
  "persistence": {
    "items": [
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Application Architecture Patterns",
          "id": 293,
          "rationale": "Structural patterns for organizing Python backend code into maintainable modules, layers, and feature boundaries. This is a coherent cluster because senior backend developers are expected to refactor and shape service internals over time.",
          "slug": "application-architecture-patterns",
          "source": "db"
        },
        "dimension_id": 293,
        "input_skill": "Data Modeling",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Skipped \u2014 no persistable v3 meta for new skill",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": ".NET Backend Developer",
            "id": 83,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "dotnet-backend-developer",
            "source": "db"
          },
          {
            "display_name": "Python Backend Developer",
            "id": 80,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "python-backend-developer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": false,
        "skill_id": null,
        "skill_tag": "new",
        "skipped_reason": "skill_not_in_db_v3_proposed"
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Service Architecture and Design Patterns",
          "id": 18,
          "rationale": "Reusable backend design patterns used to structure service code and boundaries. Covers layering, dependency management, domain modeling, and maintainable service organization.",
          "slug": "service-architecture-and-design-patterns",
          "source": "db"
        },
        "dimension_id": 18,
        "input_skill": "Data Modeling",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Skipped \u2014 no persistable v3 meta for new skill",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Backend Developer",
            "id": 1,
            "rationale": null,
            "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
            "slug": "backend-engineer",
            "source": "db"
          },
          {
            "display_name": "Java Backend Developer",
            "id": 79,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "java-backend-developer",
            "source": "db"
          },
          {
            "display_name": "Kotlin Backend Developer",
            "id": 84,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "kotlin-server-backend-developer",
            "source": "db"
          },
          {
            "display_name": "Node.js Backend Developer",
            "id": 82,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "node-backend-developer",
            "source": "db"
          },
          {
            "display_name": "PHP Backend Developer",
            "id": 86,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "php-backend-developer",
            "source": "db"
          },
          {
            "display_name": "Ruby Backend Developer",
            "id": 85,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "ruby-backend-developer",
            "source": "db"
          },
          {
            "display_name": "Scala Backend Developer",
            "id": 87,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "scala-backend-developer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": false,
        "skill_id": null,
        "skill_tag": "new",
        "skipped_reason": "skill_not_in_db_v3_proposed"
      }
    ],
    "new_skills_created": 0,
    "role_dimension_saved": 0,
    "skill_dimension_saved": 0,
    "skipped": 2
  },
  "planner_output": null,
  "run_id": "c42220f8-c0cc-4546-85e0-3207f16c8f75"
}