Pipeline run

02b5b632-bd0f-4c80-bf47-e28f87b2cf45

Pipeline LLM cost (USD)

API 1: $0.0044 API 2: $0.0004 API 3: $0.0000 Total: $0.0048

Client output enrichment

v2 Skill cluster · Nature of work · AI index · Tech stack maturity · Evidence · KRA description

role baseline loaded sources · ai_index: jd · nature_of_work: jd · tech_stack_maturity: jd

Nature of work · Data pipeline development

Build and maintain scalable ETL pipelines in Apache Airflow, automate data-platform deployment/monitoring, and troubleshoot reliability, performance, and data-quality issues with engineering/DevOps teams.

"Design, develop, and maintain robust and scalable data pipelines"

Tech stack maturity

Mainstream Modern

Apache Airflow is a widely adopted data orchestration tool commonly used in modern data engineering stacks.

AI index (0 = no AI use, 5 = totally AI-dependent · v2.1)

0.00 / 5

· Title match

· Has AI skill

· AI skill (primary)

· AI skill (secondary)

· On AI team

· Builds AI products

vocab breakdown (legacy)

Assistants (×1): —

Frameworks (×2): —

Models / concepts (×3): —

Evidence — skills matched in JD (16)

Apache Airflow ETL Data Pipelines Infrastructure as Code Auto-scaling Monitoring Alerting Failover High Availability Reliability Performance Data Security Access Control DevOps Testing Validation

Skill cluster (4 dimension groups, role-scoped)

CI/CD Pipeline Platforms

DevOps

Data Pipeline Orchestration

Apache Airflow

Observability and Incident Response

Alerting

Cross-cutting / unaligned

ETL Data Pipelines Infrastructure as Code Auto-scaling Monitoring Failover High Availability Reliability Performance Data Security Access Control Testing Validation

Show KRA description ↓

• Design, develop, and maintain robust and scalable data pipelines that support the extraction, transformation, and loading of data from various sources into our data platform. • Build and extend our automation tools for infrastructure provisioning, auto-scaling, code deployment, monitoring, alerting, reporting, and failover. • Collaborate with the engineering team to build and maintain a resilient data platform, ensuring high availability, reliability, and performance. • Leverage your expertise in Apache Airflow to orchestrate and manage complex ETL workflows efficiently. • Optimize data pipeline performance, troubleshoot issues, and ensure data quality and accuracy through monitoring, testing, and validation. • Keep up to date with the latest trends in data engineering and evaluate and introduce new technologies and tools to improve our data infrastructure. • Drive the design and implementation of data security and access control policies to protect sensitive information. • Collaborate with the DevOps team to ensure seamless deployment and monitoring of data pipelines and workflows.

Signals

Skill data-engineer

0.33

Alias data-engineer

1.00

KRA data-engineer

0.63

Post-classification

Centroidupdated · n=405

Alias collision log—

New-role queue—

New skills captured11

New KRA captured—

Captured for admin review

ETL primary ↔ Data Engineer pending

Data Pipelines primary ↔ Data Engineer pending

Infrastructure as Code ↔ Data Engineer pending

Auto-scaling ↔ Data Engineer pending

Failover ↔ Data Engineer pending

Reliability ↔ Data Engineer pending

Performance ↔ Data Engineer pending

Data Security ↔ Data Engineer pending

Access Control ↔ Data Engineer pending

Testing ↔ Data Engineer pending

Validation ↔ Data Engineer pending

Status: completed Created: 2026-05-27T16:09:38.590513Z Updated: 2026-05-27T16:11:04.992043Z API 3 duration: 24516 ms

Flow Current 3-step pipeline

1 POST /skills/extract-from-jd

2 POST /skills/extract-details

3 POST /skills/final-role-output

Role Chosen role & resolution

Data Engineer

CASE A

slug: data-engineer · id: 2 · source: db

Exact alias hit on data-engineer (1.0) — no other alias at this confidence; skill_top data-engineer 0.33 does not contradict

Resolution: in_db — role exists in library; skill↔dim and role↔dim links saved when applicable.

New skills

Skill↔dim saved

Role↔dim saved

Skipped

Job description

Job Description

NIQ is looking for a Senior Data Platform Engineer to join our Financial Services Engineering team.

At NIQ, the Financial Services team uses alternative datasets to help global public equity investors (hedge funds, mutual funds, pension funds) make better investment decisions. We work with some of the largest hedge funds in the world. As an Infrastructure Engineer, you will be at the cutting edge of the alternative data space where you will help maintain and improve our data infrastructure, which enables us to develop market research products and delivery data to our customers. In this role, you would also get the opportunity to work with world-class big data and cloud services, such as: AWS, Azure, Snowflake, Databricks, DBT, Airflow, and Looker. Apply now to start taking your career to the next level.

Who we are looking for:

• You have a strong entrepreneurial spirit and a thirst to solve difficult challenges through innovation and creativity with a strong focus on results
• You have a passion for data and the insights it can deliver
• You are intellectually curious with a broad range of interests and hobbies
• You take ownership of your deliverables
• You have excellent analytical communication and interpersonal skills
• You have excellent communication skills with both technical and non-technical audiences
• You can work with distributed teams situated globally in different geographies
• You want to work in a small team with a start-up mentality
• You can work well under pressure, prioritize work and be well organized. Relish tackling new challenges, paying attention to details, and, ultimately, growing professionally.

Responsibilities:

• Design, develop, and maintain robust and scalable data pipelines that support the extraction, transformation, and loading of data from various sources into our data platform.
• Build and extend our automation tools for infrastructure provisioning, auto-scaling, code deployment, monitoring, alerting, reporting, and failover.
• Collaborate with the engineering team to build and maintain a resilient data platform, ensuring high availability, reliability, and performance.
• Leverage your expertise in Apache Airflow to orchestrate and manage complex ETL workflows efficiently
• Optimize data pipeline performance, troubleshoot issues, and ensure data quality and accuracy through monitoring, testing, and validation.
• Keep up to date with the latest trends in data engineering and evaluate and introduce new technologies and tools to improve our data infrastructure.
• Drive the design and implementation of data security and access control policies to protect sensitive information.
• Collaborate with the DevOps team to ensure seamless deployment and monitoring of data pipelines and workflows.

Qualifications

• Bachelor's or Master's degree in Computer Science, Data Engineering, or a related field.
• 7+ years of experience as a Data Engineer with a strong background in ETL processes and data warehousing.
• 4+ years of experience with Apache Airflow for orchestrating data pipelines.
• Proficiency in using ETL tools and frameworks such as Apache Spark, Matillion.
• Strong software engineering fundamentals, including proficiency in Python, Java, or other relevant languages.
• Knowledge of data modeling, data warehousing, and SQL.
• Expertise in working with both structured and unstructured data.
• Experience with cloud-based data solutions, such as AWS, GCP, or Azure.
• Solid understanding of data storage, data transformation, and data integration concepts.
• Excellent problem-solving skills and the ability to work in a fast-paced, collaborative environment.
• Strong communication and team collaboration skills.
• Knowledge of data security and encryption best practices is a plus.

Additional Information

• Enjoy a flexible and rewarding work environment with peer-to-peer recognition platforms
• Recharge and revitalize with help of wellness plans made for you and your family
• Plan your future with financial wellness tools
• Stay relevant and upskill yourself with career development opportunities.

Our Benefits

• Flexible working environment
• Volunteer time off
• LinkedIn Learning
• Employee-Assistance-Program (EAP)

About NIQ

NIQ is the world’s leading consumer intelligence company, delivering the most complete understanding of consumer buying behavior and revealing new pathways to growth. In 2023, NIQ combined with GfK, bringing together the two industry leaders with unparalleled global reach. With a holistic retail read and the most comprehensive consumer insights—delivered with advanced analytics through state-of-the-art platforms—NIQ delivers the Full View™. NIQ is an Advent International portfolio company with operations in 100+ markets, covering more than 90% of the world’s population.

For more information, visit NIQ.com

Want to keep up with our latest updates?

Our commitment to Diversity, Equity, and Inclusion

NIQ is committed to reflecting the diversity of the clients, communities, and markets we measure within our own workforce. We exist to count everyone and are on a mission to systematically embed inclusion and diversity into all aspects of our workforce, measurement, and products. We enthusiastically invite candidates who share that mission to join us. We are proud to be an Equal Opportunity/Affirmative Action-Employer, making decisions without regard to race, color, religion, gender, gender identity or expression, sexual orientation, national origin, genetics, disability status, age, marital status, protected veteran status or any other protected class. Our global non-discrimination policy covers these protected classes in every market in which we do business worldwide. Learn more about how we are driving diversity and inclusion in everything we do by visiting the NIQ News Center: https://nielseniq.com/global/en/news-center/diversity-inclusion

Skills from this JD

Each row merges API 1 extraction, API 2 library match / v3 orchestration (dimensions + locked dims), and API 3 persistence tags.

Apache Airflow Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)

Canonical: Apache Airflow id=110 · apache-airflow

Aliases — catalog

Apache Airflow (CANONICAL) primary

Context tags (catalog)

CeleryExecutor DAG ETL KubernetesExecutor Sensors XCom backfill catchup cron data pipelines executor hooks operators scheduler task dependencies

Stored enrichment (catalog DB)

Category: Tool
Sub-category: Workflow Orchestration Tool
Vendor: Apache Software Foundation
License: apache_2
Year introduced: 2015
Confidence: 0.98
Version strategy: NOT_APPLICABLE

Maturity reasoning: Frequently listed in data engineering JDs and widely adopted for workflow orchestration; strong GitHub activity and managed offerings from AWS/GCP/Azure signal broad market demand.

Skill profile (library / DB)

Skill nature: TOOL
Volatility: STABLE
Typical lifespan: EVERGREEN
Category id: 13
Sub-category id: 130
Extractable: True
Also category: False

Dimensions (API 2 worklist)

Data Pipeline Orchestration Catalog dimension db id 23

Library dimension (catalog)

Roles linked in library: Data Engineer

API 3 link attempts (this skill)

Dimension	Skill↔dim	Role↔dim	Outcome
Data Pipeline Orchestration data-pipeline-orchestration	✓	✓	Existing dimension (library) · Role↔dimension saved

ETL Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields

Category: Data Engineering Tools
Sub-category: general
Skill nature: PRACTICE
Volatility: MEDIUM
Typical lifespan: MULTI_YEAR
Version strategy: UNVERSIONED

Data Pipelines Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields

Category: Data Engineering Tools
Sub-category: general
Skill nature: PRACTICE
Volatility: MEDIUM
Typical lifespan: MULTI_YEAR
Version strategy: UNVERSIONED

Infrastructure as Code Secondary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields

Category: Infrastructure Tools
Sub-category: general
Skill nature: CONCEPT
Volatility: MEDIUM
Typical lifespan: MULTI_YEAR
Version strategy: UNVERSIONED

Auto-scaling Secondary Library skill API 3: existing canonical (in_db) Existing skill (matched library)

Canonical: autoscaling id=858 · autoscaling

Aliases — catalog

autoscaling (CANONICAL) primary

Context tags (catalog)

AWS Auto Scaling Kubernetes capacity planning cloud infrastructure container orchestration cost efficiency dynamic scaling elasticity horizontal scaling load balancing performance tuning resource optimization scaling policies serverless architecture vertical scaling

Stored enrichment (catalog DB)

Category: Concept
Sub-category: Scaling Concept
Confidence: 0.93
Version strategy: NOT_APPLICABLE

Maturity reasoning: Autoscaling is a standard cloud/Kubernetes capability and appears routinely in AWS, GCP, Azure, and Kubernetes job descriptions, with vendor docs and managed services built around it.

Skill profile (library / DB)

Skill nature: CONCEPT
Volatility: STABLE
Typical lifespan: EVERGREEN
Category id: 2
Sub-category id: 604
Extractable: True
Also category: False

Dimensions (API 2 worklist)

Container Orchestration Platforms Catalog dimension db id 134

Library dimension (catalog)

Roles linked in library: Cloud Architect, DevOps Engineer

API 3 link attempts (this skill)

Dimension	Skill↔dim	Role↔dim	Outcome
Container Orchestration Platforms container-orchestration-platforms	—	—	Skipped — no persistable v3 meta for new skill skill_not_in_db_v3_proposed

Monitoring Secondary Library skill API 3: existing canonical (in_db) Existing skill (matched library)

Canonical: Monitoring id=1218 · monitoring

Aliases — catalog

Monitoring (CANONICAL)

Context tags (catalog)

ELK Stack Grafana Prometheus SLI SLO alerting anomaly detection dashboards health checks incident response logging metrics monitoring as code observability tracing

Stored enrichment (catalog DB)

Category: Concept
Sub-category: Observability Monitoring
Confidence: 0.88
Version strategy: NOT_APPLICABLE

Maturity reasoning: Monitoring is a standard requirement in most SRE/DevOps job descriptions and is bundled into major platforms like AWS CloudWatch, Datadog, and Prometheus, indicating broad market adoption.

Skill profile (library / DB)

Skill nature: CONCEPT
Volatility: STABLE
Typical lifespan: EVERGREEN
Category id: 2
Sub-category id: 924
Extractable: True
Also category: False

Dimensions (API 2 worklist)

Observability and Incident Triage Catalog dimension db id 155

Library dimension (catalog)

Roles linked in library: DevOps Engineer

API 3 link attempts (this skill)

Dimension	Skill↔dim	Role↔dim	Outcome
Observability and Incident Triage observability-and-incident-triage	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)

Alerting Secondary Library skill API 3: existing canonical (in_db) Existing skill (matched library)

Canonical: alerting id=882 · alerting

Aliases — catalog

alerting (CANONICAL) primary

Context tags (catalog)

Grafana SLA SLA compliance SLAs SLIs SLOs alert fatigue alert management alert prioritization alerting frameworks alerting policies alerting rules alerting systems alertmanager anomaly detection dashboard dashboards escalation policies grafana incident response log analysis metrics monitoring notifications observability prometheus real-time alerts root cause analysis thresholds webhooks

Stored enrichment (catalog DB)

Category: Concept
Sub-category: Alerting
Confidence: 0.90
Version strategy: NOT_APPLICABLE

Maturity reasoning: Alerting is a standard SRE/DevOps requirement and appears in many JDs alongside Prometheus, Grafana, PagerDuty, and Datadog; vendors actively market alerting features rather than sunsetting them.

Skill profile (library / DB)

Skill nature: CONCEPT
Volatility: STABLE
Typical lifespan: EVERGREEN
Category id: 2
Sub-category id: 3472
Extractable: True
Also category: False

Dimensions (API 2 worklist)

Backend Observability, Logging, and Diagnostics Catalog dimension db id 388

Library dimension (catalog)

Roles linked in library: Kotlin Backend Developer, Scala Backend Developer
Observability and Incident Response Catalog dimension db id 10

Library dimension (catalog)

Roles linked in library: .NET Backend Developer, Backend Developer, Node.js Backend Developer, PHP Backend Developer
Observability and Incident Triage Catalog dimension db id 155

Library dimension (catalog)

Roles linked in library: DevOps Engineer

API 3 link attempts (this skill)

Dimension	Skill↔dim	Role↔dim	Outcome
Backend Observability, Logging, and Diagnostics backend-observability-logging-and-diagnostics	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Observability and Incident Response observability-and-incident-response	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Observability and Incident Triage observability-and-incident-triage	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)

Failover Secondary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields

Category: Infrastructure Tools
Sub-category: general
Skill nature: CONCEPT
Volatility: MEDIUM
Typical lifespan: MULTI_YEAR
Version strategy: UNVERSIONED

High Availability Secondary Library skill API 3: existing canonical (in_db) Existing skill (matched library)

Canonical: high availability id=764 · high-availability

Aliases — catalog

high availability (CANONICAL) primary

Context tags (catalog)

RPO RTO SLA active-active active-passive clustering disaster recovery failover fault tolerance heartbeat load balancing redundancy replication rolling upgrade zero downtime

Stored enrichment (catalog DB)

Category: Concept
Sub-category: Reliability Concept
Confidence: 0.92
Version strategy: NOT_APPLICABLE

Maturity reasoning: High availability is a standard requirement in cloud/SRE job descriptions and vendor docs; AWS, Azure, and GCP all publish HA reference architectures, showing broad market adoption.

Skill profile (library / DB)

Skill nature: CONCEPT
Volatility: STABLE
Typical lifespan: EVERGREEN
Category id: 2
Sub-category id: 535
Extractable: True
Also category: False

Dimensions (API 2 worklist)

Availability and Disaster Recovery Catalog dimension db id 141

Library dimension (catalog)

Roles linked in library: Cloud Architect

API 3 link attempts (this skill)

Dimension	Skill↔dim	Role↔dim	Outcome
Availability and Disaster Recovery availability-and-disaster-recovery	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)

Reliability Secondary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields

Category: Soft Skills
Sub-category: general
Skill nature: CONCEPT
Volatility: STABLE
Typical lifespan: EVERGREEN
Version strategy: UNVERSIONED

Performance Secondary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields

Category: Soft Skills
Sub-category: general
Skill nature: CONCEPT
Volatility: STABLE
Typical lifespan: EVERGREEN
Version strategy: UNVERSIONED

Data Security Secondary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields

Category: Security Tools
Sub-category: general
Skill nature: CONCEPT
Volatility: MEDIUM
Typical lifespan: MULTI_YEAR
Version strategy: UNVERSIONED

Access Control Secondary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields

Category: Security Tools
Sub-category: general
Skill nature: CONCEPT
Volatility: MEDIUM
Typical lifespan: MULTI_YEAR
Version strategy: UNVERSIONED

DevOps Secondary Library skill API 3: existing canonical (in_db) Existing skill (matched library)

Canonical: DevOps id=1216 · devops

Aliases — catalog

DevOps (CANONICAL)

Context tags (catalog)

Agile Ansible Automation CI/CD Cloud-native Continuous Deployment Continuous Integration Docker GitOps Infrastructure as Code Jenkins Kubernetes Microservices Monitoring SRE Terraform

Stored enrichment (catalog DB)

Category: Methodology
Sub-category: Devops Methodology
Confidence: 0.97
Version strategy: NOT_APPLICABLE

Maturity reasoning: DevOps appears in a large share of software and platform engineering job descriptions, often alongside CI/CD, Kubernetes, and cloud tooling; it is a standard hiring-pipeline keyword rather than a niche specialty.

Skill profile (library / DB)

Skill nature: METHODOLOGY
Volatility: STABLE
Typical lifespan: EVERGREEN
Category id: 8
Sub-category id: 922
Extractable: True
Also category: False

Dimensions (API 2 worklist)

CI/CD Pipeline Platforms Catalog dimension db id 150

Library dimension (catalog)

Roles linked in library: DevOps Engineer
Deployment and Release Patterns Catalog dimension db id 140

Library dimension (catalog)

Roles linked in library: Cloud Architect
Infrastructure as Code Catalog dimension db id 132

Library dimension (catalog)

Roles linked in library: Cloud Architect, DevOps Engineer

API 3 link attempts (this skill)

Dimension	Skill↔dim	Role↔dim	Outcome
CI/CD Pipeline Platforms ci-cd-pipeline-platforms	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Deployment and Release Patterns deployment-and-release-patterns	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Infrastructure as Code infrastructure-as-code	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)

Testing Secondary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields

Category: Testing Tools
Sub-category: general
Skill nature: PRACTICE
Volatility: STABLE
Typical lifespan: EVERGREEN
Version strategy: UNVERSIONED

Validation Secondary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields

Category: Testing Tools
Sub-category: general
Skill nature: PRACTICE
Volatility: MEDIUM
Typical lifespan: MULTI_YEAR
Version strategy: UNVERSIONED

All API 3 persistence rows

Same grid as the skill-extractor “Persistence items” table: one row per (skill × dimension) work item.

Skill	Tag	Dimension	Skill↔dim	Role↔dim	Outcome	Notes
Apache Airflow	in_db	Data Pipeline Orchestration data-pipeline-orchestration	✓	✓	Existing dimension (library) · Role↔dimension saved
Auto-scaling	new	Container Orchestration Platforms container-orchestration-platforms	—	—	Skipped — no persistable v3 meta for new skill	skill_not_in_db_v3_proposed
Monitoring	in_db	Observability and Incident Triage observability-and-incident-triage	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Alerting	in_db	Backend Observability, Logging, and Diagnostics backend-observability-logging-and-diagnostics	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Alerting	in_db	Observability and Incident Response observability-and-incident-response	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Alerting	in_db	Observability and Incident Triage observability-and-incident-triage	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
High Availability	in_db	Availability and Disaster Recovery availability-and-disaster-recovery	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
DevOps	in_db	CI/CD Pipeline Platforms ci-cd-pipeline-platforms	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
DevOps	in_db	Deployment and Release Patterns deployment-and-release-patterns	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
DevOps	in_db	Infrastructure as Code infrastructure-as-code	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)

Library artifacts (this run)

Kind	Detail	DB id
canonical_skill_proposed	ETL \| type=Data Engineering Tools subtype=general nature=PRACTICE lifespan=MULTI_YEAR
canonical_skill_proposed	Data Pipelines \| type=Data Engineering Tools subtype=general nature=PRACTICE lifespan=MULTI_YEAR
canonical_skill_proposed	Infrastructure as Code \| type=Infrastructure Tools subtype=general nature=CONCEPT lifespan=MULTI_YEAR
canonical_skill_proposed	Failover \| type=Infrastructure Tools subtype=general nature=CONCEPT lifespan=MULTI_YEAR
canonical_skill_proposed	Reliability \| type=Soft Skills subtype=general nature=CONCEPT lifespan=EVERGREEN
canonical_skill_proposed	Performance \| type=Soft Skills subtype=general nature=CONCEPT lifespan=EVERGREEN
canonical_skill_proposed	Data Security \| type=Security Tools subtype=general nature=CONCEPT lifespan=MULTI_YEAR
canonical_skill_proposed	Access Control \| type=Security Tools subtype=general nature=CONCEPT lifespan=MULTI_YEAR
canonical_skill_proposed	Testing \| type=Testing Tools subtype=general nature=PRACTICE lifespan=EVERGREEN
canonical_skill_proposed	Validation \| type=Testing Tools subtype=general nature=PRACTICE lifespan=MULTI_YEAR
dimension_skill_link_proposed	Auto-scaling ↔ Container Orchestration Platforms

nano JD Parser — gpt-4.1-nano click to toggle

RoleSenior Data Platform Engineer

CompanyNIQ

Experience7+ years of experience as a Data Engineer

DomainIT Services & Consulting

JD type pass

Show raw JSON

{
  "JD_type": "pass",
  "about_company": {
    "source_marker": {
      "first_5_words": "NIQ is the world\u2019s leading",
      "last_5_words": "the world\u2019s population."
    },
    "text": "NIQ is the world\u2019s leading consumer intelligence company, delivering the most complete understanding of consumer buying behavior and revealing new pathways to growth. In 2023, NIQ combined with GfK, bringing together the two industry leaders with unparalleled global reach. With a holistic retail read and the most comprehensive consumer insights\u2014delivered with advanced analytics through state-of-the-art platforms\u2014NIQ delivers the Full View\u2122. NIQ is an Advent International portfolio company with operations in 100+ markets, covering more than 90% of the world\u2019s population.",
    "word_count": 64
  },
  "certifications": [],
  "company_name": "NIQ",
  "ctc": null,
  "domain": {
    "primary": {
      "aliases": [
        "Tech Consulting",
        "Data Services"
      ],
      "domain": "IT Services \u0026 Consulting"
    },
    "secondary": null
  },
  "education": [
    {
      "level": "Bachelor\u0027s",
      "qualification": "BTECH/BE/BSC - Computer Science (or related)",
      "raw": "Bachelor\u0027s or Master\u0027s degree in Computer Science, Data Engineering, or a related field.",
      "requirement": "required"
    }
  ],
  "experience": {
    "max": null,
    "min": 7,
    "raw": "7+ years of experience as a Data Engineer"
  },
  "job_locations": [],
  "role": "Senior Data Platform Engineer",
  "role_aliases": [
    "Data Engineer",
    "Senior Data Engineer",
    "Data Platform Engineer"
  ],
  "role_archetype": "Data",
  "roles_and_responsibilities": [
    {
      "bullet_count": 8,
      "heading": "Responsibilities",
      "heading_was_present": true,
      "source_marker": {
        "first_5_words": "\u2022 Design, develop, and maintain",
        "last_5_words": "deployment and monitoring of data"
      },
      "text": "\u2022 Design, develop, and maintain robust and scalable data pipelines that support the extraction, transformation, and loading of data from various sources into our data platform.\n\u2022 Build and extend our automation tools for infrastructure provisioning, auto-scaling, code deployment, monitoring, alerting, reporting, and failover.\n\u2022 Collaborate with the engineering team to build and maintain a resilient data platform, ensuring high availability, reliability, and performance.\n\u2022 Leverage your expertise in Apache Airflow to orchestrate and manage complex ETL workflows efficiently.\n\u2022 Optimize data pipeline performance, troubleshoot issues, and ensure data quality and accuracy through monitoring, testing, and validation.\n\u2022 Keep up to date with the latest trends in data engineering and evaluate and introduce new technologies and tools to improve our data infrastructure.\n\u2022 Drive the design and implementation of data security and access control policies to protect sensitive information.\n\u2022 Collaborate with the DevOps team to ensure seamless deployment and monitoring of data pipelines and workflows.",
      "word_count": 203
    }
  ],
  "urls": [
    {
      "type": "other",
      "url": "https://nielseniq.com/global/en/news-center/diversity-inclusion"
    },
    {
      "type": "website",
      "url": "https://niq.com"
    }
  ]
}

API 1 — extract-from-jd click to toggle

{
  "final_skills": [
    {
      "is_primary": true,
      "skill_name": "Apache Airflow"
    },
    {
      "is_primary": true,
      "skill_name": "ETL"
    },
    {
      "is_primary": true,
      "skill_name": "Data Pipelines"
    },
    {
      "is_primary": false,
      "skill_name": "Infrastructure as Code"
    },
    {
      "is_primary": false,
      "skill_name": "Auto-scaling"
    },
    {
      "is_primary": false,
      "skill_name": "Monitoring"
    },
    {
      "is_primary": false,
      "skill_name": "Alerting"
    },
    {
      "is_primary": false,
      "skill_name": "Failover"
    },
    {
      "is_primary": false,
      "skill_name": "High Availability"
    },
    {
      "is_primary": false,
      "skill_name": "Reliability"
    },
    {
      "is_primary": false,
      "skill_name": "Performance"
    },
    {
      "is_primary": false,
      "skill_name": "Data Security"
    },
    {
      "is_primary": false,
      "skill_name": "Access Control"
    },
    {
      "is_primary": false,
      "skill_name": "DevOps"
    },
    {
      "is_primary": false,
      "skill_name": "Testing"
    },
    {
      "is_primary": false,
      "skill_name": "Validation"
    }
  ],
  "jd_role": {
    "display_name": "Senior Data Platform Engineer",
    "rationale": null,
    "role_aliases": [
      "Data Engineer",
      "Senior Data Engineer",
      "Data Platform Engineer"
    ],
    "role_archetype": "Data",
    "slug": ""
  },
  "nano_parsed": {
    "JD_type": "pass",
    "about_company": {
      "source_marker": {
        "first_5_words": "NIQ is the world\u2019s leading",
        "last_5_words": "the world\u2019s population."
      },
      "text": "NIQ is the world\u2019s leading consumer intelligence company, delivering the most complete understanding of consumer buying behavior and revealing new pathways to growth. In 2023, NIQ combined with GfK, bringing together the two industry leaders with unparalleled global reach. With a holistic retail read and the most comprehensive consumer insights\u2014delivered with advanced analytics through state-of-the-art platforms\u2014NIQ delivers the Full View\u2122. NIQ is an Advent International portfolio company with operations in 100+ markets, covering more than 90% of the world\u2019s population.",
      "word_count": 64
    },
    "certifications": [],
    "company_name": "NIQ",
    "ctc": null,
    "domain": {
      "primary": {
        "aliases": [
          "Tech Consulting",
          "Data Services"
        ],
        "domain": "IT Services \u0026 Consulting"
      },
      "secondary": null
    },
    "education": [
      {
        "level": "Bachelor\u0027s",
        "qualification": "BTECH/BE/BSC - Computer Science (or related)",
        "raw": "Bachelor\u0027s or Master\u0027s degree in Computer Science, Data Engineering, or a related field.",
        "requirement": "required"
      }
    ],
    "experience": {
      "max": null,
      "min": 7,
      "raw": "7+ years of experience as a Data Engineer"
    },
    "job_locations": [],
    "role": "Senior Data Platform Engineer",
    "role_aliases": [
      "Data Engineer",
      "Senior Data Engineer",
      "Data Platform Engineer"
    ],
    "role_archetype": "Data",
    "roles_and_responsibilities": [
      {
        "bullet_count": 8,
        "heading": "Responsibilities",
        "heading_was_present": true,
        "source_marker": {
          "first_5_words": "\u2022 Design, develop, and maintain",
          "last_5_words": "deployment and monitoring of data"
        },
        "text": "\u2022 Design, develop, and maintain robust and scalable data pipelines that support the extraction, transformation, and loading of data from various sources into our data platform.\n\u2022 Build and extend our automation tools for infrastructure provisioning, auto-scaling, code deployment, monitoring, alerting, reporting, and failover.\n\u2022 Collaborate with the engineering team to build and maintain a resilient data platform, ensuring high availability, reliability, and performance.\n\u2022 Leverage your expertise in Apache Airflow to orchestrate and manage complex ETL workflows efficiently.\n\u2022 Optimize data pipeline performance, troubleshoot issues, and ensure data quality and accuracy through monitoring, testing, and validation.\n\u2022 Keep up to date with the latest trends in data engineering and evaluate and introduce new technologies and tools to improve our data infrastructure.\n\u2022 Drive the design and implementation of data security and access control policies to protect sensitive information.\n\u2022 Collaborate with the DevOps team to ensure seamless deployment and monitoring of data pipelines and workflows.",
        "word_count": 203
      }
    ],
    "urls": [
      {
        "type": "other",
        "url": "https://nielseniq.com/global/en/news-center/diversity-inclusion"
      },
      {
        "type": "website",
        "url": "https://niq.com"
      }
    ]
  },
  "rejected": false,
  "rejection_reason": null,
  "run_id": "02b5b632-bd0f-4c80-bf47-e28f87b2cf45",
  "stage3_signals": {
    "alias_found": true,
    "alias_match_roles": [
      {
        "display_name": "Data Engineer",
        "kra_matches": null,
        "matched_count": null,
        "matched_skills": null,
        "role_id": 2,
        "score": 1.0,
        "slug": "data-engineer",
        "total_count": null
      }
    ],
    "kra_match_roles": [
      {
        "display_name": "Data Engineer",
        "kra_matches": [
          {
            "kra_text": "Builds data ingestion pipelines to collect data from transactional databases, third-party APIs, event streams, and file sources into centralized data platforms.",
            "sentence": "Design, develop, and maintain robust and scalable data pipelines that support the extraction, transformation, and loading of data from various sources into our data platform.",
            "similarity": 0.6693
          },
          {
            "kra_text": "Monitors pipeline health, SLA breach alerts, and job failure notifications, and performs root cause analysis for data pipeline incidents.",
            "sentence": "Optimize data pipeline performance, troubleshoot issues, and ensure data quality and accuracy through monitoring, testing, and validation.",
            "similarity": 0.6324
          },
          {
            "kra_text": "Develops batch and real-time streaming data pipelines using Apache Spark, Apache Kafka, Apache Flink, or Airflow for data movement and processing at scale.",
            "sentence": "Leverage your expertise in Apache Airflow to orchestrate and manage complex ETL workflows efficiently.",
            "similarity": 0.5999
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 2,
        "score": 0.6339,
        "slug": "data-engineer",
        "total_count": null
      },
      {
        "display_name": "DevOps Engineer",
        "kra_matches": [
          {
            "kra_text": "Collaborates with development teams to improve build processes, reduce deployment friction, containerize applications, and adopt DevOps best practices.",
            "sentence": "Collaborate with the DevOps team to ensure seamless deployment and monitoring of data pipelines and workflows.",
            "similarity": 0.68
          },
          {
            "kra_text": "Monitors CI/CD pipeline reliability, identifies bottlenecks in delivery workflows, and improves deployment frequency, lead time, and failure recovery rate.",
            "sentence": "Optimize data pipeline performance, troubleshoot issues, and ensure data quality and accuracy through monitoring, testing, and validation.",
            "similarity": 0.597
          },
          {
            "kra_text": "Builds and maintains CI/CD pipelines using Jenkins, GitHub Actions, GitLab CI, or CircleCI to automate build, test, security scanning, and deployment workflows.",
            "sentence": "Build and extend our automation tools for infrastructure provisioning, auto-scaling, code deployment, monitoring, alerting, reporting, and failover.",
            "similarity": 0.5897
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 10,
        "score": 0.6222,
        "slug": "devops-engineer",
        "total_count": null
      },
      {
        "display_name": "MLOps Engineer",
        "kra_matches": [
          {
            "kra_text": "Automates ML platform operations including scheduled retraining triggers, pipeline orchestration, evaluation workflows, and alerting configuration.",
            "sentence": "Build and extend our automation tools for infrastructure provisioning, auto-scaling, code deployment, monitoring, alerting, reporting, and failover.",
            "similarity": 0.5569
          },
          {
            "kra_text": "Validates model performance benchmarks, data schema contracts, and system integration health before signing off on production release readiness.",
            "sentence": "Optimize data pipeline performance, troubleshoot issues, and ensure data quality and accuracy through monitoring, testing, and validation.",
            "similarity": 0.5554
          },
          {
            "kra_text": "Coordinates model promotion workflows across development, staging, and production environments including integration testing and data contract validation.",
            "sentence": "Collaborate with the DevOps team to ensure seamless deployment and monitoring of data pipelines and workflows.",
            "similarity": 0.5478
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 16,
        "score": 0.5534,
        "slug": "ml-ops-engineer",
        "total_count": null
      },
      {
        "display_name": "Cloud Architect",
        "kra_matches": [
          {
            "kra_text": "Designs IAM policies, service control policies, VPC segmentation, private endpoints, and zero-trust network access boundaries for cloud environments.",
            "sentence": "Drive the design and implementation of data security and access control policies to protect sensitive information.",
            "similarity": 0.5684
          },
          {
            "kra_text": "Designs multi-region and multi-availability-zone cloud infrastructure architectures for high availability, fault tolerance, and horizontal scalability.",
            "sentence": "Collaborate with the engineering team to build and maintain a resilient data platform, ensuring high availability, reliability, and performance.",
            "similarity": 0.512
          },
          {
            "kra_text": "Architects blue-green, canary, and immutable infrastructure deployment patterns for zero-downtime releases and fast rollback capabilities.",
            "sentence": "Build and extend our automation tools for infrastructure provisioning, auto-scaling, code deployment, monitoring, alerting, reporting, and failover.",
            "similarity": 0.5107
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 9,
        "score": 0.5304,
        "slug": "cloud-architect",
        "total_count": null
      },
      {
        "display_name": "Fullstack Developer",
        "kra_matches": [
          {
            "kra_text": "Delivers features through CI/CD pipelines using automated tests, staged rollouts, feature flags, and incremental deployments.",
            "sentence": "Build and extend our automation tools for infrastructure provisioning, auto-scaling, code deployment, monitoring, alerting, reporting, and failover.",
            "similarity": 0.5316
          },
          {
            "kra_text": "Delivers features through CI/CD pipelines using automated tests, staged rollouts, feature flags, and incremental deployments.",
            "sentence": "Collaborate with the DevOps team to ensure seamless deployment and monitoring of data pipelines and workflows.",
            "similarity": 0.513
          },
          {
            "kra_text": "Designs and queries relational databases like PostgreSQL and document stores like MongoDB, writing migrations, indexes, and optimized queries.",
            "sentence": "Design, develop, and maintain robust and scalable data pipelines that support the extraction, transformation, and loading of data from various sources into our data platform.",
            "similarity": 0.498
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 15,
        "score": 0.5142,
        "slug": "full-stack-engineer",
        "total_count": null
      }
    ],
    "skill_match_roles": [
      {
        "display_name": "Data Engineer",
        "kra_matches": null,
        "matched_count": 1,
        "matched_skills": [
          "Apache Airflow"
        ],
        "role_id": 2,
        "score": 0.3333,
        "slug": "data-engineer",
        "total_count": 3
      }
    ]
  },
  "stage4_decision": {
    "alias_collision_detected": false,
    "case": "A",
    "chosen_role": {
      "display_name": "Data Engineer",
      "kra_matches": null,
      "matched_count": null,
      "matched_skills": null,
      "role_id": 2,
      "score": 1.0,
      "slug": "data-engineer",
      "total_count": null
    },
    "confidence": 1.0,
    "is_new_role": false,
    "llm2_fired": false,
    "llm2_reasoning": null,
    "matched_dimensions": [],
    "matched_kras": [],
    "matched_skills": [],
    "new_role_display_name": null,
    "new_role_slug": null,
    "queued": false,
    "reasoning": "Exact alias hit on data-engineer (1.0) \u2014 no other alias at this confidence; skill_top data-engineer 0.33 does not contradict",
    "sub_role": null
  },
  "stage5_updates": {
    "centroid_n_after": 405,
    "centroid_updated": true,
    "collision_log_id": null,
    "new_kra_attached": null,
    "new_skills_attached": [
      {
        "is_primary": true,
        "queue_id": 18642,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "ETL",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 18643,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Data Pipelines",
        "status": "pending"
      },
      {
        "is_primary": false,
        "queue_id": 18644,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Infrastructure as Code",
        "status": "pending"
      },
      {
        "is_primary": false,
        "queue_id": 18645,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Auto-scaling",
        "status": "pending"
      },
      {
        "is_primary": false,
        "queue_id": 18646,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Failover",
        "status": "pending"
      },
      {
        "is_primary": false,
        "queue_id": 18647,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Reliability",
        "status": "pending"
      },
      {
        "is_primary": false,
        "queue_id": 18648,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Performance",
        "status": "pending"
      },
      {
        "is_primary": false,
        "queue_id": 18649,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Data Security",
        "status": "pending"
      },
      {
        "is_primary": false,
        "queue_id": 18650,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Access Control",
        "status": "pending"
      },
      {
        "is_primary": false,
        "queue_id": 18651,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Testing",
        "status": "pending"
      },
      {
        "is_primary": false,
        "queue_id": 18652,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Validation",
        "status": "pending"
      }
    ],
    "queue_entry_id": null,
    "v3_pipeline_triggered": false,
    "v3_role_slug": null,
    "v3_run_id": null
  }
}

API 2 — extract-details

{
  "alias_matches": [
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 304,
      "existing_alias_text": "Apache Airflow",
      "input_term": "Apache Airflow",
      "matched_canonical": {
        "category_id": 13,
        "display_name": "Apache Airflow",
        "id": 110,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "TOOL",
        "slug": "apache-airflow",
        "sub_category_id": 130,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "TODO: REMOVE AFTER TESTING \u2014 alias DB write disabled",
      "alias_persisted": false,
      "existing_alias_id": 1406,
      "existing_alias_text": "autoscaling",
      "input_term": "Auto-scaling",
      "matched_canonical": {
        "category_id": 2,
        "display_name": "autoscaling",
        "id": 858,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "CONCEPT",
        "slug": "autoscaling",
        "sub_category_id": 604,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "embedding_alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 1854,
      "existing_alias_text": "Monitoring",
      "input_term": "Monitoring",
      "matched_canonical": {
        "category_id": 2,
        "display_name": "Monitoring",
        "id": 1218,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "CONCEPT",
        "slug": "monitoring",
        "sub_category_id": 924,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 1444,
      "existing_alias_text": "alerting",
      "input_term": "Alerting",
      "matched_canonical": {
        "category_id": 2,
        "display_name": "alerting",
        "id": 882,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "CONCEPT",
        "slug": "alerting",
        "sub_category_id": 3472,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 1309,
      "existing_alias_text": "high availability",
      "input_term": "High Availability",
      "matched_canonical": {
        "category_id": 2,
        "display_name": "high availability",
        "id": 764,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "CONCEPT",
        "slug": "high-availability",
        "sub_category_id": 535,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 1852,
      "existing_alias_text": "DevOps",
      "input_term": "DevOps",
      "matched_canonical": {
        "category_id": 8,
        "display_name": "DevOps",
        "id": 1216,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "METHODOLOGY",
        "slug": "devops",
        "sub_category_id": 922,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    }
  ],
  "candidate_roles": [
    {
      "display_name": "Data Engineer",
      "id": 2,
      "rationale": null,
      "role_archetype": null,
      "slug": "data-engineer",
      "source": "db"
    },
    {
      "display_name": "Cloud Architect",
      "id": 9,
      "rationale": null,
      "role_archetype": null,
      "slug": "cloud-architect",
      "source": "db"
    },
    {
      "display_name": "DevOps Engineer",
      "id": 10,
      "rationale": null,
      "role_archetype": null,
      "slug": "devops-engineer",
      "source": "db"
    },
    {
      "display_name": "Kotlin Backend Developer",
      "id": 84,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "kotlin-server-backend-developer",
      "source": "db"
    },
    {
      "display_name": "Scala Backend Developer",
      "id": 87,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "scala-backend-developer",
      "source": "db"
    },
    {
      "display_name": ".NET Backend Developer",
      "id": 83,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "dotnet-backend-developer",
      "source": "db"
    },
    {
      "display_name": "Backend Developer",
      "id": 1,
      "rationale": null,
      "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
      "slug": "backend-engineer",
      "source": "db"
    },
    {
      "display_name": "Node.js Backend Developer",
      "id": 82,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "node-backend-developer",
      "source": "db"
    },
    {
      "display_name": "PHP Backend Developer",
      "id": 86,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "php-backend-developer",
      "source": "db"
    }
  ],
  "chosen_role": {
    "display_name": "Data Engineer",
    "id": 2,
    "rationale": "Exact alias hit on data-engineer (1.0) \u2014 no other alias at this confidence; skill_top data-engineer 0.33 does not contradict",
    "role_archetype": null,
    "slug": "data-engineer",
    "source": "db"
  },
  "dimensions": [
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Data Pipeline Orchestration",
        "id": 23,
        "rationale": "Workflow engines that schedule, coordinate, and recover batch data jobs. This cluster covers dependency management, retries, backfills, sensors, and operational control of pipeline DAGs.",
        "slug": "data-pipeline-orchestration",
        "source": "db"
      },
      "input_skill": "Apache Airflow",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Data Engineer",
          "id": 2,
          "rationale": null,
          "role_archetype": null,
          "slug": "data-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Container Orchestration Platforms",
        "id": 134,
        "rationale": "Platforms that schedule and manage containerized workloads across clusters and environments. Cloud Architects need these to define workload placement standards, cluster boundaries, and platform capabilities.",
        "slug": "container-orchestration-platforms",
        "source": "db"
      },
      "input_skill": "Auto-scaling",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Cloud Architect",
          "id": 9,
          "rationale": null,
          "role_archetype": null,
          "slug": "cloud-architect",
          "source": "db"
        },
        {
          "display_name": "DevOps Engineer",
          "id": 10,
          "rationale": null,
          "role_archetype": null,
          "slug": "devops-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Observability and Incident Triage",
        "id": 155,
        "rationale": "Telemetry, alerting, and troubleshooting practices used to diagnose failed builds, broken deployments, and unhealthy release environments. This is a coherent cluster because delivery reliability depends on quickly identifying where the workflow failed.",
        "slug": "observability-and-incident-triage",
        "source": "db"
      },
      "input_skill": "Monitoring",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "DevOps Engineer",
          "id": 10,
          "rationale": null,
          "role_archetype": null,
          "slug": "devops-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Backend Observability, Logging, and Diagnostics",
        "id": 388,
        "rationale": "Instrumentation and troubleshooting practices used to understand and improve backend service behavior in production and lower environments. This includes logs, metrics, traces, alerting, dashboards, structured logging, distributed tracing, health checks, and root-cause analysis using ecosystem-specific tools such as SLF4J, Logback, Micrometer, OpenTelemetry, Prometheus, Grafana, ILogger, Serilog, and Application Insights.",
        "slug": "backend-observability-logging-and-diagnostics",
        "source": "db"
      },
      "input_skill": "Alerting",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Kotlin Backend Developer",
          "id": 84,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "kotlin-server-backend-developer",
          "source": "db"
        },
        {
          "display_name": "Scala Backend Developer",
          "id": 87,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "scala-backend-developer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Observability and Incident Response",
        "id": 10,
        "rationale": "Instrumentation and production troubleshooting practices used to keep backend services reliable. Includes logs, metrics, traces, alerting, dashboards, and incident diagnosis.",
        "slug": "observability-and-incident-response",
        "source": "db"
      },
      "input_skill": "Alerting",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": ".NET Backend Developer",
          "id": 83,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "dotnet-backend-developer",
          "source": "db"
        },
        {
          "display_name": "Backend Developer",
          "id": 1,
          "rationale": null,
          "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
          "slug": "backend-engineer",
          "source": "db"
        },
        {
          "display_name": "Node.js Backend Developer",
          "id": 82,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "node-backend-developer",
          "source": "db"
        },
        {
          "display_name": "PHP Backend Developer",
          "id": 86,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "php-backend-developer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Observability and Incident Triage",
        "id": 155,
        "rationale": "Telemetry, alerting, and troubleshooting practices used to diagnose failed builds, broken deployments, and unhealthy release environments. This is a coherent cluster because delivery reliability depends on quickly identifying where the workflow failed.",
        "slug": "observability-and-incident-triage",
        "source": "db"
      },
      "input_skill": "Alerting",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "DevOps Engineer",
          "id": 10,
          "rationale": null,
          "role_archetype": null,
          "slug": "devops-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Availability and Disaster Recovery",
        "id": 141,
        "rationale": "Resilience architecture for uptime, failover, backup, and recovery objectives. This cluster is coherent because cloud architects must translate business continuity needs into platform guardrails.",
        "slug": "availability-and-disaster-recovery",
        "source": "db"
      },
      "input_skill": "High Availability",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Cloud Architect",
          "id": 9,
          "rationale": null,
          "role_archetype": null,
          "slug": "cloud-architect",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "CI/CD Pipeline Platforms",
        "id": 150,
        "rationale": "Systems used to define, run, and maintain automated build and deployment workflows. This cluster is coherent because the role owns delivery automation end to end, including pipeline reliability and promotion logic.",
        "slug": "ci-cd-pipeline-platforms",
        "source": "db"
      },
      "input_skill": "DevOps",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "DevOps Engineer",
          "id": 10,
          "rationale": null,
          "role_archetype": null,
          "slug": "devops-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Deployment and Release Patterns",
        "id": 140,
        "rationale": "Patterns for promoting changes safely across environments, including rollout, rollback, and release gating strategies. Cloud Architects define these patterns so teams can deploy consistently across the platform.",
        "slug": "deployment-and-release-patterns",
        "source": "db"
      },
      "input_skill": "DevOps",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Cloud Architect",
          "id": 9,
          "rationale": null,
          "role_archetype": null,
          "slug": "cloud-architect",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Infrastructure as Code",
        "id": 132,
        "rationale": "Declarative provisioning and environment definition tools used to codify cloud infrastructure, repeatable environments, and platform standards. Cloud Architects use these to express reference architectures and guardrails.",
        "slug": "infrastructure-as-code",
        "source": "db"
      },
      "input_skill": "DevOps",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Cloud Architect",
          "id": 9,
          "rationale": null,
          "role_archetype": null,
          "slug": "cloud-architect",
          "source": "db"
        },
        {
          "display_name": "DevOps Engineer",
          "id": 10,
          "rationale": null,
          "role_archetype": null,
          "slug": "devops-engineer",
          "source": "db"
        }
      ]
    }
  ],
  "input_final_skills": [
    "Apache Airflow",
    "ETL",
    "Data Pipelines",
    "Infrastructure as Code",
    "Auto-scaling",
    "Monitoring",
    "Alerting",
    "Failover",
    "High Availability",
    "Reliability",
    "Performance",
    "Data Security",
    "Access Control",
    "DevOps",
    "Testing",
    "Validation"
  ],
  "input_llm_skills": [
    "Apache Airflow",
    "ETL",
    "Data Pipelines",
    "Infrastructure as Code",
    "Auto-scaling",
    "Monitoring",
    "Alerting",
    "Failover",
    "High Availability",
    "Reliability",
    "Performance",
    "Data Security",
    "Access Control",
    "DevOps",
    "Testing",
    "Validation"
  ],
  "new_aliases_persisted": 0,
  "run_id": "02b5b632-bd0f-4c80-bf47-e28f87b2cf45",
  "skills_detail": [
    {
      "aliases_in_db": [
        {
          "alias_text": "Apache Airflow",
          "alias_type": "CANONICAL",
          "id": 304,
          "is_primary": true,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 13,
        "display_name": "Apache Airflow",
        "id": 110,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "TOOL",
        "slug": "apache-airflow",
        "sub_category_id": 130,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Data Pipeline Orchestration",
            "id": 23,
            "rationale": "Workflow engines that schedule, coordinate, and recover batch data jobs. This cluster covers dependency management, retries, backfills, sensors, and operational control of pipeline DAGs.",
            "slug": "data-pipeline-orchestration",
            "source": "db"
          },
          "input_skill": "Apache Airflow",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Data Engineer",
              "id": 2,
              "rationale": null,
              "role_archetype": null,
              "slug": "data-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "Apache Airflow",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "ETL",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Data Engineering Tools",
          "skill_nature": "PRACTICE",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "etl",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Data Pipelines",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Data Engineering Tools",
          "skill_nature": "PRACTICE",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "data-pipelines",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Infrastructure as Code",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Infrastructure Tools",
          "skill_nature": "CONCEPT",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "infrastructure-as-code",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "autoscaling",
          "alias_type": "CANONICAL",
          "id": 1406,
          "is_primary": true,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 2,
        "display_name": "autoscaling",
        "id": 858,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "CONCEPT",
        "slug": "autoscaling",
        "sub_category_id": 604,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Container Orchestration Platforms",
            "id": 134,
            "rationale": "Platforms that schedule and manage containerized workloads across clusters and environments. Cloud Architects need these to define workload placement standards, cluster boundaries, and platform capabilities.",
            "slug": "container-orchestration-platforms",
            "source": "db"
          },
          "input_skill": "Auto-scaling",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Cloud Architect",
              "id": 9,
              "rationale": null,
              "role_archetype": null,
              "slug": "cloud-architect",
              "source": "db"
            },
            {
              "display_name": "DevOps Engineer",
              "id": 10,
              "rationale": null,
              "role_archetype": null,
              "slug": "devops-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "Auto-scaling",
      "matched_via": "embedding_alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "Monitoring",
          "alias_type": "CANONICAL",
          "id": 1854,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 2,
        "display_name": "Monitoring",
        "id": 1218,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "CONCEPT",
        "slug": "monitoring",
        "sub_category_id": 924,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Observability and Incident Triage",
            "id": 155,
            "rationale": "Telemetry, alerting, and troubleshooting practices used to diagnose failed builds, broken deployments, and unhealthy release environments. This is a coherent cluster because delivery reliability depends on quickly identifying where the workflow failed.",
            "slug": "observability-and-incident-triage",
            "source": "db"
          },
          "input_skill": "Monitoring",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "DevOps Engineer",
              "id": 10,
              "rationale": null,
              "role_archetype": null,
              "slug": "devops-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "Monitoring",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "alerting",
          "alias_type": "CANONICAL",
          "id": 1444,
          "is_primary": true,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 2,
        "display_name": "alerting",
        "id": 882,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "CONCEPT",
        "slug": "alerting",
        "sub_category_id": 3472,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Backend Observability, Logging, and Diagnostics",
            "id": 388,
            "rationale": "Instrumentation and troubleshooting practices used to understand and improve backend service behavior in production and lower environments. This includes logs, metrics, traces, alerting, dashboards, structured logging, distributed tracing, health checks, and root-cause analysis using ecosystem-specific tools such as SLF4J, Logback, Micrometer, OpenTelemetry, Prometheus, Grafana, ILogger, Serilog, and Application Insights.",
            "slug": "backend-observability-logging-and-diagnostics",
            "source": "db"
          },
          "input_skill": "Alerting",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Kotlin Backend Developer",
              "id": 84,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "kotlin-server-backend-developer",
              "source": "db"
            },
            {
              "display_name": "Scala Backend Developer",
              "id": 87,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "scala-backend-developer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Observability and Incident Response",
            "id": 10,
            "rationale": "Instrumentation and production troubleshooting practices used to keep backend services reliable. Includes logs, metrics, traces, alerting, dashboards, and incident diagnosis.",
            "slug": "observability-and-incident-response",
            "source": "db"
          },
          "input_skill": "Alerting",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": ".NET Backend Developer",
              "id": 83,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "dotnet-backend-developer",
              "source": "db"
            },
            {
              "display_name": "Backend Developer",
              "id": 1,
              "rationale": null,
              "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
              "slug": "backend-engineer",
              "source": "db"
            },
            {
              "display_name": "Node.js Backend Developer",
              "id": 82,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "node-backend-developer",
              "source": "db"
            },
            {
              "display_name": "PHP Backend Developer",
              "id": 86,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "php-backend-developer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Observability and Incident Triage",
            "id": 155,
            "rationale": "Telemetry, alerting, and troubleshooting practices used to diagnose failed builds, broken deployments, and unhealthy release environments. This is a coherent cluster because delivery reliability depends on quickly identifying where the workflow failed.",
            "slug": "observability-and-incident-triage",
            "source": "db"
          },
          "input_skill": "Alerting",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "DevOps Engineer",
              "id": 10,
              "rationale": null,
              "role_archetype": null,
              "slug": "devops-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "Alerting",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Failover",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Infrastructure Tools",
          "skill_nature": "CONCEPT",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "failover",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "high availability",
          "alias_type": "CANONICAL",
          "id": 1309,
          "is_primary": true,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 2,
        "display_name": "high availability",
        "id": 764,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "CONCEPT",
        "slug": "high-availability",
        "sub_category_id": 535,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Availability and Disaster Recovery",
            "id": 141,
            "rationale": "Resilience architecture for uptime, failover, backup, and recovery objectives. This cluster is coherent because cloud architects must translate business continuity needs into platform guardrails.",
            "slug": "availability-and-disaster-recovery",
            "source": "db"
          },
          "input_skill": "High Availability",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Cloud Architect",
              "id": 9,
              "rationale": null,
              "role_archetype": null,
              "slug": "cloud-architect",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "High Availability",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Reliability",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Soft Skills",
          "skill_nature": "CONCEPT",
          "sub_category": "general",
          "typical_lifespan": "EVERGREEN",
          "version_strategy": "UNVERSIONED",
          "volatility": "STABLE"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "reliability",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Performance",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Soft Skills",
          "skill_nature": "CONCEPT",
          "sub_category": "general",
          "typical_lifespan": "EVERGREEN",
          "version_strategy": "UNVERSIONED",
          "volatility": "STABLE"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "performance",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Data Security",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Security Tools",
          "skill_nature": "CONCEPT",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "data-security",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Access Control",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Security Tools",
          "skill_nature": "CONCEPT",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "access-control",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "DevOps",
          "alias_type": "CANONICAL",
          "id": 1852,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 8,
        "display_name": "DevOps",
        "id": 1216,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "METHODOLOGY",
        "slug": "devops",
        "sub_category_id": 922,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "CI/CD Pipeline Platforms",
            "id": 150,
            "rationale": "Systems used to define, run, and maintain automated build and deployment workflows. This cluster is coherent because the role owns delivery automation end to end, including pipeline reliability and promotion logic.",
            "slug": "ci-cd-pipeline-platforms",
            "source": "db"
          },
          "input_skill": "DevOps",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "DevOps Engineer",
              "id": 10,
              "rationale": null,
              "role_archetype": null,
              "slug": "devops-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Deployment and Release Patterns",
            "id": 140,
            "rationale": "Patterns for promoting changes safely across environments, including rollout, rollback, and release gating strategies. Cloud Architects define these patterns so teams can deploy consistently across the platform.",
            "slug": "deployment-and-release-patterns",
            "source": "db"
          },
          "input_skill": "DevOps",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Cloud Architect",
              "id": 9,
              "rationale": null,
              "role_archetype": null,
              "slug": "cloud-architect",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Infrastructure as Code",
            "id": 132,
            "rationale": "Declarative provisioning and environment definition tools used to codify cloud infrastructure, repeatable environments, and platform standards. Cloud Architects use these to express reference architectures and guardrails.",
            "slug": "infrastructure-as-code",
            "source": "db"
          },
          "input_skill": "DevOps",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Cloud Architect",
              "id": 9,
              "rationale": null,
              "role_archetype": null,
              "slug": "cloud-architect",
              "source": "db"
            },
            {
              "display_name": "DevOps Engineer",
              "id": 10,
              "rationale": null,
              "role_archetype": null,
              "slug": "devops-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "DevOps",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Testing",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Testing Tools",
          "skill_nature": "PRACTICE",
          "sub_category": "general",
          "typical_lifespan": "EVERGREEN",
          "version_strategy": "UNVERSIONED",
          "volatility": "STABLE"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "testing",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Validation",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Testing Tools",
          "skill_nature": "PRACTICE",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "validation",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    }
  ],
  "unmatched_skills": [
    "ETL",
    "Data Pipelines",
    "Infrastructure as Code",
    "Failover",
    "Reliability",
    "Performance",
    "Data Security",
    "Access Control",
    "Testing",
    "Validation"
  ]
}

API 3 — final-role-output

{
  "chosen_role": {
    "display_name": "Data Engineer",
    "id": 2,
    "rationale": "Exact alias hit on data-engineer (1.0) \u2014 no other alias at this confidence; skill_top data-engineer 0.33 does not contradict",
    "role_archetype": null,
    "slug": "data-engineer",
    "source": "db"
  },
  "chosen_role_resolution": "in_db",
  "final_input_skills": [
    {
      "skill": "Apache Airflow",
      "tag": "in_db"
    },
    {
      "skill": "ETL",
      "tag": "new"
    },
    {
      "skill": "Data Pipelines",
      "tag": "new"
    },
    {
      "skill": "Infrastructure as Code",
      "tag": "new"
    },
    {
      "skill": "Auto-scaling",
      "tag": "in_db"
    },
    {
      "skill": "Monitoring",
      "tag": "in_db"
    },
    {
      "skill": "Alerting",
      "tag": "in_db"
    },
    {
      "skill": "Failover",
      "tag": "new"
    },
    {
      "skill": "High Availability",
      "tag": "in_db"
    },
    {
      "skill": "Reliability",
      "tag": "new"
    },
    {
      "skill": "Performance",
      "tag": "new"
    },
    {
      "skill": "Data Security",
      "tag": "new"
    },
    {
      "skill": "Access Control",
      "tag": "new"
    },
    {
      "skill": "DevOps",
      "tag": "in_db"
    },
    {
      "skill": "Testing",
      "tag": "new"
    },
    {
      "skill": "Validation",
      "tag": "new"
    }
  ],
  "llm_cost_api1_usd": null,
  "llm_cost_api2_usd": null,
  "llm_cost_api3_usd": null,
  "llm_cost_total_usd": null,
  "persistence": {
    "items": [
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Data Pipeline Orchestration",
          "id": 23,
          "rationale": "Workflow engines that schedule, coordinate, and recover batch data jobs. This cluster covers dependency management, retries, backfills, sensors, and operational control of pipeline DAGs.",
          "slug": "data-pipeline-orchestration",
          "source": "db"
        },
        "dimension_id": 23,
        "input_skill": "Apache Airflow",
        "llm_role": null,
        "matched_chosen_role": true,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension saved",
        "role_dimension_saved": true,
        "roles_from_db": [
          {
            "display_name": "Data Engineer",
            "id": 2,
            "rationale": null,
            "role_archetype": null,
            "slug": "data-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 110,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Container Orchestration Platforms",
          "id": 134,
          "rationale": "Platforms that schedule and manage containerized workloads across clusters and environments. Cloud Architects need these to define workload placement standards, cluster boundaries, and platform capabilities.",
          "slug": "container-orchestration-platforms",
          "source": "db"
        },
        "dimension_id": 134,
        "input_skill": "Auto-scaling",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Skipped \u2014 no persistable v3 meta for new skill",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Cloud Architect",
            "id": 9,
            "rationale": null,
            "role_archetype": null,
            "slug": "cloud-architect",
            "source": "db"
          },
          {
            "display_name": "DevOps Engineer",
            "id": 10,
            "rationale": null,
            "role_archetype": null,
            "slug": "devops-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": false,
        "skill_id": null,
        "skill_tag": "new",
        "skipped_reason": "skill_not_in_db_v3_proposed"
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Observability and Incident Triage",
          "id": 155,
          "rationale": "Telemetry, alerting, and troubleshooting practices used to diagnose failed builds, broken deployments, and unhealthy release environments. This is a coherent cluster because delivery reliability depends on quickly identifying where the workflow failed.",
          "slug": "observability-and-incident-triage",
          "source": "db"
        },
        "dimension_id": 155,
        "input_skill": "Monitoring",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "DevOps Engineer",
            "id": 10,
            "rationale": null,
            "role_archetype": null,
            "slug": "devops-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 1218,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Backend Observability, Logging, and Diagnostics",
          "id": 388,
          "rationale": "Instrumentation and troubleshooting practices used to understand and improve backend service behavior in production and lower environments. This includes logs, metrics, traces, alerting, dashboards, structured logging, distributed tracing, health checks, and root-cause analysis using ecosystem-specific tools such as SLF4J, Logback, Micrometer, OpenTelemetry, Prometheus, Grafana, ILogger, Serilog, and Application Insights.",
          "slug": "backend-observability-logging-and-diagnostics",
          "source": "db"
        },
        "dimension_id": 388,
        "input_skill": "Alerting",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Kotlin Backend Developer",
            "id": 84,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "kotlin-server-backend-developer",
            "source": "db"
          },
          {
            "display_name": "Scala Backend Developer",
            "id": 87,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "scala-backend-developer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 882,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Observability and Incident Response",
          "id": 10,
          "rationale": "Instrumentation and production troubleshooting practices used to keep backend services reliable. Includes logs, metrics, traces, alerting, dashboards, and incident diagnosis.",
          "slug": "observability-and-incident-response",
          "source": "db"
        },
        "dimension_id": 10,
        "input_skill": "Alerting",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": ".NET Backend Developer",
            "id": 83,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "dotnet-backend-developer",
            "source": "db"
          },
          {
            "display_name": "Backend Developer",
            "id": 1,
            "rationale": null,
            "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
            "slug": "backend-engineer",
            "source": "db"
          },
          {
            "display_name": "Node.js Backend Developer",
            "id": 82,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "node-backend-developer",
            "source": "db"
          },
          {
            "display_name": "PHP Backend Developer",
            "id": 86,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "php-backend-developer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 882,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Observability and Incident Triage",
          "id": 155,
          "rationale": "Telemetry, alerting, and troubleshooting practices used to diagnose failed builds, broken deployments, and unhealthy release environments. This is a coherent cluster because delivery reliability depends on quickly identifying where the workflow failed.",
          "slug": "observability-and-incident-triage",
          "source": "db"
        },
        "dimension_id": 155,
        "input_skill": "Alerting",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "DevOps Engineer",
            "id": 10,
            "rationale": null,
            "role_archetype": null,
            "slug": "devops-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 882,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Availability and Disaster Recovery",
          "id": 141,
          "rationale": "Resilience architecture for uptime, failover, backup, and recovery objectives. This cluster is coherent because cloud architects must translate business continuity needs into platform guardrails.",
          "slug": "availability-and-disaster-recovery",
          "source": "db"
        },
        "dimension_id": 141,
        "input_skill": "High Availability",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Cloud Architect",
            "id": 9,
            "rationale": null,
            "role_archetype": null,
            "slug": "cloud-architect",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 764,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "CI/CD Pipeline Platforms",
          "id": 150,
          "rationale": "Systems used to define, run, and maintain automated build and deployment workflows. This cluster is coherent because the role owns delivery automation end to end, including pipeline reliability and promotion logic.",
          "slug": "ci-cd-pipeline-platforms",
          "source": "db"
        },
        "dimension_id": 150,
        "input_skill": "DevOps",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "DevOps Engineer",
            "id": 10,
            "rationale": null,
            "role_archetype": null,
            "slug": "devops-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 1216,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Deployment and Release Patterns",
          "id": 140,
          "rationale": "Patterns for promoting changes safely across environments, including rollout, rollback, and release gating strategies. Cloud Architects define these patterns so teams can deploy consistently across the platform.",
          "slug": "deployment-and-release-patterns",
          "source": "db"
        },
        "dimension_id": 140,
        "input_skill": "DevOps",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Cloud Architect",
            "id": 9,
            "rationale": null,
            "role_archetype": null,
            "slug": "cloud-architect",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 1216,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Infrastructure as Code",
          "id": 132,
          "rationale": "Declarative provisioning and environment definition tools used to codify cloud infrastructure, repeatable environments, and platform standards. Cloud Architects use these to express reference architectures and guardrails.",
          "slug": "infrastructure-as-code",
          "source": "db"
        },
        "dimension_id": 132,
        "input_skill": "DevOps",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Cloud Architect",
            "id": 9,
            "rationale": null,
            "role_archetype": null,
            "slug": "cloud-architect",
            "source": "db"
          },
          {
            "display_name": "DevOps Engineer",
            "id": 10,
            "rationale": null,
            "role_archetype": null,
            "slug": "devops-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 1216,
        "skill_tag": "in_db",
        "skipped_reason": null
      }
    ],
    "new_skills_created": 0,
    "role_dimension_saved": 0,
    "skill_dimension_saved": 0,
    "skipped": 1
  },
  "planner_output": null,
  "run_id": "02b5b632-bd0f-4c80-bf47-e28f87b2cf45"
}