← Back to history

Pipeline run

f4313d5f-0537-4f83-a2db-fdff084c63c1

Pipeline LLM cost (USD)
API 1: $0.0041 API 2: $0.0006 API 3: $0.0000 Total: $0.0046

Client output enrichment

v2 Skill cluster · Nature of work · AI index · Tech stack maturity · Evidence · KRA description
role baseline loaded sources · ai_index: jd · nature_of_work: jd · tech_stack_maturity: jd
Nature of work · Data pipeline development
Design, build, and monitor Hadoop/Spark data pipelines on Cloudera, optimizing Hive/Impala performance and handling batch/real-time ingestion, ETL, and SQL/HQL-based data processing with coding standards. Also support data quality, workflow orchestration, and platform tuning in a team of Data Engineers.
"Contribute to designing, developing, and monitoring systems and solutions for collecting, storing, processing, and analyzing large data sets"
Tech stack maturity
Mainstream Legacy
The stack centers on Hadoop, Hive, and Spark/Scala, which are established big-data technologies that are widely used but generally considered mature rather than cloud-native or bleeding-edge.
AI index (0 = no AI use, 5 = totally AI-dependent · v2.1)
0.00 / 5
· Title match
· Has AI skill
· AI skill (primary)
· AI skill (secondary)
· On AI team
· Builds AI products
vocab breakdown (legacy)
Assistants (×1):
Frameworks (×2):
Models / concepts (×3):
Evidence — skills matched in JD (25)
Big Data Hadoop Hive Impala Flume HDFS Spark Scala Cloudera SQL HQL IBM WebSphere MQ Pentaho Bash Perl Oracle SQL Server SSRS SSIS AWS Kafka Guidewire CA Erwin Sparx Oozie
Skill cluster (5 dimension groups, role-scoped)
Programming Languages for Data Work
Scala SQL Bash
ETL and ELT Tooling
Hadoop Spark
Cloud Platforms
AWS
Messaging and Event Streaming
Kafka
Cross-cutting / unaligned
Big Data Hive Impala Flume HDFS Cloudera HQL IBM WebSphere MQ Pentaho Perl Oracle SQL Server SSRS SSIS Guidewire CA Erwin Sparx Oozie
Show KRA description ↓
Contribute to designing, developing, and monitoring systems and solutions for collecting, storing, processing, and analyzing large data sets Work in a team made up of other Data Engineers Understand and implement certain platform capabilities Understand coding standards and implement them within code being developed 5+ years of Big Data engineering experience Hands-on experience with Cloudera Hadoop stack including Hive and Impala performance optimization Data Quality management knowledge is highly desirable Experience with real-time data processing technologies: Flume, Message Queues (preferably IBM WebSphere MQ) 3+ years' experience with Hadoop stack: HDFS, Hive, Impala 3+ years' experience with Spark Scala 2+ year of ETL tools experience, preferably Pentaho Experience with Cloudera 6+ Knowledge of SQL/HQL Experience with scripting (bash preferably or Perl) Knowledge of RDBMS technologies (Oracle preferably and experience with SQL Server, SSRS - reporting, SSIS - ETL) Experience with AWS or Cloud technologies Experience in messaging tech – MQ or Kafka or both and general cloud awareness would be extremely helpful Knowledge of Insurance field and Guidewire system Data Modelling CA Erwin/Sparx Experience in OOZIE workflow

Signals

Skill data-engineer
0.36
Alias data-engineer
1.00
KRA data-engineer
0.54

Post-classification

Centroidupdated · n=243
Alias collision log
New-role queue
New skills captured15
New KRA captured

Captured for admin review

Big Data primary Data Engineer pending
Impala primary Data Engineer pending
Flume primary Data Engineer pending
IBM WebSphere MQ Data Engineer pending
HDFS primary Data Engineer pending
Pentaho Data Engineer pending
Cloudera primary Data Engineer pending
HQL primary Data Engineer pending
Oracle Data Engineer pending
SSRS Data Engineer pending
SSIS Data Engineer pending
Guidewire Data Engineer pending
CA Erwin Data Engineer pending
Sparx Data Engineer pending
Oozie Data Engineer pending
Status: completed Created: 2026-05-27T15:01:22.834889Z Updated: 2026-06-12T16:58:19.905401Z API 3 duration: 45609 ms
Flow Current 3-step pipeline

1 POST /skills/extract-from-jd

2 POST /skills/extract-details

3 POST /skills/final-role-output

Role Chosen role & resolution

Data Engineer

CASE A

slug: data-engineer · id: 2 · source: db

Exact alias hit on data-engineer (1.0) — no other alias at this confidence; skill_top data-engineer 0.36 does not contradict

Resolution: in_db — role exists in library; skill↔dim and role↔dim links saved when applicable.

0
New skills
0
Skill↔dim saved
0
Role↔dim saved
0
Skipped

Job description

Currently, we are looking for a remote Lead Big Data Engineer with 5+ years of Big Data engineering experience, and Hands-on experience with Cloudera Hadoop stack to join our team.

Our customer offers automobile, property, liability, agriculture, and surety insurance.

Please note that even though you are applying for this position, you may be offered other projects to join within EPAM Anywhere.

We accept CVs only in English. 

Responsibilities
Contribute to designing, developing, and monitoring systems and solutions for collecting, storing, processing, and analyzing large data sets Work in a team made up of other Data Engineers Understand and implement certain platform capabilities Understand coding standards and implement them within code being developed 

Requirements
5+ years of Big Data engineering experience Hands-on experience with Cloudera Hadoop stack including Hive and Impala performance optimization Data Quality management knowledge is highly desirable Experience with real-time data processing technologies: Flume, Message Queues (preferably IBM WebSphere MQ) 3+ years' experience with Hadoop stack: HDFS, Hive, Impala 3+ years' experience with Spark Scala 2+ year of ETL tools experience, preferably Pentaho Experience with Cloudera 6+ Knowledge of SQL/HQL Experience with scripting (bash preferably or Perl) Knowledge of RDBMS technologies (Oracle preferably and experience with SQL Server, SSRS - reporting, SSIS - ETL) Experience with AWS or Cloud technologies Experience in messaging tech – MQ or Kafka or both and general cloud awareness would be extremely helpful Knowledge of Insurance field and Guidewire system Data Modelling CA Erwin/Sparx Experience in OOZIE workflow 

We offer
Competitive compensation depending on experience and skillsWork on enterprise-level projects on a long-term basisFull-time remote workUnlimited access to learning resources (EPAM training courses, English classes, Internal Library)Community of 38,000+ industry's top professionals

This is a remote position and we welcome applications from anywhere in India

Skills from this JD

Each row merges API 1 extraction, API 2 library match / v3 orchestration (dimensions + locked dims), and API 3 persistence tags.

Big Data Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Data Engineering Tools
Sub-category
general
Skill nature
CONCEPT
Volatility
MEDIUM
Typical lifespan
MULTI_YEAR
Version strategy
UNVERSIONED
Hadoop Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: Hadoop id=1351 · hadoop

Aliases — catalog

  • Hadoop (CANONICAL)

Context tags (catalog)

Big Data Data Lake Distributed Computing ELT ETL Flume HDFS Hive Kafka MapReduce NoSQL Oozie Pig Spark Sqoop YARN

Stored enrichment (catalog DB)

Category
Framework
Sub-category
Data Processing Framework
Vendor
Apache Software Foundation
License
apache_2
Year introduced
2006
Confidence
0.90
Version strategy
NOT_APPLICABLE

Maturity reasoning: Job postings still mention Hadoop for legacy big-data stacks, but JD volume has fallen as Spark and cloud warehouses replaced MapReduce-era clusters.

Skill profile (library / DB)

Skill nature
FRAMEWORK
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
5
Sub-category id
91
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • ETL and ELT Tooling Catalog dimension db id 24

    Library dimension (catalog)

    Roles linked in library: Data Engineer

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
ETL and ELT Tooling
etl-and-elt-tooling
Existing dimension (library) · Role↔dimension saved
Hive Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: Hive id=2754 · hive

Aliases — catalog

  • Hive (CANONICAL) primary

Context tags (catalog)

Apache Apache Hive Bucketing ETL HQL Hive Metastore Hive SerDe HiveQL MapReduce SQL SQL-on-Hadoop big data bucketing columnar storage data lakes data warehousing integration metadata partitioning schema evolution

Stored enrichment (catalog DB)

Category
Datastore
Sub-category
Local Key Value Store
Vendor
Apache Software Foundation
License
apache_2
Year introduced
2010
Confidence
0.90
Version strategy
NOT_APPLICABLE

Maturity reasoning: Hive appears in Flutter/mobile JDs and package docs, but JD volume is far below SQLite/Realm and it’s mainly used for local key-value storage in Flutter apps.

Skill profile (library / DB)

Skill nature
TOOL
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
3
Sub-category id
2242
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • Local Persistence and Offline Behavior Catalog dimension db id 85

    Library dimension (catalog)

    Roles linked in library: Android Developer, Flutter Developer, Hybrid Mobile Developer, Native Mobile Developer, React Native Developer, iOS Developer

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
Local Persistence and Offline Behavior
local-persistence-and-offline-behavior
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Impala Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Databases
Sub-category
general
Skill nature
TOOL
Volatility
MEDIUM
Typical lifespan
MULTI_YEAR
Version strategy
UNVERSIONED
Flume Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Data Engineering Tools
Sub-category
general
Skill nature
TOOL
Volatility
FAST
Typical lifespan
SHORT_LIVED
Version strategy
VERSIONED
IBM WebSphere MQ Secondary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Messaging Tools
Sub-category
general
Skill nature
TOOL
Volatility
MEDIUM
Typical lifespan
MULTI_YEAR
Version strategy
UNVERSIONED
HDFS Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Data Engineering Tools
Sub-category
general
Skill nature
TOOL
Volatility
MEDIUM
Typical lifespan
MULTI_YEAR
Version strategy
UNVERSIONED
Spark Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: Apache Spark id=1350 · apache-spark

Aliases — catalog

  • Apache Spark (CANONICAL)
  • apache spark 3 (VERSION)
  • spark (VERSION)
  • spark 3 (VERSION)
  • spark 3.x (VERSION)
  • spark3 (VERSION)

Context tags (catalog)

Apache Kafka Cluster Manager DAGScheduler Data Lake DataFrame ETL Hadoop MLlib Machine Learning PySpark RDD Scala Spark SQL Spark Streaming SparkSession

Stored enrichment (catalog DB)

Category
Framework
Sub-category
Distributed Data Processing Framework
Vendor
Apache Software Foundation
License
apache_2
Year introduced
2010
Confidence
0.94
Version strategy
SEPARATE_ENTITY
Version tag
3.x

Maturity reasoning: Apache Spark appears in many data engineering JDs and remains a standard for distributed ETL/ELT; its GitHub and vendor ecosystem activity stay strong, with Databricks and cloud platforms still promoting it.

Skill profile (library / DB)

Skill nature
FRAMEWORK
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
5
Sub-category id
1021
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • ETL and ELT Tooling Catalog dimension db id 24

    Library dimension (catalog)

    Roles linked in library: Data Engineer

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
ETL and ELT Tooling
etl-and-elt-tooling
Existing dimension (library) · Role↔dimension saved
Scala Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: Scala id=102 · scala

Aliases — catalog

  • Scala (CANONICAL) primary

Context tags (catalog)

Akka Apache Kafka Cats Flink JVM Monads Play Framework SBT ScalaTest Shapeless Spark Spark SQL ZIO case class for-comprehension functional programming implicit pattern matching typeclass

Stored enrichment (catalog DB)

Category
Language
Sub-category
Programming Language
Vendor
EPFL
License
apache_2
Year introduced
2004
Confidence
0.99
Version strategy
NOT_APPLICABLE

Maturity reasoning: Scala still appears in many backend/data engineering JDs, especially with Spark and Akka, and remains supported by major JVM ecosystems; it’s not a sunset technology.

Skill profile (library / DB)

Skill nature
LANGUAGE
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
6
Sub-category id
96
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • Programming Languages for Data Work Catalog dimension db id 21

    Library dimension (catalog)

    Roles linked in library: Data Engineer

  • Programming Languages for ML Systems Catalog dimension db id 39

    Library dimension (catalog)

    Roles linked in library: ML Engineer, MLOps Engineer

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
Programming Languages for Data Work
programming-languages-for-data-work
Existing dimension (library) · Role↔dimension saved
Programming Languages for ML Systems
programming-languages-for-ml-systems
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Pentaho Secondary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Data Engineering Tools
Sub-category
general
Skill nature
TOOL
Volatility
MEDIUM
Typical lifespan
MULTI_YEAR
Version strategy
UNVERSIONED
Cloudera Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Data Engineering Tools
Sub-category
general
Skill nature
PLATFORM
Volatility
MEDIUM
Typical lifespan
MULTI_YEAR
Version strategy
UNVERSIONED
SQL Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: SQL id=101 · sql

Aliases — catalog

  • SQL (CANONICAL) primary

Context tags (catalog)

ACID CTE DDL DML ETL JOIN MySQL NoSQL OLAP ORM PostgreSQL SQL injection SQLite T-SQL data modeling data warehousing database normalization execution plan indexing joins normalization query optimization stored procedures subquery transaction isolation transaction management window functions

Stored enrichment (catalog DB)

Category
Language
Sub-category
Query Language
Vendor
ANSI
License
unknown
Year introduced
1974
Confidence
0.99
Version strategy
NOT_APPLICABLE

Maturity reasoning: SQL appears in a large share of data, backend, and analytics job descriptions and remains the default query language for PostgreSQL, MySQL, and cloud warehouses like Snowflake/BigQuery.

Skill profile (library / DB)

Skill nature
LANGUAGE
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
6
Sub-category id
97
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • Pega Programming Languages & DSLs Catalog dimension db id 267

    Library dimension (catalog)

    Roles linked in library: Pega Developer

  • Programming Languages & DSLs Catalog dimension db id 475

    Library dimension (catalog)

    Roles linked in library: Engineering Manager

  • Programming Languages for Data Work Catalog dimension db id 21

    Library dimension (catalog)

    Roles linked in library: Data Engineer

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
Pega Programming Languages & DSLs
pega-programming-languages-dsls
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Programming Languages & DSLs
programming-languages-dsls
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Programming Languages for Data Work
programming-languages-for-data-work
Existing dimension (library) · Role↔dimension saved
HQL Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Query Languages
Sub-category
general
Skill nature
LANGUAGE
Volatility
MEDIUM
Typical lifespan
MULTI_YEAR
Version strategy
UNVERSIONED
Bash Secondary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: Bash id=103 · bash

Aliases — catalog

  • Bash (VERSION)
  • Bash 3.x (VERSION)
  • Bash 4.x (VERSION)
  • Bash 5.x (VERSION)
  • GNU Bash (VERSION)
  • bash (VERSION)
  • bash 3 (VERSION)
  • bash 3.x (VERSION)
  • bash 4 (VERSION)
  • bash 4.x (VERSION)
  • bash 5 (VERSION)
  • bash 5.x (VERSION)

Context tags (catalog)

Linux POSIX Unix alias awk chmod cron environment variables grep here-doc pipes sed shebang shell scripting ssh stdin stdout xargs

Stored enrichment (catalog DB)

Category
Language
Sub-category
Shell Language
Vendor
GNU Project
License
gpl_v3
Year introduced
1989
Confidence
0.99
Version strategy
SEPARATE_ENTITY
Version tag
5.x

Maturity reasoning: Bash appears in many DevOps, SRE, and Linux admin job descriptions and remains the default shell on most Unix-like systems, with no vendor sunset or clear replacement in mainstream hiring.

Skill profile (library / DB)

Skill nature
LANGUAGE
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
6
Sub-category id
238
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • Programming Languages and Scripting Catalog dimension db id 59

    Library dimension (catalog)

    Roles linked in library: Cyber Security Engineer

  • Programming Languages for Data Work Catalog dimension db id 21

    Library dimension (catalog)

    Roles linked in library: Data Engineer

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
Programming Languages and Scripting
programming-languages-and-scripting
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Programming Languages for Data Work
programming-languages-for-data-work
Existing dimension (library) · Role↔dimension saved
Perl Secondary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: Perl id=1001 · perl

Aliases — catalog

  • Perl (CANONICAL)

Context tags (catalog)

BioPerl CPAN Catalyst DBI Dancer Mojolicious Moose Object-Oriented Perl6 Perlbrew Plack Regex Template Toolkit Test::More Tidy

Stored enrichment (catalog DB)

Category
Language
Sub-category
Scripting Language
Vendor
Perl Foundation
License
unknown
Year introduced
1987
Confidence
0.99
Version strategy
NOT_APPLICABLE

Maturity reasoning: Perl still appears in some legacy-maintenance JDs, but far fewer than Python/JavaScript; GitHub activity and new-project adoption are much lower, with many orgs having migrated to Python or Ruby.

Skill profile (library / DB)

Skill nature
LANGUAGE
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
6
Sub-category id
38
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • React Frontend Development Catalog dimension db id 96

    Library dimension (catalog)

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
React Frontend Development
d_init_01
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Oracle Secondary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Databases
Sub-category
general
Skill nature
TOOL
Volatility
MEDIUM
Typical lifespan
MULTI_YEAR
Version strategy
UNVERSIONED
SQL Server Secondary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: SQL Server id=18 · sql-server

Aliases — catalog

  • SQL Server (CANONICAL) primary
  • SQL Server 2000 (VERSION)
  • SQL Server 2005 (VERSION)
  • SQL Server 2008 (VERSION)
  • SQL Server 2012 (VERSION)
  • SQL Server 2014 (VERSION)
  • SQL Server 2016 (VERSION)
  • SQL Server 2017 (VERSION)
  • SQL Server 2019 (VERSION)
  • SQL Server 2022 (VERSION)
  • SQL Server 6.5 (VERSION)
  • SQL Server 7.0 (VERSION)

Context tags (catalog)

Always On CLR Integration Clustered Index ETL Execution Plan Linked Servers Query Store Replication SQL Agent SQL Server Agent SQL Server Integration Services SQL Server Management Studio SQL Server Reporting Services SSIS SSMS SSRS Stored Procedures T-SQL TempDB backup and recovery backup and restore clustering data migration data warehousing database design database normalization indexing performance tuning query optimization replication stored procedures transaction log transaction logs

Stored enrichment (catalog DB)

Category
Datastore
Sub-category
Relational Database
Vendor
Microsoft
License
proprietary
Year introduced
1989
Confidence
0.99
Version strategy
NOT_APPLICABLE

Maturity reasoning: SQL Server appears in many enterprise job descriptions and remains a major Microsoft-supported RDBMS with active Azure SQL/SQL Server demand; it is a common hiring-pipeline staple, not a sunset technology.

Skill profile (library / DB)

Skill nature
TOOL
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
3
Sub-category id
29
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • Relational Database Design Catalog dimension db id 4

    Library dimension (catalog)

    Roles linked in library: .NET Backend Developer, Backend Developer, Kotlin Backend Developer, Node.js Backend Developer, Python Backend Developer, Ruby Backend Developer, Scala Backend Developer

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
Relational Database Design
relational-database-design
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
SSRS Secondary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Reporting Tools
Sub-category
general
Skill nature
TOOL
Volatility
MEDIUM
Typical lifespan
MULTI_YEAR
Version strategy
UNVERSIONED
SSIS Secondary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Data Engineering Tools
Sub-category
general
Skill nature
TOOL
Volatility
MEDIUM
Typical lifespan
MULTI_YEAR
Version strategy
UNVERSIONED
AWS Secondary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: AWS id=187 · aws

Aliases — catalog

  • AWS (CANONICAL) primary

Context tags (catalog)

API Gateway AWS CLI Auto Scaling CloudFormation CloudFront CloudTrail CloudWatch Cognito DynamoDB EC2 ECS EKS Elastic Beanstalk Elastic Load Balancing IAM KMS Lambda RDS Route 53 S3 SNS SQS Serverless VPC

Stored enrichment (catalog DB)

Category
Platform
Sub-category
Cloud Platform
Vendor
Amazon
License
other_open
Year introduced
2006
Confidence
0.99
Version strategy
NOT_APPLICABLE

Maturity reasoning: AWS is a hiring-pipeline staple: it appears in a large share of cloud/DevOps job descriptions and dominates public cloud market share, with broad certification and vendor ecosystem support.

Skill profile (library / DB)

Skill nature
PLATFORM
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
9
Sub-category id
46
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • Cloud Platforms Catalog dimension db id 20

    Library dimension (catalog)

    Roles linked in library: .NET Backend Developer, Backend Developer, Cyber Security Engineer, Data Engineer, DevOps Engineer, Fullstack Developer, Go Backend Developer, Java Backend Developer, Kotlin Backend Developer, ML Engineer, MLOps Engineer, Node.js Backend Developer, Python Backend Developer, Scala Backend Developer

  • Cloud Platforms for AI Deployment Catalog dimension db id 211

    Library dimension (catalog)

    Roles linked in library: AI Engineer

  • Cloud Provider Platforms Catalog dimension db id 131

    Library dimension (catalog)

    Roles linked in library: Cloud Architect, Cloud Security Engineer

  • Cloud Security Posture Tools Catalog dimension db id 64

    Library dimension (catalog)

    Roles linked in library: Cloud Security Engineer, Cyber Security Engineer

  • Vendor Product Families Catalog dimension db id 477

    Library dimension (catalog)

    Roles linked in library: Engineering Manager

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
Cloud Platforms
cloud-platforms
Existing dimension (library) · Role↔dimension saved
Cloud Platforms for AI Deployment
cloud-platforms-for-ai-deployment
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Cloud Provider Platforms
cloud-provider-platforms
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Cloud Security Posture Tools
cloud-security-posture-tools
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Vendor Product Families
vendor-product-families
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Kafka Secondary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: Kafka id=36 · kafka

Aliases — catalog

  • Kafka (CANONICAL) primary

Context tags (catalog)

Apache Flink Apache Kafka Apache Pulsar Apache Spark Avro KSQL Kafka API Kafka Connect Kafka Streams ZooKeeper Zookeeper backpressure brokers consumer consumer group consumer groups event sourcing event-driven architecture exactly-once semantics fault tolerance high throughput log compaction message broker message queue microservices offsets partition partitioning partitions producer producer API real-time analytics real-time data replication schema registry stream processing topic topic partitioning topics

Stored enrichment (catalog DB)

Category
Datastore
Sub-category
Event Stream Store
Vendor
Confluent
License
apache_2
Year introduced
2011
Confidence
0.90
Version strategy
NOT_APPLICABLE

Maturity reasoning: Kafka appears in many production JDs for event streaming and data pipelines, and remains a standard platform in cloud/vendor offerings (e.g., Confluent, AWS MSK), indicating broad hiring demand.

Skill profile (library / DB)

Skill nature
TOOL
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
3
Sub-category id
3533
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • Asynchronous Messaging and Event Streaming Catalog dimension db id 297

    Library dimension (catalog)

    Roles linked in library: .NET Backend Developer, Go Backend Developer, Kotlin Backend Developer, Node.js Backend Developer, Scala Backend Developer

  • Messaging and Background Jobs Catalog dimension db id 291

    Library dimension (catalog)

    Roles linked in library: PHP Backend Developer, Python Backend Developer, Ruby Backend Developer

  • Messaging and Event Streaming Catalog dimension db id 8

    Library dimension (catalog)

    Roles linked in library: Backend Developer, Data Engineer

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
Asynchronous Messaging and Event Streaming
asynchronous-messaging-and-event-streaming
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Messaging and Background Jobs
messaging-and-background-jobs
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Messaging and Event Streaming
messaging-and-event-streaming
Existing dimension (library) · Role↔dimension saved
Guidewire Secondary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Insurance Software
Sub-category
general
Skill nature
TOOL
Volatility
MEDIUM
Typical lifespan
MULTI_YEAR
Version strategy
UNVERSIONED
CA Erwin Secondary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Data Modeling Tools
Sub-category
general
Skill nature
TOOL
Volatility
MEDIUM
Typical lifespan
MULTI_YEAR
Version strategy
UNVERSIONED
Sparx Secondary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Modeling Tools
Sub-category
general
Skill nature
TOOL
Volatility
MEDIUM
Typical lifespan
MULTI_YEAR
Version strategy
UNVERSIONED
Oozie Secondary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Data Engineering Tools
Sub-category
general
Skill nature
TOOL
Volatility
FAST
Typical lifespan
SHORT_LIVED
Version strategy
VERSIONED

All API 3 persistence rows

Same grid as the skill-extractor “Persistence items” table: one row per (skill × dimension) work item.

Skill Tag Dimension Skill↔dim Role↔dim Outcome Notes
Hadoop in_db
ETL and ELT Tooling
etl-and-elt-tooling
Existing dimension (library) · Role↔dimension saved
Hive in_db
Local Persistence and Offline Behavior
local-persistence-and-offline-behavior
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Spark in_db
ETL and ELT Tooling
etl-and-elt-tooling
Existing dimension (library) · Role↔dimension saved
Scala in_db
Programming Languages for Data Work
programming-languages-for-data-work
Existing dimension (library) · Role↔dimension saved
Scala in_db
Programming Languages for ML Systems
programming-languages-for-ml-systems
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
SQL in_db
Pega Programming Languages & DSLs
pega-programming-languages-dsls
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
SQL in_db
Programming Languages & DSLs
programming-languages-dsls
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
SQL in_db
Programming Languages for Data Work
programming-languages-for-data-work
Existing dimension (library) · Role↔dimension saved
Bash in_db
Programming Languages and Scripting
programming-languages-and-scripting
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Bash in_db
Programming Languages for Data Work
programming-languages-for-data-work
Existing dimension (library) · Role↔dimension saved
Perl in_db
React Frontend Development
d_init_01
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
SQL Server in_db
Relational Database Design
relational-database-design
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
AWS in_db
Cloud Platforms
cloud-platforms
Existing dimension (library) · Role↔dimension saved
AWS in_db
Cloud Platforms for AI Deployment
cloud-platforms-for-ai-deployment
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
AWS in_db
Cloud Provider Platforms
cloud-provider-platforms
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
AWS in_db
Cloud Security Posture Tools
cloud-security-posture-tools
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
AWS in_db
Vendor Product Families
vendor-product-families
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Kafka in_db
Asynchronous Messaging and Event Streaming
asynchronous-messaging-and-event-streaming
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Kafka in_db
Messaging and Background Jobs
messaging-and-background-jobs
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Kafka in_db
Messaging and Event Streaming
messaging-and-event-streaming
Existing dimension (library) · Role↔dimension saved

Library artifacts (this run)

Kind Detail DB id
canonical_skill_proposed Big Data | type=Data Engineering Tools subtype=general nature=CONCEPT lifespan=MULTI_YEAR
canonical_skill_proposed Impala | type=Databases subtype=general nature=TOOL lifespan=MULTI_YEAR
canonical_skill_proposed Flume | type=Data Engineering Tools subtype=general nature=TOOL lifespan=SHORT_LIVED
canonical_skill_proposed IBM WebSphere MQ | type=Messaging Tools subtype=general nature=TOOL lifespan=MULTI_YEAR
canonical_skill_proposed HDFS | type=Data Engineering Tools subtype=general nature=TOOL lifespan=MULTI_YEAR
canonical_skill_proposed Pentaho | type=Data Engineering Tools subtype=general nature=TOOL lifespan=MULTI_YEAR
canonical_skill_proposed Cloudera | type=Data Engineering Tools subtype=general nature=PLATFORM lifespan=MULTI_YEAR
canonical_skill_proposed HQL | type=Query Languages subtype=general nature=LANGUAGE lifespan=MULTI_YEAR
canonical_skill_proposed Oracle | type=Databases subtype=general nature=TOOL lifespan=MULTI_YEAR
canonical_skill_proposed SSRS | type=Reporting Tools subtype=general nature=TOOL lifespan=MULTI_YEAR
canonical_skill_proposed SSIS | type=Data Engineering Tools subtype=general nature=TOOL lifespan=MULTI_YEAR
canonical_skill_proposed Guidewire | type=Insurance Software subtype=general nature=TOOL lifespan=MULTI_YEAR
canonical_skill_proposed CA Erwin | type=Data Modeling Tools subtype=general nature=TOOL lifespan=MULTI_YEAR
canonical_skill_proposed Sparx | type=Modeling Tools subtype=general nature=TOOL lifespan=MULTI_YEAR
canonical_skill_proposed Oozie | type=Data Engineering Tools subtype=general nature=TOOL lifespan=SHORT_LIVED
nano JD Parser — gpt-4.1-nano click to toggle
RoleLead Big Data Engineer
CompanyEPAM Anywhere
Experience5+ years of Big Data engineering experience
DomainInsurance
Location India (remote)
JD type pass
Show raw JSON
{
  "JD_type": "pass",
  "about_company": null,
  "certifications": [],
  "company_name": "EPAM Anywhere",
  "ctc": null,
  "domain": {
    "primary": {
      "aliases": [
        "Automobile Insurance",
        "Property Insurance",
        "Liability Insurance",
        "Agriculture Insurance",
        "Surety Insurance"
      ],
      "domain": "Insurance"
    },
    "secondary": null
  },
  "education": [],
  "experience": {
    "max": null,
    "min": 5,
    "raw": "5+ years of Big Data engineering experience"
  },
  "job_locations": [
    {
      "aliases": [],
      "city": null,
      "country": "India",
      "state": null,
      "work_mode": "remote"
    }
  ],
  "role": "Lead Big Data Engineer",
  "role_aliases": [
    "Big Data Engineer",
    "Lead Data Engineer",
    "Senior Big Data Engineer"
  ],
  "role_archetype": "Data",
  "roles_and_responsibilities": [
    {
      "bullet_count": 4,
      "heading": "Responsibilities",
      "heading_was_present": true,
      "source_marker": {
        "first_5_words": "Contribute to designing, developing,",
        "last_5_words": "within code being developed"
      },
      "text": "Contribute to designing, developing, and monitoring systems and solutions for collecting, storing, processing, and analyzing large data sets\nWork in a team made up of other Data Engineers\nUnderstand and implement certain platform capabilities\nUnderstand coding standards and implement them within code being developed",
      "word_count": 48
    },
    {
      "bullet_count": 13,
      "heading": "Requirements",
      "heading_was_present": true,
      "source_marker": {
        "first_5_words": "5+ years of Big Data",
        "last_5_words": "and Guidewire system"
      },
      "text": "5+ years of Big Data engineering experience\nHands-on experience with Cloudera Hadoop stack including Hive and Impala performance optimization\nData Quality management knowledge is highly desirable\nExperience with real-time data processing technologies: Flume, Message Queues (preferably IBM WebSphere MQ)\n3+ years\u0027 experience with Hadoop stack: HDFS, Hive, Impala\n3+ years\u0027 experience with Spark Scala\n2+ year of ETL tools experience, preferably Pentaho\nExperience with Cloudera 6+\nKnowledge of SQL/HQL\nExperience with scripting (bash preferably or Perl)\nKnowledge of RDBMS technologies (Oracle preferably and experience with SQL Server, SSRS - reporting, SSIS - ETL)\nExperience with AWS or Cloud technologies\nExperience in messaging tech \u2013 MQ or Kafka or both and general cloud awareness would be extremely helpful\nKnowledge of Insurance field and Guidewire system\nData Modelling CA Erwin/Sparx\nExperience in OOZIE workflow",
      "word_count": 174
    }
  ],
  "urls": []
}
API 1 — extract-from-jd click to toggle
{
  "final_skills": [
    {
      "is_primary": true,
      "skill_name": "Big Data"
    },
    {
      "is_primary": true,
      "skill_name": "Hadoop"
    },
    {
      "is_primary": true,
      "skill_name": "Hive"
    },
    {
      "is_primary": true,
      "skill_name": "Impala"
    },
    {
      "is_primary": true,
      "skill_name": "Flume"
    },
    {
      "is_primary": false,
      "skill_name": "IBM WebSphere MQ"
    },
    {
      "is_primary": true,
      "skill_name": "HDFS"
    },
    {
      "is_primary": true,
      "skill_name": "Spark"
    },
    {
      "is_primary": true,
      "skill_name": "Scala"
    },
    {
      "is_primary": false,
      "skill_name": "Pentaho"
    },
    {
      "is_primary": true,
      "skill_name": "Cloudera"
    },
    {
      "is_primary": true,
      "skill_name": "SQL"
    },
    {
      "is_primary": true,
      "skill_name": "HQL"
    },
    {
      "is_primary": false,
      "skill_name": "Bash"
    },
    {
      "is_primary": false,
      "skill_name": "Perl"
    },
    {
      "is_primary": false,
      "skill_name": "Oracle"
    },
    {
      "is_primary": false,
      "skill_name": "SQL Server"
    },
    {
      "is_primary": false,
      "skill_name": "SSRS"
    },
    {
      "is_primary": false,
      "skill_name": "SSIS"
    },
    {
      "is_primary": false,
      "skill_name": "AWS"
    },
    {
      "is_primary": false,
      "skill_name": "Kafka"
    },
    {
      "is_primary": false,
      "skill_name": "Guidewire"
    },
    {
      "is_primary": false,
      "skill_name": "CA Erwin"
    },
    {
      "is_primary": false,
      "skill_name": "Sparx"
    },
    {
      "is_primary": false,
      "skill_name": "Oozie"
    }
  ],
  "jd_role": {
    "display_name": "Lead Big Data Engineer",
    "rationale": null,
    "role_aliases": [
      "Big Data Engineer",
      "Lead Data Engineer",
      "Senior Big Data Engineer"
    ],
    "role_archetype": "Data",
    "slug": ""
  },
  "nano_parsed": {
    "JD_type": "pass",
    "about_company": null,
    "certifications": [],
    "company_name": "EPAM Anywhere",
    "ctc": null,
    "domain": {
      "primary": {
        "aliases": [
          "Automobile Insurance",
          "Property Insurance",
          "Liability Insurance",
          "Agriculture Insurance",
          "Surety Insurance"
        ],
        "domain": "Insurance"
      },
      "secondary": null
    },
    "education": [],
    "experience": {
      "max": null,
      "min": 5,
      "raw": "5+ years of Big Data engineering experience"
    },
    "job_locations": [
      {
        "aliases": [],
        "city": null,
        "country": "India",
        "state": null,
        "work_mode": "remote"
      }
    ],
    "role": "Lead Big Data Engineer",
    "role_aliases": [
      "Big Data Engineer",
      "Lead Data Engineer",
      "Senior Big Data Engineer"
    ],
    "role_archetype": "Data",
    "roles_and_responsibilities": [
      {
        "bullet_count": 4,
        "heading": "Responsibilities",
        "heading_was_present": true,
        "source_marker": {
          "first_5_words": "Contribute to designing, developing,",
          "last_5_words": "within code being developed"
        },
        "text": "Contribute to designing, developing, and monitoring systems and solutions for collecting, storing, processing, and analyzing large data sets\nWork in a team made up of other Data Engineers\nUnderstand and implement certain platform capabilities\nUnderstand coding standards and implement them within code being developed",
        "word_count": 48
      },
      {
        "bullet_count": 13,
        "heading": "Requirements",
        "heading_was_present": true,
        "source_marker": {
          "first_5_words": "5+ years of Big Data",
          "last_5_words": "and Guidewire system"
        },
        "text": "5+ years of Big Data engineering experience\nHands-on experience with Cloudera Hadoop stack including Hive and Impala performance optimization\nData Quality management knowledge is highly desirable\nExperience with real-time data processing technologies: Flume, Message Queues (preferably IBM WebSphere MQ)\n3+ years\u0027 experience with Hadoop stack: HDFS, Hive, Impala\n3+ years\u0027 experience with Spark Scala\n2+ year of ETL tools experience, preferably Pentaho\nExperience with Cloudera 6+\nKnowledge of SQL/HQL\nExperience with scripting (bash preferably or Perl)\nKnowledge of RDBMS technologies (Oracle preferably and experience with SQL Server, SSRS - reporting, SSIS - ETL)\nExperience with AWS or Cloud technologies\nExperience in messaging tech \u2013 MQ or Kafka or both and general cloud awareness would be extremely helpful\nKnowledge of Insurance field and Guidewire system\nData Modelling CA Erwin/Sparx\nExperience in OOZIE workflow",
        "word_count": 174
      }
    ],
    "urls": []
  },
  "rejected": false,
  "rejection_reason": null,
  "run_id": "f4313d5f-0537-4f83-a2db-fdff084c63c1",
  "stage3_signals": {
    "alias_found": true,
    "alias_match_roles": [
      {
        "display_name": "Data Engineer",
        "kra_matches": null,
        "matched_count": null,
        "matched_skills": null,
        "role_id": 2,
        "score": 1.0,
        "slug": "data-engineer",
        "total_count": null
      }
    ],
    "kra_match_roles": [
      {
        "display_name": "Data Engineer",
        "kra_matches": [
          {
            "kra_text": "Works with data analysts, data scientists, and business stakeholders to define data models, ingestion schedules, and data delivery requirements.",
            "sentence": "Contribute to designing, developing, and monitoring systems and solutions for collecting, storing, processing, and analyzing large data sets",
            "similarity": 0.5713
          },
          {
            "kra_text": "Works with data analysts, data scientists, and business stakeholders to define data models, ingestion schedules, and data delivery requirements.",
            "sentence": "Work in a team made up of other Data Engineers",
            "similarity": 0.5378
          },
          {
            "kra_text": "Implements data quality validation rules, reconciliation checks, and anomaly detection to ensure data completeness, accuracy, and consistency.",
            "sentence": "Data Quality management knowledge is highly desirable",
            "similarity": 0.5123
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 2,
        "score": 0.5405,
        "slug": "data-engineer",
        "total_count": null
      },
      {
        "display_name": "React Native Developer",
        "kra_matches": [
          {
            "kra_text": "maintain code quality",
            "sentence": "Understand coding standards and implement them within code being developed",
            "similarity": 0.6284
          },
          {
            "kra_text": "handle permissions and device behaviors",
            "sentence": "Understand and implement certain platform capabilities",
            "similarity": 0.4655
          },
          {
            "kra_text": "maintain code quality",
            "sentence": "Data Quality management knowledge is highly desirable",
            "similarity": 0.3634
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 73,
        "score": 0.4858,
        "slug": "react-native-developer",
        "total_count": null
      },
      {
        "display_name": "Java Backend Developer",
        "kra_matches": [
          {
            "kra_text": "persistence and data modeling",
            "sentence": "Data Quality management knowledge is highly desirable",
            "similarity": 0.5178
          },
          {
            "kra_text": "code refactoring and defect fixes",
            "sentence": "Understand coding standards and implement them within code being developed",
            "similarity": 0.4842
          },
          {
            "kra_text": "persistence and data modeling",
            "sentence": "Contribute to designing, developing, and monitoring systems and solutions for collecting, storing, processing, and analyzing large data sets",
            "similarity": 0.4242
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 79,
        "score": 0.4754,
        "slug": "java-backend-developer",
        "total_count": null
      },
      {
        "display_name": "Node.js Backend Developer",
        "kra_matches": [
          {
            "kra_text": "code review and refactoring",
            "sentence": "Understand coding standards and implement them within code being developed",
            "similarity": 0.5381
          },
          {
            "kra_text": "data modeling and persistence access",
            "sentence": "Data Quality management knowledge is highly desirable",
            "similarity": 0.4432
          },
          {
            "kra_text": "external system integration",
            "sentence": "Understand and implement certain platform capabilities",
            "similarity": 0.3865
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 82,
        "score": 0.4559,
        "slug": "node-backend-developer",
        "total_count": null
      },
      {
        "display_name": "Ruby Backend Developer",
        "kra_matches": [
          {
            "kra_text": "refactoring and code organization",
            "sentence": "Understand coding standards and implement them within code being developed",
            "similarity": 0.4753
          },
          {
            "kra_text": "Server-side feature implementation",
            "sentence": "Understand and implement certain platform capabilities",
            "similarity": 0.4454
          },
          {
            "kra_text": "data access and persistence",
            "sentence": "Data Quality management knowledge is highly desirable",
            "similarity": 0.4191
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 85,
        "score": 0.4466,
        "slug": "ruby-backend-developer",
        "total_count": null
      }
    ],
    "skill_match_roles": [
      {
        "display_name": "Data Engineer",
        "kra_matches": null,
        "matched_count": 4,
        "matched_skills": [
          "Apache Spark",
          "Hadoop",
          "SQL",
          "Scala"
        ],
        "role_id": 2,
        "score": 0.3636,
        "slug": "data-engineer",
        "total_count": 11
      },
      {
        "display_name": "Android Developer",
        "kra_matches": null,
        "matched_count": 1,
        "matched_skills": [
          "Hive"
        ],
        "role_id": 4,
        "score": 0.0909,
        "slug": "android-engineer",
        "total_count": 11
      },
      {
        "display_name": "iOS Developer",
        "kra_matches": null,
        "matched_count": 1,
        "matched_skills": [
          "Hive"
        ],
        "role_id": 6,
        "score": 0.0909,
        "slug": "ios-engineer",
        "total_count": 11
      },
      {
        "display_name": "Hybrid Mobile Developer",
        "kra_matches": null,
        "matched_count": 1,
        "matched_skills": [
          "Hive"
        ],
        "role_id": 11,
        "score": 0.0909,
        "slug": "hybrid-mobile-developer",
        "total_count": 11
      },
      {
        "display_name": "ML Engineer",
        "kra_matches": null,
        "matched_count": 1,
        "matched_skills": [
          "Scala"
        ],
        "role_id": 3,
        "score": 0.0909,
        "slug": "ml-engineer",
        "total_count": 11
      }
    ]
  },
  "stage4_decision": {
    "alias_collision_detected": false,
    "case": "A",
    "chosen_role": {
      "display_name": "Data Engineer",
      "kra_matches": null,
      "matched_count": null,
      "matched_skills": null,
      "role_id": 2,
      "score": 1.0,
      "slug": "data-engineer",
      "total_count": null
    },
    "confidence": 1.0,
    "is_new_role": false,
    "llm2_fired": false,
    "llm2_reasoning": null,
    "matched_dimensions": [],
    "matched_kras": [],
    "matched_skills": [],
    "new_role_display_name": null,
    "new_role_slug": null,
    "queued": false,
    "reasoning": "Exact alias hit on data-engineer (1.0) \u2014 no other alias at this confidence; skill_top data-engineer 0.36 does not contradict",
    "sub_role": null
  },
  "stage5_updates": {
    "centroid_n_after": 243,
    "centroid_updated": true,
    "collision_log_id": null,
    "new_kra_attached": null,
    "new_skills_attached": [
      {
        "is_primary": true,
        "queue_id": 12178,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Big Data",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 12179,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Impala",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 12180,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Flume",
        "status": "pending"
      },
      {
        "is_primary": false,
        "queue_id": 12181,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "IBM WebSphere MQ",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 12182,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "HDFS",
        "status": "pending"
      },
      {
        "is_primary": false,
        "queue_id": 12183,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Pentaho",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 12184,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Cloudera",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 12185,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "HQL",
        "status": "pending"
      },
      {
        "is_primary": false,
        "queue_id": 12186,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Oracle",
        "status": "pending"
      },
      {
        "is_primary": false,
        "queue_id": 12187,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "SSRS",
        "status": "pending"
      },
      {
        "is_primary": false,
        "queue_id": 12188,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "SSIS",
        "status": "pending"
      },
      {
        "is_primary": false,
        "queue_id": 12189,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Guidewire",
        "status": "pending"
      },
      {
        "is_primary": false,
        "queue_id": 12190,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "CA Erwin",
        "status": "pending"
      },
      {
        "is_primary": false,
        "queue_id": 12191,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Sparx",
        "status": "pending"
      },
      {
        "is_primary": false,
        "queue_id": 12192,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Oozie",
        "status": "pending"
      }
    ],
    "queue_entry_id": null,
    "v3_pipeline_triggered": false,
    "v3_role_slug": null,
    "v3_run_id": null
  }
}
API 2 — extract-details
{
  "alias_matches": [
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 2010,
      "existing_alias_text": "Hadoop",
      "input_term": "Hadoop",
      "matched_canonical": {
        "category_id": 5,
        "display_name": "Hadoop",
        "id": 1351,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "FRAMEWORK",
        "slug": "hadoop",
        "sub_category_id": 91,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 4198,
      "existing_alias_text": "Hive",
      "input_term": "Hive",
      "matched_canonical": {
        "category_id": 3,
        "display_name": "Hive",
        "id": 2754,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "TOOL",
        "slug": "hive",
        "sub_category_id": 2242,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 2510,
      "existing_alias_text": "spark",
      "input_term": "Spark",
      "matched_canonical": {
        "category_id": 5,
        "display_name": "Apache Spark",
        "id": 1350,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "FRAMEWORK",
        "slug": "apache-spark",
        "sub_category_id": 1021,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 272,
      "existing_alias_text": "Scala",
      "input_term": "Scala",
      "matched_canonical": {
        "category_id": 6,
        "display_name": "Scala",
        "id": 102,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "LANGUAGE",
        "slug": "scala",
        "sub_category_id": 96,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 271,
      "existing_alias_text": "SQL",
      "input_term": "SQL",
      "matched_canonical": {
        "category_id": 6,
        "display_name": "SQL",
        "id": 101,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "LANGUAGE",
        "slug": "sql",
        "sub_category_id": 97,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 273,
      "existing_alias_text": "Bash",
      "input_term": "Bash",
      "matched_canonical": {
        "category_id": 6,
        "display_name": "Bash",
        "id": 103,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "LANGUAGE",
        "slug": "bash",
        "sub_category_id": 238,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 1612,
      "existing_alias_text": "Perl",
      "input_term": "Perl",
      "matched_canonical": {
        "category_id": 6,
        "display_name": "Perl",
        "id": 1001,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "LANGUAGE",
        "slug": "perl",
        "sub_category_id": 38,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 135,
      "existing_alias_text": "SQL Server",
      "input_term": "SQL Server",
      "matched_canonical": {
        "category_id": 3,
        "display_name": "SQL Server",
        "id": 18,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "TOOL",
        "slug": "sql-server",
        "sub_category_id": 29,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 406,
      "existing_alias_text": "AWS",
      "input_term": "AWS",
      "matched_canonical": {
        "category_id": 9,
        "display_name": "AWS",
        "id": 187,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "PLATFORM",
        "slug": "aws",
        "sub_category_id": 46,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 173,
      "existing_alias_text": "Kafka",
      "input_term": "Kafka",
      "matched_canonical": {
        "category_id": 3,
        "display_name": "Kafka",
        "id": 36,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "TOOL",
        "slug": "kafka",
        "sub_category_id": 3533,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    }
  ],
  "candidate_roles": [
    {
      "display_name": "Data Engineer",
      "id": 2,
      "rationale": null,
      "role_archetype": null,
      "slug": "data-engineer",
      "source": "db"
    },
    {
      "display_name": "Android Developer",
      "id": 4,
      "rationale": null,
      "role_archetype": null,
      "slug": "android-engineer",
      "source": "db"
    },
    {
      "display_name": "Flutter Developer",
      "id": 74,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "flutter-developer",
      "source": "db"
    },
    {
      "display_name": "Hybrid Mobile Developer",
      "id": 11,
      "rationale": null,
      "role_archetype": null,
      "slug": "hybrid-mobile-developer",
      "source": "db"
    },
    {
      "display_name": "Native Mobile Developer",
      "id": 75,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "native-mobile-developer",
      "source": "db"
    },
    {
      "display_name": "React Native Developer",
      "id": 73,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "react-native-developer",
      "source": "db"
    },
    {
      "display_name": "iOS Developer",
      "id": 6,
      "rationale": null,
      "role_archetype": null,
      "slug": "ios-engineer",
      "source": "db"
    },
    {
      "display_name": "ML Engineer",
      "id": 3,
      "rationale": null,
      "role_archetype": null,
      "slug": "ml-engineer",
      "source": "db"
    },
    {
      "display_name": "MLOps Engineer",
      "id": 16,
      "rationale": null,
      "role_archetype": null,
      "slug": "ml-ops-engineer",
      "source": "db"
    },
    {
      "display_name": "Pega Developer",
      "id": 24,
      "rationale": null,
      "role_archetype": null,
      "slug": "pega-developer",
      "source": "db"
    },
    {
      "display_name": "Engineering Manager",
      "id": 121,
      "rationale": null,
      "role_archetype": null,
      "slug": "engineering-manager",
      "source": "db"
    },
    {
      "display_name": "Cyber Security Engineer",
      "id": 5,
      "rationale": null,
      "role_archetype": null,
      "slug": "cybersecurity-engineer",
      "source": "db"
    },
    {
      "display_name": ".NET Backend Developer",
      "id": 83,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "dotnet-backend-developer",
      "source": "db"
    },
    {
      "display_name": "Backend Developer",
      "id": 1,
      "rationale": null,
      "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
      "slug": "backend-engineer",
      "source": "db"
    },
    {
      "display_name": "Kotlin Backend Developer",
      "id": 84,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "kotlin-server-backend-developer",
      "source": "db"
    },
    {
      "display_name": "Node.js Backend Developer",
      "id": 82,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "node-backend-developer",
      "source": "db"
    },
    {
      "display_name": "Python Backend Developer",
      "id": 80,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "python-backend-developer",
      "source": "db"
    },
    {
      "display_name": "Ruby Backend Developer",
      "id": 85,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "ruby-backend-developer",
      "source": "db"
    },
    {
      "display_name": "Scala Backend Developer",
      "id": 87,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "scala-backend-developer",
      "source": "db"
    },
    {
      "display_name": "DevOps Engineer",
      "id": 10,
      "rationale": null,
      "role_archetype": null,
      "slug": "devops-engineer",
      "source": "db"
    },
    {
      "display_name": "Fullstack Developer",
      "id": 15,
      "rationale": null,
      "role_archetype": null,
      "slug": "full-stack-engineer",
      "source": "db"
    },
    {
      "display_name": "Go Backend Developer",
      "id": 81,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "go-backend-developer",
      "source": "db"
    },
    {
      "display_name": "Java Backend Developer",
      "id": 79,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "java-backend-developer",
      "source": "db"
    },
    {
      "display_name": "AI Engineer",
      "id": 13,
      "rationale": null,
      "role_archetype": null,
      "slug": "ai-engineer",
      "source": "db"
    },
    {
      "display_name": "Cloud Architect",
      "id": 9,
      "rationale": null,
      "role_archetype": null,
      "slug": "cloud-architect",
      "source": "db"
    },
    {
      "display_name": "Cloud Security Engineer",
      "id": 23,
      "rationale": null,
      "role_archetype": null,
      "slug": "cloud-security-engineer",
      "source": "db"
    },
    {
      "display_name": "PHP Backend Developer",
      "id": 86,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "php-backend-developer",
      "source": "db"
    }
  ],
  "chosen_role": {
    "display_name": "Data Engineer",
    "id": 2,
    "rationale": "Exact alias hit on data-engineer (1.0) \u2014 no other alias at this confidence; skill_top data-engineer 0.36 does not contradict",
    "role_archetype": null,
    "slug": "data-engineer",
    "source": "db"
  },
  "dimensions": [
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "ETL and ELT Tooling",
        "id": 24,
        "rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
        "slug": "etl-and-elt-tooling",
        "source": "db"
      },
      "input_skill": "Hadoop",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Data Engineer",
          "id": 2,
          "rationale": null,
          "role_archetype": null,
          "slug": "data-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Local Persistence and Offline Behavior",
        "id": 85,
        "rationale": "On-device storage used for caching, offline support, and durable client state. This cluster is coherent because iOS apps often need to preserve user progress and data when connectivity is limited.",
        "slug": "local-persistence-and-offline-behavior",
        "source": "db"
      },
      "input_skill": "Hive",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Android Developer",
          "id": 4,
          "rationale": null,
          "role_archetype": null,
          "slug": "android-engineer",
          "source": "db"
        },
        {
          "display_name": "Flutter Developer",
          "id": 74,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "flutter-developer",
          "source": "db"
        },
        {
          "display_name": "Hybrid Mobile Developer",
          "id": 11,
          "rationale": null,
          "role_archetype": null,
          "slug": "hybrid-mobile-developer",
          "source": "db"
        },
        {
          "display_name": "Native Mobile Developer",
          "id": 75,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "native-mobile-developer",
          "source": "db"
        },
        {
          "display_name": "React Native Developer",
          "id": 73,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "react-native-developer",
          "source": "db"
        },
        {
          "display_name": "iOS Developer",
          "id": 6,
          "rationale": null,
          "role_archetype": null,
          "slug": "ios-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "ETL and ELT Tooling",
        "id": 24,
        "rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
        "slug": "etl-and-elt-tooling",
        "source": "db"
      },
      "input_skill": "Spark",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Data Engineer",
          "id": 2,
          "rationale": null,
          "role_archetype": null,
          "slug": "data-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Programming Languages for Data Work",
        "id": 21,
        "rationale": "Languages used to implement data pipelines, transformations, and operational glue. This is the primary coding surface for building ingestion, enrichment, and automation logic in data engineering.",
        "slug": "programming-languages-for-data-work",
        "source": "db"
      },
      "input_skill": "Scala",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Data Engineer",
          "id": 2,
          "rationale": null,
          "role_archetype": null,
          "slug": "data-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Programming Languages for ML Systems",
        "id": 39,
        "rationale": "Languages used to build training code, inference services, evaluation jobs, and ML glue code. This is the primary implementation surface for ML engineers across experimentation and productionization.",
        "slug": "programming-languages-for-ml-systems",
        "source": "db"
      },
      "input_skill": "Scala",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "ML Engineer",
          "id": 3,
          "rationale": null,
          "role_archetype": null,
          "slug": "ml-engineer",
          "source": "db"
        },
        {
          "display_name": "MLOps Engineer",
          "id": 16,
          "rationale": null,
          "role_archetype": null,
          "slug": "ml-ops-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Pega Programming Languages \u0026 DSLs",
        "id": 267,
        "rationale": "Programming languages and domain-specific languages used in Pega development.",
        "slug": "pega-programming-languages-dsls",
        "source": "db"
      },
      "input_skill": "SQL",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Pega Developer",
          "id": 24,
          "rationale": null,
          "role_archetype": null,
          "slug": "pega-developer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Programming Languages \u0026 DSLs",
        "id": 475,
        "rationale": "Oversee and guide the selection and effective use of programming and domain\u2010specific languages in software projects.",
        "slug": "programming-languages-dsls",
        "source": "db"
      },
      "input_skill": "SQL",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Engineering Manager",
          "id": 121,
          "rationale": null,
          "role_archetype": null,
          "slug": "engineering-manager",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Programming Languages for Data Work",
        "id": 21,
        "rationale": "Languages used to implement data pipelines, transformations, and operational glue. This is the primary coding surface for building ingestion, enrichment, and automation logic in data engineering.",
        "slug": "programming-languages-for-data-work",
        "source": "db"
      },
      "input_skill": "SQL",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Data Engineer",
          "id": 2,
          "rationale": null,
          "role_archetype": null,
          "slug": "data-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Programming Languages and Scripting",
        "id": 59,
        "rationale": "Languages used to write security automation, analysis scripts, detection logic, and remediation helpers. This is the primary implementation surface for a cybersecurity engineer across tooling and response workflows.",
        "slug": "programming-languages-and-scripting",
        "source": "db"
      },
      "input_skill": "Bash",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Cyber Security Engineer",
          "id": 5,
          "rationale": null,
          "role_archetype": null,
          "slug": "cybersecurity-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Programming Languages for Data Work",
        "id": 21,
        "rationale": "Languages used to implement data pipelines, transformations, and operational glue. This is the primary coding surface for building ingestion, enrichment, and automation logic in data engineering.",
        "slug": "programming-languages-for-data-work",
        "source": "db"
      },
      "input_skill": "Bash",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Data Engineer",
          "id": 2,
          "rationale": null,
          "role_archetype": null,
          "slug": "data-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "React Frontend Development",
        "id": 96,
        "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
        "slug": "d_init_01",
        "source": "db"
      },
      "input_skill": "Perl",
      "llm_role": null,
      "roles_from_db": []
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Relational Database Design",
        "id": 4,
        "rationale": "Modeling and operating relational persistence for backend services. Includes schema design, normalization, indexing, transactions, and query tuning for operational data stores.",
        "slug": "relational-database-design",
        "source": "db"
      },
      "input_skill": "SQL Server",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": ".NET Backend Developer",
          "id": 83,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "dotnet-backend-developer",
          "source": "db"
        },
        {
          "display_name": "Backend Developer",
          "id": 1,
          "rationale": null,
          "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
          "slug": "backend-engineer",
          "source": "db"
        },
        {
          "display_name": "Kotlin Backend Developer",
          "id": 84,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "kotlin-server-backend-developer",
          "source": "db"
        },
        {
          "display_name": "Node.js Backend Developer",
          "id": 82,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "node-backend-developer",
          "source": "db"
        },
        {
          "display_name": "Python Backend Developer",
          "id": 80,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "python-backend-developer",
          "source": "db"
        },
        {
          "display_name": "Ruby Backend Developer",
          "id": 85,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "ruby-backend-developer",
          "source": "db"
        },
        {
          "display_name": "Scala Backend Developer",
          "id": 87,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "scala-backend-developer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Cloud Platforms",
        "id": 20,
        "rationale": "Underlying cloud providers that host the managed services or infrastructure used by the role, such as AWS, Azure, and GCP.",
        "slug": "cloud-platforms",
        "source": "db"
      },
      "input_skill": "AWS",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": ".NET Backend Developer",
          "id": 83,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "dotnet-backend-developer",
          "source": "db"
        },
        {
          "display_name": "Backend Developer",
          "id": 1,
          "rationale": null,
          "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
          "slug": "backend-engineer",
          "source": "db"
        },
        {
          "display_name": "Cyber Security Engineer",
          "id": 5,
          "rationale": null,
          "role_archetype": null,
          "slug": "cybersecurity-engineer",
          "source": "db"
        },
        {
          "display_name": "Data Engineer",
          "id": 2,
          "rationale": null,
          "role_archetype": null,
          "slug": "data-engineer",
          "source": "db"
        },
        {
          "display_name": "DevOps Engineer",
          "id": 10,
          "rationale": null,
          "role_archetype": null,
          "slug": "devops-engineer",
          "source": "db"
        },
        {
          "display_name": "Fullstack Developer",
          "id": 15,
          "rationale": null,
          "role_archetype": null,
          "slug": "full-stack-engineer",
          "source": "db"
        },
        {
          "display_name": "Go Backend Developer",
          "id": 81,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "go-backend-developer",
          "source": "db"
        },
        {
          "display_name": "Java Backend Developer",
          "id": 79,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "java-backend-developer",
          "source": "db"
        },
        {
          "display_name": "Kotlin Backend Developer",
          "id": 84,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "kotlin-server-backend-developer",
          "source": "db"
        },
        {
          "display_name": "ML Engineer",
          "id": 3,
          "rationale": null,
          "role_archetype": null,
          "slug": "ml-engineer",
          "source": "db"
        },
        {
          "display_name": "MLOps Engineer",
          "id": 16,
          "rationale": null,
          "role_archetype": null,
          "slug": "ml-ops-engineer",
          "source": "db"
        },
        {
          "display_name": "Node.js Backend Developer",
          "id": 82,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "node-backend-developer",
          "source": "db"
        },
        {
          "display_name": "Python Backend Developer",
          "id": 80,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "python-backend-developer",
          "source": "db"
        },
        {
          "display_name": "Scala Backend Developer",
          "id": 87,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "scala-backend-developer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Cloud Platforms for AI Deployment",
        "id": 211,
        "rationale": "Major cloud services that provide infrastructure and managed services for AI workloads.",
        "slug": "cloud-platforms-for-ai-deployment",
        "source": "db"
      },
      "input_skill": "AWS",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "AI Engineer",
          "id": 13,
          "rationale": null,
          "role_archetype": null,
          "slug": "ai-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Cloud Provider Platforms",
        "id": 131,
        "rationale": "Major cloud platforms and their core service ecosystems used to design target-state architectures, choose deployment boundaries, and evaluate managed capabilities. This is the primary substrate for cloud architecture decisions.",
        "slug": "cloud-provider-platforms",
        "source": "db"
      },
      "input_skill": "AWS",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Cloud Architect",
          "id": 9,
          "rationale": null,
          "role_archetype": null,
          "slug": "cloud-architect",
          "source": "db"
        },
        {
          "display_name": "Cloud Security Engineer",
          "id": 23,
          "rationale": null,
          "role_archetype": null,
          "slug": "cloud-security-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Cloud Security Posture Tools",
        "id": 64,
        "rationale": "Cloud-native security platforms used to assess misconfiguration, workload exposure, and cloud control coverage. This dimension includes the major CNAPP/CSPM/CWPP vendors and cloud security services the role reviews and tunes.",
        "slug": "cloud-security-posture-tools",
        "source": "db"
      },
      "input_skill": "AWS",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Cloud Security Engineer",
          "id": 23,
          "rationale": null,
          "role_archetype": null,
          "slug": "cloud-security-engineer",
          "source": "db"
        },
        {
          "display_name": "Cyber Security Engineer",
          "id": 5,
          "rationale": null,
          "role_archetype": null,
          "slug": "cybersecurity-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Vendor Product Families",
        "id": 477,
        "rationale": "Coordinate usage, licensing, and architecture decisions for major vendor software and cloud product families.",
        "slug": "vendor-product-families",
        "source": "db"
      },
      "input_skill": "AWS",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Engineering Manager",
          "id": 121,
          "rationale": null,
          "role_archetype": null,
          "slug": "engineering-manager",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Asynchronous Messaging and Event Streaming",
        "id": 297,
        "rationale": "Asynchronous communication patterns and broker technologies used to decouple backend services and move work off the request path. Includes queues, pub/sub, event streams, consumer groups, dead-letter queues, and delivery semantics across systems such as Kafka, RabbitMQ, NATS, SQS/SNS, Pulsar, and ActiveMQ.",
        "slug": "asynchronous-messaging-and-event-streaming",
        "source": "db"
      },
      "input_skill": "Kafka",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": ".NET Backend Developer",
          "id": 83,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "dotnet-backend-developer",
          "source": "db"
        },
        {
          "display_name": "Go Backend Developer",
          "id": 81,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "go-backend-developer",
          "source": "db"
        },
        {
          "display_name": "Kotlin Backend Developer",
          "id": 84,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "kotlin-server-backend-developer",
          "source": "db"
        },
        {
          "display_name": "Node.js Backend Developer",
          "id": 82,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "node-backend-developer",
          "source": "db"
        },
        {
          "display_name": "Scala Backend Developer",
          "id": 87,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "scala-backend-developer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Messaging and Background Jobs",
        "id": 291,
        "rationale": "Asynchronous processing patterns and worker systems used to decouple backend work from request handling. This is a coherent cluster because the role supports background jobs, retries, and deferred processing.",
        "slug": "messaging-and-background-jobs",
        "source": "db"
      },
      "input_skill": "Kafka",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "PHP Backend Developer",
          "id": 86,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "php-backend-developer",
          "source": "db"
        },
        {
          "display_name": "Python Backend Developer",
          "id": 80,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "python-backend-developer",
          "source": "db"
        },
        {
          "display_name": "Ruby Backend Developer",
          "id": 85,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "ruby-backend-developer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Messaging and Event Streaming",
        "id": 8,
        "rationale": "Transport-layer systems used to move events and decouple producers from consumers. Data engineers use these systems to ingest, buffer, and distribute event data before downstream processing.",
        "slug": "messaging-and-event-streaming",
        "source": "db"
      },
      "input_skill": "Kafka",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Backend Developer",
          "id": 1,
          "rationale": null,
          "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
          "slug": "backend-engineer",
          "source": "db"
        },
        {
          "display_name": "Data Engineer",
          "id": 2,
          "rationale": null,
          "role_archetype": null,
          "slug": "data-engineer",
          "source": "db"
        }
      ]
    }
  ],
  "input_final_skills": [
    "Big Data",
    "Hadoop",
    "Hive",
    "Impala",
    "Flume",
    "IBM WebSphere MQ",
    "HDFS",
    "Spark",
    "Scala",
    "Pentaho",
    "Cloudera",
    "SQL",
    "HQL",
    "Bash",
    "Perl",
    "Oracle",
    "SQL Server",
    "SSRS",
    "SSIS",
    "AWS",
    "Kafka",
    "Guidewire",
    "CA Erwin",
    "Sparx",
    "Oozie"
  ],
  "input_llm_skills": [
    "Big Data",
    "Hadoop",
    "Hive",
    "Impala",
    "Flume",
    "IBM WebSphere MQ",
    "HDFS",
    "Spark",
    "Scala",
    "Pentaho",
    "Cloudera",
    "SQL",
    "HQL",
    "Bash",
    "Perl",
    "Oracle",
    "SQL Server",
    "SSRS",
    "SSIS",
    "AWS",
    "Kafka",
    "Guidewire",
    "CA Erwin",
    "Sparx",
    "Oozie"
  ],
  "new_aliases_persisted": 0,
  "run_id": "f4313d5f-0537-4f83-a2db-fdff084c63c1",
  "skills_detail": [
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Big Data",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Data Engineering Tools",
          "skill_nature": "CONCEPT",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "big-data",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "Hadoop",
          "alias_type": "CANONICAL",
          "id": 2010,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 5,
        "display_name": "Hadoop",
        "id": 1351,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "FRAMEWORK",
        "slug": "hadoop",
        "sub_category_id": 91,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "ETL and ELT Tooling",
            "id": 24,
            "rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
            "slug": "etl-and-elt-tooling",
            "source": "db"
          },
          "input_skill": "Hadoop",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Data Engineer",
              "id": 2,
              "rationale": null,
              "role_archetype": null,
              "slug": "data-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "Hadoop",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "Hive",
          "alias_type": "CANONICAL",
          "id": 4198,
          "is_primary": true,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 3,
        "display_name": "Hive",
        "id": 2754,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "TOOL",
        "slug": "hive",
        "sub_category_id": 2242,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Local Persistence and Offline Behavior",
            "id": 85,
            "rationale": "On-device storage used for caching, offline support, and durable client state. This cluster is coherent because iOS apps often need to preserve user progress and data when connectivity is limited.",
            "slug": "local-persistence-and-offline-behavior",
            "source": "db"
          },
          "input_skill": "Hive",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Android Developer",
              "id": 4,
              "rationale": null,
              "role_archetype": null,
              "slug": "android-engineer",
              "source": "db"
            },
            {
              "display_name": "Flutter Developer",
              "id": 74,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "flutter-developer",
              "source": "db"
            },
            {
              "display_name": "Hybrid Mobile Developer",
              "id": 11,
              "rationale": null,
              "role_archetype": null,
              "slug": "hybrid-mobile-developer",
              "source": "db"
            },
            {
              "display_name": "Native Mobile Developer",
              "id": 75,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "native-mobile-developer",
              "source": "db"
            },
            {
              "display_name": "React Native Developer",
              "id": 73,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "react-native-developer",
              "source": "db"
            },
            {
              "display_name": "iOS Developer",
              "id": 6,
              "rationale": null,
              "role_archetype": null,
              "slug": "ios-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "Hive",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Impala",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Databases",
          "skill_nature": "TOOL",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "impala",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Flume",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Data Engineering Tools",
          "skill_nature": "TOOL",
          "sub_category": "general",
          "typical_lifespan": "SHORT_LIVED",
          "version_strategy": "VERSIONED",
          "volatility": "FAST"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "flume",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "IBM WebSphere MQ",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Messaging Tools",
          "skill_nature": "TOOL",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "ibm-websphere-mq",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "HDFS",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Data Engineering Tools",
          "skill_nature": "TOOL",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "hdfs",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "Apache Spark",
          "alias_type": "CANONICAL",
          "id": 2004,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "apache spark 3",
          "alias_type": "VERSION",
          "id": 2006,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "spark",
          "alias_type": "VERSION",
          "id": 2510,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "spark 3",
          "alias_type": "VERSION",
          "id": 2007,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "spark 3.x",
          "alias_type": "VERSION",
          "id": 2009,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "spark3",
          "alias_type": "VERSION",
          "id": 2008,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 5,
        "display_name": "Apache Spark",
        "id": 1350,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "FRAMEWORK",
        "slug": "apache-spark",
        "sub_category_id": 1021,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "ETL and ELT Tooling",
            "id": 24,
            "rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
            "slug": "etl-and-elt-tooling",
            "source": "db"
          },
          "input_skill": "Spark",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Data Engineer",
              "id": 2,
              "rationale": null,
              "role_archetype": null,
              "slug": "data-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "Spark",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "Scala",
          "alias_type": "CANONICAL",
          "id": 272,
          "is_primary": true,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 6,
        "display_name": "Scala",
        "id": 102,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "LANGUAGE",
        "slug": "scala",
        "sub_category_id": 96,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Programming Languages for Data Work",
            "id": 21,
            "rationale": "Languages used to implement data pipelines, transformations, and operational glue. This is the primary coding surface for building ingestion, enrichment, and automation logic in data engineering.",
            "slug": "programming-languages-for-data-work",
            "source": "db"
          },
          "input_skill": "Scala",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Data Engineer",
              "id": 2,
              "rationale": null,
              "role_archetype": null,
              "slug": "data-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Programming Languages for ML Systems",
            "id": 39,
            "rationale": "Languages used to build training code, inference services, evaluation jobs, and ML glue code. This is the primary implementation surface for ML engineers across experimentation and productionization.",
            "slug": "programming-languages-for-ml-systems",
            "source": "db"
          },
          "input_skill": "Scala",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "ML Engineer",
              "id": 3,
              "rationale": null,
              "role_archetype": null,
              "slug": "ml-engineer",
              "source": "db"
            },
            {
              "display_name": "MLOps Engineer",
              "id": 16,
              "rationale": null,
              "role_archetype": null,
              "slug": "ml-ops-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "Scala",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Pentaho",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Data Engineering Tools",
          "skill_nature": "TOOL",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "pentaho",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Cloudera",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Data Engineering Tools",
          "skill_nature": "PLATFORM",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "cloudera",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "SQL",
          "alias_type": "CANONICAL",
          "id": 271,
          "is_primary": true,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 6,
        "display_name": "SQL",
        "id": 101,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "LANGUAGE",
        "slug": "sql",
        "sub_category_id": 97,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Pega Programming Languages \u0026 DSLs",
            "id": 267,
            "rationale": "Programming languages and domain-specific languages used in Pega development.",
            "slug": "pega-programming-languages-dsls",
            "source": "db"
          },
          "input_skill": "SQL",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Pega Developer",
              "id": 24,
              "rationale": null,
              "role_archetype": null,
              "slug": "pega-developer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Programming Languages \u0026 DSLs",
            "id": 475,
            "rationale": "Oversee and guide the selection and effective use of programming and domain\u2010specific languages in software projects.",
            "slug": "programming-languages-dsls",
            "source": "db"
          },
          "input_skill": "SQL",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Engineering Manager",
              "id": 121,
              "rationale": null,
              "role_archetype": null,
              "slug": "engineering-manager",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Programming Languages for Data Work",
            "id": 21,
            "rationale": "Languages used to implement data pipelines, transformations, and operational glue. This is the primary coding surface for building ingestion, enrichment, and automation logic in data engineering.",
            "slug": "programming-languages-for-data-work",
            "source": "db"
          },
          "input_skill": "SQL",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Data Engineer",
              "id": 2,
              "rationale": null,
              "role_archetype": null,
              "slug": "data-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "SQL",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "HQL",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Query Languages",
          "skill_nature": "LANGUAGE",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "hql",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "Bash",
          "alias_type": "VERSION",
          "id": 273,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Bash 3.x",
          "alias_type": "VERSION",
          "id": 279,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Bash 4.x",
          "alias_type": "VERSION",
          "id": 280,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Bash 5.x",
          "alias_type": "VERSION",
          "id": 281,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "GNU Bash",
          "alias_type": "VERSION",
          "id": 282,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "bash",
          "alias_type": "VERSION",
          "id": 275,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "bash 3",
          "alias_type": "VERSION",
          "id": 276,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "bash 3.x",
          "alias_type": "VERSION",
          "id": 283,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "bash 4",
          "alias_type": "VERSION",
          "id": 277,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "bash 4.x",
          "alias_type": "VERSION",
          "id": 284,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "bash 5",
          "alias_type": "VERSION",
          "id": 278,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "bash 5.x",
          "alias_type": "VERSION",
          "id": 285,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 6,
        "display_name": "Bash",
        "id": 103,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "LANGUAGE",
        "slug": "bash",
        "sub_category_id": 238,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Programming Languages and Scripting",
            "id": 59,
            "rationale": "Languages used to write security automation, analysis scripts, detection logic, and remediation helpers. This is the primary implementation surface for a cybersecurity engineer across tooling and response workflows.",
            "slug": "programming-languages-and-scripting",
            "source": "db"
          },
          "input_skill": "Bash",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Cyber Security Engineer",
              "id": 5,
              "rationale": null,
              "role_archetype": null,
              "slug": "cybersecurity-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Programming Languages for Data Work",
            "id": 21,
            "rationale": "Languages used to implement data pipelines, transformations, and operational glue. This is the primary coding surface for building ingestion, enrichment, and automation logic in data engineering.",
            "slug": "programming-languages-for-data-work",
            "source": "db"
          },
          "input_skill": "Bash",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Data Engineer",
              "id": 2,
              "rationale": null,
              "role_archetype": null,
              "slug": "data-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "Bash",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "Perl",
          "alias_type": "CANONICAL",
          "id": 1612,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 6,
        "display_name": "Perl",
        "id": 1001,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "LANGUAGE",
        "slug": "perl",
        "sub_category_id": 38,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "React Frontend Development",
            "id": 96,
            "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
            "slug": "d_init_01",
            "source": "db"
          },
          "input_skill": "Perl",
          "llm_role": null,
          "roles_from_db": []
        }
      ],
      "input_skill": "Perl",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Oracle",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Databases",
          "skill_nature": "TOOL",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "oracle",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "SQL Server",
          "alias_type": "CANONICAL",
          "id": 135,
          "is_primary": true,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "SQL Server 2000",
          "alias_type": "VERSION",
          "id": 138,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "SQL Server 2005",
          "alias_type": "VERSION",
          "id": 139,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "SQL Server 2008",
          "alias_type": "VERSION",
          "id": 140,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "SQL Server 2012",
          "alias_type": "VERSION",
          "id": 141,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "SQL Server 2014",
          "alias_type": "VERSION",
          "id": 142,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "SQL Server 2016",
          "alias_type": "VERSION",
          "id": 143,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "SQL Server 2017",
          "alias_type": "VERSION",
          "id": 144,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "SQL Server 2019",
          "alias_type": "VERSION",
          "id": 145,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "SQL Server 2022",
          "alias_type": "VERSION",
          "id": 146,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "SQL Server 6.5",
          "alias_type": "VERSION",
          "id": 136,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "SQL Server 7.0",
          "alias_type": "VERSION",
          "id": 137,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 3,
        "display_name": "SQL Server",
        "id": 18,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "TOOL",
        "slug": "sql-server",
        "sub_category_id": 29,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Relational Database Design",
            "id": 4,
            "rationale": "Modeling and operating relational persistence for backend services. Includes schema design, normalization, indexing, transactions, and query tuning for operational data stores.",
            "slug": "relational-database-design",
            "source": "db"
          },
          "input_skill": "SQL Server",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": ".NET Backend Developer",
              "id": 83,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "dotnet-backend-developer",
              "source": "db"
            },
            {
              "display_name": "Backend Developer",
              "id": 1,
              "rationale": null,
              "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
              "slug": "backend-engineer",
              "source": "db"
            },
            {
              "display_name": "Kotlin Backend Developer",
              "id": 84,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "kotlin-server-backend-developer",
              "source": "db"
            },
            {
              "display_name": "Node.js Backend Developer",
              "id": 82,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "node-backend-developer",
              "source": "db"
            },
            {
              "display_name": "Python Backend Developer",
              "id": 80,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "python-backend-developer",
              "source": "db"
            },
            {
              "display_name": "Ruby Backend Developer",
              "id": 85,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "ruby-backend-developer",
              "source": "db"
            },
            {
              "display_name": "Scala Backend Developer",
              "id": 87,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "scala-backend-developer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "SQL Server",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "SSRS",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Reporting Tools",
          "skill_nature": "TOOL",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "ssrs",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "SSIS",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Data Engineering Tools",
          "skill_nature": "TOOL",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "ssis",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "AWS",
          "alias_type": "CANONICAL",
          "id": 406,
          "is_primary": true,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 9,
        "display_name": "AWS",
        "id": 187,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "PLATFORM",
        "slug": "aws",
        "sub_category_id": 46,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Cloud Platforms",
            "id": 20,
            "rationale": "Underlying cloud providers that host the managed services or infrastructure used by the role, such as AWS, Azure, and GCP.",
            "slug": "cloud-platforms",
            "source": "db"
          },
          "input_skill": "AWS",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": ".NET Backend Developer",
              "id": 83,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "dotnet-backend-developer",
              "source": "db"
            },
            {
              "display_name": "Backend Developer",
              "id": 1,
              "rationale": null,
              "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
              "slug": "backend-engineer",
              "source": "db"
            },
            {
              "display_name": "Cyber Security Engineer",
              "id": 5,
              "rationale": null,
              "role_archetype": null,
              "slug": "cybersecurity-engineer",
              "source": "db"
            },
            {
              "display_name": "Data Engineer",
              "id": 2,
              "rationale": null,
              "role_archetype": null,
              "slug": "data-engineer",
              "source": "db"
            },
            {
              "display_name": "DevOps Engineer",
              "id": 10,
              "rationale": null,
              "role_archetype": null,
              "slug": "devops-engineer",
              "source": "db"
            },
            {
              "display_name": "Fullstack Developer",
              "id": 15,
              "rationale": null,
              "role_archetype": null,
              "slug": "full-stack-engineer",
              "source": "db"
            },
            {
              "display_name": "Go Backend Developer",
              "id": 81,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "go-backend-developer",
              "source": "db"
            },
            {
              "display_name": "Java Backend Developer",
              "id": 79,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "java-backend-developer",
              "source": "db"
            },
            {
              "display_name": "Kotlin Backend Developer",
              "id": 84,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "kotlin-server-backend-developer",
              "source": "db"
            },
            {
              "display_name": "ML Engineer",
              "id": 3,
              "rationale": null,
              "role_archetype": null,
              "slug": "ml-engineer",
              "source": "db"
            },
            {
              "display_name": "MLOps Engineer",
              "id": 16,
              "rationale": null,
              "role_archetype": null,
              "slug": "ml-ops-engineer",
              "source": "db"
            },
            {
              "display_name": "Node.js Backend Developer",
              "id": 82,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "node-backend-developer",
              "source": "db"
            },
            {
              "display_name": "Python Backend Developer",
              "id": 80,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "python-backend-developer",
              "source": "db"
            },
            {
              "display_name": "Scala Backend Developer",
              "id": 87,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "scala-backend-developer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Cloud Platforms for AI Deployment",
            "id": 211,
            "rationale": "Major cloud services that provide infrastructure and managed services for AI workloads.",
            "slug": "cloud-platforms-for-ai-deployment",
            "source": "db"
          },
          "input_skill": "AWS",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "AI Engineer",
              "id": 13,
              "rationale": null,
              "role_archetype": null,
              "slug": "ai-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Cloud Provider Platforms",
            "id": 131,
            "rationale": "Major cloud platforms and their core service ecosystems used to design target-state architectures, choose deployment boundaries, and evaluate managed capabilities. This is the primary substrate for cloud architecture decisions.",
            "slug": "cloud-provider-platforms",
            "source": "db"
          },
          "input_skill": "AWS",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Cloud Architect",
              "id": 9,
              "rationale": null,
              "role_archetype": null,
              "slug": "cloud-architect",
              "source": "db"
            },
            {
              "display_name": "Cloud Security Engineer",
              "id": 23,
              "rationale": null,
              "role_archetype": null,
              "slug": "cloud-security-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Cloud Security Posture Tools",
            "id": 64,
            "rationale": "Cloud-native security platforms used to assess misconfiguration, workload exposure, and cloud control coverage. This dimension includes the major CNAPP/CSPM/CWPP vendors and cloud security services the role reviews and tunes.",
            "slug": "cloud-security-posture-tools",
            "source": "db"
          },
          "input_skill": "AWS",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Cloud Security Engineer",
              "id": 23,
              "rationale": null,
              "role_archetype": null,
              "slug": "cloud-security-engineer",
              "source": "db"
            },
            {
              "display_name": "Cyber Security Engineer",
              "id": 5,
              "rationale": null,
              "role_archetype": null,
              "slug": "cybersecurity-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Vendor Product Families",
            "id": 477,
            "rationale": "Coordinate usage, licensing, and architecture decisions for major vendor software and cloud product families.",
            "slug": "vendor-product-families",
            "source": "db"
          },
          "input_skill": "AWS",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Engineering Manager",
              "id": 121,
              "rationale": null,
              "role_archetype": null,
              "slug": "engineering-manager",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "AWS",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "Kafka",
          "alias_type": "CANONICAL",
          "id": 173,
          "is_primary": true,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 3,
        "display_name": "Kafka",
        "id": 36,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "TOOL",
        "slug": "kafka",
        "sub_category_id": 3533,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Asynchronous Messaging and Event Streaming",
            "id": 297,
            "rationale": "Asynchronous communication patterns and broker technologies used to decouple backend services and move work off the request path. Includes queues, pub/sub, event streams, consumer groups, dead-letter queues, and delivery semantics across systems such as Kafka, RabbitMQ, NATS, SQS/SNS, Pulsar, and ActiveMQ.",
            "slug": "asynchronous-messaging-and-event-streaming",
            "source": "db"
          },
          "input_skill": "Kafka",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": ".NET Backend Developer",
              "id": 83,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "dotnet-backend-developer",
              "source": "db"
            },
            {
              "display_name": "Go Backend Developer",
              "id": 81,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "go-backend-developer",
              "source": "db"
            },
            {
              "display_name": "Kotlin Backend Developer",
              "id": 84,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "kotlin-server-backend-developer",
              "source": "db"
            },
            {
              "display_name": "Node.js Backend Developer",
              "id": 82,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "node-backend-developer",
              "source": "db"
            },
            {
              "display_name": "Scala Backend Developer",
              "id": 87,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "scala-backend-developer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Messaging and Background Jobs",
            "id": 291,
            "rationale": "Asynchronous processing patterns and worker systems used to decouple backend work from request handling. This is a coherent cluster because the role supports background jobs, retries, and deferred processing.",
            "slug": "messaging-and-background-jobs",
            "source": "db"
          },
          "input_skill": "Kafka",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "PHP Backend Developer",
              "id": 86,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "php-backend-developer",
              "source": "db"
            },
            {
              "display_name": "Python Backend Developer",
              "id": 80,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "python-backend-developer",
              "source": "db"
            },
            {
              "display_name": "Ruby Backend Developer",
              "id": 85,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "ruby-backend-developer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Messaging and Event Streaming",
            "id": 8,
            "rationale": "Transport-layer systems used to move events and decouple producers from consumers. Data engineers use these systems to ingest, buffer, and distribute event data before downstream processing.",
            "slug": "messaging-and-event-streaming",
            "source": "db"
          },
          "input_skill": "Kafka",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Backend Developer",
              "id": 1,
              "rationale": null,
              "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
              "slug": "backend-engineer",
              "source": "db"
            },
            {
              "display_name": "Data Engineer",
              "id": 2,
              "rationale": null,
              "role_archetype": null,
              "slug": "data-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "Kafka",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Guidewire",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Insurance Software",
          "skill_nature": "TOOL",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "guidewire",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "CA Erwin",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Data Modeling Tools",
          "skill_nature": "TOOL",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "ca-erwin",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Sparx",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Modeling Tools",
          "skill_nature": "TOOL",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "sparx",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Oozie",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Data Engineering Tools",
          "skill_nature": "TOOL",
          "sub_category": "general",
          "typical_lifespan": "SHORT_LIVED",
          "version_strategy": "VERSIONED",
          "volatility": "FAST"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "oozie",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    }
  ],
  "unmatched_skills": [
    "Big Data",
    "Impala",
    "Flume",
    "IBM WebSphere MQ",
    "HDFS",
    "Pentaho",
    "Cloudera",
    "HQL",
    "Oracle",
    "SSRS",
    "SSIS",
    "Guidewire",
    "CA Erwin",
    "Sparx",
    "Oozie"
  ]
}
API 3 — final-role-output
{
  "chosen_role": {
    "display_name": "Data Engineer",
    "id": 2,
    "rationale": "Exact alias hit on data-engineer (1.0) \u2014 no other alias at this confidence; skill_top data-engineer 0.36 does not contradict",
    "role_archetype": null,
    "slug": "data-engineer",
    "source": "db"
  },
  "chosen_role_resolution": "in_db",
  "final_input_skills": [
    {
      "skill": "Big Data",
      "tag": "new"
    },
    {
      "skill": "Hadoop",
      "tag": "in_db"
    },
    {
      "skill": "Hive",
      "tag": "in_db"
    },
    {
      "skill": "Impala",
      "tag": "new"
    },
    {
      "skill": "Flume",
      "tag": "new"
    },
    {
      "skill": "IBM WebSphere MQ",
      "tag": "new"
    },
    {
      "skill": "HDFS",
      "tag": "new"
    },
    {
      "skill": "Spark",
      "tag": "in_db"
    },
    {
      "skill": "Scala",
      "tag": "in_db"
    },
    {
      "skill": "Pentaho",
      "tag": "new"
    },
    {
      "skill": "Cloudera",
      "tag": "new"
    },
    {
      "skill": "SQL",
      "tag": "in_db"
    },
    {
      "skill": "HQL",
      "tag": "new"
    },
    {
      "skill": "Bash",
      "tag": "in_db"
    },
    {
      "skill": "Perl",
      "tag": "in_db"
    },
    {
      "skill": "Oracle",
      "tag": "new"
    },
    {
      "skill": "SQL Server",
      "tag": "in_db"
    },
    {
      "skill": "SSRS",
      "tag": "new"
    },
    {
      "skill": "SSIS",
      "tag": "new"
    },
    {
      "skill": "AWS",
      "tag": "in_db"
    },
    {
      "skill": "Kafka",
      "tag": "in_db"
    },
    {
      "skill": "Guidewire",
      "tag": "new"
    },
    {
      "skill": "CA Erwin",
      "tag": "new"
    },
    {
      "skill": "Sparx",
      "tag": "new"
    },
    {
      "skill": "Oozie",
      "tag": "new"
    }
  ],
  "llm_cost_api1_usd": null,
  "llm_cost_api2_usd": null,
  "llm_cost_api3_usd": null,
  "llm_cost_total_usd": null,
  "persistence": {
    "items": [
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "ETL and ELT Tooling",
          "id": 24,
          "rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
          "slug": "etl-and-elt-tooling",
          "source": "db"
        },
        "dimension_id": 24,
        "input_skill": "Hadoop",
        "llm_role": null,
        "matched_chosen_role": true,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension saved",
        "role_dimension_saved": true,
        "roles_from_db": [
          {
            "display_name": "Data Engineer",
            "id": 2,
            "rationale": null,
            "role_archetype": null,
            "slug": "data-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 1351,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Local Persistence and Offline Behavior",
          "id": 85,
          "rationale": "On-device storage used for caching, offline support, and durable client state. This cluster is coherent because iOS apps often need to preserve user progress and data when connectivity is limited.",
          "slug": "local-persistence-and-offline-behavior",
          "source": "db"
        },
        "dimension_id": 85,
        "input_skill": "Hive",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Android Developer",
            "id": 4,
            "rationale": null,
            "role_archetype": null,
            "slug": "android-engineer",
            "source": "db"
          },
          {
            "display_name": "Flutter Developer",
            "id": 74,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "flutter-developer",
            "source": "db"
          },
          {
            "display_name": "Hybrid Mobile Developer",
            "id": 11,
            "rationale": null,
            "role_archetype": null,
            "slug": "hybrid-mobile-developer",
            "source": "db"
          },
          {
            "display_name": "Native Mobile Developer",
            "id": 75,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "native-mobile-developer",
            "source": "db"
          },
          {
            "display_name": "React Native Developer",
            "id": 73,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "react-native-developer",
            "source": "db"
          },
          {
            "display_name": "iOS Developer",
            "id": 6,
            "rationale": null,
            "role_archetype": null,
            "slug": "ios-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 2754,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "ETL and ELT Tooling",
          "id": 24,
          "rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
          "slug": "etl-and-elt-tooling",
          "source": "db"
        },
        "dimension_id": 24,
        "input_skill": "Spark",
        "llm_role": null,
        "matched_chosen_role": true,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension saved",
        "role_dimension_saved": true,
        "roles_from_db": [
          {
            "display_name": "Data Engineer",
            "id": 2,
            "rationale": null,
            "role_archetype": null,
            "slug": "data-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 1350,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Programming Languages for Data Work",
          "id": 21,
          "rationale": "Languages used to implement data pipelines, transformations, and operational glue. This is the primary coding surface for building ingestion, enrichment, and automation logic in data engineering.",
          "slug": "programming-languages-for-data-work",
          "source": "db"
        },
        "dimension_id": 21,
        "input_skill": "Scala",
        "llm_role": null,
        "matched_chosen_role": true,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension saved",
        "role_dimension_saved": true,
        "roles_from_db": [
          {
            "display_name": "Data Engineer",
            "id": 2,
            "rationale": null,
            "role_archetype": null,
            "slug": "data-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 102,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Programming Languages for ML Systems",
          "id": 39,
          "rationale": "Languages used to build training code, inference services, evaluation jobs, and ML glue code. This is the primary implementation surface for ML engineers across experimentation and productionization.",
          "slug": "programming-languages-for-ml-systems",
          "source": "db"
        },
        "dimension_id": 39,
        "input_skill": "Scala",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "ML Engineer",
            "id": 3,
            "rationale": null,
            "role_archetype": null,
            "slug": "ml-engineer",
            "source": "db"
          },
          {
            "display_name": "MLOps Engineer",
            "id": 16,
            "rationale": null,
            "role_archetype": null,
            "slug": "ml-ops-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 102,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Pega Programming Languages \u0026 DSLs",
          "id": 267,
          "rationale": "Programming languages and domain-specific languages used in Pega development.",
          "slug": "pega-programming-languages-dsls",
          "source": "db"
        },
        "dimension_id": 267,
        "input_skill": "SQL",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Pega Developer",
            "id": 24,
            "rationale": null,
            "role_archetype": null,
            "slug": "pega-developer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 101,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Programming Languages \u0026 DSLs",
          "id": 475,
          "rationale": "Oversee and guide the selection and effective use of programming and domain\u2010specific languages in software projects.",
          "slug": "programming-languages-dsls",
          "source": "db"
        },
        "dimension_id": 475,
        "input_skill": "SQL",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Engineering Manager",
            "id": 121,
            "rationale": null,
            "role_archetype": null,
            "slug": "engineering-manager",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 101,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Programming Languages for Data Work",
          "id": 21,
          "rationale": "Languages used to implement data pipelines, transformations, and operational glue. This is the primary coding surface for building ingestion, enrichment, and automation logic in data engineering.",
          "slug": "programming-languages-for-data-work",
          "source": "db"
        },
        "dimension_id": 21,
        "input_skill": "SQL",
        "llm_role": null,
        "matched_chosen_role": true,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension saved",
        "role_dimension_saved": true,
        "roles_from_db": [
          {
            "display_name": "Data Engineer",
            "id": 2,
            "rationale": null,
            "role_archetype": null,
            "slug": "data-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 101,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Programming Languages and Scripting",
          "id": 59,
          "rationale": "Languages used to write security automation, analysis scripts, detection logic, and remediation helpers. This is the primary implementation surface for a cybersecurity engineer across tooling and response workflows.",
          "slug": "programming-languages-and-scripting",
          "source": "db"
        },
        "dimension_id": 59,
        "input_skill": "Bash",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Cyber Security Engineer",
            "id": 5,
            "rationale": null,
            "role_archetype": null,
            "slug": "cybersecurity-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 103,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Programming Languages for Data Work",
          "id": 21,
          "rationale": "Languages used to implement data pipelines, transformations, and operational glue. This is the primary coding surface for building ingestion, enrichment, and automation logic in data engineering.",
          "slug": "programming-languages-for-data-work",
          "source": "db"
        },
        "dimension_id": 21,
        "input_skill": "Bash",
        "llm_role": null,
        "matched_chosen_role": true,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension saved",
        "role_dimension_saved": true,
        "roles_from_db": [
          {
            "display_name": "Data Engineer",
            "id": 2,
            "rationale": null,
            "role_archetype": null,
            "slug": "data-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 103,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "React Frontend Development",
          "id": 96,
          "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
          "slug": "d_init_01",
          "source": "db"
        },
        "dimension_id": 96,
        "input_skill": "Perl",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [],
        "skill_dimension_saved": true,
        "skill_id": 1001,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Relational Database Design",
          "id": 4,
          "rationale": "Modeling and operating relational persistence for backend services. Includes schema design, normalization, indexing, transactions, and query tuning for operational data stores.",
          "slug": "relational-database-design",
          "source": "db"
        },
        "dimension_id": 4,
        "input_skill": "SQL Server",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": ".NET Backend Developer",
            "id": 83,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "dotnet-backend-developer",
            "source": "db"
          },
          {
            "display_name": "Backend Developer",
            "id": 1,
            "rationale": null,
            "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
            "slug": "backend-engineer",
            "source": "db"
          },
          {
            "display_name": "Kotlin Backend Developer",
            "id": 84,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "kotlin-server-backend-developer",
            "source": "db"
          },
          {
            "display_name": "Node.js Backend Developer",
            "id": 82,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "node-backend-developer",
            "source": "db"
          },
          {
            "display_name": "Python Backend Developer",
            "id": 80,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "python-backend-developer",
            "source": "db"
          },
          {
            "display_name": "Ruby Backend Developer",
            "id": 85,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "ruby-backend-developer",
            "source": "db"
          },
          {
            "display_name": "Scala Backend Developer",
            "id": 87,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "scala-backend-developer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 18,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Cloud Platforms",
          "id": 20,
          "rationale": "Underlying cloud providers that host the managed services or infrastructure used by the role, such as AWS, Azure, and GCP.",
          "slug": "cloud-platforms",
          "source": "db"
        },
        "dimension_id": 20,
        "input_skill": "AWS",
        "llm_role": null,
        "matched_chosen_role": true,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension saved",
        "role_dimension_saved": true,
        "roles_from_db": [
          {
            "display_name": ".NET Backend Developer",
            "id": 83,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "dotnet-backend-developer",
            "source": "db"
          },
          {
            "display_name": "Backend Developer",
            "id": 1,
            "rationale": null,
            "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
            "slug": "backend-engineer",
            "source": "db"
          },
          {
            "display_name": "Cyber Security Engineer",
            "id": 5,
            "rationale": null,
            "role_archetype": null,
            "slug": "cybersecurity-engineer",
            "source": "db"
          },
          {
            "display_name": "Data Engineer",
            "id": 2,
            "rationale": null,
            "role_archetype": null,
            "slug": "data-engineer",
            "source": "db"
          },
          {
            "display_name": "DevOps Engineer",
            "id": 10,
            "rationale": null,
            "role_archetype": null,
            "slug": "devops-engineer",
            "source": "db"
          },
          {
            "display_name": "Fullstack Developer",
            "id": 15,
            "rationale": null,
            "role_archetype": null,
            "slug": "full-stack-engineer",
            "source": "db"
          },
          {
            "display_name": "Go Backend Developer",
            "id": 81,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "go-backend-developer",
            "source": "db"
          },
          {
            "display_name": "Java Backend Developer",
            "id": 79,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "java-backend-developer",
            "source": "db"
          },
          {
            "display_name": "Kotlin Backend Developer",
            "id": 84,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "kotlin-server-backend-developer",
            "source": "db"
          },
          {
            "display_name": "ML Engineer",
            "id": 3,
            "rationale": null,
            "role_archetype": null,
            "slug": "ml-engineer",
            "source": "db"
          },
          {
            "display_name": "MLOps Engineer",
            "id": 16,
            "rationale": null,
            "role_archetype": null,
            "slug": "ml-ops-engineer",
            "source": "db"
          },
          {
            "display_name": "Node.js Backend Developer",
            "id": 82,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "node-backend-developer",
            "source": "db"
          },
          {
            "display_name": "Python Backend Developer",
            "id": 80,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "python-backend-developer",
            "source": "db"
          },
          {
            "display_name": "Scala Backend Developer",
            "id": 87,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "scala-backend-developer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 187,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Cloud Platforms for AI Deployment",
          "id": 211,
          "rationale": "Major cloud services that provide infrastructure and managed services for AI workloads.",
          "slug": "cloud-platforms-for-ai-deployment",
          "source": "db"
        },
        "dimension_id": 211,
        "input_skill": "AWS",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "AI Engineer",
            "id": 13,
            "rationale": null,
            "role_archetype": null,
            "slug": "ai-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 187,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Cloud Provider Platforms",
          "id": 131,
          "rationale": "Major cloud platforms and their core service ecosystems used to design target-state architectures, choose deployment boundaries, and evaluate managed capabilities. This is the primary substrate for cloud architecture decisions.",
          "slug": "cloud-provider-platforms",
          "source": "db"
        },
        "dimension_id": 131,
        "input_skill": "AWS",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Cloud Architect",
            "id": 9,
            "rationale": null,
            "role_archetype": null,
            "slug": "cloud-architect",
            "source": "db"
          },
          {
            "display_name": "Cloud Security Engineer",
            "id": 23,
            "rationale": null,
            "role_archetype": null,
            "slug": "cloud-security-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 187,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Cloud Security Posture Tools",
          "id": 64,
          "rationale": "Cloud-native security platforms used to assess misconfiguration, workload exposure, and cloud control coverage. This dimension includes the major CNAPP/CSPM/CWPP vendors and cloud security services the role reviews and tunes.",
          "slug": "cloud-security-posture-tools",
          "source": "db"
        },
        "dimension_id": 64,
        "input_skill": "AWS",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Cloud Security Engineer",
            "id": 23,
            "rationale": null,
            "role_archetype": null,
            "slug": "cloud-security-engineer",
            "source": "db"
          },
          {
            "display_name": "Cyber Security Engineer",
            "id": 5,
            "rationale": null,
            "role_archetype": null,
            "slug": "cybersecurity-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 187,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Vendor Product Families",
          "id": 477,
          "rationale": "Coordinate usage, licensing, and architecture decisions for major vendor software and cloud product families.",
          "slug": "vendor-product-families",
          "source": "db"
        },
        "dimension_id": 477,
        "input_skill": "AWS",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Engineering Manager",
            "id": 121,
            "rationale": null,
            "role_archetype": null,
            "slug": "engineering-manager",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 187,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Asynchronous Messaging and Event Streaming",
          "id": 297,
          "rationale": "Asynchronous communication patterns and broker technologies used to decouple backend services and move work off the request path. Includes queues, pub/sub, event streams, consumer groups, dead-letter queues, and delivery semantics across systems such as Kafka, RabbitMQ, NATS, SQS/SNS, Pulsar, and ActiveMQ.",
          "slug": "asynchronous-messaging-and-event-streaming",
          "source": "db"
        },
        "dimension_id": 297,
        "input_skill": "Kafka",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": ".NET Backend Developer",
            "id": 83,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "dotnet-backend-developer",
            "source": "db"
          },
          {
            "display_name": "Go Backend Developer",
            "id": 81,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "go-backend-developer",
            "source": "db"
          },
          {
            "display_name": "Kotlin Backend Developer",
            "id": 84,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "kotlin-server-backend-developer",
            "source": "db"
          },
          {
            "display_name": "Node.js Backend Developer",
            "id": 82,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "node-backend-developer",
            "source": "db"
          },
          {
            "display_name": "Scala Backend Developer",
            "id": 87,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "scala-backend-developer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 36,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Messaging and Background Jobs",
          "id": 291,
          "rationale": "Asynchronous processing patterns and worker systems used to decouple backend work from request handling. This is a coherent cluster because the role supports background jobs, retries, and deferred processing.",
          "slug": "messaging-and-background-jobs",
          "source": "db"
        },
        "dimension_id": 291,
        "input_skill": "Kafka",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "PHP Backend Developer",
            "id": 86,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "php-backend-developer",
            "source": "db"
          },
          {
            "display_name": "Python Backend Developer",
            "id": 80,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "python-backend-developer",
            "source": "db"
          },
          {
            "display_name": "Ruby Backend Developer",
            "id": 85,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "ruby-backend-developer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 36,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Messaging and Event Streaming",
          "id": 8,
          "rationale": "Transport-layer systems used to move events and decouple producers from consumers. Data engineers use these systems to ingest, buffer, and distribute event data before downstream processing.",
          "slug": "messaging-and-event-streaming",
          "source": "db"
        },
        "dimension_id": 8,
        "input_skill": "Kafka",
        "llm_role": null,
        "matched_chosen_role": true,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension saved",
        "role_dimension_saved": true,
        "roles_from_db": [
          {
            "display_name": "Backend Developer",
            "id": 1,
            "rationale": null,
            "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
            "slug": "backend-engineer",
            "source": "db"
          },
          {
            "display_name": "Data Engineer",
            "id": 2,
            "rationale": null,
            "role_archetype": null,
            "slug": "data-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 36,
        "skill_tag": "in_db",
        "skipped_reason": null
      }
    ],
    "new_skills_created": 0,
    "role_dimension_saved": 0,
    "skill_dimension_saved": 0,
    "skipped": 0
  },
  "planner_output": null,
  "run_id": "f4313d5f-0537-4f83-a2db-fdff084c63c1"
}

LLM Calls

Every model call made for this run, in pipeline order. Click a card to see the model's response.

Loading…