← Back to history

Pipeline run

1f106d71-338e-40ee-a69a-09957abcd98f

Pipeline LLM cost (USD)
API 1: $0.0036 API 2: $0.1689 API 3: $0.0000 Total: $0.1726

Client output enrichment

v2 Skill cluster · Nature of work · AI index · Tech stack maturity · Evidence · KRA description
role baseline loaded sources · ai_index: jd · nature_of_work: jd · tech_stack_maturity: jd
Nature of work · Data pipeline development
Design and optimize large-scale Spark/Kafka/Hadoop data pipelines and core AdCloud platform services, processing 10B+ events/day with a focus on scalability, reliability, and cost efficiency.
"• Develop high-throughput data pipelines processing 10B+ events per day"
Tech stack maturity
Mainstream Modern
The stack centers on widely adopted distributed data and cloud technologies like Spark, Kafka, AWS/Azure/GCP, Cassandra, and Java/Scala, which are characteristic of mainstream modern data engineering.
AI index (0 = no AI use, 5 = totally AI-dependent · v2.1)
0.50 / 5
· Title match
Has AI skill
· AI skill (primary)
AI skill (secondary)
· On AI team
· Builds AI products
vocab breakdown (legacy)
Assistants (×1):
Frameworks (×2):
Models / concepts (×3): AI, ML, Machine Learning, Artificial Intelligence
Evidence — skills matched in JD (18)
Apache Spark Kafka Hadoop HBase Aerospike Cassandra Java Scala AWS GCP Azure RDBMS NoSQL Machine Learning Artificial Intelligence Data Lakes Lakehouse Event-Driven Architecture
Skill cluster (6 dimension groups, role-scoped)
Cloud Platforms
AWS GCP Azure
ETL and ELT Tooling
Apache Spark Hadoop
Programming Languages for Data Work
Java Scala
AI Governance and Model Security
Machine Learning
Messaging and Event Streaming
Kafka
Cross-cutting / unaligned
HBase Aerospike Cassandra RDBMS NoSQL Artificial Intelligence Data Lakes Lakehouse Event-Driven Architecture
Show KRA description ↓
• Design and build scalable, distributed data systems for real-time and batch processing • Develop high-throughput data pipelines processing 10B+ events per day • Contribute to and own key components in the technical design and implementation of core AdCloud platform components • Work with technologies such as Apache Spark, Kafka, Hadoop ecosystem, and modern data platforms • Ensure performance, scalability, reliability, and cost efficiency of data systems • Collaborate with Product, Engineering, and Data Science teams to deliver end-to-end solutions • Participate in design and code reviews to maintain high engineering standards • Continuously improve system robustness, scalability, and developer productivity • 10+ years of experience in designing and developing large-scale data-driven systems • Strong experience with distributed data processing frameworks (Spark, Kafka, Hadoop, etc.) • Experience with NoSQL systems (HBase, Aerospike, Cassandra) and RDBMS • Strong programming skills in Java, Scala, or similar languages • Solid understanding of data structures, algorithms, and system design • Experience building scalable systems on cloud platforms (AWS/GCP/Azure) • Strong focus on performance optimization and cost efficiency • Excellent communication and collaboration skills • Experience in AdTech or high-scale event-driven systems • Exposure to data governance, data quality, and metadata systems • Experience supporting ML/AI data pipelines • Familiarity with modern data architectures (data lakes, lakehouse, etc.) • Contributions to open-source or technical communities

Signals

Skill backend-engineer
0.33
Alias
KRA cloud-architect
0.43

Post-classification

Centroidupdated · n=16
Alias collision log
New-role queue
New skills captured11
New KRA captured

Captured for admin review

Apache Spark primary Data Engineer pending
Hadoop primary Data Engineer pending
HBase primary Data Engineer pending
Aerospike primary Data Engineer pending
Cassandra primary Data Engineer pending
RDBMS primary Data Engineer pending
Machine Learning Data Engineer pending
Artificial Intelligence Data Engineer pending
Data Lakes Data Engineer pending
Lakehouse Data Engineer pending
Event-Driven Architecture Data Engineer pending
Status: completed Created: 2026-05-19T06:11:09.061650Z Updated: 2026-05-19T06:13:10.453152Z API 3 duration: 10374 ms
Flow Current 3-step pipeline

1 POST /skills/extract-from-jd

2 POST /skills/extract-details

3 POST /skills/final-role-output

Role Chosen role & resolution

Data Engineer

CASE F

slug: data-engineer · id: 2 · source: db

The primary skills indicate a strong focus on data processing technologies and cloud platforms, aligning well with a Data Engineer role.

Resolution: in_db — role exists in library; skill↔dim and role↔dim links saved when applicable.

11
New skills
13
Skill↔dim saved
0
Role↔dim saved
0
Skipped

Job description

ey Responsibilities
• Design and build scalable, distributed data systems for real-time and batch processing
• Develop high-throughput data pipelines processing 10B+ events per day
• Contribute to and own key components in the technical design and implementation of core AdCloud platform components
• Work with technologies such as Apache Spark, Kafka, Hadoop ecosystem, and modern data platforms
• Ensure performance, scalability, reliability, and cost efficiency of data systems
• Collaborate with Product, Engineering, and Data Science teams to deliver end-to-end solutions
• Participate in design and code reviews to maintain high engineering standards
• Continuously improve system robustness, scalability, and developer productivity

Qualifications
• 10+ years of experience in designing and developing large-scale data-driven systems
• Strong experience with distributed data processing frameworks (Spark, Kafka, Hadoop, etc.)
• Experience with NoSQL systems (HBase, Aerospike, Cassandra) and RDBMS
• Strong programming skills in Java, Scala, or similar languages
• Solid understanding of data structures, algorithms, and system design
• Experience building scalable systems on cloud platforms (AWS/GCP/Azure)
• Strong focus on performance optimization and cost efficiency
• Excellent communication and collaboration skills

Nice to Have
• Experience in AdTech or high-scale event-driven systems
• Exposure to data governance, data quality, and metadata systems
• Experience supporting ML/AI data pipelines
• Familiarity with modern data architectures (data lakes, lakehouse, etc.)
• Contributions to open-source or technical communities

Skills from this JD

Each row merges API 1 extraction, API 2 library match / v3 orchestration (dimensions + locked dims), and API 3 persistence tags.

Apache Spark Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

Maturity well_known confidence 0.95

Apache Spark appears in many data engineering JDs and remains a standard for distributed ETL/ELT; its GitHub and vendor ecosystem activity stay strong, with Databricks and cloud platforms still promoting it.

Vendor & license

Apache Software Foundation ·apache_2 ·since 2010 (0.95)

Context keywords
RDD DataFrame Spark SQL MLlib Spark Streaming DAGScheduler Cluster Manager Apache Kafka Hadoop PySpark Scala SparkSession ETL Data Lake Machine Learning
Ambiguity low

“Apache Spark” is a specific, widely recognized distributed data processing framework; typical JDs won’t confuse it with other distinct ETL/streaming tools.

Versioning

Versioned 3.x

{
  "apache spark 3": "3.x",
  "spark": "3.x",
  "spark 3": "3.x",
  "spark 3.x": "3.x",
  "spark3": "3.x"
}
Type assignment

Framework ·distributed_data_processing_framework confidence 0.94

Apache Spark is a structured codebase that users build data applications and pipelines inside, so by the Tool vs Framework rule it is a Framework rather than a Tool.

Derived legacy fields
Category
Framework
Sub-category
distributed_data_processing_framework
Skill nature
FRAMEWORK
Volatility
STABLE
Typical lifespan
EVERGREEN
Version strategy
SEPARATE_ENTITY

Dimensions (API 2 worklist)

  • ETL and ELT Tooling Catalog dimension db id 24

    Library dimension (catalog)

    Roles linked in library: Data Engineer

Locked dimensions (v3 placement)

  • Distributed Data Processing Frameworks

    Reuses catalog slug

    Frameworks used to process large datasets in batch or streaming pipelines, often as part of ETL/ELT workflows. Apache Spark belongs here because it is a core engine for distributed transformation, aggregation, and data movement.

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
ETL and ELT Tooling
etl-and-elt-tooling
New skill saved · Existing dimension (library) · Role↔dimension saved
Kafka Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: Kafka id=36 · kafka

Aliases — catalog

  • Kafka (CANONICAL) primary

Context tags (catalog)

Apache Flink Apache Kafka Apache Pulsar Apache Spark Avro KSQL Kafka API Kafka Connect Kafka Streams ZooKeeper Zookeeper backpressure brokers consumer consumer group consumer groups event sourcing event-driven architecture exactly-once semantics fault tolerance high throughput log compaction message broker message queue microservices offsets partition partitioning partitions producer producer API real-time analytics real-time data replication schema registry stream processing topic topic partitioning topics

Stored enrichment (catalog DB)

Category
Datastore
Sub-category
Event Stream Store
Vendor
Confluent
License
apache_2
Year introduced
2011
Confidence
0.90
Version strategy
NOT_APPLICABLE

Maturity reasoning: Kafka appears in many production JDs for event streaming and data pipelines, and remains a standard platform in cloud/vendor offerings (e.g., Confluent, AWS MSK), indicating broad hiring demand.

Skill profile (library / DB)

Skill nature
PLATFORM
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
9
Sub-category id
47
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • Messaging and Event Streaming Catalog dimension db id 8

    Library dimension (catalog)

    Roles linked in library: Backend Engineer, Data Engineer

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
Messaging and Event Streaming
messaging-and-event-streaming
Existing dimension (library) · Role↔dimension saved
Hadoop Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

Maturity niche confidence 0.91

Job postings still mention Hadoop for legacy big-data stacks, but JD volume has fallen as Spark and cloud warehouses replaced MapReduce-era clusters.

Vendor & license

Apache Software Foundation ·apache_2 ·since 2006 (0.95)

Context keywords
MapReduce HDFS YARN Hive Pig Spark Sqoop Flume Oozie Kafka NoSQL Big Data Data Lake ETL ELT Distributed Computing
Ambiguity low

“Hadoop” is a specific data processing framework; typical JDs distinguish it from other big data tools like Spark or Hive.

Versioning

Not versioned

Type assignment

Framework ·data_processing_framework confidence 0.90

Hadoop is fundamentally a structured software stack that users build distributed data applications and jobs within, so it fits the Framework category rather than a Tool or Platform.

Derived legacy fields
Category
Framework
Sub-category
data_processing_framework
Skill nature
FRAMEWORK
Volatility
STABLE
Typical lifespan
EVERGREEN
Version strategy
NOT_APPLICABLE

Dimensions (API 2 worklist)

  • ETL and ELT Tooling Catalog dimension db id 24

    Library dimension (catalog)

    Roles linked in library: Data Engineer

Locked dimensions (v3 placement)

  • Distributed Data Processing Platforms

    Reuses catalog slug

    Tools and frameworks used to ingest, store, and process large-scale batch data across clusters. Hadoop belongs here because it is a foundational platform for distributed storage and computation in data engineering workflows.

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
ETL and ELT Tooling
etl-and-elt-tooling
New skill saved · Existing dimension (library) · Role↔dimension saved
HBase Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

Maturity niche confidence 0.86

HBase appears in a limited set of big-data/legacy Hadoop job postings, while newer JDs more often specify DynamoDB, Bigtable, or Cassandra; its market demand is specialized rather than broad.

Vendor & license

Apache Software Foundation ·apache_2 ·since 2010 (0.95)

Context keywords
Hadoop NoSQL Bigtable MapReduce Apache column family scalability distributed data model real-time table design region server Thrift REST API data replication
Ambiguity low

HBase is a specific Apache wide-column NoSQL datastore; JDs typically distinguish it from other datastores.

Versioning

Not versioned

Type assignment

Datastore ·wide_column_store confidence 0.98

HBase is fundamentally a persistent data system, so by the Datastore vs Format rule it is a datastore rather than a tool or framework.

Derived legacy fields
Category
Datastore
Sub-category
wide_column_store
Skill nature
TOOL
Volatility
STABLE
Typical lifespan
EVERGREEN
Version strategy
NOT_APPLICABLE

Dimensions (API 2 worklist)

  • Cloud Storage and Data Services Catalog dimension db id 144

    Library dimension (catalog)

    Roles linked in library: Cloud Architect

Locked dimensions (v3 placement)

  • Distributed Data Storage Systems

    Reuses catalog slug

    Storage systems used to persist large-scale application and analytical data with low-latency access and horizontal scaling. HBase fits here as a distributed NoSQL store built on top of Hadoop storage infrastructure.

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
Cloud Storage and Data Services
cloud-storage-and-data-services
New skill saved · Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Aerospike Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

Maturity niche confidence 0.86

Aerospike appears in a limited set of high-scale datastore JDs and vendor case studies, but it is far less common than PostgreSQL, Redis, or MongoDB in general hiring pipelines.

Vendor & license

Aerospike Inc. ·apache_2 ·since 2012 (0.95)

Context keywords
NoSQL distributed database high availability scalability data model key-value store latency replication cluster management data partitioning Aerospike client TTL secondary indexes real-time analytics data persistence
Ambiguity low

Aerospike is a specific distributed NoSQL database name; unlikely to be confused with other catalog datastore skills.

Versioning

Not versioned

Type assignment

Datastore ·distributed_nosql_datastore confidence 0.98

Aerospike is fundamentally a system that persists and serves data, so by the Datastore vs Format rule it is a Datastore rather than a tool or platform.

Derived legacy fields
Category
Datastore
Sub-category
distributed_nosql_datastore
Skill nature
TOOL
Volatility
STABLE
Typical lifespan
EVERGREEN
Version strategy
NOT_APPLICABLE

Dimensions (API 2 worklist)

  • React Frontend Development Catalog dimension db id 96

    Library dimension (catalog)

Locked dimensions (v3 placement)

  • Distributed NoSQL Databases

    Pipeline tentative id

    Distributed NoSQL databases used for low-latency key-value access, horizontal scaling, and high availability. Aerospike belongs here because it is a distributed database platform rather than a general storage or cloud service.

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
React Frontend Development
d_init_01
New skill saved · Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Cassandra Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

Maturity well_known confidence 0.84

Apache Cassandra appears in many production data-platform JDs and is a common choice for high-write, distributed workloads; GitHub and vendor docs show sustained activity rather than sunset signals.

Vendor & license

Apache Software Foundation ·apache_2 ·since 2008 (0.95)

Context keywords
CQL DataStax TinkerPop Spark ScyllaDB Replication Partitioning Cluster NoSQL Wide Column Consistency Data Modeling DSE Thrift Eventual Consistency
Ambiguity low

“Cassandra” is a specific wide-column NoSQL database name; unlikely to be confused with other catalog datastore skills.

Versioning

Not versioned

Type assignment

Datastore ·wide_column_store confidence 0.99

Cassandra is fundamentally a distributed database that persists data, so by the Datastore vs Format rule it is a Datastore.

Derived legacy fields
Category
Datastore
Sub-category
wide_column_store
Skill nature
TOOL
Volatility
STABLE
Typical lifespan
EVERGREEN
Version strategy
NOT_APPLICABLE

Dimensions (API 2 worklist)

  • Cloud Storage and Data Services Catalog dimension db id 144

    Library dimension (catalog)

    Roles linked in library: Cloud Architect

Locked dimensions (v3 placement)

  • Distributed Data Storage Systems

    Reuses catalog slug

    Managed and self-hosted data stores used to persist application data with high availability and horizontal scale. Cassandra belongs here because it is a distributed wide-column database chosen for partitioning, replication, and fault-tolerant storage.

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
Cloud Storage and Data Services
cloud-storage-and-data-services
New skill saved · Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Java Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: Java id=1 · java

Aliases — catalog

  • Java (CANONICAL) primary
  • JDK (VERSION)
  • JDK 10 (VERSION)
  • JDK 11 (VERSION)
  • JDK 12 (VERSION)
  • JDK 13 (VERSION)
  • JDK 14 (VERSION)
  • JDK 15 (VERSION)
  • JDK 16 (VERSION)
  • JDK 17 (VERSION)
  • JDK 18 (VERSION)
  • JDK 19 (VERSION)
  • JDK 20 (VERSION)
  • JDK 21 (VERSION)
  • JDK 5 (VERSION)
  • JDK 6 (VERSION)
  • JDK 7 (VERSION)
  • JDK 8 (VERSION)
  • JDK 9 (VERSION)
  • Java 1.0 (VERSION)
  • Java 1.1 (VERSION)
  • Java 1.2 (VERSION)
  • Java 1.3 (VERSION)
  • Java 1.4 (VERSION)
  • Java 1.5 (VERSION)
  • Java 1.6 (VERSION)
  • Java 1.7 (VERSION)
  • Java 1.8 (VERSION)
  • Java 10 (VERSION)
  • Java 11 (VERSION)
  • Java 12 (VERSION)
  • Java 13 (VERSION)
  • Java 14 (VERSION)
  • Java 15 (VERSION)
  • Java 16 (VERSION)
  • Java 17 (VERSION)
  • Java 18 (VERSION)
  • Java 19 (VERSION)
  • Java 20 (VERSION)
  • Java 21 (VERSION)
  • Java 5 (VERSION)
  • Java 6 (VERSION)
  • Java 7 (VERSION)
  • Java 8 (VERSION)
  • Java 9 (VERSION)
  • Java11 (VERSION)
  • Java17 (VERSION)
  • Java21 (VERSION)
  • Java8 (VERSION)
  • OpenJDK 11 (VERSION)
  • OpenJDK 17 (VERSION)
  • OpenJDK 21 (VERSION)
  • OpenJDK 8 (VERSION)
  • java 11 (VERSION)
  • java 17 (VERSION)
  • java 21 (VERSION)
  • java 4 (VERSION)
  • java 5 (VERSION)
  • java 6 (VERSION)
  • java 7 (VERSION)
  • java 8 (VERSION)
  • java lts (VERSION)
  • java-11 (VERSION)
  • java-17 (VERSION)
  • java-21 (VERSION)
  • java-4 (VERSION)
  • java-5 (VERSION)
  • java-6 (VERSION)
  • java-7 (VERSION)
  • java-8 (VERSION)
  • java11 (VERSION)
  • java17 (VERSION)
  • java21 (VERSION)
  • java4 (VERSION)
  • java5 (VERSION)
  • java6 (VERSION)
  • java7 (VERSION)
  • java8 (VERSION)
  • jdk 11 (VERSION)
  • jdk 17 (VERSION)
  • jdk 21 (VERSION)
  • jdk 4 (VERSION)
  • jdk 5 (VERSION)
  • jdk 6 (VERSION)
  • jdk 7 (VERSION)
  • jdk 8 (VERSION)
  • jdk11 (VERSION)
  • jdk17 (VERSION)
  • jdk21 (VERSION)
  • jdk4 (VERSION)
  • jdk5 (VERSION)
  • jdk6 (VERSION)
  • jdk7 (VERSION)
  • jdk8 (VERSION)
  • jvm21 (VERSION)

Context tags (catalog)

APIs Apache Tomcat Concurrency Design patterns Garbage collection GraalVM Gradle Hibernate JDBC JDK JPA JUnit JVM Java 8 Java EE JavaFX Kafka Lambda expressions Maven Microservices Mockito Object-oriented REST RESTful SOAP Servlets Spring Spring Boot Tomcat microservices

Stored enrichment (catalog DB)

Category
Language
Sub-category
Programming Language
Vendor
Oracle
License
other_open
Year introduced
1995
Confidence
0.99
Version strategy
SEPARATE_ENTITY
Version tag
21

Maturity reasoning: Java is a hiring-pipeline staple with very high JD volume across enterprise backend, Android, and cloud roles; it remains widely supported by major vendors and frameworks like Spring.

Skill profile (library / DB)

Skill nature
LANGUAGE
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
6
Sub-category id
96
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • Kotlin and Java Catalog dimension db id 161

    Library dimension (catalog)

    Roles linked in library: Android Engineer

  • Programming Languages Catalog dimension db id 1

    Library dimension (catalog)

    Roles linked in library: Backend Engineer

  • Programming Languages for Data Work Catalog dimension db id 21

    Library dimension (catalog)

    Roles linked in library: Data Engineer

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
Kotlin and Java
kotlin-and-java
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Programming Languages
programming-languages
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Programming Languages for Data Work
programming-languages-for-data-work
Existing dimension (library) · Role↔dimension saved
Scala Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: Scala id=102 · scala

Aliases — catalog

  • Scala (CANONICAL) primary

Context tags (catalog)

Akka Apache Kafka Cats Flink JVM Monads Play Framework SBT ScalaTest Shapeless Spark Spark SQL ZIO case class for-comprehension functional programming implicit pattern matching typeclass

Stored enrichment (catalog DB)

Category
Language
Sub-category
Programming Language
Vendor
EPFL
License
apache_2
Year introduced
2004
Confidence
0.99
Version strategy
NOT_APPLICABLE

Maturity reasoning: Scala still appears in many backend/data engineering JDs, especially with Spark and Akka, and remains supported by major JVM ecosystems; it’s not a sunset technology.

Skill profile (library / DB)

Skill nature
LANGUAGE
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
6
Sub-category id
96
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • Programming Languages for Data Work Catalog dimension db id 21

    Library dimension (catalog)

    Roles linked in library: Data Engineer

  • Programming Languages for ML Systems Catalog dimension db id 39

    Library dimension (catalog)

    Roles linked in library: ML Engineer

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
Programming Languages for Data Work
programming-languages-for-data-work
Existing dimension (library) · Role↔dimension saved
Programming Languages for ML Systems
programming-languages-for-ml-systems
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
AWS Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: AWS id=187 · aws

Aliases — catalog

  • AWS (CANONICAL) primary

Context tags (catalog)

API Gateway AWS CLI Auto Scaling CloudFormation CloudFront CloudTrail CloudWatch Cognito DynamoDB EC2 ECS EKS Elastic Beanstalk Elastic Load Balancing IAM KMS Lambda RDS Route 53 S3 SNS SQS Serverless VPC

Stored enrichment (catalog DB)

Category
Platform
Sub-category
Cloud Platform
Vendor
Amazon
License
other_open
Year introduced
2006
Confidence
0.99
Version strategy
NOT_APPLICABLE

Maturity reasoning: AWS is a hiring-pipeline staple: it appears in a large share of cloud/DevOps job descriptions and dominates public cloud market share, with broad certification and vendor ecosystem support.

Skill profile (library / DB)

Skill nature
PLATFORM
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
9
Sub-category id
46
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • Cloud Platforms Catalog dimension db id 20

    Library dimension (catalog)

    Roles linked in library: Backend Engineer, Cybersecurity Engineer, Data Engineer, DevOps Engineer, ML Engineer

  • Cloud Platforms for AI Deployment Catalog dimension db id 211

    Library dimension (catalog)

    Roles linked in library: AI Engineer

  • Cloud Provider Platforms Catalog dimension db id 131

    Library dimension (catalog)

    Roles linked in library: Cloud Architect

  • Cloud Security Posture Tools Catalog dimension db id 64

    Library dimension (catalog)

    Roles linked in library: Cybersecurity Engineer

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
Cloud Platforms
cloud-platforms
Existing dimension (library) · Role↔dimension saved
Cloud Platforms for AI Deployment
cloud-platforms-for-ai-deployment
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Cloud Provider Platforms
cloud-provider-platforms
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Cloud Security Posture Tools
cloud-security-posture-tools
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
GCP Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: GCP id=186 · gcp

Aliases — catalog

  • GCP (CANONICAL) primary

Context tags (catalog)

Anthos App Engine Artifact Registry BigQuery Cloud Build Cloud Composer Cloud Functions Cloud Logging Cloud Monitoring Cloud Run Cloud SQL Cloud Spanner Cloud Storage Compute Engine Dataflow GKE IAM Kubernetes Pub/Sub Service Accounts Stackdriver Terraform VPC

Stored enrichment (catalog DB)

Category
Platform
Sub-category
Cloud Platform
Vendor
Google
License
other_open
Year introduced
2011
Confidence
0.99
Version strategy
NOT_APPLICABLE

Maturity reasoning: GCP appears frequently in cloud/platform job descriptions and is a major hyperscaler alongside AWS/Azure, with broad enterprise adoption and active vendor investment.

Skill profile (library / DB)

Skill nature
PLATFORM
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
9
Sub-category id
46
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • Cloud Platforms Catalog dimension db id 20

    Library dimension (catalog)

    Roles linked in library: Backend Engineer, Cybersecurity Engineer, Data Engineer, DevOps Engineer, ML Engineer

  • Cloud Platforms for AI Deployment Catalog dimension db id 211

    Library dimension (catalog)

    Roles linked in library: AI Engineer

  • Cloud Security Posture Tools Catalog dimension db id 64

    Library dimension (catalog)

    Roles linked in library: Cybersecurity Engineer

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
Cloud Platforms
cloud-platforms
Existing dimension (library) · Role↔dimension saved
Cloud Platforms for AI Deployment
cloud-platforms-for-ai-deployment
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Cloud Security Posture Tools
cloud-security-posture-tools
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Azure Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: Azure id=188 · azure

Aliases — catalog

  • Azure (CANONICAL) primary

Context tags (catalog)

AKS ARM templates App Service Azure AD Azure Active Directory Azure App Service Azure Blob Azure Blob Storage Azure Cognitive Services Azure Cosmos DB Azure DevOps Azure DevTest Labs Azure Functions Azure Kubernetes Service Azure Logic Apps Azure Monitor Azure Networking Azure Resource Manager Azure SQL Azure SQL Database Azure Security Center Azure Storage Azure Storage Explorer Azure Virtual Machines Bicep Blob Storage Cloud Services Cosmos DB Entra ID Functions Infrastructure as Code Key Vault Log Analytics Logic Apps Resource Groups Serverless Computing Service Bus Storage Account Terraform Virtual Machines

Stored enrichment (catalog DB)

Category
Platform
Sub-category
Cloud Platform
Vendor
Microsoft
License
proprietary
Year introduced
2010
Confidence
0.99
Version strategy
NOT_APPLICABLE

Maturity reasoning: Azure is broadly adopted and frequently appears in cloud/platform job descriptions alongside AWS and GCP; Microsoft’s ongoing enterprise investment and Azure certification demand signal strong hiring-pipeline relevance.

Skill profile (library / DB)

Skill nature
PLATFORM
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
9
Sub-category id
46
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • Cloud Platforms Catalog dimension db id 20

    Library dimension (catalog)

    Roles linked in library: Backend Engineer, Cybersecurity Engineer, Data Engineer, DevOps Engineer, ML Engineer

  • Cloud Platforms for AI Deployment Catalog dimension db id 211

    Library dimension (catalog)

    Roles linked in library: AI Engineer

  • Cloud Provider Platforms Catalog dimension db id 131

    Library dimension (catalog)

    Roles linked in library: Cloud Architect

  • Cloud Security Posture Tools Catalog dimension db id 64

    Library dimension (catalog)

    Roles linked in library: Cybersecurity Engineer

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
Cloud Platforms
cloud-platforms
Existing dimension (library) · Role↔dimension saved
Cloud Platforms for AI Deployment
cloud-platforms-for-ai-deployment
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Cloud Provider Platforms
cloud-provider-platforms
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Cloud Security Posture Tools
cloud-security-posture-tools
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
RDBMS Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

Maturity well_known confidence 0.98

RDBMS is a core requirement in many job descriptions across backend, data, and DBA roles; PostgreSQL, MySQL, and SQL Server remain standard enterprise stacks.

Vendor & license

(0.90)

Context keywords
SQL ACID normalization indexes transactions joins stored procedures views foreign keys data integrity schema design ER diagrams database tuning backup and recovery query optimization data modeling
Ambiguity low

RDBMS is a standard, specific datastore category (relational DBMS) with little overlap with other distinct skills in typical JDs.

Versioning

Not versioned

Type assignment

Datastore ·relational_database_management_system confidence 0.98

RDBMS is fundamentally a system that persists and manages data, so under the Datastore vs Format rule it is a Datastore rather than a tool or concept.

Derived legacy fields
Category
Datastore
Sub-category
relational_database_management_system
Skill nature
TOOL
Volatility
STABLE
Typical lifespan
EVERGREEN
Version strategy
NOT_APPLICABLE

Dimensions (API 2 worklist)

  • React Frontend Development Catalog dimension db id 96

    Library dimension (catalog)

Locked dimensions (v3 placement)

  • Relational Database Systems

    Pipeline tentative id

    Relational database management systems used to store, query, and maintain structured data with tables, keys, constraints, and SQL. RDBMS fits here because it names the core database engine category rather than a specific vendor or data workflow tool.

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
React Frontend Development
d_init_01
New skill saved · Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
NoSQL Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: NoSQL id=1346 · nosql

Aliases — catalog

  • NoSQL (CANONICAL)

Context tags (catalog)

CAP theorem Cassandra DynamoDB MongoDB Redis column-family data modeling document store eventual consistency graph database horizontal scaling key-value store query language schema-less sharding

Stored enrichment (catalog DB)

Category
Concept
Sub-category
Database Paradigm
Confidence
0.93
Version strategy
NOT_APPLICABLE

Maturity reasoning: NoSQL is broadly listed in job descriptions across backend/data roles, with MongoDB, DynamoDB, and Cassandra appearing as common market signals; it remains a hiring-pipeline staple rather than a niche or sunset tech.

Skill profile (library / DB)

Skill nature
CONCEPT
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
2
Sub-category id
1019
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • NoSQL Databases Catalog dimension db id 19

    Library dimension (catalog)

    Roles linked in library: Backend Engineer

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
NoSQL Databases
nosql-databases
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Machine Learning Secondary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

Maturity well_known confidence 0.97

Machine Learning appears in large volumes of job descriptions across data, product, and platform roles, and major cloud vendors (AWS, Google Cloud, Azure) offer dedicated ML services and certifications, indicating broad adoption.

Vendor & license

(0.95)

Context keywords
TensorFlow scikit-learn Keras PyTorch neural networks supervised learning unsupervised learning reinforcement learning feature engineering model evaluation hyperparameter tuning data preprocessing cross-validation ensemble methods natural language processing
Ambiguity low

“Machine Learning” is a standard, specific concept and is unlikely to be confused with other distinct catalog skills in typical job descriptions.

Versioning

Not versioned

Type assignment

Concept ·machine_learning confidence 0.98

Machine Learning is a named knowledge unit about building models that learn from data, so by the Concept vs Methodology rule it is a Concept rather than an Architecture or Methodology.

Derived legacy fields
Category
Concept
Sub-category
machine_learning
Skill nature
CONCEPT
Volatility
STABLE
Typical lifespan
EVERGREEN
Version strategy
NOT_APPLICABLE

Dimensions (API 2 worklist)

  • React Frontend Development Catalog dimension db id 96

    Library dimension (catalog)

  • AI Governance and Model Security Catalog dimension db id 50

    Library dimension (catalog)

    Roles linked in library: AI Engineer, ML Engineer

  • AI Governance and Model Security Catalog dimension db id 50

    Library dimension (catalog)

    Roles linked in library: AI Engineer, ML Engineer

Locked dimensions (v3 placement)

  • Machine Learning Fundamentals

    Pipeline tentative id

    Core concepts, methods, and workflows for building predictive models from data. This fits the target skill because machine learning is the umbrella discipline covering model selection, training, validation, and deployment-oriented thinking.

  • AI Governance and Model Security

    Reuses catalog slug

    Controls and documentation used to make models safer, auditable, and compliant. Machine learning practitioners may need this when training or deploying models in regulated or risk-sensitive environments.

  • AI Governance and Model Security

    Reuses catalog slug

    Controls and documentation used to make models safer, auditable, and compliant. ML engineers use this to manage model risk, supply chain integrity, and governance requirements.

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
React Frontend Development
d_init_01
New skill saved · Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
AI Governance and Model Security
ai-governance-and-model-security
New skill saved · Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Artificial Intelligence Secondary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

Maturity well_known confidence 0.96

AI appears in a large and growing share of job descriptions across software, data, and product roles, and major vendors (Microsoft, Google, AWS) have standardized AI offerings, signaling broad market adoption.

Vendor & license

(1.00)

Context keywords
machine learning neural networks deep learning natural language processing computer vision reinforcement learning TensorFlow PyTorch data mining predictive analytics algorithm optimization AI ethics supervised learning unsupervised learning model training
Ambiguity low

“Artificial Intelligence” is a broad, standard concept and is unlikely to be confused with a different catalog skill in typical job descriptions.

Versioning

Not versioned

Type assignment

Concept ·artificial_intelligence confidence 0.98

Artificial Intelligence is a named knowledge unit about a field of techniques and theory, so by the Concept vs Methodology rule it is a Concept rather than a tool, platform, or methodology.

Derived legacy fields
Category
Concept
Sub-category
artificial_intelligence
Skill nature
CONCEPT
Volatility
STABLE
Typical lifespan
EVERGREEN
Version strategy
NOT_APPLICABLE

Dimensions (API 2 worklist)

  • React Frontend Development Catalog dimension db id 96

    Library dimension (catalog)

Locked dimensions (v3 placement)

  • Artificial Intelligence Concepts

    Pipeline tentative id

    Core concepts, methods, and terminology for building AI systems across symbolic, statistical, and machine-learning approaches. This skill is broad enough to stand as a top-level conceptual dimension when the intent is general AI literacy rather than a specific subdomain.

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
React Frontend Development
d_init_01
New skill saved · Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Data Lakes Secondary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

Maturity well_known confidence 0.93

Data lakes are widely listed in cloud/data platform job descriptions and are a standard architecture in AWS, Azure, and GCP ecosystems; they’re a common hiring-pipeline staple rather than a niche pattern.

Vendor & license

(0.80)

Context keywords
AWS Lake Formation Azure Data Lake data ingestion ETL data governance schema evolution data catalog big data data warehousing real-time analytics data pipelines data modeling partitioning data lakes vs data warehouses serverless architecture
Ambiguity low

“Data Lakes” is a specific architecture pattern (data lake storage/processing) and is unlikely to be confused with other distinct catalog skills.

Versioning

Not versioned

Type assignment

Architecture ·data_lake_architecture confidence 0.90

By the Architecture vs Concept rule, data lakes describe a system-shape for organizing and storing data rather than a specific knowledge unit or product.

Derived legacy fields
Category
Architecture
Sub-category
data_lake_architecture
Skill nature
PATTERN
Volatility
STABLE
Typical lifespan
EVERGREEN
Version strategy
NOT_APPLICABLE

Dimensions (API 2 worklist)

  • Cloud Storage and Data Services Catalog dimension db id 144

    Library dimension (catalog)

    Roles linked in library: Cloud Architect

  • React Frontend Development Catalog dimension db id 96

    Library dimension (catalog)

  • Cloud Storage and Data Services Catalog dimension db id 144

    Library dimension (catalog)

    Roles linked in library: Cloud Architect

Locked dimensions (v3 placement)

  • Cloud Storage and Data Services

    Reuses catalog slug

    Cloud-native storage and managed data services used to store large analytical datasets, define retention, and support lake-style architectures. Data Lakes fit here because they are typically built on object storage and adjacent managed services for durable, scalable data storage.

  • Lakehouse Data Architecture

    Pipeline tentative id

    Architectural patterns for organizing analytical data across raw, curated, and consumption-ready layers in a lake or lakehouse. This fits Data Lakes when the skill is used to design how data is structured, governed, and accessed for analytics.

  • Cloud Storage and Data Services

    Reuses catalog slug

    Cloud-native storage and managed data services used to place workloads, choose durability tiers, and define platform boundaries. This is a coherent cluster because architects evaluate storage fit, access patterns, and managed service tradeoffs.

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
Cloud Storage and Data Services
cloud-storage-and-data-services
New skill saved · Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
React Frontend Development
d_init_01
New skill saved · Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Lakehouse Secondary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

Maturity emerging confidence 0.86

Lakehouse is increasingly listed in data-platform JDs and vendor docs (Databricks, Snowflake, Microsoft Fabric), but it is not yet as universal as core warehouse or lake skills.

Vendor & license

(0.80)

Context keywords
Delta Lake Apache Spark data warehouse data lake ETL streaming analytics data governance cloud storage SQL data modeling real-time processing data integration analytics data pipeline metadata management
Ambiguity low

“Lakehouse” is a specific data platform architecture term and is unlikely to be confused with other catalog skills.

Versioning

Not versioned

Type assignment

Architecture ·data_platform_architecture confidence 0.90

Lakehouse is fundamentally a system-shape pattern that combines data lake and warehouse characteristics, so by the Architecture vs Concept rule it fits Architecture rather than a tool or datastore.

Derived legacy fields
Category
Architecture
Sub-category
data_platform_architecture
Skill nature
PATTERN
Volatility
EMERGING
Typical lifespan
EVERGREEN
Version strategy
NOT_APPLICABLE

Dimensions (API 2 worklist)

  • React Frontend Development Catalog dimension db id 96

    Library dimension (catalog)

Locked dimensions (v3 placement)

  • Lakehouse Architecture

    Pipeline tentative id

    Unified data platform patterns that combine data lake storage with warehouse-style management, governance, and analytics. Lakehouse belongs here because it refers to the architectural approach and platform capabilities used to store, process, and serve analytical data.

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
React Frontend Development
d_init_01
New skill saved · Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Event-Driven Architecture Secondary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

Maturity well_known confidence 0.92

Common in cloud-native JDs and vendor docs; AWS, Azure, and Confluent all market event-driven patterns with Kafka/PubSub, showing broad hiring demand.

Vendor & license

(0.90)

Context keywords
microservices Kafka RabbitMQ event sourcing CQRS asynchronous messaging publish-subscribe stream processing event bus serverless event-driven programming message broker real-time data data pipeline event schema
Ambiguity low

Event-Driven Architecture is a specific architecture pattern; typical JDs won’t confuse it with other distinct architecture skills.

Versioning

Not versioned

Type assignment

Architecture ·event_driven_architecture confidence 0.99

By the Architecture vs Concept rule, Event-Driven Architecture is a system-shape pattern that influences how systems are built, not just a knowledge unit.

Derived legacy fields
Category
Architecture
Sub-category
event_driven_architecture
Skill nature
PATTERN
Volatility
STABLE
Typical lifespan
EVERGREEN
Version strategy
NOT_APPLICABLE

Dimensions (API 2 worklist)

  • React Frontend Development Catalog dimension db id 96

    Library dimension (catalog)

Locked dimensions (v3 placement)

  • Event-Driven Architecture

    Pipeline tentative id

    Architectural patterns for building systems around events, asynchronous messaging, and decoupled producers and consumers. This fits the target skill because it covers how services publish, route, process, and react to domain and integration events.

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
React Frontend Development
d_init_01
New skill saved · Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)

All API 3 persistence rows

Same grid as the skill-extractor “Persistence items” table: one row per (skill × dimension) work item.

Skill Tag Dimension Skill↔dim Role↔dim Outcome Notes
Kafka in_db
Messaging and Event Streaming
messaging-and-event-streaming
Existing dimension (library) · Role↔dimension saved
Java in_db
Kotlin and Java
kotlin-and-java
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Java in_db
Programming Languages
programming-languages
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Java in_db
Programming Languages for Data Work
programming-languages-for-data-work
Existing dimension (library) · Role↔dimension saved
Scala in_db
Programming Languages for Data Work
programming-languages-for-data-work
Existing dimension (library) · Role↔dimension saved
Scala in_db
Programming Languages for ML Systems
programming-languages-for-ml-systems
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
AWS in_db
Cloud Platforms
cloud-platforms
Existing dimension (library) · Role↔dimension saved
AWS in_db
Cloud Platforms for AI Deployment
cloud-platforms-for-ai-deployment
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
AWS in_db
Cloud Provider Platforms
cloud-provider-platforms
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
AWS in_db
Cloud Security Posture Tools
cloud-security-posture-tools
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
GCP in_db
Cloud Platforms
cloud-platforms
Existing dimension (library) · Role↔dimension saved
GCP in_db
Cloud Platforms for AI Deployment
cloud-platforms-for-ai-deployment
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
GCP in_db
Cloud Security Posture Tools
cloud-security-posture-tools
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Azure in_db
Cloud Platforms
cloud-platforms
Existing dimension (library) · Role↔dimension saved
Azure in_db
Cloud Platforms for AI Deployment
cloud-platforms-for-ai-deployment
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Azure in_db
Cloud Provider Platforms
cloud-provider-platforms
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Azure in_db
Cloud Security Posture Tools
cloud-security-posture-tools
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
NoSQL in_db
NoSQL Databases
nosql-databases
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Apache Spark in_db
ETL and ELT Tooling
etl-and-elt-tooling
New skill saved · Existing dimension (library) · Role↔dimension saved
Hadoop in_db
ETL and ELT Tooling
etl-and-elt-tooling
New skill saved · Existing dimension (library) · Role↔dimension saved
HBase in_db
Cloud Storage and Data Services
cloud-storage-and-data-services
New skill saved · Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Aerospike in_db
React Frontend Development
d_init_01
New skill saved · Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Cassandra in_db
Cloud Storage and Data Services
cloud-storage-and-data-services
New skill saved · Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
RDBMS in_db
React Frontend Development
d_init_01
New skill saved · Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Machine Learning in_db
React Frontend Development
d_init_01
New skill saved · Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Machine Learning in_db
AI Governance and Model Security
ai-governance-and-model-security
New skill saved · Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Artificial Intelligence in_db
React Frontend Development
d_init_01
New skill saved · Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Data Lakes in_db
Cloud Storage and Data Services
cloud-storage-and-data-services
New skill saved · Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Data Lakes in_db
React Frontend Development
d_init_01
New skill saved · Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Lakehouse in_db
React Frontend Development
d_init_01
New skill saved · Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Event-Driven Architecture in_db
React Frontend Development
d_init_01
New skill saved · Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)

Library artifacts (this run)

Kind Detail DB id
canonical_skill_added Apache Spark 1350
canonical_skill_added Hadoop 1351
canonical_skill_added HBase 1352
canonical_skill_added Aerospike 1353
canonical_skill_added Cassandra 1354
canonical_skill_added RDBMS 1355
canonical_skill_added Machine Learning 1356
canonical_skill_added Artificial Intelligence 1357
canonical_skill_added Data Lakes 1358
canonical_skill_added Lakehouse 1359
canonical_skill_added Event-Driven Architecture 1360
dimension_skill_link Apache Spark ↔ ETL and ELT Tooling 24
dimension_skill_link Hadoop ↔ ETL and ELT Tooling 24
dimension_skill_link HBase ↔ Cloud Storage and Data Services 144
dimension_skill_link Aerospike ↔ React Frontend Development 96
dimension_skill_link Cassandra ↔ Cloud Storage and Data Services 144
dimension_skill_link RDBMS ↔ React Frontend Development 96
dimension_skill_link Machine Learning ↔ React Frontend Development 96
dimension_skill_link Machine Learning ↔ AI Governance and Model Security 50
dimension_skill_link Artificial Intelligence ↔ React Frontend Development 96
dimension_skill_link Data Lakes ↔ Cloud Storage and Data Services 144
dimension_skill_link Data Lakes ↔ React Frontend Development 96
dimension_skill_link Lakehouse ↔ React Frontend Development 96
dimension_skill_link Event-Driven Architecture ↔ React Frontend Development 96
nano JD Parser — gpt-4.1-nano click to toggle
Experience10+ years of experience in designing and developing large-scale data-driven systems
DomainOther
JD type pass
Show raw JSON
{
  "JD_type": "pass",
  "about_company": null,
  "certifications": [],
  "company_name": null,
  "ctc": null,
  "domain": {
    "primary": {
      "aliases": [],
      "domain": "Other"
    },
    "secondary": null
  },
  "education": [],
  "experience": {
    "max": null,
    "min": 10,
    "raw": "10+ years of experience in designing and developing large-scale data-driven systems"
  },
  "job_locations": [],
  "role": null,
  "role_archetype": "Data",
  "roles_and_responsibilities": [
    {
      "bullet_count": 8,
      "heading": "Responsibilities",
      "heading_was_present": true,
      "source_marker": {
        "first_5_words": "\u2022 Design and build scalable,",
        "last_5_words": "robustness, scalability, and developer productivity"
      },
      "text": "\u2022 Design and build scalable, distributed data systems for real-time and batch processing\n\u2022 Develop high-throughput data pipelines processing 10B+ events per day\n\u2022 Contribute to and own key components in the technical design and implementation of core AdCloud platform components\n\u2022 Work with technologies such as Apache Spark, Kafka, Hadoop ecosystem, and modern data platforms\n\u2022 Ensure performance, scalability, reliability, and cost efficiency of data systems\n\u2022 Collaborate with Product, Engineering, and Data Science teams to deliver end-to-end solutions\n\u2022 Participate in design and code reviews to maintain high engineering standards\n\u2022 Continuously improve system robustness, scalability, and developer productivity",
      "word_count": 108
    },
    {
      "bullet_count": 8,
      "heading": "Qualifications",
      "heading_was_present": true,
      "source_marker": {
        "first_5_words": "\u2022 10+ years of experience in",
        "last_5_words": "communication and collaboration skills"
      },
      "text": "\u2022 10+ years of experience in designing and developing large-scale data-driven systems\n\u2022 Strong experience with distributed data processing frameworks (Spark, Kafka, Hadoop, etc.)\n\u2022 Experience with NoSQL systems (HBase, Aerospike, Cassandra) and RDBMS\n\u2022 Strong programming skills in Java, Scala, or similar languages\n\u2022 Solid understanding of data structures, algorithms, and system design\n\u2022 Experience building scalable systems on cloud platforms (AWS/GCP/Azure)\n\u2022 Strong focus on performance optimization and cost efficiency\n\u2022 Excellent communication and collaboration skills",
      "word_count": 108
    },
    {
      "bullet_count": 5,
      "heading": "Nice to Have",
      "heading_was_present": true,
      "source_marker": {
        "first_5_words": "\u2022 Experience in AdTech or",
        "last_5_words": "or technical communities"
      },
      "text": "\u2022 Experience in AdTech or high-scale event-driven systems\n\u2022 Exposure to data governance, data quality, and metadata systems\n\u2022 Experience supporting ML/AI data pipelines\n\u2022 Familiarity with modern data architectures (data lakes, lakehouse, etc.)\n\u2022 Contributions to open-source or technical communities",
      "word_count": 56
    }
  ],
  "urls": []
}
API 1 — extract-from-jd click to toggle
{
  "final_skills": [
    {
      "is_primary": true,
      "skill_name": "Apache Spark"
    },
    {
      "is_primary": true,
      "skill_name": "Kafka"
    },
    {
      "is_primary": true,
      "skill_name": "Hadoop"
    },
    {
      "is_primary": true,
      "skill_name": "HBase"
    },
    {
      "is_primary": true,
      "skill_name": "Aerospike"
    },
    {
      "is_primary": true,
      "skill_name": "Cassandra"
    },
    {
      "is_primary": true,
      "skill_name": "Java"
    },
    {
      "is_primary": true,
      "skill_name": "Scala"
    },
    {
      "is_primary": true,
      "skill_name": "AWS"
    },
    {
      "is_primary": true,
      "skill_name": "GCP"
    },
    {
      "is_primary": true,
      "skill_name": "Azure"
    },
    {
      "is_primary": true,
      "skill_name": "RDBMS"
    },
    {
      "is_primary": true,
      "skill_name": "NoSQL"
    },
    {
      "is_primary": false,
      "skill_name": "Machine Learning"
    },
    {
      "is_primary": false,
      "skill_name": "Artificial Intelligence"
    },
    {
      "is_primary": false,
      "skill_name": "Data Lakes"
    },
    {
      "is_primary": false,
      "skill_name": "Lakehouse"
    },
    {
      "is_primary": false,
      "skill_name": "Event-Driven Architecture"
    }
  ],
  "jd_role": null,
  "nano_parsed": {
    "JD_type": "pass",
    "about_company": null,
    "certifications": [],
    "company_name": null,
    "ctc": null,
    "domain": {
      "primary": {
        "aliases": [],
        "domain": "Other"
      },
      "secondary": null
    },
    "education": [],
    "experience": {
      "max": null,
      "min": 10,
      "raw": "10+ years of experience in designing and developing large-scale data-driven systems"
    },
    "job_locations": [],
    "role": null,
    "role_archetype": "Data",
    "roles_and_responsibilities": [
      {
        "bullet_count": 8,
        "heading": "Responsibilities",
        "heading_was_present": true,
        "source_marker": {
          "first_5_words": "\u2022 Design and build scalable,",
          "last_5_words": "robustness, scalability, and developer productivity"
        },
        "text": "\u2022 Design and build scalable, distributed data systems for real-time and batch processing\n\u2022 Develop high-throughput data pipelines processing 10B+ events per day\n\u2022 Contribute to and own key components in the technical design and implementation of core AdCloud platform components\n\u2022 Work with technologies such as Apache Spark, Kafka, Hadoop ecosystem, and modern data platforms\n\u2022 Ensure performance, scalability, reliability, and cost efficiency of data systems\n\u2022 Collaborate with Product, Engineering, and Data Science teams to deliver end-to-end solutions\n\u2022 Participate in design and code reviews to maintain high engineering standards\n\u2022 Continuously improve system robustness, scalability, and developer productivity",
        "word_count": 108
      },
      {
        "bullet_count": 8,
        "heading": "Qualifications",
        "heading_was_present": true,
        "source_marker": {
          "first_5_words": "\u2022 10+ years of experience in",
          "last_5_words": "communication and collaboration skills"
        },
        "text": "\u2022 10+ years of experience in designing and developing large-scale data-driven systems\n\u2022 Strong experience with distributed data processing frameworks (Spark, Kafka, Hadoop, etc.)\n\u2022 Experience with NoSQL systems (HBase, Aerospike, Cassandra) and RDBMS\n\u2022 Strong programming skills in Java, Scala, or similar languages\n\u2022 Solid understanding of data structures, algorithms, and system design\n\u2022 Experience building scalable systems on cloud platforms (AWS/GCP/Azure)\n\u2022 Strong focus on performance optimization and cost efficiency\n\u2022 Excellent communication and collaboration skills",
        "word_count": 108
      },
      {
        "bullet_count": 5,
        "heading": "Nice to Have",
        "heading_was_present": true,
        "source_marker": {
          "first_5_words": "\u2022 Experience in AdTech or",
          "last_5_words": "or technical communities"
        },
        "text": "\u2022 Experience in AdTech or high-scale event-driven systems\n\u2022 Exposure to data governance, data quality, and metadata systems\n\u2022 Experience supporting ML/AI data pipelines\n\u2022 Familiarity with modern data architectures (data lakes, lakehouse, etc.)\n\u2022 Contributions to open-source or technical communities",
        "word_count": 56
      }
    ],
    "urls": []
  },
  "rejected": false,
  "rejection_reason": null,
  "run_id": "1f106d71-338e-40ee-a69a-09957abcd98f",
  "stage3_signals": {
    "alias_match_roles": [],
    "kra_match_roles": [
      {
        "display_name": "Cloud Architect",
        "matched_count": null,
        "role_id": 9,
        "score": 0.429,
        "slug": "cloud-architect",
        "total_count": null
      },
      {
        "display_name": "Data Engineer",
        "matched_count": null,
        "role_id": 2,
        "score": 0.4131,
        "slug": "data-engineer",
        "total_count": null
      },
      {
        "display_name": "AR/VR Engineer",
        "matched_count": null,
        "role_id": 8,
        "score": 0.4098,
        "slug": "ar-vr-engineer",
        "total_count": null
      },
      {
        "display_name": "Android Engineer",
        "matched_count": null,
        "role_id": 4,
        "score": 0.3901,
        "slug": "android-engineer",
        "total_count": null
      },
      {
        "display_name": "Backend Engineer",
        "matched_count": null,
        "role_id": 1,
        "score": 0.3778,
        "slug": "backend-engineer",
        "total_count": null
      }
    ],
    "skill_match_roles": [
      {
        "display_name": "Backend Engineer",
        "matched_count": 6,
        "role_id": 1,
        "score": 0.3333,
        "slug": "backend-engineer",
        "total_count": 18
      },
      {
        "display_name": "Data Engineer",
        "matched_count": 6,
        "role_id": 2,
        "score": 0.3333,
        "slug": "data-engineer",
        "total_count": 18
      },
      {
        "display_name": "ML Engineer",
        "matched_count": 4,
        "role_id": 3,
        "score": 0.2222,
        "slug": "ml-engineer",
        "total_count": 18
      },
      {
        "display_name": "AI Engineer",
        "matched_count": 3,
        "role_id": 13,
        "score": 0.1667,
        "slug": "ai-engineer",
        "total_count": 18
      },
      {
        "display_name": "Cybersecurity Engineer",
        "matched_count": 3,
        "role_id": 5,
        "score": 0.1667,
        "slug": "cybersecurity-engineer",
        "total_count": 18
      }
    ],
    "stage35_ran": false
  },
  "stage4_decision": {
    "alias_collision_detected": false,
    "case": "F",
    "chosen_role": {
      "display_name": "Data Engineer",
      "matched_count": null,
      "role_id": 2,
      "score": 0.4131,
      "slug": "data-engineer",
      "total_count": null
    },
    "confidence": 0.95,
    "llm2_fired": true,
    "llm2_reasoning": "The JD focuses heavily on building large-scale, distributed data pipelines with Spark, Kafka, and Hadoop, which aligns directly with typical Data Engineer responsibilities.",
    "queued": false,
    "reasoning": "LLM2 picked data-engineer (confidence 0.95)"
  },
  "stage5_updates": {
    "centroid_n_after": 16,
    "centroid_updated": true,
    "collision_log_id": null,
    "new_kra_attached": null,
    "new_skills_attached": [
      {
        "is_primary": true,
        "queue_id": 1088,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Apache Spark",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 1089,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Hadoop",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 1090,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "HBase",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 1091,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Aerospike",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 1092,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Cassandra",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 1093,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "RDBMS",
        "status": "pending"
      },
      {
        "is_primary": false,
        "queue_id": 1094,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Machine Learning",
        "status": "pending"
      },
      {
        "is_primary": false,
        "queue_id": 1095,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Artificial Intelligence",
        "status": "pending"
      },
      {
        "is_primary": false,
        "queue_id": 1096,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Data Lakes",
        "status": "pending"
      },
      {
        "is_primary": false,
        "queue_id": 1097,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Lakehouse",
        "status": "pending"
      },
      {
        "is_primary": false,
        "queue_id": 1098,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Event-Driven Architecture",
        "status": "pending"
      }
    ],
    "queue_entry_id": null,
    "v3_pipeline_triggered": false,
    "v3_role_slug": null,
    "v3_run_id": null
  }
}
API 2 — extract-details
{
  "alias_matches": [
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 173,
      "existing_alias_text": "Kafka",
      "input_term": "Kafka",
      "matched_canonical": {
        "category_id": 9,
        "display_name": "Kafka",
        "id": 36,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "PLATFORM",
        "slug": "kafka",
        "sub_category_id": 47,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 1,
      "existing_alias_text": "Java",
      "input_term": "Java",
      "matched_canonical": {
        "category_id": 6,
        "display_name": "Java",
        "id": 1,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "LANGUAGE",
        "slug": "java",
        "sub_category_id": 96,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 272,
      "existing_alias_text": "Scala",
      "input_term": "Scala",
      "matched_canonical": {
        "category_id": 6,
        "display_name": "Scala",
        "id": 102,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "LANGUAGE",
        "slug": "scala",
        "sub_category_id": 96,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 406,
      "existing_alias_text": "AWS",
      "input_term": "AWS",
      "matched_canonical": {
        "category_id": 9,
        "display_name": "AWS",
        "id": 187,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "PLATFORM",
        "slug": "aws",
        "sub_category_id": 46,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 405,
      "existing_alias_text": "GCP",
      "input_term": "GCP",
      "matched_canonical": {
        "category_id": 9,
        "display_name": "GCP",
        "id": 186,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "PLATFORM",
        "slug": "gcp",
        "sub_category_id": 46,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 407,
      "existing_alias_text": "Azure",
      "input_term": "Azure",
      "matched_canonical": {
        "category_id": 9,
        "display_name": "Azure",
        "id": 188,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "PLATFORM",
        "slug": "azure",
        "sub_category_id": 46,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 1989,
      "existing_alias_text": "NoSQL",
      "input_term": "NoSQL",
      "matched_canonical": {
        "category_id": 2,
        "display_name": "NoSQL",
        "id": 1346,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "CONCEPT",
        "slug": "nosql",
        "sub_category_id": 1019,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    }
  ],
  "candidate_roles": [
    {
      "display_name": "Backend Engineer",
      "id": 1,
      "rationale": null,
      "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
      "slug": "backend-engineer",
      "source": "db"
    },
    {
      "display_name": "Data Engineer",
      "id": 2,
      "rationale": null,
      "role_archetype": null,
      "slug": "data-engineer",
      "source": "db"
    },
    {
      "display_name": "Android Engineer",
      "id": 4,
      "rationale": null,
      "role_archetype": null,
      "slug": "android-engineer",
      "source": "db"
    },
    {
      "display_name": "ML Engineer",
      "id": 3,
      "rationale": null,
      "role_archetype": null,
      "slug": "ml-engineer",
      "source": "db"
    },
    {
      "display_name": "Cybersecurity Engineer",
      "id": 5,
      "rationale": null,
      "role_archetype": null,
      "slug": "cybersecurity-engineer",
      "source": "db"
    },
    {
      "display_name": "DevOps Engineer",
      "id": 10,
      "rationale": null,
      "role_archetype": null,
      "slug": "devops-engineer",
      "source": "db"
    },
    {
      "display_name": "AI Engineer",
      "id": 13,
      "rationale": null,
      "role_archetype": null,
      "slug": "ai-engineer",
      "source": "db"
    },
    {
      "display_name": "Cloud Architect",
      "id": 9,
      "rationale": null,
      "role_archetype": null,
      "slug": "cloud-architect",
      "source": "db"
    }
  ],
  "chosen_role": {
    "display_name": "Data Engineer",
    "id": 2,
    "rationale": "The primary skills indicate a strong focus on data processing technologies and cloud platforms, aligning well with a Data Engineer role.",
    "role_archetype": null,
    "slug": "data-engineer",
    "source": "db"
  },
  "dimensions": [
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Messaging and Event Streaming",
        "id": 8,
        "rationale": "Transport-layer systems used to move events and decouple producers from consumers. Data engineers use these systems to ingest, buffer, and distribute event data before downstream processing.",
        "slug": "messaging-and-event-streaming",
        "source": "db"
      },
      "input_skill": "Kafka",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Backend Engineer",
          "id": 1,
          "rationale": null,
          "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
          "slug": "backend-engineer",
          "source": "db"
        },
        {
          "display_name": "Data Engineer",
          "id": 2,
          "rationale": null,
          "role_archetype": null,
          "slug": "data-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Kotlin and Java",
        "id": 161,
        "rationale": "Primary implementation languages for Android app features, platform integration, and client-side business logic. Android engineers use these languages to build screens, state flows, service adapters, and device-aware behavior.",
        "slug": "kotlin-and-java",
        "source": "db"
      },
      "input_skill": "Java",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Android Engineer",
          "id": 4,
          "rationale": null,
          "role_archetype": null,
          "slug": "android-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Programming Languages",
        "id": 1,
        "rationale": "Core server-side languages used to implement backend business logic, integrations, and service internals. This is the primary coding surface for the role across application layers.",
        "slug": "programming-languages",
        "source": "db"
      },
      "input_skill": "Java",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Backend Engineer",
          "id": 1,
          "rationale": null,
          "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
          "slug": "backend-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Programming Languages for Data Work",
        "id": 21,
        "rationale": "Languages used to implement data pipelines, transformations, and operational glue. This is the primary coding surface for building ingestion, enrichment, and automation logic in data engineering.",
        "slug": "programming-languages-for-data-work",
        "source": "db"
      },
      "input_skill": "Java",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Data Engineer",
          "id": 2,
          "rationale": null,
          "role_archetype": null,
          "slug": "data-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Programming Languages for Data Work",
        "id": 21,
        "rationale": "Languages used to implement data pipelines, transformations, and operational glue. This is the primary coding surface for building ingestion, enrichment, and automation logic in data engineering.",
        "slug": "programming-languages-for-data-work",
        "source": "db"
      },
      "input_skill": "Scala",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Data Engineer",
          "id": 2,
          "rationale": null,
          "role_archetype": null,
          "slug": "data-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Programming Languages for ML Systems",
        "id": 39,
        "rationale": "Languages used to build training code, inference services, evaluation jobs, and ML glue code. This is the primary implementation surface for ML engineers across experimentation and productionization.",
        "slug": "programming-languages-for-ml-systems",
        "source": "db"
      },
      "input_skill": "Scala",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "ML Engineer",
          "id": 3,
          "rationale": null,
          "role_archetype": null,
          "slug": "ml-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Cloud Platforms",
        "id": 20,
        "rationale": "Proficiency in major cloud service provider platforms and their core services.",
        "slug": "cloud-platforms",
        "source": "db"
      },
      "input_skill": "AWS",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Backend Engineer",
          "id": 1,
          "rationale": null,
          "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
          "slug": "backend-engineer",
          "source": "db"
        },
        {
          "display_name": "Cybersecurity Engineer",
          "id": 5,
          "rationale": null,
          "role_archetype": null,
          "slug": "cybersecurity-engineer",
          "source": "db"
        },
        {
          "display_name": "Data Engineer",
          "id": 2,
          "rationale": null,
          "role_archetype": null,
          "slug": "data-engineer",
          "source": "db"
        },
        {
          "display_name": "DevOps Engineer",
          "id": 10,
          "rationale": null,
          "role_archetype": null,
          "slug": "devops-engineer",
          "source": "db"
        },
        {
          "display_name": "ML Engineer",
          "id": 3,
          "rationale": null,
          "role_archetype": null,
          "slug": "ml-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Cloud Platforms for AI Deployment",
        "id": 211,
        "rationale": "Major cloud services that provide infrastructure and managed services for AI workloads.",
        "slug": "cloud-platforms-for-ai-deployment",
        "source": "db"
      },
      "input_skill": "AWS",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "AI Engineer",
          "id": 13,
          "rationale": null,
          "role_archetype": null,
          "slug": "ai-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Cloud Provider Platforms",
        "id": 131,
        "rationale": "Major cloud platforms and their core service ecosystems used to design target-state architectures, choose deployment boundaries, and evaluate managed capabilities. This is the primary substrate for cloud architecture decisions.",
        "slug": "cloud-provider-platforms",
        "source": "db"
      },
      "input_skill": "AWS",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Cloud Architect",
          "id": 9,
          "rationale": null,
          "role_archetype": null,
          "slug": "cloud-architect",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Cloud Security Posture Tools",
        "id": 64,
        "rationale": "Cloud-native security platforms used to assess misconfiguration, workload exposure, and cloud control coverage. This dimension includes the major CNAPP/CSPM/CWPP vendors and cloud security services the role reviews and tunes.",
        "slug": "cloud-security-posture-tools",
        "source": "db"
      },
      "input_skill": "AWS",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Cybersecurity Engineer",
          "id": 5,
          "rationale": null,
          "role_archetype": null,
          "slug": "cybersecurity-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Cloud Platforms",
        "id": 20,
        "rationale": "Proficiency in major cloud service provider platforms and their core services.",
        "slug": "cloud-platforms",
        "source": "db"
      },
      "input_skill": "GCP",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Backend Engineer",
          "id": 1,
          "rationale": null,
          "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
          "slug": "backend-engineer",
          "source": "db"
        },
        {
          "display_name": "Cybersecurity Engineer",
          "id": 5,
          "rationale": null,
          "role_archetype": null,
          "slug": "cybersecurity-engineer",
          "source": "db"
        },
        {
          "display_name": "Data Engineer",
          "id": 2,
          "rationale": null,
          "role_archetype": null,
          "slug": "data-engineer",
          "source": "db"
        },
        {
          "display_name": "DevOps Engineer",
          "id": 10,
          "rationale": null,
          "role_archetype": null,
          "slug": "devops-engineer",
          "source": "db"
        },
        {
          "display_name": "ML Engineer",
          "id": 3,
          "rationale": null,
          "role_archetype": null,
          "slug": "ml-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Cloud Platforms for AI Deployment",
        "id": 211,
        "rationale": "Major cloud services that provide infrastructure and managed services for AI workloads.",
        "slug": "cloud-platforms-for-ai-deployment",
        "source": "db"
      },
      "input_skill": "GCP",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "AI Engineer",
          "id": 13,
          "rationale": null,
          "role_archetype": null,
          "slug": "ai-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Cloud Security Posture Tools",
        "id": 64,
        "rationale": "Cloud-native security platforms used to assess misconfiguration, workload exposure, and cloud control coverage. This dimension includes the major CNAPP/CSPM/CWPP vendors and cloud security services the role reviews and tunes.",
        "slug": "cloud-security-posture-tools",
        "source": "db"
      },
      "input_skill": "GCP",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Cybersecurity Engineer",
          "id": 5,
          "rationale": null,
          "role_archetype": null,
          "slug": "cybersecurity-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Cloud Platforms",
        "id": 20,
        "rationale": "Proficiency in major cloud service provider platforms and their core services.",
        "slug": "cloud-platforms",
        "source": "db"
      },
      "input_skill": "Azure",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Backend Engineer",
          "id": 1,
          "rationale": null,
          "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
          "slug": "backend-engineer",
          "source": "db"
        },
        {
          "display_name": "Cybersecurity Engineer",
          "id": 5,
          "rationale": null,
          "role_archetype": null,
          "slug": "cybersecurity-engineer",
          "source": "db"
        },
        {
          "display_name": "Data Engineer",
          "id": 2,
          "rationale": null,
          "role_archetype": null,
          "slug": "data-engineer",
          "source": "db"
        },
        {
          "display_name": "DevOps Engineer",
          "id": 10,
          "rationale": null,
          "role_archetype": null,
          "slug": "devops-engineer",
          "source": "db"
        },
        {
          "display_name": "ML Engineer",
          "id": 3,
          "rationale": null,
          "role_archetype": null,
          "slug": "ml-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Cloud Platforms for AI Deployment",
        "id": 211,
        "rationale": "Major cloud services that provide infrastructure and managed services for AI workloads.",
        "slug": "cloud-platforms-for-ai-deployment",
        "source": "db"
      },
      "input_skill": "Azure",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "AI Engineer",
          "id": 13,
          "rationale": null,
          "role_archetype": null,
          "slug": "ai-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Cloud Provider Platforms",
        "id": 131,
        "rationale": "Major cloud platforms and their core service ecosystems used to design target-state architectures, choose deployment boundaries, and evaluate managed capabilities. This is the primary substrate for cloud architecture decisions.",
        "slug": "cloud-provider-platforms",
        "source": "db"
      },
      "input_skill": "Azure",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Cloud Architect",
          "id": 9,
          "rationale": null,
          "role_archetype": null,
          "slug": "cloud-architect",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Cloud Security Posture Tools",
        "id": 64,
        "rationale": "Cloud-native security platforms used to assess misconfiguration, workload exposure, and cloud control coverage. This dimension includes the major CNAPP/CSPM/CWPP vendors and cloud security services the role reviews and tunes.",
        "slug": "cloud-security-posture-tools",
        "source": "db"
      },
      "input_skill": "Azure",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Cybersecurity Engineer",
          "id": 5,
          "rationale": null,
          "role_archetype": null,
          "slug": "cybersecurity-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "NoSQL Databases",
        "id": 19,
        "rationale": "Models and manages data using non-relational database systems.",
        "slug": "nosql-databases",
        "source": "db"
      },
      "input_skill": "NoSQL",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Backend Engineer",
          "id": 1,
          "rationale": null,
          "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
          "slug": "backend-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "ETL and ELT Tooling",
        "id": 24,
        "rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
        "slug": "etl-and-elt-tooling",
        "source": "db"
      },
      "input_skill": "Apache Spark",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Data Engineer",
          "id": 2,
          "rationale": null,
          "role_archetype": null,
          "slug": "data-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "ETL and ELT Tooling",
        "id": 24,
        "rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
        "slug": "etl-and-elt-tooling",
        "source": "db"
      },
      "input_skill": "Hadoop",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Data Engineer",
          "id": 2,
          "rationale": null,
          "role_archetype": null,
          "slug": "data-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Cloud Storage and Data Services",
        "id": 144,
        "rationale": "Cloud-native storage and managed data services used to place workloads, choose durability tiers, and define platform boundaries. This is a coherent cluster because architects evaluate storage fit, access patterns, and managed service tradeoffs.",
        "slug": "cloud-storage-and-data-services",
        "source": "db"
      },
      "input_skill": "HBase",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Cloud Architect",
          "id": 9,
          "rationale": null,
          "role_archetype": null,
          "slug": "cloud-architect",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "React Frontend Development",
        "id": 96,
        "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
        "slug": "d_init_01",
        "source": "db"
      },
      "input_skill": "Aerospike",
      "llm_role": null,
      "roles_from_db": []
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Cloud Storage and Data Services",
        "id": 144,
        "rationale": "Cloud-native storage and managed data services used to place workloads, choose durability tiers, and define platform boundaries. This is a coherent cluster because architects evaluate storage fit, access patterns, and managed service tradeoffs.",
        "slug": "cloud-storage-and-data-services",
        "source": "db"
      },
      "input_skill": "Cassandra",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Cloud Architect",
          "id": 9,
          "rationale": null,
          "role_archetype": null,
          "slug": "cloud-architect",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "React Frontend Development",
        "id": 96,
        "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
        "slug": "d_init_01",
        "source": "db"
      },
      "input_skill": "RDBMS",
      "llm_role": null,
      "roles_from_db": []
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "React Frontend Development",
        "id": 96,
        "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
        "slug": "d_init_01",
        "source": "db"
      },
      "input_skill": "Machine Learning",
      "llm_role": null,
      "roles_from_db": []
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "AI Governance and Model Security",
        "id": 50,
        "rationale": "Controls and documentation used to make models safer, auditable, and compliant. ML engineers use this to manage model risk, supply chain integrity, and governance requirements.",
        "slug": "ai-governance-and-model-security",
        "source": "db"
      },
      "input_skill": "Machine Learning",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "AI Engineer",
          "id": 13,
          "rationale": null,
          "role_archetype": null,
          "slug": "ai-engineer",
          "source": "db"
        },
        {
          "display_name": "ML Engineer",
          "id": 3,
          "rationale": null,
          "role_archetype": null,
          "slug": "ml-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "AI Governance and Model Security",
        "id": 50,
        "rationale": "Controls and documentation used to make models safer, auditable, and compliant. ML engineers use this to manage model risk, supply chain integrity, and governance requirements.",
        "slug": "ai-governance-and-model-security",
        "source": "db"
      },
      "input_skill": "Machine Learning",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "AI Engineer",
          "id": 13,
          "rationale": null,
          "role_archetype": null,
          "slug": "ai-engineer",
          "source": "db"
        },
        {
          "display_name": "ML Engineer",
          "id": 3,
          "rationale": null,
          "role_archetype": null,
          "slug": "ml-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "React Frontend Development",
        "id": 96,
        "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
        "slug": "d_init_01",
        "source": "db"
      },
      "input_skill": "Artificial Intelligence",
      "llm_role": null,
      "roles_from_db": []
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Cloud Storage and Data Services",
        "id": 144,
        "rationale": "Cloud-native storage and managed data services used to place workloads, choose durability tiers, and define platform boundaries. This is a coherent cluster because architects evaluate storage fit, access patterns, and managed service tradeoffs.",
        "slug": "cloud-storage-and-data-services",
        "source": "db"
      },
      "input_skill": "Data Lakes",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Cloud Architect",
          "id": 9,
          "rationale": null,
          "role_archetype": null,
          "slug": "cloud-architect",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "React Frontend Development",
        "id": 96,
        "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
        "slug": "d_init_01",
        "source": "db"
      },
      "input_skill": "Data Lakes",
      "llm_role": null,
      "roles_from_db": []
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Cloud Storage and Data Services",
        "id": 144,
        "rationale": "Cloud-native storage and managed data services used to place workloads, choose durability tiers, and define platform boundaries. This is a coherent cluster because architects evaluate storage fit, access patterns, and managed service tradeoffs.",
        "slug": "cloud-storage-and-data-services",
        "source": "db"
      },
      "input_skill": "Data Lakes",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Cloud Architect",
          "id": 9,
          "rationale": null,
          "role_archetype": null,
          "slug": "cloud-architect",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "React Frontend Development",
        "id": 96,
        "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
        "slug": "d_init_01",
        "source": "db"
      },
      "input_skill": "Lakehouse",
      "llm_role": null,
      "roles_from_db": []
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "React Frontend Development",
        "id": 96,
        "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
        "slug": "d_init_01",
        "source": "db"
      },
      "input_skill": "Event-Driven Architecture",
      "llm_role": null,
      "roles_from_db": []
    }
  ],
  "input_final_skills": [
    "Apache Spark",
    "Kafka",
    "Hadoop",
    "HBase",
    "Aerospike",
    "Cassandra",
    "Java",
    "Scala",
    "AWS",
    "GCP",
    "Azure",
    "RDBMS",
    "NoSQL",
    "Machine Learning",
    "Artificial Intelligence",
    "Data Lakes",
    "Lakehouse",
    "Event-Driven Architecture"
  ],
  "input_llm_skills": [
    "Apache Spark",
    "Kafka",
    "Hadoop",
    "HBase",
    "Aerospike",
    "Cassandra",
    "Java",
    "Scala",
    "AWS",
    "GCP",
    "Azure",
    "RDBMS",
    "NoSQL",
    "Machine Learning",
    "Artificial Intelligence",
    "Data Lakes",
    "Lakehouse",
    "Event-Driven Architecture"
  ],
  "new_aliases_persisted": 0,
  "run_id": "1f106d71-338e-40ee-a69a-09957abcd98f",
  "skills_detail": [
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "ETL and ELT Tooling",
            "id": 24,
            "rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
            "slug": "etl-and-elt-tooling",
            "source": "db"
          },
          "input_skill": "Apache Spark",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Data Engineer",
              "id": 2,
              "rationale": null,
              "role_archetype": null,
              "slug": "data-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "Apache Spark",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Framework",
          "skill_nature": "FRAMEWORK",
          "sub_category": "distributed_data_processing_framework",
          "typical_lifespan": "EVERGREEN",
          "version_strategy": "SEPARATE_ENTITY",
          "volatility": "STABLE"
        },
        "enrichment": {
          "ambiguity": {
            "ambiguity_flag": false,
            "confused_with": [],
            "reasoning": "\u201cApache Spark\u201d is a specific, widely recognized distributed data processing framework; typical JDs won\u2019t confuse it with other distinct ETL/streaming tools."
          },
          "context_keywords": {
            "context_keywords": [
              "RDD",
              "DataFrame",
              "Spark SQL",
              "MLlib",
              "Spark Streaming",
              "DAGScheduler",
              "Cluster Manager",
              "Apache Kafka",
              "Hadoop",
              "PySpark",
              "Scala",
              "SparkSession",
              "ETL",
              "Data Lake",
              "Machine Learning"
            ]
          },
          "maturity": {
            "confidence": 0.95,
            "maturity": "well_known",
            "reasoning": "Apache Spark appears in many data engineering JDs and remains a standard for distributed ETL/ELT; its GitHub and vendor ecosystem activity stay strong, with Databricks and cloud platforms still promoting it."
          },
          "skill_id": "apache-spark",
          "vendor_license": {
            "confidence": 0.95,
            "license": "apache_2",
            "vendor": "Apache Software Foundation",
            "year_introduced": 2010
          },
          "versioning": {
            "current_version": "3.x",
            "version_aliases": {
              "apache spark 3": "3.x",
              "spark": "3.x",
              "spark 3": "3.x",
              "spark 3.x": "3.x",
              "spark3": "3.x"
            },
            "versioned": true
          }
        },
        "keep_log": [],
        "locked_dimensions": [
          {
            "description": "Frameworks used to process large datasets in batch or streaming pipelines, often as part of ETL/ELT workflows. Apache Spark belongs here because it is a core engine for distributed transformation, aggregation, and data movement.",
            "exemplar_skills": [
              "Apache Spark",
              "Spark SQL",
              "Spark Structured Streaming",
              "Spark DataFrame API",
              "Spark Dataset API",
              "distributed batch processing",
              "distributed joins",
              "windowed aggregations"
            ],
            "in_scope": "Apache Spark, Spark SQL, Spark Structured Streaming, DataFrame and Dataset APIs, batch ETL jobs, distributed joins and aggregations, window functions, partitioning strategies, cluster execution on YARN/Kubernetes/Databricks",
            "name": "Distributed Data Processing Frameworks",
            "out_of_scope": "BI dashboards and semantic layers, storage systems like data lakes and warehouses, low-level programming language syntax, workflow orchestration platforms, model training frameworks",
            "overlap_flags": [
              {
                "reason": "Spark commonly reads from and writes to cloud object stores and managed data services, but the processing engine itself is the primary fit here.",
                "with_dim_id": "cloud-storage-and-data-services",
                "with_dim_name": null,
                "with_role": "Cloud Architect"
              },
              {
                "reason": "Spark jobs are written in languages like Scala, Python, and Java, but language fluency is a separate dimension from the Spark framework itself.",
                "with_dim_id": "programming-languages-and-scripting",
                "with_dim_name": null,
                "with_role": "Cybersecurity Engineer"
              }
            ],
            "tentative_id": "etl-and-elt-tooling"
          }
        ],
        "merge_log": [],
        "placed": {
          "name": "Apache Spark",
          "placement_confidence": 0.92,
          "primary_dimension": "etl-and-elt-tooling",
          "reasoning": "Deterministic JD placement: locked_dimensions has 1 dimension(s) from skill-driven dimension generation after reconciliation; primary_dimension is the first locked dim.",
          "secondary_dimensions": [],
          "skill_id": "apache-spark"
        },
        "relationships": {
          "child_skills": [],
          "parent_skills": [
            "spark"
          ],
          "related_to": [
            "flink",
            "databricks",
            "kubernetes",
            "aws",
            "azure"
          ],
          "requires": [],
          "skill_id": "apache-spark",
          "suppress_on_match": []
        },
        "skill_id": "apache-spark",
        "split_log": [],
        "typed": {
          "alternatives_considered": [],
          "confidence": 0.94,
          "name": "Apache Spark",
          "reasoning": "Apache Spark is a structured codebase that users build data applications and pipelines inside, so by the Tool vs Framework rule it is a Framework rather than a Tool.",
          "skill_id": "apache-spark",
          "subtype": "distributed_data_processing_framework",
          "type": "Framework"
        },
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "Kafka",
          "alias_type": "CANONICAL",
          "id": 173,
          "is_primary": true,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 9,
        "display_name": "Kafka",
        "id": 36,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "PLATFORM",
        "slug": "kafka",
        "sub_category_id": 47,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Messaging and Event Streaming",
            "id": 8,
            "rationale": "Transport-layer systems used to move events and decouple producers from consumers. Data engineers use these systems to ingest, buffer, and distribute event data before downstream processing.",
            "slug": "messaging-and-event-streaming",
            "source": "db"
          },
          "input_skill": "Kafka",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Backend Engineer",
              "id": 1,
              "rationale": null,
              "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
              "slug": "backend-engineer",
              "source": "db"
            },
            {
              "display_name": "Data Engineer",
              "id": 2,
              "rationale": null,
              "role_archetype": null,
              "slug": "data-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "Kafka",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "ETL and ELT Tooling",
            "id": 24,
            "rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
            "slug": "etl-and-elt-tooling",
            "source": "db"
          },
          "input_skill": "Hadoop",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Data Engineer",
              "id": 2,
              "rationale": null,
              "role_archetype": null,
              "slug": "data-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "Hadoop",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Framework",
          "skill_nature": "FRAMEWORK",
          "sub_category": "data_processing_framework",
          "typical_lifespan": "EVERGREEN",
          "version_strategy": "NOT_APPLICABLE",
          "volatility": "STABLE"
        },
        "enrichment": {
          "ambiguity": {
            "ambiguity_flag": false,
            "confused_with": [],
            "reasoning": "\u201cHadoop\u201d is a specific data processing framework; typical JDs distinguish it from other big data tools like Spark or Hive."
          },
          "context_keywords": {
            "context_keywords": [
              "MapReduce",
              "HDFS",
              "YARN",
              "Hive",
              "Pig",
              "Spark",
              "Sqoop",
              "Flume",
              "Oozie",
              "Kafka",
              "NoSQL",
              "Big Data",
              "Data Lake",
              "ETL",
              "ELT",
              "Distributed Computing"
            ]
          },
          "maturity": {
            "confidence": 0.91,
            "maturity": "niche",
            "reasoning": "Job postings still mention Hadoop for legacy big-data stacks, but JD volume has fallen as Spark and cloud warehouses replaced MapReduce-era clusters."
          },
          "skill_id": "hadoop",
          "vendor_license": {
            "confidence": 0.95,
            "license": "apache_2",
            "vendor": "Apache Software Foundation",
            "year_introduced": 2006
          },
          "versioning": {
            "current_version": null,
            "version_aliases": {},
            "versioned": false
          }
        },
        "keep_log": [],
        "locked_dimensions": [
          {
            "description": "Tools and frameworks used to ingest, store, and process large-scale batch data across clusters. Hadoop belongs here because it is a foundational platform for distributed storage and computation in data engineering workflows.",
            "exemplar_skills": [
              "Hadoop",
              "HDFS",
              "MapReduce",
              "YARN",
              "Hive",
              "Spark on Hadoop"
            ],
            "in_scope": "Hadoop, HDFS, MapReduce, YARN, Hive on Hadoop, Spark on Hadoop clusters, cluster-based batch ingestion, distributed file storage, job scheduling on Hadoop",
            "name": "Distributed Data Processing Platforms",
            "out_of_scope": "Cloud object storage and managed warehouses, streaming-only systems like Kafka and Flink, BI dashboards and reporting, application backend APIs and services",
            "overlap_flags": [
              {
                "reason": "Hadoop includes distributed storage concepts via HDFS, which can overlap with broader storage platform skills.",
                "with_dim_id": "cloud-storage-and-data-services",
                "with_dim_name": null,
                "with_role": "Cloud Architect"
              },
              {
                "reason": "MapReduce and cluster execution rely on parallel processing concepts, though the dimension is primarily data-platform oriented.",
                "with_dim_id": "concurrency-and-parallel-processing",
                "with_dim_name": null,
                "with_role": "Backend Engineer"
              }
            ],
            "tentative_id": "etl-and-elt-tooling"
          }
        ],
        "merge_log": [],
        "placed": {
          "name": "Hadoop",
          "placement_confidence": 0.92,
          "primary_dimension": "etl-and-elt-tooling",
          "reasoning": "Deterministic JD placement: locked_dimensions has 1 dimension(s) from skill-driven dimension generation after reconciliation; primary_dimension is the first locked dim.",
          "secondary_dimensions": [],
          "skill_id": "hadoop"
        },
        "relationships": {
          "child_skills": [],
          "parent_skills": [],
          "related_to": [
            "spark",
            "databricks",
            "jvm",
            "nosql",
            "kubernetes",
            "jenkins",
            "git",
            "gradle"
          ],
          "requires": [],
          "skill_id": "hadoop",
          "suppress_on_match": []
        },
        "skill_id": "hadoop",
        "split_log": [],
        "typed": {
          "alternatives_considered": [],
          "confidence": 0.9,
          "name": "Hadoop",
          "reasoning": "Hadoop is fundamentally a structured software stack that users build distributed data applications and jobs within, so it fits the Framework category rather than a Tool or Platform.",
          "skill_id": "hadoop",
          "subtype": "data_processing_framework",
          "type": "Framework"
        },
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Cloud Storage and Data Services",
            "id": 144,
            "rationale": "Cloud-native storage and managed data services used to place workloads, choose durability tiers, and define platform boundaries. This is a coherent cluster because architects evaluate storage fit, access patterns, and managed service tradeoffs.",
            "slug": "cloud-storage-and-data-services",
            "source": "db"
          },
          "input_skill": "HBase",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Cloud Architect",
              "id": 9,
              "rationale": null,
              "role_archetype": null,
              "slug": "cloud-architect",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "HBase",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Datastore",
          "skill_nature": "TOOL",
          "sub_category": "wide_column_store",
          "typical_lifespan": "EVERGREEN",
          "version_strategy": "NOT_APPLICABLE",
          "volatility": "STABLE"
        },
        "enrichment": {
          "ambiguity": {
            "ambiguity_flag": false,
            "confused_with": [],
            "reasoning": "HBase is a specific Apache wide-column NoSQL datastore; JDs typically distinguish it from other datastores."
          },
          "context_keywords": {
            "context_keywords": [
              "Hadoop",
              "NoSQL",
              "Bigtable",
              "MapReduce",
              "Apache",
              "column family",
              "scalability",
              "distributed",
              "data model",
              "real-time",
              "table design",
              "region server",
              "Thrift",
              "REST API",
              "data replication"
            ]
          },
          "maturity": {
            "confidence": 0.86,
            "maturity": "niche",
            "reasoning": "HBase appears in a limited set of big-data/legacy Hadoop job postings, while newer JDs more often specify DynamoDB, Bigtable, or Cassandra; its market demand is specialized rather than broad."
          },
          "skill_id": "hbase",
          "vendor_license": {
            "confidence": 0.95,
            "license": "apache_2",
            "vendor": "Apache Software Foundation",
            "year_introduced": 2010
          },
          "versioning": {
            "current_version": null,
            "version_aliases": {},
            "versioned": false
          }
        },
        "keep_log": [],
        "locked_dimensions": [
          {
            "description": "Storage systems used to persist large-scale application and analytical data with low-latency access and horizontal scaling. HBase fits here as a distributed NoSQL store built on top of Hadoop storage infrastructure.",
            "exemplar_skills": [
              "HBase",
              "HBase data modeling",
              "row-key design in HBase",
              "column families",
              "region server management",
              "HBase replication"
            ],
            "in_scope": "HBase, distributed NoSQL tables, column-family storage, row-key design, region splitting and balancing, replication, compactions, scans and gets, Hadoop ecosystem storage services",
            "name": "Distributed Data Storage Systems",
            "out_of_scope": "Relational databases and SQL schema design, stream processing engines, object storage buckets, in-memory caches, query orchestration and BI tools",
            "overlap_flags": [
              {
                "reason": "HBase is often deployed on cloud infrastructure, but this dimension is about the storage system itself rather than the hosting platform.",
                "with_dim_id": "cloud-platforms",
                "with_dim_name": null,
                "with_role": "Backend Engineer, Cybersecurity Engineer, Data Engineer, DevOps Engineer, ML Engineer"
              },
              {
                "reason": "HBase is frequently used in ingestion pipelines, but pipeline orchestration belongs to ETL/ELT tooling rather than the database dimension.",
                "with_dim_id": "etl-and-elt-tooling",
                "with_dim_name": null,
                "with_role": "Data Engineer"
              }
            ],
            "tentative_id": "cloud-storage-and-data-services"
          }
        ],
        "merge_log": [],
        "placed": {
          "name": "HBase",
          "placement_confidence": 0.92,
          "primary_dimension": "cloud-storage-and-data-services",
          "reasoning": "Deterministic JD placement: locked_dimensions has 1 dimension(s) from skill-driven dimension generation after reconciliation; primary_dimension is the first locked dim.",
          "secondary_dimensions": [],
          "skill_id": "hbase"
        },
        "relationships": {
          "child_skills": [],
          "parent_skills": [],
          "related_to": [
            "nosql",
            "sqlite",
            "rds",
            "vector-db",
            "databricks",
            "flink",
            "kubernetes",
            "jvm"
          ],
          "requires": [],
          "skill_id": "hbase",
          "suppress_on_match": []
        },
        "skill_id": "hbase",
        "split_log": [],
        "typed": {
          "alternatives_considered": [],
          "confidence": 0.98,
          "name": "HBase",
          "reasoning": "HBase is fundamentally a persistent data system, so by the Datastore vs Format rule it is a datastore rather than a tool or framework.",
          "skill_id": "hbase",
          "subtype": "wide_column_store",
          "type": "Datastore"
        },
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "React Frontend Development",
            "id": 96,
            "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
            "slug": "d_init_01",
            "source": "db"
          },
          "input_skill": "Aerospike",
          "llm_role": null,
          "roles_from_db": []
        }
      ],
      "input_skill": "Aerospike",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Datastore",
          "skill_nature": "TOOL",
          "sub_category": "distributed_nosql_datastore",
          "typical_lifespan": "EVERGREEN",
          "version_strategy": "NOT_APPLICABLE",
          "volatility": "STABLE"
        },
        "enrichment": {
          "ambiguity": {
            "ambiguity_flag": false,
            "confused_with": [],
            "reasoning": "Aerospike is a specific distributed NoSQL database name; unlikely to be confused with other catalog datastore skills."
          },
          "context_keywords": {
            "context_keywords": [
              "NoSQL",
              "distributed database",
              "high availability",
              "scalability",
              "data model",
              "key-value store",
              "latency",
              "replication",
              "cluster management",
              "data partitioning",
              "Aerospike client",
              "TTL",
              "secondary indexes",
              "real-time analytics",
              "data persistence"
            ]
          },
          "maturity": {
            "confidence": 0.86,
            "maturity": "niche",
            "reasoning": "Aerospike appears in a limited set of high-scale datastore JDs and vendor case studies, but it is far less common than PostgreSQL, Redis, or MongoDB in general hiring pipelines."
          },
          "skill_id": "aerospike",
          "vendor_license": {
            "confidence": 0.95,
            "license": "apache_2",
            "vendor": "Aerospike Inc.",
            "year_introduced": 2012
          },
          "versioning": {
            "current_version": null,
            "version_aliases": {},
            "versioned": false
          }
        },
        "keep_log": [],
        "locked_dimensions": [
          {
            "description": "Distributed NoSQL databases used for low-latency key-value access, horizontal scaling, and high availability. Aerospike belongs here because it is a distributed database platform rather than a general storage or cloud service.",
            "exemplar_skills": [
              "Aerospike",
              "NoSQL databases",
              "key-value stores",
              "distributed database clustering",
              "secondary indexing",
              "data replication",
              "partitioning and sharding"
            ],
            "in_scope": "Aerospike, key-value data modeling, secondary indexes, TTL and record expiration, replication and partitioning, cluster management, strong consistency options, low-latency reads and writes",
            "name": "Distributed NoSQL Databases",
            "out_of_scope": "Vector search and embedding stores, relational SQL databases, object storage, cache-only systems like Redis when used purely as cache, application-level data serialization",
            "overlap_flags": [
              {
                "reason": "Some teams deploy Aerospike as a managed data service, but the core skill is database operation and modeling rather than cloud storage selection.",
                "with_dim_id": "cloud-storage-and-data-services",
                "with_dim_name": null,
                "with_role": "Cloud Architect"
              },
              {
                "reason": "Aerospike work often involves latency and throughput tuning, but that dimension is broader and not database-specific.",
                "with_dim_id": "performance-and-scalability-tuning",
                "with_dim_name": null,
                "with_role": "Backend Engineer"
              }
            ],
            "tentative_id": "d_init_01"
          }
        ],
        "merge_log": [],
        "placed": {
          "name": "Aerospike",
          "placement_confidence": 0.92,
          "primary_dimension": "d_init_01",
          "reasoning": "Deterministic JD placement: locked_dimensions has 1 dimension(s) from skill-driven dimension generation after reconciliation; primary_dimension is the first locked dim.",
          "secondary_dimensions": [],
          "skill_id": "aerospike"
        },
        "relationships": {
          "child_skills": [],
          "parent_skills": [],
          "related_to": [
            "nosql",
            "sqlite",
            "rds",
            "spark",
            "kubernetes",
            "databricks",
            "aws",
            "aks"
          ],
          "requires": [],
          "skill_id": "aerospike",
          "suppress_on_match": []
        },
        "skill_id": "aerospike",
        "split_log": [],
        "typed": {
          "alternatives_considered": [],
          "confidence": 0.98,
          "name": "Aerospike",
          "reasoning": "Aerospike is fundamentally a system that persists and serves data, so by the Datastore vs Format rule it is a Datastore rather than a tool or platform.",
          "skill_id": "aerospike",
          "subtype": "distributed_nosql_datastore",
          "type": "Datastore"
        },
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Cloud Storage and Data Services",
            "id": 144,
            "rationale": "Cloud-native storage and managed data services used to place workloads, choose durability tiers, and define platform boundaries. This is a coherent cluster because architects evaluate storage fit, access patterns, and managed service tradeoffs.",
            "slug": "cloud-storage-and-data-services",
            "source": "db"
          },
          "input_skill": "Cassandra",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Cloud Architect",
              "id": 9,
              "rationale": null,
              "role_archetype": null,
              "slug": "cloud-architect",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "Cassandra",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Datastore",
          "skill_nature": "TOOL",
          "sub_category": "wide_column_store",
          "typical_lifespan": "EVERGREEN",
          "version_strategy": "NOT_APPLICABLE",
          "volatility": "STABLE"
        },
        "enrichment": {
          "ambiguity": {
            "ambiguity_flag": false,
            "confused_with": [],
            "reasoning": "\u201cCassandra\u201d is a specific wide-column NoSQL database name; unlikely to be confused with other catalog datastore skills."
          },
          "context_keywords": {
            "context_keywords": [
              "CQL",
              "DataStax",
              "TinkerPop",
              "Spark",
              "ScyllaDB",
              "Replication",
              "Partitioning",
              "Cluster",
              "NoSQL",
              "Wide Column",
              "Consistency",
              "Data Modeling",
              "DSE",
              "Thrift",
              "Eventual Consistency"
            ]
          },
          "maturity": {
            "confidence": 0.84,
            "maturity": "well_known",
            "reasoning": "Apache Cassandra appears in many production data-platform JDs and is a common choice for high-write, distributed workloads; GitHub and vendor docs show sustained activity rather than sunset signals."
          },
          "skill_id": "cassandra",
          "vendor_license": {
            "confidence": 0.95,
            "license": "apache_2",
            "vendor": "Apache Software Foundation",
            "year_introduced": 2008
          },
          "versioning": {
            "current_version": null,
            "version_aliases": {},
            "versioned": false
          }
        },
        "keep_log": [],
        "locked_dimensions": [
          {
            "description": "Managed and self-hosted data stores used to persist application data with high availability and horizontal scale. Cassandra belongs here because it is a distributed wide-column database chosen for partitioning, replication, and fault-tolerant storage.",
            "exemplar_skills": [
              "Cassandra",
              "Apache Cassandra",
              "CQL",
              "partition keys",
              "clustering keys",
              "replication",
              "consistency levels",
              "compaction",
              "tombstones"
            ],
            "in_scope": "Cassandra, Apache Cassandra, wide-column data modeling, partition keys, clustering keys, replication, consistency levels, compaction, tombstones, secondary indexes, CQL, data distribution, sharding, multi-datacenter replication",
            "name": "Distributed Data Storage Systems",
            "out_of_scope": "Relational schema design and SQL tuning, which belong to database/relational data modeling; vector similarity search, which belongs to vector-databases; cache-only systems like Redis, which are a separate in-memory data store cluster",
            "overlap_flags": [
              {
                "reason": "Cassandra is often selected and tuned for latency, throughput, and scale, so operational performance concerns can overlap.",
                "with_dim_id": "performance-and-scalability-tuning",
                "with_dim_name": null,
                "with_role": "Backend Engineer"
              },
              {
                "reason": "Cassandra is frequently deployed on cloud infrastructure or as a managed service, but the core skill is the database system itself.",
                "with_dim_id": "cloud-platforms",
                "with_dim_name": null,
                "with_role": "Backend Engineer, Cybersecurity Engineer, Data Engineer, DevOps Engineer, ML Engineer"
              }
            ],
            "tentative_id": "cloud-storage-and-data-services"
          }
        ],
        "merge_log": [],
        "placed": {
          "name": "Cassandra",
          "placement_confidence": 0.92,
          "primary_dimension": "cloud-storage-and-data-services",
          "reasoning": "Deterministic JD placement: locked_dimensions has 1 dimension(s) from skill-driven dimension generation after reconciliation; primary_dimension is the first locked dim.",
          "secondary_dimensions": [],
          "skill_id": "cassandra"
        },
        "relationships": {
          "child_skills": [],
          "parent_skills": [
            "nosql",
            "aws"
          ],
          "related_to": [
            "rds",
            "sqlite",
            "relational-databases",
            "firebase-firestore",
            "databricks",
            "chromadb",
            "kubernetes",
            "ibm-cloud"
          ],
          "requires": [],
          "skill_id": "cassandra",
          "suppress_on_match": []
        },
        "skill_id": "cassandra",
        "split_log": [],
        "typed": {
          "alternatives_considered": [],
          "confidence": 0.99,
          "name": "Cassandra",
          "reasoning": "Cassandra is fundamentally a distributed database that persists data, so by the Datastore vs Format rule it is a Datastore.",
          "skill_id": "cassandra",
          "subtype": "wide_column_store",
          "type": "Datastore"
        },
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "Java",
          "alias_type": "CANONICAL",
          "id": 1,
          "is_primary": true,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "JDK 11",
          "alias_type": "VERSION",
          "id": 4,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "JDK 17",
          "alias_type": "VERSION",
          "id": 5,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "JDK 21",
          "alias_type": "VERSION",
          "id": 6,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "JDK 8",
          "alias_type": "VERSION",
          "id": 3,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Java 1.0",
          "alias_type": "VERSION",
          "id": 11,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Java 1.1",
          "alias_type": "VERSION",
          "id": 12,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Java 1.2",
          "alias_type": "VERSION",
          "id": 13,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Java 1.3",
          "alias_type": "VERSION",
          "id": 14,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Java 1.4",
          "alias_type": "VERSION",
          "id": 15,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Java 1.5",
          "alias_type": "VERSION",
          "id": 16,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Java 1.6",
          "alias_type": "VERSION",
          "id": 17,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Java 1.7",
          "alias_type": "VERSION",
          "id": 18,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Java 1.8",
          "alias_type": "VERSION",
          "id": 19,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Java 11",
          "alias_type": "VERSION",
          "id": 8,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Java 17",
          "alias_type": "VERSION",
          "id": 9,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Java 21",
          "alias_type": "VERSION",
          "id": 10,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Java 5",
          "alias_type": "VERSION",
          "id": 288,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Java 6",
          "alias_type": "VERSION",
          "id": 289,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Java 7",
          "alias_type": "VERSION",
          "id": 290,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Java 8",
          "alias_type": "VERSION",
          "id": 7,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "OpenJDK 11",
          "alias_type": "VERSION",
          "id": 21,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "OpenJDK 17",
          "alias_type": "VERSION",
          "id": 22,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "OpenJDK 21",
          "alias_type": "VERSION",
          "id": 23,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "OpenJDK 8",
          "alias_type": "VERSION",
          "id": 20,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "java 11",
          "alias_type": "VERSION",
          "id": 1512,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "java 17",
          "alias_type": "VERSION",
          "id": 1513,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "java 21",
          "alias_type": "VERSION",
          "id": 1514,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "java 4",
          "alias_type": "VERSION",
          "id": 1496,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "java 5",
          "alias_type": "VERSION",
          "id": 1497,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "java 6",
          "alias_type": "VERSION",
          "id": 1498,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "java 7",
          "alias_type": "VERSION",
          "id": 1499,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "java 8",
          "alias_type": "VERSION",
          "id": 1500,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "java-11",
          "alias_type": "VERSION",
          "id": 1515,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "java-17",
          "alias_type": "VERSION",
          "id": 1516,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "java-21",
          "alias_type": "VERSION",
          "id": 1517,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "java-4",
          "alias_type": "VERSION",
          "id": 1501,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "java-5",
          "alias_type": "VERSION",
          "id": 1502,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "java-6",
          "alias_type": "VERSION",
          "id": 1503,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "java-7",
          "alias_type": "VERSION",
          "id": 1504,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "java-8",
          "alias_type": "VERSION",
          "id": 1505,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "java11",
          "alias_type": "VERSION",
          "id": 1506,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "java17",
          "alias_type": "VERSION",
          "id": 1507,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "java21",
          "alias_type": "VERSION",
          "id": 1508,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "java4",
          "alias_type": "VERSION",
          "id": 1482,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "java5",
          "alias_type": "VERSION",
          "id": 1483,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "java6",
          "alias_type": "VERSION",
          "id": 1484,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "java7",
          "alias_type": "VERSION",
          "id": 1485,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "java8",
          "alias_type": "VERSION",
          "id": 1486,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "jdk 11",
          "alias_type": "VERSION",
          "id": 1509,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "jdk 17",
          "alias_type": "VERSION",
          "id": 1510,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "jdk 21",
          "alias_type": "VERSION",
          "id": 1511,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "jdk 4",
          "alias_type": "VERSION",
          "id": 1487,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "jdk 5",
          "alias_type": "VERSION",
          "id": 1488,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "jdk 6",
          "alias_type": "VERSION",
          "id": 1489,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "jdk 7",
          "alias_type": "VERSION",
          "id": 1490,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "jdk 8",
          "alias_type": "VERSION",
          "id": 1491,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "jdk11",
          "alias_type": "VERSION",
          "id": 1492,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "jdk17",
          "alias_type": "VERSION",
          "id": 1493,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "jdk21",
          "alias_type": "VERSION",
          "id": 1494,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "jdk4",
          "alias_type": "VERSION",
          "id": 1477,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "jdk5",
          "alias_type": "VERSION",
          "id": 1478,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "jdk6",
          "alias_type": "VERSION",
          "id": 1479,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "jdk7",
          "alias_type": "VERSION",
          "id": 1480,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "jdk8",
          "alias_type": "VERSION",
          "id": 1481,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "jvm21",
          "alias_type": "VERSION",
          "id": 1495,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 6,
        "display_name": "Java",
        "id": 1,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "LANGUAGE",
        "slug": "java",
        "sub_category_id": 96,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Kotlin and Java",
            "id": 161,
            "rationale": "Primary implementation languages for Android app features, platform integration, and client-side business logic. Android engineers use these languages to build screens, state flows, service adapters, and device-aware behavior.",
            "slug": "kotlin-and-java",
            "source": "db"
          },
          "input_skill": "Java",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Android Engineer",
              "id": 4,
              "rationale": null,
              "role_archetype": null,
              "slug": "android-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Programming Languages",
            "id": 1,
            "rationale": "Core server-side languages used to implement backend business logic, integrations, and service internals. This is the primary coding surface for the role across application layers.",
            "slug": "programming-languages",
            "source": "db"
          },
          "input_skill": "Java",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Backend Engineer",
              "id": 1,
              "rationale": null,
              "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
              "slug": "backend-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Programming Languages for Data Work",
            "id": 21,
            "rationale": "Languages used to implement data pipelines, transformations, and operational glue. This is the primary coding surface for building ingestion, enrichment, and automation logic in data engineering.",
            "slug": "programming-languages-for-data-work",
            "source": "db"
          },
          "input_skill": "Java",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Data Engineer",
              "id": 2,
              "rationale": null,
              "role_archetype": null,
              "slug": "data-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "Java",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "Scala",
          "alias_type": "CANONICAL",
          "id": 272,
          "is_primary": true,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 6,
        "display_name": "Scala",
        "id": 102,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "LANGUAGE",
        "slug": "scala",
        "sub_category_id": 96,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Programming Languages for Data Work",
            "id": 21,
            "rationale": "Languages used to implement data pipelines, transformations, and operational glue. This is the primary coding surface for building ingestion, enrichment, and automation logic in data engineering.",
            "slug": "programming-languages-for-data-work",
            "source": "db"
          },
          "input_skill": "Scala",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Data Engineer",
              "id": 2,
              "rationale": null,
              "role_archetype": null,
              "slug": "data-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Programming Languages for ML Systems",
            "id": 39,
            "rationale": "Languages used to build training code, inference services, evaluation jobs, and ML glue code. This is the primary implementation surface for ML engineers across experimentation and productionization.",
            "slug": "programming-languages-for-ml-systems",
            "source": "db"
          },
          "input_skill": "Scala",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "ML Engineer",
              "id": 3,
              "rationale": null,
              "role_archetype": null,
              "slug": "ml-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "Scala",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "AWS",
          "alias_type": "CANONICAL",
          "id": 406,
          "is_primary": true,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 9,
        "display_name": "AWS",
        "id": 187,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "PLATFORM",
        "slug": "aws",
        "sub_category_id": 46,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Cloud Platforms",
            "id": 20,
            "rationale": "Proficiency in major cloud service provider platforms and their core services.",
            "slug": "cloud-platforms",
            "source": "db"
          },
          "input_skill": "AWS",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Backend Engineer",
              "id": 1,
              "rationale": null,
              "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
              "slug": "backend-engineer",
              "source": "db"
            },
            {
              "display_name": "Cybersecurity Engineer",
              "id": 5,
              "rationale": null,
              "role_archetype": null,
              "slug": "cybersecurity-engineer",
              "source": "db"
            },
            {
              "display_name": "Data Engineer",
              "id": 2,
              "rationale": null,
              "role_archetype": null,
              "slug": "data-engineer",
              "source": "db"
            },
            {
              "display_name": "DevOps Engineer",
              "id": 10,
              "rationale": null,
              "role_archetype": null,
              "slug": "devops-engineer",
              "source": "db"
            },
            {
              "display_name": "ML Engineer",
              "id": 3,
              "rationale": null,
              "role_archetype": null,
              "slug": "ml-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Cloud Platforms for AI Deployment",
            "id": 211,
            "rationale": "Major cloud services that provide infrastructure and managed services for AI workloads.",
            "slug": "cloud-platforms-for-ai-deployment",
            "source": "db"
          },
          "input_skill": "AWS",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "AI Engineer",
              "id": 13,
              "rationale": null,
              "role_archetype": null,
              "slug": "ai-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Cloud Provider Platforms",
            "id": 131,
            "rationale": "Major cloud platforms and their core service ecosystems used to design target-state architectures, choose deployment boundaries, and evaluate managed capabilities. This is the primary substrate for cloud architecture decisions.",
            "slug": "cloud-provider-platforms",
            "source": "db"
          },
          "input_skill": "AWS",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Cloud Architect",
              "id": 9,
              "rationale": null,
              "role_archetype": null,
              "slug": "cloud-architect",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Cloud Security Posture Tools",
            "id": 64,
            "rationale": "Cloud-native security platforms used to assess misconfiguration, workload exposure, and cloud control coverage. This dimension includes the major CNAPP/CSPM/CWPP vendors and cloud security services the role reviews and tunes.",
            "slug": "cloud-security-posture-tools",
            "source": "db"
          },
          "input_skill": "AWS",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Cybersecurity Engineer",
              "id": 5,
              "rationale": null,
              "role_archetype": null,
              "slug": "cybersecurity-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "AWS",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "GCP",
          "alias_type": "CANONICAL",
          "id": 405,
          "is_primary": true,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 9,
        "display_name": "GCP",
        "id": 186,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "PLATFORM",
        "slug": "gcp",
        "sub_category_id": 46,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Cloud Platforms",
            "id": 20,
            "rationale": "Proficiency in major cloud service provider platforms and their core services.",
            "slug": "cloud-platforms",
            "source": "db"
          },
          "input_skill": "GCP",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Backend Engineer",
              "id": 1,
              "rationale": null,
              "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
              "slug": "backend-engineer",
              "source": "db"
            },
            {
              "display_name": "Cybersecurity Engineer",
              "id": 5,
              "rationale": null,
              "role_archetype": null,
              "slug": "cybersecurity-engineer",
              "source": "db"
            },
            {
              "display_name": "Data Engineer",
              "id": 2,
              "rationale": null,
              "role_archetype": null,
              "slug": "data-engineer",
              "source": "db"
            },
            {
              "display_name": "DevOps Engineer",
              "id": 10,
              "rationale": null,
              "role_archetype": null,
              "slug": "devops-engineer",
              "source": "db"
            },
            {
              "display_name": "ML Engineer",
              "id": 3,
              "rationale": null,
              "role_archetype": null,
              "slug": "ml-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Cloud Platforms for AI Deployment",
            "id": 211,
            "rationale": "Major cloud services that provide infrastructure and managed services for AI workloads.",
            "slug": "cloud-platforms-for-ai-deployment",
            "source": "db"
          },
          "input_skill": "GCP",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "AI Engineer",
              "id": 13,
              "rationale": null,
              "role_archetype": null,
              "slug": "ai-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Cloud Security Posture Tools",
            "id": 64,
            "rationale": "Cloud-native security platforms used to assess misconfiguration, workload exposure, and cloud control coverage. This dimension includes the major CNAPP/CSPM/CWPP vendors and cloud security services the role reviews and tunes.",
            "slug": "cloud-security-posture-tools",
            "source": "db"
          },
          "input_skill": "GCP",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Cybersecurity Engineer",
              "id": 5,
              "rationale": null,
              "role_archetype": null,
              "slug": "cybersecurity-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "GCP",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "Azure",
          "alias_type": "CANONICAL",
          "id": 407,
          "is_primary": true,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 9,
        "display_name": "Azure",
        "id": 188,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "PLATFORM",
        "slug": "azure",
        "sub_category_id": 46,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Cloud Platforms",
            "id": 20,
            "rationale": "Proficiency in major cloud service provider platforms and their core services.",
            "slug": "cloud-platforms",
            "source": "db"
          },
          "input_skill": "Azure",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Backend Engineer",
              "id": 1,
              "rationale": null,
              "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
              "slug": "backend-engineer",
              "source": "db"
            },
            {
              "display_name": "Cybersecurity Engineer",
              "id": 5,
              "rationale": null,
              "role_archetype": null,
              "slug": "cybersecurity-engineer",
              "source": "db"
            },
            {
              "display_name": "Data Engineer",
              "id": 2,
              "rationale": null,
              "role_archetype": null,
              "slug": "data-engineer",
              "source": "db"
            },
            {
              "display_name": "DevOps Engineer",
              "id": 10,
              "rationale": null,
              "role_archetype": null,
              "slug": "devops-engineer",
              "source": "db"
            },
            {
              "display_name": "ML Engineer",
              "id": 3,
              "rationale": null,
              "role_archetype": null,
              "slug": "ml-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Cloud Platforms for AI Deployment",
            "id": 211,
            "rationale": "Major cloud services that provide infrastructure and managed services for AI workloads.",
            "slug": "cloud-platforms-for-ai-deployment",
            "source": "db"
          },
          "input_skill": "Azure",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "AI Engineer",
              "id": 13,
              "rationale": null,
              "role_archetype": null,
              "slug": "ai-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Cloud Provider Platforms",
            "id": 131,
            "rationale": "Major cloud platforms and their core service ecosystems used to design target-state architectures, choose deployment boundaries, and evaluate managed capabilities. This is the primary substrate for cloud architecture decisions.",
            "slug": "cloud-provider-platforms",
            "source": "db"
          },
          "input_skill": "Azure",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Cloud Architect",
              "id": 9,
              "rationale": null,
              "role_archetype": null,
              "slug": "cloud-architect",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Cloud Security Posture Tools",
            "id": 64,
            "rationale": "Cloud-native security platforms used to assess misconfiguration, workload exposure, and cloud control coverage. This dimension includes the major CNAPP/CSPM/CWPP vendors and cloud security services the role reviews and tunes.",
            "slug": "cloud-security-posture-tools",
            "source": "db"
          },
          "input_skill": "Azure",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Cybersecurity Engineer",
              "id": 5,
              "rationale": null,
              "role_archetype": null,
              "slug": "cybersecurity-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "Azure",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "React Frontend Development",
            "id": 96,
            "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
            "slug": "d_init_01",
            "source": "db"
          },
          "input_skill": "RDBMS",
          "llm_role": null,
          "roles_from_db": []
        }
      ],
      "input_skill": "RDBMS",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Datastore",
          "skill_nature": "TOOL",
          "sub_category": "relational_database_management_system",
          "typical_lifespan": "EVERGREEN",
          "version_strategy": "NOT_APPLICABLE",
          "volatility": "STABLE"
        },
        "enrichment": {
          "ambiguity": {
            "ambiguity_flag": false,
            "confused_with": [],
            "reasoning": "RDBMS is a standard, specific datastore category (relational DBMS) with little overlap with other distinct skills in typical JDs."
          },
          "context_keywords": {
            "context_keywords": [
              "SQL",
              "ACID",
              "normalization",
              "indexes",
              "transactions",
              "joins",
              "stored procedures",
              "views",
              "foreign keys",
              "data integrity",
              "schema design",
              "ER diagrams",
              "database tuning",
              "backup and recovery",
              "query optimization",
              "data modeling"
            ]
          },
          "maturity": {
            "confidence": 0.98,
            "maturity": "well_known",
            "reasoning": "RDBMS is a core requirement in many job descriptions across backend, data, and DBA roles; PostgreSQL, MySQL, and SQL Server remain standard enterprise stacks."
          },
          "skill_id": "rdbms",
          "vendor_license": {
            "confidence": 0.9,
            "license": null,
            "vendor": null,
            "year_introduced": null
          },
          "versioning": {
            "current_version": null,
            "version_aliases": {},
            "versioned": false
          }
        },
        "keep_log": [],
        "locked_dimensions": [
          {
            "description": "Relational database management systems used to store, query, and maintain structured data with tables, keys, constraints, and SQL. RDBMS fits here because it names the core database engine category rather than a specific vendor or data workflow tool.",
            "exemplar_skills": [
              "RDBMS",
              "SQL",
              "PostgreSQL",
              "MySQL",
              "Oracle Database",
              "Microsoft SQL Server",
              "schema design",
              "indexing",
              "transactions"
            ],
            "in_scope": "RDBMS, relational database management systems, SQL query execution, tables and schemas, primary and foreign keys, indexes, transactions, ACID properties, normalization, joins, stored procedures",
            "name": "Relational Database Systems",
            "out_of_scope": "NoSQL document or key-value stores, data warehouse modeling, ETL orchestration, vector databases, cloud storage services, application ORM usage",
            "overlap_flags": [
              {
                "reason": "Managed database services may be discussed alongside storage platforms, but this skill is specifically about relational database engines and SQL data modeling.",
                "with_dim_id": "cloud-storage-and-data-services",
                "with_dim_name": null,
                "with_role": "Cloud Architect"
              },
              {
                "reason": "Database tuning overlaps when discussing query plans and indexing, but that dimension is broader and centered on system performance rather than database technology itself.",
                "with_dim_id": "performance-and-scalability-tuning",
                "with_dim_name": null,
                "with_role": "Backend Engineer"
              }
            ],
            "tentative_id": "d_init_01"
          }
        ],
        "merge_log": [],
        "placed": {
          "name": "RDBMS",
          "placement_confidence": 0.92,
          "primary_dimension": "d_init_01",
          "reasoning": "Deterministic JD placement: locked_dimensions has 1 dimension(s) from skill-driven dimension generation after reconciliation; primary_dimension is the first locked dim.",
          "secondary_dimensions": [],
          "skill_id": "rdbms"
        },
        "relationships": {
          "child_skills": [],
          "parent_skills": [],
          "related_to": [
            "relational-databases",
            "rds",
            "sqlite",
            "nosql",
            "vector-db",
            "rag",
            "rollback-procedures",
            "data-structures"
          ],
          "requires": [],
          "skill_id": "rdbms",
          "suppress_on_match": []
        },
        "skill_id": "rdbms",
        "split_log": [],
        "typed": {
          "alternatives_considered": [],
          "confidence": 0.98,
          "name": "RDBMS",
          "reasoning": "RDBMS is fundamentally a system that persists and manages data, so under the Datastore vs Format rule it is a Datastore rather than a tool or concept.",
          "skill_id": "rdbms",
          "subtype": "relational_database_management_system",
          "type": "Datastore"
        },
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "NoSQL",
          "alias_type": "CANONICAL",
          "id": 1989,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 2,
        "display_name": "NoSQL",
        "id": 1346,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "CONCEPT",
        "slug": "nosql",
        "sub_category_id": 1019,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "NoSQL Databases",
            "id": 19,
            "rationale": "Models and manages data using non-relational database systems.",
            "slug": "nosql-databases",
            "source": "db"
          },
          "input_skill": "NoSQL",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Backend Engineer",
              "id": 1,
              "rationale": null,
              "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
              "slug": "backend-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "NoSQL",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "React Frontend Development",
            "id": 96,
            "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
            "slug": "d_init_01",
            "source": "db"
          },
          "input_skill": "Machine Learning",
          "llm_role": null,
          "roles_from_db": []
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "AI Governance and Model Security",
            "id": 50,
            "rationale": "Controls and documentation used to make models safer, auditable, and compliant. ML engineers use this to manage model risk, supply chain integrity, and governance requirements.",
            "slug": "ai-governance-and-model-security",
            "source": "db"
          },
          "input_skill": "Machine Learning",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "AI Engineer",
              "id": 13,
              "rationale": null,
              "role_archetype": null,
              "slug": "ai-engineer",
              "source": "db"
            },
            {
              "display_name": "ML Engineer",
              "id": 3,
              "rationale": null,
              "role_archetype": null,
              "slug": "ml-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "AI Governance and Model Security",
            "id": 50,
            "rationale": "Controls and documentation used to make models safer, auditable, and compliant. ML engineers use this to manage model risk, supply chain integrity, and governance requirements.",
            "slug": "ai-governance-and-model-security",
            "source": "db"
          },
          "input_skill": "Machine Learning",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "AI Engineer",
              "id": 13,
              "rationale": null,
              "role_archetype": null,
              "slug": "ai-engineer",
              "source": "db"
            },
            {
              "display_name": "ML Engineer",
              "id": 3,
              "rationale": null,
              "role_archetype": null,
              "slug": "ml-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "Machine Learning",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Concept",
          "skill_nature": "CONCEPT",
          "sub_category": "machine_learning",
          "typical_lifespan": "EVERGREEN",
          "version_strategy": "NOT_APPLICABLE",
          "volatility": "STABLE"
        },
        "enrichment": {
          "ambiguity": {
            "ambiguity_flag": false,
            "confused_with": [],
            "reasoning": "\u201cMachine Learning\u201d is a standard, specific concept and is unlikely to be confused with other distinct catalog skills in typical job descriptions."
          },
          "context_keywords": {
            "context_keywords": [
              "TensorFlow",
              "scikit-learn",
              "Keras",
              "PyTorch",
              "neural networks",
              "supervised learning",
              "unsupervised learning",
              "reinforcement learning",
              "feature engineering",
              "model evaluation",
              "hyperparameter tuning",
              "data preprocessing",
              "cross-validation",
              "ensemble methods",
              "natural language processing"
            ]
          },
          "maturity": {
            "confidence": 0.97,
            "maturity": "well_known",
            "reasoning": "Machine Learning appears in large volumes of job descriptions across data, product, and platform roles, and major cloud vendors (AWS, Google Cloud, Azure) offer dedicated ML services and certifications, indicating broad adoption."
          },
          "skill_id": "machine-learning",
          "vendor_license": {
            "confidence": 0.95,
            "license": null,
            "vendor": null,
            "year_introduced": null
          },
          "versioning": {
            "current_version": null,
            "version_aliases": {},
            "versioned": false
          }
        },
        "keep_log": [
          {
            "a_dim_id": "ai-governance-and-model-security",
            "a_name": "AI Governance and Model Security",
            "a_role": "__skill_focal__",
            "b_dim_id": "ai-training-and-deployment-controls",
            "b_name": "AI Training and Deployment Controls",
            "b_role": "AI Compliance Officer",
            "pair_kind": "cross_role",
            "reasoning": "Dim A covers model governance/security work for ML practitioners: training data provenance, model approvals, safety controls, auditability, policy compliance, and model supply chain integrity. Dim B is about an AI Compliance Officer reviewing lifecycle checkpoints around training, release, and deployment to ensure approvals and safeguards exist. A senior practitioner in A would not naturally be a senior practitioner in B: career-track: no, because governance/security engineering and compliance-officer checkpoint review are adjacent but distinct.",
            "similarity": 0.6770906625953983
          }
        ],
        "locked_dimensions": [
          {
            "description": "Core concepts, methods, and workflows for building predictive models from data. This fits the target skill because machine learning is the umbrella discipline covering model selection, training, validation, and deployment-oriented thinking.",
            "exemplar_skills": [
              "Machine Learning",
              "supervised learning",
              "unsupervised learning",
              "feature engineering",
              "cross-validation",
              "hyperparameter tuning"
            ],
            "in_scope": "Machine Learning, supervised learning, unsupervised learning, feature engineering, model training, classification, regression, clustering, cross-validation, bias-variance tradeoff, hyperparameter tuning",
            "name": "Machine Learning Fundamentals",
            "out_of_scope": "Deep learning architecture design, transformer fine-tuning, and neural network implementation, which belong to specialized model architecture dimensions; experiment logging and run comparison, which belong to experiment tracking and evaluation",
            "overlap_flags": [
              {
                "reason": "ML work often uses experiment tracking, but that dimension covers the tooling and evaluation workflow rather than the core modeling concepts.",
                "with_dim_id": "experiment-tracking-and-evaluation",
                "with_dim_name": null,
                "with_role": "ML Engineer"
              },
              {
                "reason": "Some ML roles optimize models, but that dimension is specifically about latency, throughput, and efficiency tuning.",
                "with_dim_id": "model-optimization-and-acceleration",
                "with_dim_name": null,
                "with_role": "ML Engineer"
              }
            ],
            "tentative_id": "d_init_01"
          },
          {
            "description": "Controls and documentation used to make models safer, auditable, and compliant. Machine learning practitioners may need this when training or deploying models in regulated or risk-sensitive environments.",
            "exemplar_skills": [
              "Machine Learning",
              "model risk review",
              "training data provenance",
              "model approvals",
              "safety controls",
              "auditability"
            ],
            "in_scope": "Machine Learning, model risk review, training data provenance, model approvals, safety controls, auditability, policy compliance, model supply chain integrity",
            "name": "AI Governance and Model Security",
            "out_of_scope": "General model building, feature engineering, and algorithm selection, which belong to core machine learning practice; release pipeline mechanics, which belong to deployment and CI/CD dimensions",
            "overlap_flags": [
              {
                "reason": "Both can touch model release stages, but this dimension is about governance and safeguards while that one focuses on training/deployment gates.",
                "with_dim_id": "ai-training-and-deployment-controls",
                "with_dim_name": null,
                "with_role": "AI Compliance Officer"
              }
            ],
            "tentative_id": "ai-governance-and-model-security"
          },
          {
            "description": "Controls and documentation used to make models safer, auditable, and compliant. ML engineers use this to manage model risk, supply chain integrity, and governance requirements.",
            "exemplar_skills": [
              "AI Governance and Model Security"
            ],
            "in_scope": "Skills, tools, and practices that belong under AI Governance and Model Security for the target role, including items implied by the dimension rationale.",
            "name": "AI Governance and Model Security",
            "out_of_scope": "Adjacent clusters explicitly not owned by AI Governance and Model Security, including unrelated platforms, roles, and skill families per library policy.",
            "overlap_flags": [],
            "tentative_id": "ai-governance-and-model-security"
          }
        ],
        "merge_log": [],
        "placed": {
          "name": "Machine Learning",
          "placement_confidence": 0.92,
          "primary_dimension": "d_init_01",
          "reasoning": "Deterministic JD placement: locked_dimensions has 3 dimension(s) from skill-driven dimension generation after reconciliation; primary_dimension is the first locked dim.",
          "secondary_dimensions": [
            "ai-governance-and-model-security"
          ],
          "skill_id": "machine-learning"
        },
        "relationships": {
          "child_skills": [],
          "parent_skills": [
            "ai"
          ],
          "related_to": [
            "mlops",
            "intelligent-automation",
            "embeddings",
            "chatbots",
            "pytorch",
            "openai"
          ],
          "requires": [
            "algorithms",
            "data-structures"
          ],
          "skill_id": "machine-learning",
          "suppress_on_match": []
        },
        "skill_id": "machine-learning",
        "split_log": [],
        "typed": {
          "alternatives_considered": [],
          "confidence": 0.98,
          "name": "Machine Learning",
          "reasoning": "Machine Learning is a named knowledge unit about building models that learn from data, so by the Concept vs Methodology rule it is a Concept rather than an Architecture or Methodology.",
          "skill_id": "machine-learning",
          "subtype": "machine_learning",
          "type": "Concept"
        },
        "warnings": [
          "stage3_post_filter_dropped_catalog_only_locked_dims:42-\u003e3"
        ]
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "React Frontend Development",
            "id": 96,
            "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
            "slug": "d_init_01",
            "source": "db"
          },
          "input_skill": "Artificial Intelligence",
          "llm_role": null,
          "roles_from_db": []
        }
      ],
      "input_skill": "Artificial Intelligence",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Concept",
          "skill_nature": "CONCEPT",
          "sub_category": "artificial_intelligence",
          "typical_lifespan": "EVERGREEN",
          "version_strategy": "NOT_APPLICABLE",
          "volatility": "STABLE"
        },
        "enrichment": {
          "ambiguity": {
            "ambiguity_flag": false,
            "confused_with": [],
            "reasoning": "\u201cArtificial Intelligence\u201d is a broad, standard concept and is unlikely to be confused with a different catalog skill in typical job descriptions."
          },
          "context_keywords": {
            "context_keywords": [
              "machine learning",
              "neural networks",
              "deep learning",
              "natural language processing",
              "computer vision",
              "reinforcement learning",
              "TensorFlow",
              "PyTorch",
              "data mining",
              "predictive analytics",
              "algorithm optimization",
              "AI ethics",
              "supervised learning",
              "unsupervised learning",
              "model training"
            ]
          },
          "maturity": {
            "confidence": 0.96,
            "maturity": "well_known",
            "reasoning": "AI appears in a large and growing share of job descriptions across software, data, and product roles, and major vendors (Microsoft, Google, AWS) have standardized AI offerings, signaling broad market adoption."
          },
          "skill_id": "artificial-intelligence",
          "vendor_license": {
            "confidence": 1.0,
            "license": null,
            "vendor": null,
            "year_introduced": null
          },
          "versioning": {
            "current_version": null,
            "version_aliases": {},
            "versioned": false
          }
        },
        "keep_log": [],
        "locked_dimensions": [
          {
            "description": "Core concepts, methods, and terminology for building AI systems across symbolic, statistical, and machine-learning approaches. This skill is broad enough to stand as a top-level conceptual dimension when the intent is general AI literacy rather than a specific subdomain.",
            "exemplar_skills": [
              "Artificial Intelligence",
              "machine learning",
              "neural networks",
              "generative AI",
              "reinforcement learning"
            ],
            "in_scope": "Artificial Intelligence, machine learning basics, neural networks, supervised learning, unsupervised learning, reinforcement learning, generative AI concepts, model evaluation fundamentals",
            "name": "Artificial Intelligence Concepts",
            "out_of_scope": "AI governance and compliance controls, prompt management, vector databases, model deployment operations, these belong to more specialized AI or platform dimensions",
            "overlap_flags": [
              {
                "reason": "AI systems often require governance and security controls, but this dimension is about the core AI concept itself rather than risk management.",
                "with_dim_id": "ai-governance-and-model-security",
                "with_dim_name": null,
                "with_role": "AI Engineer, ML Engineer"
              },
              {
                "reason": "Optimization is a downstream specialization for AI models, not the general AI concept.",
                "with_dim_id": "model-optimization-and-acceleration",
                "with_dim_name": null,
                "with_role": "ML Engineer"
              },
              {
                "reason": "Evaluation is commonly part of AI work, but this dimension focuses on the broader AI domain rather than experiment tooling and measurement.",
                "with_dim_id": "experiment-tracking-and-evaluation",
                "with_dim_name": null,
                "with_role": "ML Engineer"
              }
            ],
            "tentative_id": "d_init_01"
          }
        ],
        "merge_log": [],
        "placed": {
          "name": "Artificial Intelligence",
          "placement_confidence": 0.92,
          "primary_dimension": "d_init_01",
          "reasoning": "Deterministic JD placement: locked_dimensions has 1 dimension(s) from skill-driven dimension generation after reconciliation; primary_dimension is the first locked dim.",
          "secondary_dimensions": [],
          "skill_id": "artificial-intelligence"
        },
        "relationships": {
          "child_skills": [],
          "parent_skills": [],
          "related_to": [
            "ai",
            "intelligent-automation",
            "algorithms",
            "chatbots",
            "virtual-assistants",
            "agentic-workflows",
            "apis",
            "openai",
            "anthropic",
            "openai-embeddings"
          ],
          "requires": [],
          "skill_id": "artificial-intelligence",
          "suppress_on_match": []
        },
        "skill_id": "artificial-intelligence",
        "split_log": [],
        "typed": {
          "alternatives_considered": [],
          "confidence": 0.98,
          "name": "Artificial Intelligence",
          "reasoning": "Artificial Intelligence is a named knowledge unit about a field of techniques and theory, so by the Concept vs Methodology rule it is a Concept rather than a tool, platform, or methodology.",
          "skill_id": "artificial-intelligence",
          "subtype": "artificial_intelligence",
          "type": "Concept"
        },
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Cloud Storage and Data Services",
            "id": 144,
            "rationale": "Cloud-native storage and managed data services used to place workloads, choose durability tiers, and define platform boundaries. This is a coherent cluster because architects evaluate storage fit, access patterns, and managed service tradeoffs.",
            "slug": "cloud-storage-and-data-services",
            "source": "db"
          },
          "input_skill": "Data Lakes",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Cloud Architect",
              "id": 9,
              "rationale": null,
              "role_archetype": null,
              "slug": "cloud-architect",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "React Frontend Development",
            "id": 96,
            "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
            "slug": "d_init_01",
            "source": "db"
          },
          "input_skill": "Data Lakes",
          "llm_role": null,
          "roles_from_db": []
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Cloud Storage and Data Services",
            "id": 144,
            "rationale": "Cloud-native storage and managed data services used to place workloads, choose durability tiers, and define platform boundaries. This is a coherent cluster because architects evaluate storage fit, access patterns, and managed service tradeoffs.",
            "slug": "cloud-storage-and-data-services",
            "source": "db"
          },
          "input_skill": "Data Lakes",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Cloud Architect",
              "id": 9,
              "rationale": null,
              "role_archetype": null,
              "slug": "cloud-architect",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "Data Lakes",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Architecture",
          "skill_nature": "PATTERN",
          "sub_category": "data_lake_architecture",
          "typical_lifespan": "EVERGREEN",
          "version_strategy": "NOT_APPLICABLE",
          "volatility": "STABLE"
        },
        "enrichment": {
          "ambiguity": {
            "ambiguity_flag": false,
            "confused_with": [],
            "reasoning": "\u201cData Lakes\u201d is a specific architecture pattern (data lake storage/processing) and is unlikely to be confused with other distinct catalog skills."
          },
          "context_keywords": {
            "context_keywords": [
              "AWS Lake Formation",
              "Azure Data Lake",
              "data ingestion",
              "ETL",
              "data governance",
              "schema evolution",
              "data catalog",
              "big data",
              "data warehousing",
              "real-time analytics",
              "data pipelines",
              "data modeling",
              "partitioning",
              "data lakes vs data warehouses",
              "serverless architecture"
            ]
          },
          "maturity": {
            "confidence": 0.93,
            "maturity": "well_known",
            "reasoning": "Data lakes are widely listed in cloud/data platform job descriptions and are a standard architecture in AWS, Azure, and GCP ecosystems; they\u2019re a common hiring-pipeline staple rather than a niche pattern."
          },
          "skill_id": "data-lakes",
          "vendor_license": {
            "confidence": 0.8,
            "license": null,
            "vendor": null,
            "year_introduced": null
          },
          "versioning": {
            "current_version": null,
            "version_aliases": {},
            "versioned": false
          }
        },
        "keep_log": [
          {
            "a_dim_id": "cloud-storage-and-data-services",
            "a_name": "Cloud Storage and Data Services",
            "a_role": "__skill_focal__",
            "b_dim_id": "cloud-storage-and-data-services",
            "b_name": "Cloud Storage and Data Services",
            "b_role": "Cloud Architect",
            "pair_kind": "cross_role",
            "reasoning": "Dim A is analytical data-lake storage: object storage, bucket/prefix design, lifecycle policies, raw/bronze and curated/silver/gold zones, and lakehouse storage design. Dim B is cloud-architecture storage placement: choosing durability tiers, placing workloads, and evaluating managed service tradeoffs. A senior in A (e.g., S3 data lake architecture, Azure Data Lake Storage) is not automatically a senior in B\u2019s broader platform-boundary work. career-track: no, because these are related but distinct specialties.",
            "similarity": 0.853807177782391
          }
        ],
        "locked_dimensions": [
          {
            "description": "Cloud-native storage and managed data services used to store large analytical datasets, define retention, and support lake-style architectures. Data Lakes fit here because they are typically built on object storage and adjacent managed services for durable, scalable data storage.",
            "exemplar_skills": [
              "Data Lakes",
              "object storage",
              "lakehouse storage design",
              "S3 data lake architecture",
              "Azure Data Lake Storage",
              "Google Cloud Storage",
              "data retention policies"
            ],
            "in_scope": "Data Lakes, object storage, data lake storage layouts, bucket and prefix design, lifecycle policies, retention tiers, cloud-native analytical storage, raw/bronze data zones, curated/silver/gold zones",
            "name": "Cloud Storage and Data Services",
            "out_of_scope": "Data warehouse modeling, BI dashboards, ETL job orchestration, streaming ingestion pipelines, which belong to analytics modeling, BI, or ETL dimensions",
            "overlap_flags": [
              {
                "reason": "Data lakes often receive data from ETL/ELT pipelines, but this dimension is about the storage layer rather than ingestion/transformation workflows.",
                "with_dim_id": "etl-and-elt-tooling",
                "with_dim_name": null,
                "with_role": "Data Engineer"
              },
              {
                "reason": "Curated lake data may feed BI tools, but dashboarding and semantic reporting are separate concerns.",
                "with_dim_id": "bi-and-visualization-tools",
                "with_dim_name": null,
                "with_role": "Data Engineer"
              }
            ],
            "tentative_id": "cloud-storage-and-data-services"
          },
          {
            "description": "Architectural patterns for organizing analytical data across raw, curated, and consumption-ready layers in a lake or lakehouse. This fits Data Lakes when the skill is used to design how data is structured, governed, and accessed for analytics.",
            "exemplar_skills": [
              "Data Lakes",
              "lakehouse architecture",
              "medallion architecture",
              "schema-on-read",
              "partitioning strategy",
              "bronze-silver-gold layers",
              "analytical data platform design"
            ],
            "in_scope": "Data Lakes, lakehouse architecture, medallion architecture, bronze-silver-gold layering, schema-on-read, partitioning strategy, table formats for lakes, analytical data organization",
            "name": "Lakehouse Data Architecture",
            "out_of_scope": "Physical cloud storage primitives, ETL connector setup, warehouse SQL modeling, and BI consumption layers, which are owned by storage, ETL, or analytics dimensions",
            "overlap_flags": [
              {
                "reason": "Lakehouse design depends on underlying object storage, but the architectural pattern is broader than storage configuration alone.",
                "with_dim_id": "cloud-storage-and-data-services",
                "with_dim_name": null,
                "with_role": "Cloud Architect"
              },
              {
                "reason": "Lakehouse architectures are commonly populated by ETL/ELT pipelines, though pipeline implementation is not the core of this dimension.",
                "with_dim_id": "etl-and-elt-tooling",
                "with_dim_name": null,
                "with_role": "Data Engineer"
              }
            ],
            "tentative_id": "d_init_01"
          },
          {
            "description": "Cloud-native storage and managed data services used to place workloads, choose durability tiers, and define platform boundaries. This is a coherent cluster because architects evaluate storage fit, access patterns, and managed service tradeoffs.",
            "exemplar_skills": [
              "Cloud Storage and Data Services"
            ],
            "in_scope": "Skills, tools, and practices that belong under Cloud Storage and Data Services for the target role, including items implied by the dimension rationale.",
            "name": "Cloud Storage and Data Services",
            "out_of_scope": "Adjacent clusters explicitly not owned by Cloud Storage and Data Services, including unrelated platforms, roles, and skill families per library policy.",
            "overlap_flags": [],
            "tentative_id": "cloud-storage-and-data-services"
          }
        ],
        "merge_log": [],
        "placed": {
          "name": "Data Lakes",
          "placement_confidence": 0.92,
          "primary_dimension": "cloud-storage-and-data-services",
          "reasoning": "Deterministic JD placement: locked_dimensions has 3 dimension(s) from skill-driven dimension generation after reconciliation; primary_dimension is the first locked dim.",
          "secondary_dimensions": [
            "d_init_01"
          ],
          "skill_id": "data-lakes"
        },
        "relationships": {
          "child_skills": [],
          "parent_skills": [],
          "related_to": [
            "nosql",
            "relational-databases",
            "rds",
            "databricks",
            "splunk",
            "kubernetes",
            "mlops",
            "devops"
          ],
          "requires": [],
          "skill_id": "data-lakes",
          "suppress_on_match": []
        },
        "skill_id": "data-lakes",
        "split_log": [],
        "typed": {
          "alternatives_considered": [
            "Concept: ruled out \u2014 although it is a known data-management idea, the term primarily denotes an architectural pattern.",
            "Datastore: ruled out \u2014 a data lake is typically an architectural approach built on one or more storage systems, not a single datastore product."
          ],
          "confidence": 0.9,
          "name": "Data Lakes",
          "reasoning": "By the Architecture vs Concept rule, data lakes describe a system-shape for organizing and storing data rather than a specific knowledge unit or product.",
          "skill_id": "data-lakes",
          "subtype": "data_lake_architecture",
          "type": "Architecture"
        },
        "warnings": [
          "stage3_post_filter_dropped_catalog_only_locked_dims:42-\u003e3"
        ]
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "React Frontend Development",
            "id": 96,
            "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
            "slug": "d_init_01",
            "source": "db"
          },
          "input_skill": "Lakehouse",
          "llm_role": null,
          "roles_from_db": []
        }
      ],
      "input_skill": "Lakehouse",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Architecture",
          "skill_nature": "PATTERN",
          "sub_category": "data_platform_architecture",
          "typical_lifespan": "EVERGREEN",
          "version_strategy": "NOT_APPLICABLE",
          "volatility": "EMERGING"
        },
        "enrichment": {
          "ambiguity": {
            "ambiguity_flag": false,
            "confused_with": [],
            "reasoning": "\u201cLakehouse\u201d is a specific data platform architecture term and is unlikely to be confused with other catalog skills."
          },
          "context_keywords": {
            "context_keywords": [
              "Delta Lake",
              "Apache Spark",
              "data warehouse",
              "data lake",
              "ETL",
              "streaming analytics",
              "data governance",
              "cloud storage",
              "SQL",
              "data modeling",
              "real-time processing",
              "data integration",
              "analytics",
              "data pipeline",
              "metadata management"
            ]
          },
          "maturity": {
            "confidence": 0.86,
            "maturity": "emerging",
            "reasoning": "Lakehouse is increasingly listed in data-platform JDs and vendor docs (Databricks, Snowflake, Microsoft Fabric), but it is not yet as universal as core warehouse or lake skills."
          },
          "skill_id": "lakehouse",
          "vendor_license": {
            "confidence": 0.8,
            "license": null,
            "vendor": null,
            "year_introduced": null
          },
          "versioning": {
            "current_version": null,
            "version_aliases": {},
            "versioned": false
          }
        },
        "keep_log": [],
        "locked_dimensions": [
          {
            "description": "Unified data platform patterns that combine data lake storage with warehouse-style management, governance, and analytics. Lakehouse belongs here because it refers to the architectural approach and platform capabilities used to store, process, and serve analytical data.",
            "exemplar_skills": [
              "Lakehouse",
              "Delta Lake",
              "Apache Iceberg",
              "Apache Hudi",
              "medallion architecture",
              "ACID table formats",
              "schema evolution",
              "time travel queries"
            ],
            "in_scope": "Lakehouse, Delta Lake, Apache Iceberg, Apache Hudi, table formats, ACID tables, schema evolution, time travel, medallion architecture, unified batch and streaming analytics",
            "name": "Lakehouse Architecture",
            "out_of_scope": "Traditional data warehouse modeling and BI semantic layers, covered by BI and Visualization Tools; generic cloud storage primitives without table management, covered by Cloud Storage and Data Services; ETL job orchestration, covered by ETL and ELT Tooling",
            "overlap_flags": [
              {
                "reason": "Lakehouse platforms rely on object storage, but this dimension is about the table and governance layer rather than raw storage services.",
                "with_dim_id": "cloud-storage-and-data-services",
                "with_dim_name": null,
                "with_role": "Cloud Architect"
              },
              {
                "reason": "Lakehouse implementations often use ETL/ELT pipelines, but pipeline tooling is a separate concern from the lakehouse architecture itself.",
                "with_dim_id": "etl-and-elt-tooling",
                "with_dim_name": null,
                "with_role": "Data Engineer"
              },
              {
                "reason": "Lakehouse data is frequently consumed by BI tools, but BI/semantic consumption is downstream of the storage and processing architecture.",
                "with_dim_id": "bi-and-visualization-tools",
                "with_dim_name": null,
                "with_role": "Data Engineer"
              }
            ],
            "tentative_id": "d_init_01"
          }
        ],
        "merge_log": [],
        "placed": {
          "name": "Lakehouse",
          "placement_confidence": 0.92,
          "primary_dimension": "d_init_01",
          "reasoning": "Deterministic JD placement: locked_dimensions has 1 dimension(s) from skill-driven dimension generation after reconciliation; primary_dimension is the first locked dim.",
          "secondary_dimensions": [],
          "skill_id": "lakehouse"
        },
        "relationships": {
          "child_skills": [],
          "parent_skills": [],
          "related_to": [
            "databricks",
            "spark",
            "flink",
            "rds",
            "sqlite",
            "room",
            "jenkins",
            "gradle"
          ],
          "requires": [],
          "skill_id": "lakehouse",
          "suppress_on_match": []
        },
        "skill_id": "lakehouse",
        "split_log": [],
        "typed": {
          "alternatives_considered": [],
          "confidence": 0.9,
          "name": "Lakehouse",
          "reasoning": "Lakehouse is fundamentally a system-shape pattern that combines data lake and warehouse characteristics, so by the Architecture vs Concept rule it fits Architecture rather than a tool or datastore.",
          "skill_id": "lakehouse",
          "subtype": "data_platform_architecture",
          "type": "Architecture"
        },
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "React Frontend Development",
            "id": 96,
            "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
            "slug": "d_init_01",
            "source": "db"
          },
          "input_skill": "Event-Driven Architecture",
          "llm_role": null,
          "roles_from_db": []
        }
      ],
      "input_skill": "Event-Driven Architecture",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Architecture",
          "skill_nature": "PATTERN",
          "sub_category": "event_driven_architecture",
          "typical_lifespan": "EVERGREEN",
          "version_strategy": "NOT_APPLICABLE",
          "volatility": "STABLE"
        },
        "enrichment": {
          "ambiguity": {
            "ambiguity_flag": false,
            "confused_with": [],
            "reasoning": "Event-Driven Architecture is a specific architecture pattern; typical JDs won\u2019t confuse it with other distinct architecture skills."
          },
          "context_keywords": {
            "context_keywords": [
              "microservices",
              "Kafka",
              "RabbitMQ",
              "event sourcing",
              "CQRS",
              "asynchronous messaging",
              "publish-subscribe",
              "stream processing",
              "event bus",
              "serverless",
              "event-driven programming",
              "message broker",
              "real-time data",
              "data pipeline",
              "event schema"
            ]
          },
          "maturity": {
            "confidence": 0.92,
            "maturity": "well_known",
            "reasoning": "Common in cloud-native JDs and vendor docs; AWS, Azure, and Confluent all market event-driven patterns with Kafka/PubSub, showing broad hiring demand."
          },
          "skill_id": "event-driven-architecture",
          "vendor_license": {
            "confidence": 0.9,
            "license": null,
            "vendor": null,
            "year_introduced": null
          },
          "versioning": {
            "current_version": null,
            "version_aliases": {},
            "versioned": false
          }
        },
        "keep_log": [],
        "locked_dimensions": [
          {
            "description": "Architectural patterns for building systems around events, asynchronous messaging, and decoupled producers and consumers. This fits the target skill because it covers how services publish, route, process, and react to domain and integration events.",
            "exemplar_skills": [
              "Event-Driven Architecture",
              "Event Sourcing",
              "Pub/Sub Design",
              "Kafka",
              "RabbitMQ",
              "Asynchronous Messaging",
              "Domain Events",
              "Idempotent Consumer Design"
            ],
            "in_scope": "Event-Driven Architecture, event sourcing, pub/sub, message-driven workflows, domain events, integration events, asynchronous processing, event contracts, idempotent consumers, eventual consistency, Kafka, RabbitMQ, SNS/SQS, Google Pub/Sub",
            "name": "Event-Driven Architecture",
            "out_of_scope": "Synchronous REST API design, client-side HTTP calls, database schema design, low-level serialization formats, which belong to networking, data modeling, or serialization dimensions",
            "overlap_flags": [
              {
                "reason": "Event payloads often rely on schemas and wire formats, but this dimension is about the architectural pattern rather than serialization mechanics.",
                "with_dim_id": "data-serialization-standards-protocols",
                "with_dim_name": null,
                "with_role": "Data Engineer"
              },
              {
                "reason": "Event-driven systems frequently use async execution and coordination, but this dimension focuses on system architecture and messaging topology.",
                "with_dim_id": "concurrency-and-parallel-processing",
                "with_dim_name": null,
                "with_role": "Backend Engineer"
              },
              {
                "reason": "EDA is often chosen for scale and throughput, but performance tuning is a separate concern from the architectural style itself.",
                "with_dim_id": "performance-and-scalability-tuning",
                "with_dim_name": null,
                "with_role": "Backend Engineer"
              }
            ],
            "tentative_id": "d_init_01"
          }
        ],
        "merge_log": [],
        "placed": {
          "name": "Event-Driven Architecture",
          "placement_confidence": 0.92,
          "primary_dimension": "d_init_01",
          "reasoning": "Deterministic JD placement: locked_dimensions has 1 dimension(s) from skill-driven dimension generation after reconciliation; primary_dimension is the first locked dim.",
          "secondary_dimensions": [],
          "skill_id": "event-driven-architecture"
        },
        "relationships": {
          "child_skills": [],
          "parent_skills": [],
          "related_to": [
            "redux",
            "repository-pattern",
            "mvvm",
            "apis",
            "ci-cd",
            "devops",
            "scrum",
            "nosql"
          ],
          "requires": [],
          "skill_id": "event-driven-architecture",
          "suppress_on_match": []
        },
        "skill_id": "event-driven-architecture",
        "split_log": [],
        "typed": {
          "alternatives_considered": [],
          "confidence": 0.99,
          "name": "Event-Driven Architecture",
          "reasoning": "By the Architecture vs Concept rule, Event-Driven Architecture is a system-shape pattern that influences how systems are built, not just a knowledge unit.",
          "skill_id": "event-driven-architecture",
          "subtype": "event_driven_architecture",
          "type": "Architecture"
        },
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    }
  ],
  "unmatched_skills": [
    "Apache Spark",
    "Hadoop",
    "HBase",
    "Aerospike",
    "Cassandra",
    "RDBMS",
    "Machine Learning",
    "Artificial Intelligence",
    "Data Lakes",
    "Lakehouse",
    "Event-Driven Architecture"
  ]
}
API 3 — final-role-output
{
  "chosen_role": {
    "display_name": "Data Engineer",
    "id": 2,
    "rationale": "The primary skills indicate a strong focus on data processing technologies and cloud platforms, aligning well with a Data Engineer role.",
    "role_archetype": null,
    "slug": "data-engineer",
    "source": "db"
  },
  "chosen_role_resolution": "in_db",
  "final_input_skills": [
    {
      "skill": "Apache Spark",
      "tag": "new"
    },
    {
      "skill": "Kafka",
      "tag": "in_db"
    },
    {
      "skill": "Hadoop",
      "tag": "new"
    },
    {
      "skill": "HBase",
      "tag": "new"
    },
    {
      "skill": "Aerospike",
      "tag": "new"
    },
    {
      "skill": "Cassandra",
      "tag": "new"
    },
    {
      "skill": "Java",
      "tag": "in_db"
    },
    {
      "skill": "Scala",
      "tag": "in_db"
    },
    {
      "skill": "AWS",
      "tag": "in_db"
    },
    {
      "skill": "GCP",
      "tag": "in_db"
    },
    {
      "skill": "Azure",
      "tag": "in_db"
    },
    {
      "skill": "RDBMS",
      "tag": "new"
    },
    {
      "skill": "NoSQL",
      "tag": "in_db"
    },
    {
      "skill": "Machine Learning",
      "tag": "new"
    },
    {
      "skill": "Artificial Intelligence",
      "tag": "new"
    },
    {
      "skill": "Data Lakes",
      "tag": "new"
    },
    {
      "skill": "Lakehouse",
      "tag": "new"
    },
    {
      "skill": "Event-Driven Architecture",
      "tag": "new"
    }
  ],
  "persistence": {
    "items": [
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Messaging and Event Streaming",
          "id": 8,
          "rationale": "Transport-layer systems used to move events and decouple producers from consumers. Data engineers use these systems to ingest, buffer, and distribute event data before downstream processing.",
          "slug": "messaging-and-event-streaming",
          "source": "db"
        },
        "dimension_id": 8,
        "input_skill": "Kafka",
        "llm_role": null,
        "matched_chosen_role": true,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension saved",
        "role_dimension_saved": true,
        "roles_from_db": [
          {
            "display_name": "Backend Engineer",
            "id": 1,
            "rationale": null,
            "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
            "slug": "backend-engineer",
            "source": "db"
          },
          {
            "display_name": "Data Engineer",
            "id": 2,
            "rationale": null,
            "role_archetype": null,
            "slug": "data-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 36,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Kotlin and Java",
          "id": 161,
          "rationale": "Primary implementation languages for Android app features, platform integration, and client-side business logic. Android engineers use these languages to build screens, state flows, service adapters, and device-aware behavior.",
          "slug": "kotlin-and-java",
          "source": "db"
        },
        "dimension_id": 161,
        "input_skill": "Java",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Android Engineer",
            "id": 4,
            "rationale": null,
            "role_archetype": null,
            "slug": "android-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 1,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Programming Languages",
          "id": 1,
          "rationale": "Core server-side languages used to implement backend business logic, integrations, and service internals. This is the primary coding surface for the role across application layers.",
          "slug": "programming-languages",
          "source": "db"
        },
        "dimension_id": 1,
        "input_skill": "Java",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Backend Engineer",
            "id": 1,
            "rationale": null,
            "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
            "slug": "backend-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 1,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Programming Languages for Data Work",
          "id": 21,
          "rationale": "Languages used to implement data pipelines, transformations, and operational glue. This is the primary coding surface for building ingestion, enrichment, and automation logic in data engineering.",
          "slug": "programming-languages-for-data-work",
          "source": "db"
        },
        "dimension_id": 21,
        "input_skill": "Java",
        "llm_role": null,
        "matched_chosen_role": true,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension saved",
        "role_dimension_saved": true,
        "roles_from_db": [
          {
            "display_name": "Data Engineer",
            "id": 2,
            "rationale": null,
            "role_archetype": null,
            "slug": "data-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 1,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Programming Languages for Data Work",
          "id": 21,
          "rationale": "Languages used to implement data pipelines, transformations, and operational glue. This is the primary coding surface for building ingestion, enrichment, and automation logic in data engineering.",
          "slug": "programming-languages-for-data-work",
          "source": "db"
        },
        "dimension_id": 21,
        "input_skill": "Scala",
        "llm_role": null,
        "matched_chosen_role": true,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension saved",
        "role_dimension_saved": true,
        "roles_from_db": [
          {
            "display_name": "Data Engineer",
            "id": 2,
            "rationale": null,
            "role_archetype": null,
            "slug": "data-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 102,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Programming Languages for ML Systems",
          "id": 39,
          "rationale": "Languages used to build training code, inference services, evaluation jobs, and ML glue code. This is the primary implementation surface for ML engineers across experimentation and productionization.",
          "slug": "programming-languages-for-ml-systems",
          "source": "db"
        },
        "dimension_id": 39,
        "input_skill": "Scala",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "ML Engineer",
            "id": 3,
            "rationale": null,
            "role_archetype": null,
            "slug": "ml-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 102,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Cloud Platforms",
          "id": 20,
          "rationale": "Proficiency in major cloud service provider platforms and their core services.",
          "slug": "cloud-platforms",
          "source": "db"
        },
        "dimension_id": 20,
        "input_skill": "AWS",
        "llm_role": null,
        "matched_chosen_role": true,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension saved",
        "role_dimension_saved": true,
        "roles_from_db": [
          {
            "display_name": "Backend Engineer",
            "id": 1,
            "rationale": null,
            "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
            "slug": "backend-engineer",
            "source": "db"
          },
          {
            "display_name": "Cybersecurity Engineer",
            "id": 5,
            "rationale": null,
            "role_archetype": null,
            "slug": "cybersecurity-engineer",
            "source": "db"
          },
          {
            "display_name": "Data Engineer",
            "id": 2,
            "rationale": null,
            "role_archetype": null,
            "slug": "data-engineer",
            "source": "db"
          },
          {
            "display_name": "DevOps Engineer",
            "id": 10,
            "rationale": null,
            "role_archetype": null,
            "slug": "devops-engineer",
            "source": "db"
          },
          {
            "display_name": "ML Engineer",
            "id": 3,
            "rationale": null,
            "role_archetype": null,
            "slug": "ml-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 187,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Cloud Platforms for AI Deployment",
          "id": 211,
          "rationale": "Major cloud services that provide infrastructure and managed services for AI workloads.",
          "slug": "cloud-platforms-for-ai-deployment",
          "source": "db"
        },
        "dimension_id": 211,
        "input_skill": "AWS",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "AI Engineer",
            "id": 13,
            "rationale": null,
            "role_archetype": null,
            "slug": "ai-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 187,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Cloud Provider Platforms",
          "id": 131,
          "rationale": "Major cloud platforms and their core service ecosystems used to design target-state architectures, choose deployment boundaries, and evaluate managed capabilities. This is the primary substrate for cloud architecture decisions.",
          "slug": "cloud-provider-platforms",
          "source": "db"
        },
        "dimension_id": 131,
        "input_skill": "AWS",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Cloud Architect",
            "id": 9,
            "rationale": null,
            "role_archetype": null,
            "slug": "cloud-architect",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 187,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Cloud Security Posture Tools",
          "id": 64,
          "rationale": "Cloud-native security platforms used to assess misconfiguration, workload exposure, and cloud control coverage. This dimension includes the major CNAPP/CSPM/CWPP vendors and cloud security services the role reviews and tunes.",
          "slug": "cloud-security-posture-tools",
          "source": "db"
        },
        "dimension_id": 64,
        "input_skill": "AWS",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Cybersecurity Engineer",
            "id": 5,
            "rationale": null,
            "role_archetype": null,
            "slug": "cybersecurity-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 187,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Cloud Platforms",
          "id": 20,
          "rationale": "Proficiency in major cloud service provider platforms and their core services.",
          "slug": "cloud-platforms",
          "source": "db"
        },
        "dimension_id": 20,
        "input_skill": "GCP",
        "llm_role": null,
        "matched_chosen_role": true,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension saved",
        "role_dimension_saved": true,
        "roles_from_db": [
          {
            "display_name": "Backend Engineer",
            "id": 1,
            "rationale": null,
            "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
            "slug": "backend-engineer",
            "source": "db"
          },
          {
            "display_name": "Cybersecurity Engineer",
            "id": 5,
            "rationale": null,
            "role_archetype": null,
            "slug": "cybersecurity-engineer",
            "source": "db"
          },
          {
            "display_name": "Data Engineer",
            "id": 2,
            "rationale": null,
            "role_archetype": null,
            "slug": "data-engineer",
            "source": "db"
          },
          {
            "display_name": "DevOps Engineer",
            "id": 10,
            "rationale": null,
            "role_archetype": null,
            "slug": "devops-engineer",
            "source": "db"
          },
          {
            "display_name": "ML Engineer",
            "id": 3,
            "rationale": null,
            "role_archetype": null,
            "slug": "ml-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 186,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Cloud Platforms for AI Deployment",
          "id": 211,
          "rationale": "Major cloud services that provide infrastructure and managed services for AI workloads.",
          "slug": "cloud-platforms-for-ai-deployment",
          "source": "db"
        },
        "dimension_id": 211,
        "input_skill": "GCP",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "AI Engineer",
            "id": 13,
            "rationale": null,
            "role_archetype": null,
            "slug": "ai-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 186,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Cloud Security Posture Tools",
          "id": 64,
          "rationale": "Cloud-native security platforms used to assess misconfiguration, workload exposure, and cloud control coverage. This dimension includes the major CNAPP/CSPM/CWPP vendors and cloud security services the role reviews and tunes.",
          "slug": "cloud-security-posture-tools",
          "source": "db"
        },
        "dimension_id": 64,
        "input_skill": "GCP",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Cybersecurity Engineer",
            "id": 5,
            "rationale": null,
            "role_archetype": null,
            "slug": "cybersecurity-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 186,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Cloud Platforms",
          "id": 20,
          "rationale": "Proficiency in major cloud service provider platforms and their core services.",
          "slug": "cloud-platforms",
          "source": "db"
        },
        "dimension_id": 20,
        "input_skill": "Azure",
        "llm_role": null,
        "matched_chosen_role": true,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension saved",
        "role_dimension_saved": true,
        "roles_from_db": [
          {
            "display_name": "Backend Engineer",
            "id": 1,
            "rationale": null,
            "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
            "slug": "backend-engineer",
            "source": "db"
          },
          {
            "display_name": "Cybersecurity Engineer",
            "id": 5,
            "rationale": null,
            "role_archetype": null,
            "slug": "cybersecurity-engineer",
            "source": "db"
          },
          {
            "display_name": "Data Engineer",
            "id": 2,
            "rationale": null,
            "role_archetype": null,
            "slug": "data-engineer",
            "source": "db"
          },
          {
            "display_name": "DevOps Engineer",
            "id": 10,
            "rationale": null,
            "role_archetype": null,
            "slug": "devops-engineer",
            "source": "db"
          },
          {
            "display_name": "ML Engineer",
            "id": 3,
            "rationale": null,
            "role_archetype": null,
            "slug": "ml-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 188,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Cloud Platforms for AI Deployment",
          "id": 211,
          "rationale": "Major cloud services that provide infrastructure and managed services for AI workloads.",
          "slug": "cloud-platforms-for-ai-deployment",
          "source": "db"
        },
        "dimension_id": 211,
        "input_skill": "Azure",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "AI Engineer",
            "id": 13,
            "rationale": null,
            "role_archetype": null,
            "slug": "ai-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 188,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Cloud Provider Platforms",
          "id": 131,
          "rationale": "Major cloud platforms and their core service ecosystems used to design target-state architectures, choose deployment boundaries, and evaluate managed capabilities. This is the primary substrate for cloud architecture decisions.",
          "slug": "cloud-provider-platforms",
          "source": "db"
        },
        "dimension_id": 131,
        "input_skill": "Azure",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Cloud Architect",
            "id": 9,
            "rationale": null,
            "role_archetype": null,
            "slug": "cloud-architect",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 188,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Cloud Security Posture Tools",
          "id": 64,
          "rationale": "Cloud-native security platforms used to assess misconfiguration, workload exposure, and cloud control coverage. This dimension includes the major CNAPP/CSPM/CWPP vendors and cloud security services the role reviews and tunes.",
          "slug": "cloud-security-posture-tools",
          "source": "db"
        },
        "dimension_id": 64,
        "input_skill": "Azure",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Cybersecurity Engineer",
            "id": 5,
            "rationale": null,
            "role_archetype": null,
            "slug": "cybersecurity-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 188,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "NoSQL Databases",
          "id": 19,
          "rationale": "Models and manages data using non-relational database systems.",
          "slug": "nosql-databases",
          "source": "db"
        },
        "dimension_id": 19,
        "input_skill": "NoSQL",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Backend Engineer",
            "id": 1,
            "rationale": null,
            "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
            "slug": "backend-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 1346,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "ETL and ELT Tooling",
          "id": 24,
          "rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
          "slug": "etl-and-elt-tooling",
          "source": "db"
        },
        "dimension_id": 24,
        "input_skill": "Apache Spark",
        "llm_role": null,
        "matched_chosen_role": true,
        "outcome_line": "New skill saved \u00b7 Existing dimension (library) \u00b7 Role\u2194dimension saved",
        "role_dimension_saved": true,
        "roles_from_db": [
          {
            "display_name": "Data Engineer",
            "id": 2,
            "rationale": null,
            "role_archetype": null,
            "slug": "data-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 1350,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "ETL and ELT Tooling",
          "id": 24,
          "rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
          "slug": "etl-and-elt-tooling",
          "source": "db"
        },
        "dimension_id": 24,
        "input_skill": "Hadoop",
        "llm_role": null,
        "matched_chosen_role": true,
        "outcome_line": "New skill saved \u00b7 Existing dimension (library) \u00b7 Role\u2194dimension saved",
        "role_dimension_saved": true,
        "roles_from_db": [
          {
            "display_name": "Data Engineer",
            "id": 2,
            "rationale": null,
            "role_archetype": null,
            "slug": "data-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 1351,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Cloud Storage and Data Services",
          "id": 144,
          "rationale": "Cloud-native storage and managed data services used to place workloads, choose durability tiers, and define platform boundaries. This is a coherent cluster because architects evaluate storage fit, access patterns, and managed service tradeoffs.",
          "slug": "cloud-storage-and-data-services",
          "source": "db"
        },
        "dimension_id": 144,
        "input_skill": "HBase",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "New skill saved \u00b7 Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Cloud Architect",
            "id": 9,
            "rationale": null,
            "role_archetype": null,
            "slug": "cloud-architect",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 1352,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "React Frontend Development",
          "id": 96,
          "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
          "slug": "d_init_01",
          "source": "db"
        },
        "dimension_id": 96,
        "input_skill": "Aerospike",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "New skill saved \u00b7 Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [],
        "skill_dimension_saved": true,
        "skill_id": 1353,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Cloud Storage and Data Services",
          "id": 144,
          "rationale": "Cloud-native storage and managed data services used to place workloads, choose durability tiers, and define platform boundaries. This is a coherent cluster because architects evaluate storage fit, access patterns, and managed service tradeoffs.",
          "slug": "cloud-storage-and-data-services",
          "source": "db"
        },
        "dimension_id": 144,
        "input_skill": "Cassandra",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "New skill saved \u00b7 Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Cloud Architect",
            "id": 9,
            "rationale": null,
            "role_archetype": null,
            "slug": "cloud-architect",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 1354,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "React Frontend Development",
          "id": 96,
          "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
          "slug": "d_init_01",
          "source": "db"
        },
        "dimension_id": 96,
        "input_skill": "RDBMS",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "New skill saved \u00b7 Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [],
        "skill_dimension_saved": true,
        "skill_id": 1355,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "React Frontend Development",
          "id": 96,
          "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
          "slug": "d_init_01",
          "source": "db"
        },
        "dimension_id": 96,
        "input_skill": "Machine Learning",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "New skill saved \u00b7 Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [],
        "skill_dimension_saved": true,
        "skill_id": 1356,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "AI Governance and Model Security",
          "id": 50,
          "rationale": "Controls and documentation used to make models safer, auditable, and compliant. ML engineers use this to manage model risk, supply chain integrity, and governance requirements.",
          "slug": "ai-governance-and-model-security",
          "source": "db"
        },
        "dimension_id": 50,
        "input_skill": "Machine Learning",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "New skill saved \u00b7 Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "AI Engineer",
            "id": 13,
            "rationale": null,
            "role_archetype": null,
            "slug": "ai-engineer",
            "source": "db"
          },
          {
            "display_name": "ML Engineer",
            "id": 3,
            "rationale": null,
            "role_archetype": null,
            "slug": "ml-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 1356,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "React Frontend Development",
          "id": 96,
          "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
          "slug": "d_init_01",
          "source": "db"
        },
        "dimension_id": 96,
        "input_skill": "Artificial Intelligence",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "New skill saved \u00b7 Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [],
        "skill_dimension_saved": true,
        "skill_id": 1357,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Cloud Storage and Data Services",
          "id": 144,
          "rationale": "Cloud-native storage and managed data services used to place workloads, choose durability tiers, and define platform boundaries. This is a coherent cluster because architects evaluate storage fit, access patterns, and managed service tradeoffs.",
          "slug": "cloud-storage-and-data-services",
          "source": "db"
        },
        "dimension_id": 144,
        "input_skill": "Data Lakes",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "New skill saved \u00b7 Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Cloud Architect",
            "id": 9,
            "rationale": null,
            "role_archetype": null,
            "slug": "cloud-architect",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 1358,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "React Frontend Development",
          "id": 96,
          "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
          "slug": "d_init_01",
          "source": "db"
        },
        "dimension_id": 96,
        "input_skill": "Data Lakes",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "New skill saved \u00b7 Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [],
        "skill_dimension_saved": true,
        "skill_id": 1358,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "React Frontend Development",
          "id": 96,
          "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
          "slug": "d_init_01",
          "source": "db"
        },
        "dimension_id": 96,
        "input_skill": "Lakehouse",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "New skill saved \u00b7 Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [],
        "skill_dimension_saved": true,
        "skill_id": 1359,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "React Frontend Development",
          "id": 96,
          "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
          "slug": "d_init_01",
          "source": "db"
        },
        "dimension_id": 96,
        "input_skill": "Event-Driven Architecture",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "New skill saved \u00b7 Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [],
        "skill_dimension_saved": true,
        "skill_id": 1360,
        "skill_tag": "in_db",
        "skipped_reason": null
      }
    ],
    "new_skills_created": 11,
    "role_dimension_saved": 0,
    "skill_dimension_saved": 13,
    "skipped": 0
  },
  "planner_output": null,
  "run_id": "1f106d71-338e-40ee-a69a-09957abcd98f"
}

LLM Calls

Every model call made for this run, in pipeline order. Click a card to see the model's response.

Loading…