← Back to history

Pipeline run

bae69e34-7cea-4d40-93b7-a7eea576fb78

Pipeline LLM cost (USD)
API 1: $0.0071 API 2: $0.0002 API 3: $0.0000 Total: $0.0073

Client output enrichment

v2 Skill cluster · Nature of work · AI index · Tech stack maturity · Evidence · KRA description
SPARSE JD role baseline loaded sources · ai_index: role_baseline · nature_of_work: no_kras · tech_stack_maturity: jd
Nature of work no kras
Vague JD — no KRAs present to derive a specific nature of work.
Tech stack maturity
Mainstream Modern
The stack centers on widely used distributed data engineering tools like Airflow, Spark, Kafka, and Python, with some legacy data stores like HBase and MongoDB, which fits a mainstream modern data platform rather than bleeding-edge or pre-cloud.
AI index (0 = no AI use, 5 = totally AI-dependent · v2.1)
1.20 / 5
· Title match
· Has AI skill
· AI skill (primary)
· AI skill (secondary)
· On AI team
· Builds AI products
vocab breakdown (legacy)
Assistants (×1):
Frameworks (×2):
Models / concepts (×3):
Evidence — skills matched in JD (12)
Python Kafka Apache Spark Google Dataflow Airflow Bigtable MongoDB DynamoDB HBase Scrapy Selenium Beautiful Soup
Skill cluster (4 dimension groups, role-scoped)
ETL and ELT Tooling
Apache Spark
Messaging and Event Streaming
Kafka
Programming Languages for Data Work
Python
Cross-cutting / unaligned
Google Dataflow Airflow Bigtable MongoDB DynamoDB HBase Scrapy Selenium Beautiful Soup

Signals

Skill data-engineer
0.25
Alias
KRA data-engineer
0.46

Post-classification

Centroidupdated · n=81
Alias collision log
New-role queue
New skills captured6
New KRA captured

Captured for admin review

Google Dataflow primary Data Engineer pending
Bigtable primary Data Engineer pending
DynamoDB primary Data Engineer pending
Scrapy primary Data Engineer pending
Selenium primary Data Engineer pending
Beautiful Soup primary Data Engineer pending
Status: completed Created: 2026-05-27T13:51:31.613420Z Updated: 2026-05-27T13:53:18.618470Z API 3 duration: 37234 ms
Flow Current 3-step pipeline

1 POST /skills/extract-from-jd

2 POST /skills/extract-details

3 POST /skills/final-role-output

Role Chosen role & resolution

Data Engineer

domain · Data Engineering & Analytics CASE DOMAIN

slug: data-engineer · id: 2 · source: db

Domain=Data Engineering & Analytics; The JD centers on Python-based data onboarding and collection with Kafka, Spark, Dataflow, Airflow, NoSQL databases, and data cleansing/normalization, which best fits a Data Engineer role.

Matched skills

PythonKafkaSparkGoogle DataflowAirflowBigTableMongoDBDynamoDBHBaseScrapySeleniumbeautiful soupweb scrapingweb crawlingdata cleansing

Matched dimensions

Data OnboardingData Streaming and ProcessingNoSQL Data EngineeringData Collection and Web ScrapingData Cleansing and Normalization

Matched KRAs

Develop data onboarding pipelinesBuild data streaming / data processing technologiesWork with multiple NoSQL databasesPerform data cleansing, normalization, translationDevelop web crawling / web scraping tools

Resolution: in_db — role exists in library; skill↔dim and role↔dim links saved when applicable.

0
New skills
0
Skill↔dim saved
0
Role↔dim saved
1
Skipped

Job description

Requriment 1
Sr. Engineer – Data Onboarding: (Priority 1 – Multiple positions)6+ years of development experience in development Python.Data streaming / Data processing technologies like Kafka, Spark, Google Dataflow, AirflowExperience with multiple NoSQL databases (BigTable, MongoDB, DynamoDB, HBaseExperience working with data – data cleansing, normalization, translation or similar concepts.Requriment 2
 Software Engineer – Data Collection: (Priority 2 – Multiple positions)3+ years of experience in programming Language: PythonWeb Crawling or Web Scraping Tools: Scrapy, Selenium, and Beautiful soup.Recent work experience in web scrapping / crawling.

Skills from this JD

Each row merges API 1 extraction, API 2 library match / v3 orchestration (dimensions + locked dims), and API 3 persistence tags.

Python Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: Python id=5 · python

Aliases — catalog

  • Python (CANONICAL) primary
  • Python 2 (VERSION)
  • Python 2.x (VERSION)
  • Python 3 (VERSION)
  • Python 3.10 (VERSION)
  • Python 3.11 (VERSION)
  • Python 3.12 (VERSION)
  • Python 3.x (VERSION)
  • py (VERSION)
  • py2 (VERSION)
  • py3 (VERSION)
  • python 3 (VERSION)
  • python 3.x (VERSION)
  • python2 (VERSION)
  • python3 (VERSION)
  • python3.x (VERSION)

Context tags (catalog)

API Django FastAPI Flask Jupyter NumPy PEP 8 Pandas REST SQLAlchemy asyncio pandas pip pytest type hints venv virtualenv

Stored enrichment (catalog DB)

Category
Language
Sub-category
Programming Language
Vendor
PSF
License
mit
Year introduced
1991
Confidence
0.99
Version strategy
SEPARATE_ENTITY
Version tag
3

Maturity reasoning: Python appears in a very high volume of job descriptions across data, backend, automation, and ML roles, and remains a default hiring-pipeline language on major job boards and tech stacks.

Skill profile (library / DB)

Skill nature
LANGUAGE
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
6
Sub-category id
96
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • Cloud Security Scripting & DSL Languages Catalog dimension db id 248

    Library dimension (catalog)

    Roles linked in library: Cloud Security Engineer

  • Programming Languages Catalog dimension db id 1

    Library dimension (catalog)

    Roles linked in library: Backend Developer, Fullstack Developer

  • Programming Languages and Scripting Catalog dimension db id 59

    Library dimension (catalog)

    Roles linked in library: Cyber Security Engineer

  • Programming Languages for Data Work Catalog dimension db id 21

    Library dimension (catalog)

    Roles linked in library: Data Engineer

  • Programming Languages for ML Systems Catalog dimension db id 39

    Library dimension (catalog)

    Roles linked in library: ML Engineer, MLOps Engineer

  • Programming Languages for XR Catalog dimension db id 97

    Library dimension (catalog)

    Roles linked in library: AR/VR Engineer

  • Python Programming Catalog dimension db id 290

    Library dimension (catalog)

    Roles linked in library: Python Backend Developer

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
Cloud Security Scripting & DSL Languages
cloud-security-scripting-dsl-languages
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Programming Languages
programming-languages
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Programming Languages and Scripting
programming-languages-and-scripting
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Programming Languages for Data Work
programming-languages-for-data-work
Existing dimension (library) · Role↔dimension saved
Programming Languages for ML Systems
programming-languages-for-ml-systems
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Programming Languages for XR
programming-languages-for-xr
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Python Programming
python-programming
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Kafka Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: Kafka id=36 · kafka

Aliases — catalog

  • Kafka (CANONICAL) primary

Context tags (catalog)

Apache Flink Apache Kafka Apache Pulsar Apache Spark Avro KSQL Kafka API Kafka Connect Kafka Streams ZooKeeper Zookeeper backpressure brokers consumer consumer group consumer groups event sourcing event-driven architecture exactly-once semantics fault tolerance high throughput log compaction message broker message queue microservices offsets partition partitioning partitions producer producer API real-time analytics real-time data replication schema registry stream processing topic topic partitioning topics

Stored enrichment (catalog DB)

Category
Datastore
Sub-category
Event Stream Store
Vendor
Confluent
License
apache_2
Year introduced
2011
Confidence
0.90
Version strategy
NOT_APPLICABLE

Maturity reasoning: Kafka appears in many production JDs for event streaming and data pipelines, and remains a standard platform in cloud/vendor offerings (e.g., Confluent, AWS MSK), indicating broad hiring demand.

Skill profile (library / DB)

Skill nature
TOOL
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
3
Sub-category id
3533
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • Asynchronous Messaging and Event Streaming Catalog dimension db id 297

    Library dimension (catalog)

    Roles linked in library: .NET Backend Developer, Go Backend Developer, Kotlin Backend Developer, Node.js Backend Developer, Scala Backend Developer

  • Messaging and Background Jobs Catalog dimension db id 291

    Library dimension (catalog)

    Roles linked in library: PHP Backend Developer, Python Backend Developer, Ruby Backend Developer

  • Messaging and Event Streaming Catalog dimension db id 8

    Library dimension (catalog)

    Roles linked in library: Backend Developer, Data Engineer

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
Asynchronous Messaging and Event Streaming
asynchronous-messaging-and-event-streaming
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Messaging and Background Jobs
messaging-and-background-jobs
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Messaging and Event Streaming
messaging-and-event-streaming
Existing dimension (library) · Role↔dimension saved
Apache Spark Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: Apache Spark id=1350 · apache-spark

Aliases — catalog

  • Apache Spark (CANONICAL)
  • apache spark 3 (VERSION)
  • spark (VERSION)
  • spark 3 (VERSION)
  • spark 3.x (VERSION)
  • spark3 (VERSION)

Context tags (catalog)

Apache Kafka Cluster Manager DAGScheduler Data Lake DataFrame ETL Hadoop MLlib Machine Learning PySpark RDD Scala Spark SQL Spark Streaming SparkSession

Stored enrichment (catalog DB)

Category
Framework
Sub-category
Distributed Data Processing Framework
Vendor
Apache Software Foundation
License
apache_2
Year introduced
2010
Confidence
0.94
Version strategy
SEPARATE_ENTITY
Version tag
3.x

Maturity reasoning: Apache Spark appears in many data engineering JDs and remains a standard for distributed ETL/ELT; its GitHub and vendor ecosystem activity stay strong, with Databricks and cloud platforms still promoting it.

Skill profile (library / DB)

Skill nature
FRAMEWORK
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
5
Sub-category id
1021
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • ETL and ELT Tooling Catalog dimension db id 24

    Library dimension (catalog)

    Roles linked in library: Data Engineer

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
ETL and ELT Tooling
etl-and-elt-tooling
Existing dimension (library) · Role↔dimension saved
Google Dataflow Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Data Engineering Tools
Sub-category
general
Skill nature
PLATFORM
Volatility
MEDIUM
Typical lifespan
MULTI_YEAR
Version strategy
UNVERSIONED
Airflow Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: Airflow id=265 · airflow

Aliases — catalog

  • Airflow (CANONICAL) primary
  • airflow 2 (VERSION)
  • airflow-2 (VERSION)
  • airflow2 (VERSION)
  • airflow2.x (VERSION)
  • apache airflow 2 (VERSION)

Context tags (catalog)

Apache Celery CeleryExecutor DAG ETL Executor Jinja templating Python SLA Sensors UI XCom backfill connections data pipeline executor hooks logging monitoring operators plugins scheduler task dependencies task instance variables

Stored enrichment (catalog DB)

Category
Tool
Sub-category
Workflow Orchestration Tool
Vendor
Apache Software Foundation
License
apache_2
Year introduced
2014
Confidence
0.95
Version strategy
SEPARATE_ENTITY
Version tag
2.x

Maturity reasoning: Apache Airflow appears in many data engineering job postings and is a common orchestration choice in production stacks; its GitHub activity and ecosystem remain strong, with no vendor sunset or clear replacement dominating JDs.

Skill profile (library / DB)

Skill nature
TOOL
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
13
Sub-category id
130
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • Workflow Orchestration for ML Pipelines Catalog dimension db id 54

    Library dimension (catalog)

    Roles linked in library: ML Engineer, MLOps Engineer

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
Workflow Orchestration for ML Pipelines
workflow-orchestration-for-ml-pipelines
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Bigtable Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Databases
Sub-category
general
Skill nature
TOOL
Volatility
MEDIUM
Typical lifespan
MULTI_YEAR
Version strategy
UNVERSIONED
MongoDB Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: MongoDB id=91 · mongodb

Aliases — catalog

  • MongoDB (CANONICAL) primary
  • MongoDB 2.0 (VERSION)
  • MongoDB 2.2 (VERSION)
  • MongoDB 2.4 (VERSION)
  • MongoDB 2.6 (VERSION)
  • MongoDB 3.0 (VERSION)
  • MongoDB 3.2 (VERSION)
  • MongoDB 3.4 (VERSION)
  • MongoDB 3.6 (VERSION)
  • MongoDB 4 (VERSION)
  • MongoDB 4.0 (VERSION)
  • MongoDB 4.2 (VERSION)
  • MongoDB 4.4 (VERSION)
  • MongoDB 5 (VERSION)
  • MongoDB 5.0 (VERSION)
  • MongoDB 6 (VERSION)
  • MongoDB 6.0 (VERSION)
  • MongoDB 7 (VERSION)
  • MongoDB 7.0 (VERSION)
  • MongoDB 8 (VERSION)
  • MongoDB 8.0 (VERSION)

Context tags (catalog)

BSON CRUD GridFS MongoDB Atlas Mongoose NoSQL TTL index aggregation pipeline change streams collections documents indexes replica set sharding

Stored enrichment (catalog DB)

Category
Datastore
Sub-category
Document Database
Vendor
MongoDB, Inc.
License
other_open
Year introduced
2009
Confidence
0.99
Version strategy
SEPARATE_ENTITY
Version tag
8.0

Maturity reasoning: MongoDB appears in many job descriptions across backend/data roles and is a standard document database in modern stacks; strong GitHub/community activity and broad cloud vendor support indicate mainstream adoption.

Skill profile (library / DB)

Skill nature
TOOL
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
3
Sub-category id
27
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • NoSQL Databases Catalog dimension db id 19

    Library dimension (catalog)

    Roles linked in library: Backend Developer

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
NoSQL Databases
nosql-databases
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
DynamoDB Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: Amazon DynamoDB id=93 · amazon-dynamodb

Aliases — catalog

  • Amazon DynamoDB (CANONICAL) primary

Context tags (catalog)

AWS SDK DAX GSI LSI NoSQL Streams TTL boto3 conditional writes on-demand capacity partition key provisioned throughput secondary index sort key transactions

Stored enrichment (catalog DB)

Category
Service
Sub-category
Managed Nosql Database Service
Vendor
Amazon Web Services
License
proprietary
Year introduced
2012
Confidence
0.98
Version strategy
NOT_APPLICABLE

Maturity reasoning: Commonly listed in cloud/backend job descriptions and widely used on AWS; strong vendor adoption and active ecosystem signal broad market demand.

Skill profile (library / DB)

Skill nature
CLOUD_SERVICE
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
11
Sub-category id
55
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • NoSQL Databases Catalog dimension db id 19

    Library dimension (catalog)

    Roles linked in library: Backend Developer

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
NoSQL Databases
nosql-databases
Skipped — no persistable v3 meta for new skill
skill_not_in_db_v3_proposed
HBase Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: HBase id=1352 · hbase

Aliases — catalog

  • HBase (CANONICAL)

Context tags (catalog)

Apache Bigtable Hadoop MapReduce NoSQL REST API Thrift column family data model data replication distributed real-time region server scalability table design

Stored enrichment (catalog DB)

Category
Datastore
Sub-category
Wide Column Store
Vendor
Apache Software Foundation
License
apache_2
Year introduced
2010
Confidence
0.98
Version strategy
NOT_APPLICABLE

Maturity reasoning: HBase appears in a limited set of big-data/legacy Hadoop job postings, while newer JDs more often specify DynamoDB, Bigtable, or Cassandra; its market demand is specialized rather than broad.

Skill profile (library / DB)

Skill nature
TOOL
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
3
Sub-category id
31
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • Cloud Storage and Data Services Catalog dimension db id 144

    Library dimension (catalog)

    Roles linked in library: Cloud Architect

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
Cloud Storage and Data Services
cloud-storage-and-data-services
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Scrapy Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Web Frameworks
Sub-category
general
Skill nature
TOOL
Volatility
MEDIUM
Typical lifespan
MULTI_YEAR
Version strategy
UNVERSIONED
Selenium Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Testing Tools
Sub-category
general
Skill nature
TOOL
Volatility
MEDIUM
Typical lifespan
MULTI_YEAR
Version strategy
UNVERSIONED
Beautiful Soup Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Web Frameworks
Sub-category
general
Skill nature
TOOL
Volatility
MEDIUM
Typical lifespan
MULTI_YEAR
Version strategy
UNVERSIONED

All API 3 persistence rows

Same grid as the skill-extractor “Persistence items” table: one row per (skill × dimension) work item.

Skill Tag Dimension Skill↔dim Role↔dim Outcome Notes
Python in_db
Cloud Security Scripting & DSL Languages
cloud-security-scripting-dsl-languages
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Python in_db
Programming Languages
programming-languages
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Python in_db
Programming Languages and Scripting
programming-languages-and-scripting
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Python in_db
Programming Languages for Data Work
programming-languages-for-data-work
Existing dimension (library) · Role↔dimension saved
Python in_db
Programming Languages for ML Systems
programming-languages-for-ml-systems
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Python in_db
Programming Languages for XR
programming-languages-for-xr
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Python in_db
Python Programming
python-programming
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Kafka in_db
Asynchronous Messaging and Event Streaming
asynchronous-messaging-and-event-streaming
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Kafka in_db
Messaging and Background Jobs
messaging-and-background-jobs
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Kafka in_db
Messaging and Event Streaming
messaging-and-event-streaming
Existing dimension (library) · Role↔dimension saved
Apache Spark in_db
ETL and ELT Tooling
etl-and-elt-tooling
Existing dimension (library) · Role↔dimension saved
Airflow in_db
Workflow Orchestration for ML Pipelines
workflow-orchestration-for-ml-pipelines
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
MongoDB in_db
NoSQL Databases
nosql-databases
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
DynamoDB new
NoSQL Databases
nosql-databases
Skipped — no persistable v3 meta for new skill skill_not_in_db_v3_proposed
HBase in_db
Cloud Storage and Data Services
cloud-storage-and-data-services
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)

Library artifacts (this run)

Kind Detail DB id
canonical_skill_proposed Google Dataflow | type=Data Engineering Tools subtype=general nature=PLATFORM lifespan=MULTI_YEAR
canonical_skill_proposed Bigtable | type=Databases subtype=general nature=TOOL lifespan=MULTI_YEAR
canonical_skill_proposed Scrapy | type=Web Frameworks subtype=general nature=TOOL lifespan=MULTI_YEAR
canonical_skill_proposed Selenium | type=Testing Tools subtype=general nature=TOOL lifespan=MULTI_YEAR
canonical_skill_proposed Beautiful Soup | type=Web Frameworks subtype=general nature=TOOL lifespan=MULTI_YEAR
dimension_skill_link_proposed DynamoDB ↔ NoSQL Databases
nano JD Parser — gpt-4.1-nano click to toggle
JD type fail
Show raw JSON
{
  "JD_type": "fail",
  "archetype_override_applied": true,
  "archetype_override_matched_skills": [
    "Python",
    "MongoDB",
    "NoSQL",
    "normalization",
    "Kafka"
  ],
  "role_archetype": "Engineering"
}
API 1 — extract-from-jd click to toggle
{
  "final_skills": [
    {
      "is_primary": true,
      "skill_name": "Python"
    },
    {
      "is_primary": true,
      "skill_name": "Kafka"
    },
    {
      "is_primary": true,
      "skill_name": "Apache Spark"
    },
    {
      "is_primary": true,
      "skill_name": "Google Dataflow"
    },
    {
      "is_primary": true,
      "skill_name": "Airflow"
    },
    {
      "is_primary": true,
      "skill_name": "Bigtable"
    },
    {
      "is_primary": true,
      "skill_name": "MongoDB"
    },
    {
      "is_primary": true,
      "skill_name": "DynamoDB"
    },
    {
      "is_primary": true,
      "skill_name": "HBase"
    },
    {
      "is_primary": true,
      "skill_name": "Scrapy"
    },
    {
      "is_primary": true,
      "skill_name": "Selenium"
    },
    {
      "is_primary": true,
      "skill_name": "Beautiful Soup"
    }
  ],
  "jd_role": null,
  "nano_parsed": {
    "JD_type": "fail",
    "archetype_override_applied": true,
    "archetype_override_matched_skills": [
      "Python",
      "MongoDB",
      "NoSQL",
      "normalization",
      "Kafka"
    ],
    "role_archetype": "Engineering"
  },
  "rejected": false,
  "rejection_reason": null,
  "run_id": "bae69e34-7cea-4d40-93b7-a7eea576fb78",
  "stage3_signals": {
    "alias_found": false,
    "alias_match_roles": [],
    "kra_match_roles": [
      {
        "display_name": "Data Engineer",
        "kra_matches": [
          {
            "kra_text": "Develops batch and real-time streaming data pipelines using Apache Spark, Apache Kafka, Apache Flink, or Airflow for data movement and processing at scale.",
            "sentence": "Engineer \u2013 Data Onboarding:\u00a0(Priority 1\u00a0\u2013 Multiple positions)6+ years of development experience in\u00a0development\u00a0Python.Data streaming / Data processing\u00a0technologies like\u00a0Kafka, Spark, Google Dataflow, AirflowExperience with multiple\u00a0NoSQL databases (BigTable, MongoDB, DynamoDB, HBaseExperience working with data \u2013\u00a0data cleansing, normalization, translation\u00a0or similar concepts.Requriment 2",
            "similarity": 0.5721
          },
          {
            "kra_text": "Works with data analysts, data scientists, and business stakeholders to define data models, ingestion schedules, and data delivery requirements.",
            "sentence": "Software Engineer \u2013 Data Collection:\u00a0(Priority 2\u00a0\u2013 Multiple positions)3+ years of experience in programming Language:\u00a0PythonWeb Crawling or Web\u00a0Scraping Tools: Scrapy, Selenium, and Beautiful soup.Recent work experience in\u00a0web scrapping / crawling.",
            "similarity": 0.3462
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 2,
        "score": 0.4592,
        "slug": "data-engineer",
        "total_count": null
      },
      {
        "display_name": "ML Engineer",
        "kra_matches": [
          {
            "kra_text": "Prepares, cleans, and transforms training datasets, manages feature stores, and builds feature engineering pipelines for model training.",
            "sentence": "Engineer \u2013 Data Onboarding:\u00a0(Priority 1\u00a0\u2013 Multiple positions)6+ years of development experience in\u00a0development\u00a0Python.Data streaming / Data processing\u00a0technologies like\u00a0Kafka, Spark, Google Dataflow, AirflowExperience with multiple\u00a0NoSQL databases (BigTable, MongoDB, DynamoDB, HBaseExperience working with data \u2013\u00a0data cleansing, normalization, translation\u00a0or similar concepts.Requriment 2",
            "similarity": 0.4177
          },
          {
            "kra_text": "Prepares, cleans, and transforms training datasets, manages feature stores, and builds feature engineering pipelines for model training.",
            "sentence": "Software Engineer \u2013 Data Collection:\u00a0(Priority 2\u00a0\u2013 Multiple positions)3+ years of experience in programming Language:\u00a0PythonWeb Crawling or Web\u00a0Scraping Tools: Scrapy, Selenium, and Beautiful soup.Recent work experience in\u00a0web scrapping / crawling.",
            "similarity": 0.2909
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 3,
        "score": 0.3543,
        "slug": "ml-engineer",
        "total_count": null
      },
      {
        "display_name": "Svelte Frontend Developer",
        "kra_matches": [
          {
            "kra_text": "backend data integration",
            "sentence": "Engineer \u2013 Data Onboarding:\u00a0(Priority 1\u00a0\u2013 Multiple positions)6+ years of development experience in\u00a0development\u00a0Python.Data streaming / Data processing\u00a0technologies like\u00a0Kafka, Spark, Google Dataflow, AirflowExperience with multiple\u00a0NoSQL databases (BigTable, MongoDB, DynamoDB, HBaseExperience working with data \u2013\u00a0data cleansing, normalization, translation\u00a0or similar concepts.Requriment 2",
            "similarity": 0.4174
          },
          {
            "kra_text": "backend data integration",
            "sentence": "Software Engineer \u2013 Data Collection:\u00a0(Priority 2\u00a0\u2013 Multiple positions)3+ years of experience in programming Language:\u00a0PythonWeb Crawling or Web\u00a0Scraping Tools: Scrapy, Selenium, and Beautiful soup.Recent work experience in\u00a0web scrapping / crawling.",
            "similarity": 0.2793
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 92,
        "score": 0.3483,
        "slug": "svelte-frontend-developer",
        "total_count": null
      },
      {
        "display_name": "Fullstack Developer",
        "kra_matches": [
          {
            "kra_text": "Designs and queries relational databases like PostgreSQL and document stores like MongoDB, writing migrations, indexes, and optimized queries.",
            "sentence": "Engineer \u2013 Data Onboarding:\u00a0(Priority 1\u00a0\u2013 Multiple positions)6+ years of development experience in\u00a0development\u00a0Python.Data streaming / Data processing\u00a0technologies like\u00a0Kafka, Spark, Google Dataflow, AirflowExperience with multiple\u00a0NoSQL databases (BigTable, MongoDB, DynamoDB, HBaseExperience working with data \u2013\u00a0data cleansing, normalization, translation\u00a0or similar concepts.Requriment 2",
            "similarity": 0.3934
          },
          {
            "kra_text": "Implements complete product features end-to-end from database schema design through backend API to frontend UI using JavaScript, TypeScript, Python, or Ruby on Rails.",
            "sentence": "Software Engineer \u2013 Data Collection:\u00a0(Priority 2\u00a0\u2013 Multiple positions)3+ years of experience in programming Language:\u00a0PythonWeb Crawling or Web\u00a0Scraping Tools: Scrapy, Selenium, and Beautiful soup.Recent work experience in\u00a0web scrapping / crawling.",
            "similarity": 0.295
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 15,
        "score": 0.3442,
        "slug": "full-stack-engineer",
        "total_count": null
      },
      {
        "display_name": "AI Engineer",
        "kra_matches": [
          {
            "kra_text": "Designs and implements prompt engineering workflows, few-shot examples, chain-of-thought patterns, and structured output parsing for AI feature pipelines.",
            "sentence": "Engineer \u2013 Data Onboarding:\u00a0(Priority 1\u00a0\u2013 Multiple positions)6+ years of development experience in\u00a0development\u00a0Python.Data streaming / Data processing\u00a0technologies like\u00a0Kafka, Spark, Google Dataflow, AirflowExperience with multiple\u00a0NoSQL databases (BigTable, MongoDB, DynamoDB, HBaseExperience working with data \u2013\u00a0data cleansing, normalization, translation\u00a0or similar concepts.Requriment 2",
            "similarity": 0.3705
          },
          {
            "kra_text": "Designs and implements prompt engineering workflows, few-shot examples, chain-of-thought patterns, and structured output parsing for AI feature pipelines.",
            "sentence": "Software Engineer \u2013 Data Collection:\u00a0(Priority 2\u00a0\u2013 Multiple positions)3+ years of experience in programming Language:\u00a0PythonWeb Crawling or Web\u00a0Scraping Tools: Scrapy, Selenium, and Beautiful soup.Recent work experience in\u00a0web scrapping / crawling.",
            "similarity": 0.3146
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 13,
        "score": 0.3425,
        "slug": "ai-engineer",
        "total_count": null
      }
    ],
    "skill_match_roles": [
      {
        "display_name": "Data Engineer",
        "kra_matches": null,
        "matched_count": 3,
        "matched_skills": [
          "Apache Spark",
          "Kafka",
          "Python"
        ],
        "role_id": 2,
        "score": 0.25,
        "slug": "data-engineer",
        "total_count": 12
      },
      {
        "display_name": "Backend Developer",
        "kra_matches": null,
        "matched_count": 3,
        "matched_skills": [
          "Kafka",
          "MongoDB",
          "Python"
        ],
        "role_id": 1,
        "score": 0.25,
        "slug": "backend-engineer",
        "total_count": 12
      },
      {
        "display_name": "MLOps Engineer",
        "kra_matches": null,
        "matched_count": 2,
        "matched_skills": [
          "Airflow",
          "Python"
        ],
        "role_id": 16,
        "score": 0.1667,
        "slug": "ml-ops-engineer",
        "total_count": 12
      },
      {
        "display_name": "ML Engineer",
        "kra_matches": null,
        "matched_count": 2,
        "matched_skills": [
          "Airflow",
          "Python"
        ],
        "role_id": 3,
        "score": 0.1667,
        "slug": "ml-engineer",
        "total_count": 12
      },
      {
        "display_name": "Python Backend Developer",
        "kra_matches": null,
        "matched_count": 2,
        "matched_skills": [
          "Kafka",
          "Python"
        ],
        "role_id": 80,
        "score": 0.1667,
        "slug": "python-backend-developer",
        "total_count": 12
      }
    ]
  },
  "stage4_decision": {
    "alias_collision_detected": false,
    "case": "DOMAIN",
    "chosen_role": {
      "display_name": "Data Engineer",
      "kra_matches": null,
      "matched_count": null,
      "matched_skills": null,
      "role_id": 2,
      "score": 0.92,
      "slug": "data-engineer",
      "total_count": null
    },
    "confidence": 0.92,
    "is_new_role": false,
    "llm2_fired": false,
    "llm2_reasoning": null,
    "matched_dimensions": [
      "Data Onboarding",
      "Data Streaming and Processing",
      "NoSQL Data Engineering",
      "Data Collection and Web Scraping",
      "Data Cleansing and Normalization"
    ],
    "matched_kras": [
      "Develop data onboarding pipelines",
      "Build data streaming / data processing technologies",
      "Work with multiple NoSQL databases",
      "Perform data cleansing, normalization, translation",
      "Develop web crawling / web scraping tools"
    ],
    "matched_skills": [
      "Python",
      "Kafka",
      "Spark",
      "Google Dataflow",
      "Airflow",
      "BigTable",
      "MongoDB",
      "DynamoDB",
      "HBase",
      "Scrapy",
      "Selenium",
      "beautiful soup",
      "web scraping",
      "web crawling",
      "data cleansing"
    ],
    "new_role_display_name": null,
    "new_role_slug": null,
    "queued": false,
    "reasoning": "Domain=Data Engineering \u0026 Analytics; The JD centers on Python-based data onboarding and collection with Kafka, Spark, Dataflow, Airflow, NoSQL databases, and data cleansing/normalization, which best fits a Data Engineer role.",
    "sub_role": null
  },
  "stage5_updates": {
    "centroid_n_after": 81,
    "centroid_updated": true,
    "collision_log_id": null,
    "new_kra_attached": null,
    "new_skills_attached": [
      {
        "is_primary": true,
        "queue_id": 5240,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Google Dataflow",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 5241,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Bigtable",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 5242,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "DynamoDB",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 5243,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Scrapy",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 5244,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Selenium",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 5245,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Beautiful Soup",
        "status": "pending"
      }
    ],
    "queue_entry_id": null,
    "v3_pipeline_triggered": false,
    "v3_role_slug": null,
    "v3_run_id": null
  }
}
API 2 — extract-details
{
  "alias_matches": [
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 67,
      "existing_alias_text": "Python",
      "input_term": "Python",
      "matched_canonical": {
        "category_id": 6,
        "display_name": "Python",
        "id": 5,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "LANGUAGE",
        "slug": "python",
        "sub_category_id": 96,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 173,
      "existing_alias_text": "Kafka",
      "input_term": "Kafka",
      "matched_canonical": {
        "category_id": 3,
        "display_name": "Kafka",
        "id": 36,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "TOOL",
        "slug": "kafka",
        "sub_category_id": 3533,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 2004,
      "existing_alias_text": "Apache Spark",
      "input_term": "Apache Spark",
      "matched_canonical": {
        "category_id": 5,
        "display_name": "Apache Spark",
        "id": 1350,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "FRAMEWORK",
        "slug": "apache-spark",
        "sub_category_id": 1021,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 526,
      "existing_alias_text": "Airflow",
      "input_term": "Airflow",
      "matched_canonical": {
        "category_id": 13,
        "display_name": "Airflow",
        "id": 265,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "TOOL",
        "slug": "airflow",
        "sub_category_id": 130,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 232,
      "existing_alias_text": "MongoDB",
      "input_term": "MongoDB",
      "matched_canonical": {
        "category_id": 3,
        "display_name": "MongoDB",
        "id": 91,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "TOOL",
        "slug": "mongodb",
        "sub_category_id": 27,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "TODO: REMOVE AFTER TESTING \u2014 alias DB write disabled",
      "alias_persisted": false,
      "existing_alias_id": null,
      "existing_alias_text": null,
      "input_term": "DynamoDB",
      "matched_canonical": {
        "category_id": 11,
        "display_name": "Amazon DynamoDB",
        "id": 93,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "CLOUD_SERVICE",
        "slug": "amazon-dynamodb",
        "sub_category_id": 55,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "embedding_display_name"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 2011,
      "existing_alias_text": "HBase",
      "input_term": "HBase",
      "matched_canonical": {
        "category_id": 3,
        "display_name": "HBase",
        "id": 1352,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "TOOL",
        "slug": "hbase",
        "sub_category_id": 31,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    }
  ],
  "candidate_roles": [
    {
      "display_name": "Cloud Security Engineer",
      "id": 23,
      "rationale": null,
      "role_archetype": null,
      "slug": "cloud-security-engineer",
      "source": "db"
    },
    {
      "display_name": "Backend Developer",
      "id": 1,
      "rationale": null,
      "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
      "slug": "backend-engineer",
      "source": "db"
    },
    {
      "display_name": "Fullstack Developer",
      "id": 15,
      "rationale": null,
      "role_archetype": null,
      "slug": "full-stack-engineer",
      "source": "db"
    },
    {
      "display_name": "Cyber Security Engineer",
      "id": 5,
      "rationale": null,
      "role_archetype": null,
      "slug": "cybersecurity-engineer",
      "source": "db"
    },
    {
      "display_name": "Data Engineer",
      "id": 2,
      "rationale": null,
      "role_archetype": null,
      "slug": "data-engineer",
      "source": "db"
    },
    {
      "display_name": "ML Engineer",
      "id": 3,
      "rationale": null,
      "role_archetype": null,
      "slug": "ml-engineer",
      "source": "db"
    },
    {
      "display_name": "MLOps Engineer",
      "id": 16,
      "rationale": null,
      "role_archetype": null,
      "slug": "ml-ops-engineer",
      "source": "db"
    },
    {
      "display_name": "AR/VR Engineer",
      "id": 8,
      "rationale": null,
      "role_archetype": null,
      "slug": "ar-vr-engineer",
      "source": "db"
    },
    {
      "display_name": "Python Backend Developer",
      "id": 80,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "python-backend-developer",
      "source": "db"
    },
    {
      "display_name": ".NET Backend Developer",
      "id": 83,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "dotnet-backend-developer",
      "source": "db"
    },
    {
      "display_name": "Go Backend Developer",
      "id": 81,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "go-backend-developer",
      "source": "db"
    },
    {
      "display_name": "Kotlin Backend Developer",
      "id": 84,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "kotlin-server-backend-developer",
      "source": "db"
    },
    {
      "display_name": "Node.js Backend Developer",
      "id": 82,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "node-backend-developer",
      "source": "db"
    },
    {
      "display_name": "Scala Backend Developer",
      "id": 87,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "scala-backend-developer",
      "source": "db"
    },
    {
      "display_name": "PHP Backend Developer",
      "id": 86,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "php-backend-developer",
      "source": "db"
    },
    {
      "display_name": "Ruby Backend Developer",
      "id": 85,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "ruby-backend-developer",
      "source": "db"
    },
    {
      "display_name": "Cloud Architect",
      "id": 9,
      "rationale": null,
      "role_archetype": null,
      "slug": "cloud-architect",
      "source": "db"
    }
  ],
  "chosen_role": {
    "display_name": "Data Engineer",
    "id": 2,
    "rationale": "Domain=Data Engineering \u0026 Analytics; The JD centers on Python-based data onboarding and collection with Kafka, Spark, Dataflow, Airflow, NoSQL databases, and data cleansing/normalization, which best fits a Data Engineer role.",
    "role_archetype": null,
    "slug": "data-engineer",
    "source": "db"
  },
  "dimensions": [
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Cloud Security Scripting \u0026 DSL Languages",
        "id": 248,
        "rationale": "Proficiency in programming and domain-specific languages used to automate and script cloud security controls.",
        "slug": "cloud-security-scripting-dsl-languages",
        "source": "db"
      },
      "input_skill": "Python",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Cloud Security Engineer",
          "id": 23,
          "rationale": null,
          "role_archetype": null,
          "slug": "cloud-security-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Programming Languages",
        "id": 1,
        "rationale": "Primary implementation languages used to build client and server feature code. Full stack engineers need enough fluency to move across layers and implement product behavior end to end.",
        "slug": "programming-languages",
        "source": "db"
      },
      "input_skill": "Python",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Backend Developer",
          "id": 1,
          "rationale": null,
          "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
          "slug": "backend-engineer",
          "source": "db"
        },
        {
          "display_name": "Fullstack Developer",
          "id": 15,
          "rationale": null,
          "role_archetype": null,
          "slug": "full-stack-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Programming Languages and Scripting",
        "id": 59,
        "rationale": "Languages used to write security automation, analysis scripts, detection logic, and remediation helpers. This is the primary implementation surface for a cybersecurity engineer across tooling and response workflows.",
        "slug": "programming-languages-and-scripting",
        "source": "db"
      },
      "input_skill": "Python",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Cyber Security Engineer",
          "id": 5,
          "rationale": null,
          "role_archetype": null,
          "slug": "cybersecurity-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Programming Languages for Data Work",
        "id": 21,
        "rationale": "Languages used to implement data pipelines, transformations, and operational glue. This is the primary coding surface for building ingestion, enrichment, and automation logic in data engineering.",
        "slug": "programming-languages-for-data-work",
        "source": "db"
      },
      "input_skill": "Python",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Data Engineer",
          "id": 2,
          "rationale": null,
          "role_archetype": null,
          "slug": "data-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Programming Languages for ML Systems",
        "id": 39,
        "rationale": "Languages used to build training code, inference services, evaluation jobs, and ML glue code. This is the primary implementation surface for ML engineers across experimentation and productionization.",
        "slug": "programming-languages-for-ml-systems",
        "source": "db"
      },
      "input_skill": "Python",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "ML Engineer",
          "id": 3,
          "rationale": null,
          "role_archetype": null,
          "slug": "ml-engineer",
          "source": "db"
        },
        {
          "display_name": "MLOps Engineer",
          "id": 16,
          "rationale": null,
          "role_archetype": null,
          "slug": "ml-ops-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Programming Languages for XR",
        "id": 97,
        "rationale": "Primary implementation languages used to build immersive client features, interaction logic, and device-specific runtime behavior. This is the core coding surface for AR/VR experiences.",
        "slug": "programming-languages-for-xr",
        "source": "db"
      },
      "input_skill": "Python",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "AR/VR Engineer",
          "id": 8,
          "rationale": null,
          "role_archetype": null,
          "slug": "ar-vr-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Python Programming",
        "id": 290,
        "rationale": "Core Python language skills used to implement backend business logic, request handlers, integrations, and service internals. This is the primary coding surface for the role.",
        "slug": "python-programming",
        "source": "db"
      },
      "input_skill": "Python",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Python Backend Developer",
          "id": 80,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "python-backend-developer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Asynchronous Messaging and Event Streaming",
        "id": 297,
        "rationale": "Asynchronous communication patterns and broker technologies used to decouple backend services and move work off the request path. Includes queues, pub/sub, event streams, consumer groups, dead-letter queues, and delivery semantics across systems such as Kafka, RabbitMQ, NATS, SQS/SNS, Pulsar, and ActiveMQ.",
        "slug": "asynchronous-messaging-and-event-streaming",
        "source": "db"
      },
      "input_skill": "Kafka",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": ".NET Backend Developer",
          "id": 83,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "dotnet-backend-developer",
          "source": "db"
        },
        {
          "display_name": "Go Backend Developer",
          "id": 81,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "go-backend-developer",
          "source": "db"
        },
        {
          "display_name": "Kotlin Backend Developer",
          "id": 84,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "kotlin-server-backend-developer",
          "source": "db"
        },
        {
          "display_name": "Node.js Backend Developer",
          "id": 82,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "node-backend-developer",
          "source": "db"
        },
        {
          "display_name": "Scala Backend Developer",
          "id": 87,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "scala-backend-developer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Messaging and Background Jobs",
        "id": 291,
        "rationale": "Asynchronous processing patterns and worker systems used to decouple backend work from request handling. This is a coherent cluster because the role supports background jobs, retries, and deferred processing.",
        "slug": "messaging-and-background-jobs",
        "source": "db"
      },
      "input_skill": "Kafka",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "PHP Backend Developer",
          "id": 86,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "php-backend-developer",
          "source": "db"
        },
        {
          "display_name": "Python Backend Developer",
          "id": 80,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "python-backend-developer",
          "source": "db"
        },
        {
          "display_name": "Ruby Backend Developer",
          "id": 85,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "ruby-backend-developer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Messaging and Event Streaming",
        "id": 8,
        "rationale": "Transport-layer systems used to move events and decouple producers from consumers. Data engineers use these systems to ingest, buffer, and distribute event data before downstream processing.",
        "slug": "messaging-and-event-streaming",
        "source": "db"
      },
      "input_skill": "Kafka",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Backend Developer",
          "id": 1,
          "rationale": null,
          "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
          "slug": "backend-engineer",
          "source": "db"
        },
        {
          "display_name": "Data Engineer",
          "id": 2,
          "rationale": null,
          "role_archetype": null,
          "slug": "data-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "ETL and ELT Tooling",
        "id": 24,
        "rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
        "slug": "etl-and-elt-tooling",
        "source": "db"
      },
      "input_skill": "Apache Spark",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Data Engineer",
          "id": 2,
          "rationale": null,
          "role_archetype": null,
          "slug": "data-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Workflow Orchestration for ML Pipelines",
        "id": 54,
        "rationale": "Workflow engines used to coordinate training, evaluation, deployment, and retraining jobs. This cluster covers dependencies, retries, scheduling, and pipeline composition for ML lifecycle automation.",
        "slug": "workflow-orchestration-for-ml-pipelines",
        "source": "db"
      },
      "input_skill": "Airflow",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "ML Engineer",
          "id": 3,
          "rationale": null,
          "role_archetype": null,
          "slug": "ml-engineer",
          "source": "db"
        },
        {
          "display_name": "MLOps Engineer",
          "id": 16,
          "rationale": null,
          "role_archetype": null,
          "slug": "ml-ops-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "NoSQL Databases",
        "id": 19,
        "rationale": "Models and manages data using non-relational database systems.",
        "slug": "nosql-databases",
        "source": "db"
      },
      "input_skill": "MongoDB",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Backend Developer",
          "id": 1,
          "rationale": null,
          "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
          "slug": "backend-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "NoSQL Databases",
        "id": 19,
        "rationale": "Models and manages data using non-relational database systems.",
        "slug": "nosql-databases",
        "source": "db"
      },
      "input_skill": "DynamoDB",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Backend Developer",
          "id": 1,
          "rationale": null,
          "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
          "slug": "backend-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Cloud Storage and Data Services",
        "id": 144,
        "rationale": "Cloud-native storage and managed data services used to place workloads, choose durability tiers, and define platform boundaries. This is a coherent cluster because architects evaluate storage fit, access patterns, and managed service tradeoffs.",
        "slug": "cloud-storage-and-data-services",
        "source": "db"
      },
      "input_skill": "HBase",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Cloud Architect",
          "id": 9,
          "rationale": null,
          "role_archetype": null,
          "slug": "cloud-architect",
          "source": "db"
        }
      ]
    }
  ],
  "input_final_skills": [
    "Python",
    "Kafka",
    "Apache Spark",
    "Google Dataflow",
    "Airflow",
    "Bigtable",
    "MongoDB",
    "DynamoDB",
    "HBase",
    "Scrapy",
    "Selenium",
    "Beautiful Soup"
  ],
  "input_llm_skills": [
    "Python",
    "Kafka",
    "Apache Spark",
    "Google Dataflow",
    "Airflow",
    "Bigtable",
    "MongoDB",
    "DynamoDB",
    "HBase",
    "Scrapy",
    "Selenium",
    "Beautiful Soup"
  ],
  "new_aliases_persisted": 0,
  "run_id": "bae69e34-7cea-4d40-93b7-a7eea576fb78",
  "skills_detail": [
    {
      "aliases_in_db": [
        {
          "alias_text": "Python",
          "alias_type": "CANONICAL",
          "id": 67,
          "is_primary": true,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Python 2",
          "alias_type": "VERSION",
          "id": 72,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Python 2.x",
          "alias_type": "VERSION",
          "id": 74,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Python 3",
          "alias_type": "VERSION",
          "id": 73,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Python 3.10",
          "alias_type": "VERSION",
          "id": 76,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Python 3.11",
          "alias_type": "VERSION",
          "id": 77,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Python 3.12",
          "alias_type": "VERSION",
          "id": 78,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Python 3.x",
          "alias_type": "VERSION",
          "id": 75,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "py",
          "alias_type": "VERSION",
          "id": 2183,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "py2",
          "alias_type": "VERSION",
          "id": 68,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "py3",
          "alias_type": "VERSION",
          "id": 69,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "python 3",
          "alias_type": "VERSION",
          "id": 2186,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "python 3.x",
          "alias_type": "VERSION",
          "id": 2849,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "python2",
          "alias_type": "VERSION",
          "id": 70,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "python3",
          "alias_type": "VERSION",
          "id": 71,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "python3.x",
          "alias_type": "VERSION",
          "id": 2848,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 6,
        "display_name": "Python",
        "id": 5,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "LANGUAGE",
        "slug": "python",
        "sub_category_id": 96,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Cloud Security Scripting \u0026 DSL Languages",
            "id": 248,
            "rationale": "Proficiency in programming and domain-specific languages used to automate and script cloud security controls.",
            "slug": "cloud-security-scripting-dsl-languages",
            "source": "db"
          },
          "input_skill": "Python",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Cloud Security Engineer",
              "id": 23,
              "rationale": null,
              "role_archetype": null,
              "slug": "cloud-security-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Programming Languages",
            "id": 1,
            "rationale": "Primary implementation languages used to build client and server feature code. Full stack engineers need enough fluency to move across layers and implement product behavior end to end.",
            "slug": "programming-languages",
            "source": "db"
          },
          "input_skill": "Python",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Backend Developer",
              "id": 1,
              "rationale": null,
              "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
              "slug": "backend-engineer",
              "source": "db"
            },
            {
              "display_name": "Fullstack Developer",
              "id": 15,
              "rationale": null,
              "role_archetype": null,
              "slug": "full-stack-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Programming Languages and Scripting",
            "id": 59,
            "rationale": "Languages used to write security automation, analysis scripts, detection logic, and remediation helpers. This is the primary implementation surface for a cybersecurity engineer across tooling and response workflows.",
            "slug": "programming-languages-and-scripting",
            "source": "db"
          },
          "input_skill": "Python",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Cyber Security Engineer",
              "id": 5,
              "rationale": null,
              "role_archetype": null,
              "slug": "cybersecurity-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Programming Languages for Data Work",
            "id": 21,
            "rationale": "Languages used to implement data pipelines, transformations, and operational glue. This is the primary coding surface for building ingestion, enrichment, and automation logic in data engineering.",
            "slug": "programming-languages-for-data-work",
            "source": "db"
          },
          "input_skill": "Python",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Data Engineer",
              "id": 2,
              "rationale": null,
              "role_archetype": null,
              "slug": "data-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Programming Languages for ML Systems",
            "id": 39,
            "rationale": "Languages used to build training code, inference services, evaluation jobs, and ML glue code. This is the primary implementation surface for ML engineers across experimentation and productionization.",
            "slug": "programming-languages-for-ml-systems",
            "source": "db"
          },
          "input_skill": "Python",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "ML Engineer",
              "id": 3,
              "rationale": null,
              "role_archetype": null,
              "slug": "ml-engineer",
              "source": "db"
            },
            {
              "display_name": "MLOps Engineer",
              "id": 16,
              "rationale": null,
              "role_archetype": null,
              "slug": "ml-ops-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Programming Languages for XR",
            "id": 97,
            "rationale": "Primary implementation languages used to build immersive client features, interaction logic, and device-specific runtime behavior. This is the core coding surface for AR/VR experiences.",
            "slug": "programming-languages-for-xr",
            "source": "db"
          },
          "input_skill": "Python",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "AR/VR Engineer",
              "id": 8,
              "rationale": null,
              "role_archetype": null,
              "slug": "ar-vr-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Python Programming",
            "id": 290,
            "rationale": "Core Python language skills used to implement backend business logic, request handlers, integrations, and service internals. This is the primary coding surface for the role.",
            "slug": "python-programming",
            "source": "db"
          },
          "input_skill": "Python",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Python Backend Developer",
              "id": 80,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "python-backend-developer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "Python",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "Kafka",
          "alias_type": "CANONICAL",
          "id": 173,
          "is_primary": true,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 3,
        "display_name": "Kafka",
        "id": 36,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "TOOL",
        "slug": "kafka",
        "sub_category_id": 3533,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Asynchronous Messaging and Event Streaming",
            "id": 297,
            "rationale": "Asynchronous communication patterns and broker technologies used to decouple backend services and move work off the request path. Includes queues, pub/sub, event streams, consumer groups, dead-letter queues, and delivery semantics across systems such as Kafka, RabbitMQ, NATS, SQS/SNS, Pulsar, and ActiveMQ.",
            "slug": "asynchronous-messaging-and-event-streaming",
            "source": "db"
          },
          "input_skill": "Kafka",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": ".NET Backend Developer",
              "id": 83,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "dotnet-backend-developer",
              "source": "db"
            },
            {
              "display_name": "Go Backend Developer",
              "id": 81,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "go-backend-developer",
              "source": "db"
            },
            {
              "display_name": "Kotlin Backend Developer",
              "id": 84,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "kotlin-server-backend-developer",
              "source": "db"
            },
            {
              "display_name": "Node.js Backend Developer",
              "id": 82,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "node-backend-developer",
              "source": "db"
            },
            {
              "display_name": "Scala Backend Developer",
              "id": 87,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "scala-backend-developer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Messaging and Background Jobs",
            "id": 291,
            "rationale": "Asynchronous processing patterns and worker systems used to decouple backend work from request handling. This is a coherent cluster because the role supports background jobs, retries, and deferred processing.",
            "slug": "messaging-and-background-jobs",
            "source": "db"
          },
          "input_skill": "Kafka",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "PHP Backend Developer",
              "id": 86,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "php-backend-developer",
              "source": "db"
            },
            {
              "display_name": "Python Backend Developer",
              "id": 80,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "python-backend-developer",
              "source": "db"
            },
            {
              "display_name": "Ruby Backend Developer",
              "id": 85,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "ruby-backend-developer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Messaging and Event Streaming",
            "id": 8,
            "rationale": "Transport-layer systems used to move events and decouple producers from consumers. Data engineers use these systems to ingest, buffer, and distribute event data before downstream processing.",
            "slug": "messaging-and-event-streaming",
            "source": "db"
          },
          "input_skill": "Kafka",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Backend Developer",
              "id": 1,
              "rationale": null,
              "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
              "slug": "backend-engineer",
              "source": "db"
            },
            {
              "display_name": "Data Engineer",
              "id": 2,
              "rationale": null,
              "role_archetype": null,
              "slug": "data-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "Kafka",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "Apache Spark",
          "alias_type": "CANONICAL",
          "id": 2004,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "apache spark 3",
          "alias_type": "VERSION",
          "id": 2006,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "spark",
          "alias_type": "VERSION",
          "id": 2510,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "spark 3",
          "alias_type": "VERSION",
          "id": 2007,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "spark 3.x",
          "alias_type": "VERSION",
          "id": 2009,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "spark3",
          "alias_type": "VERSION",
          "id": 2008,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 5,
        "display_name": "Apache Spark",
        "id": 1350,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "FRAMEWORK",
        "slug": "apache-spark",
        "sub_category_id": 1021,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "ETL and ELT Tooling",
            "id": 24,
            "rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
            "slug": "etl-and-elt-tooling",
            "source": "db"
          },
          "input_skill": "Apache Spark",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Data Engineer",
              "id": 2,
              "rationale": null,
              "role_archetype": null,
              "slug": "data-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "Apache Spark",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Google Dataflow",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Data Engineering Tools",
          "skill_nature": "PLATFORM",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "google-dataflow",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "Airflow",
          "alias_type": "CANONICAL",
          "id": 526,
          "is_primary": true,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "airflow 2",
          "alias_type": "VERSION",
          "id": 2477,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "airflow-2",
          "alias_type": "VERSION",
          "id": 2478,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "airflow2",
          "alias_type": "VERSION",
          "id": 2476,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "airflow2.x",
          "alias_type": "VERSION",
          "id": 2479,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "apache airflow 2",
          "alias_type": "VERSION",
          "id": 2480,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 13,
        "display_name": "Airflow",
        "id": 265,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "TOOL",
        "slug": "airflow",
        "sub_category_id": 130,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Workflow Orchestration for ML Pipelines",
            "id": 54,
            "rationale": "Workflow engines used to coordinate training, evaluation, deployment, and retraining jobs. This cluster covers dependencies, retries, scheduling, and pipeline composition for ML lifecycle automation.",
            "slug": "workflow-orchestration-for-ml-pipelines",
            "source": "db"
          },
          "input_skill": "Airflow",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "ML Engineer",
              "id": 3,
              "rationale": null,
              "role_archetype": null,
              "slug": "ml-engineer",
              "source": "db"
            },
            {
              "display_name": "MLOps Engineer",
              "id": 16,
              "rationale": null,
              "role_archetype": null,
              "slug": "ml-ops-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "Airflow",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Bigtable",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Databases",
          "skill_nature": "TOOL",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "bigtable",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "MongoDB",
          "alias_type": "CANONICAL",
          "id": 232,
          "is_primary": true,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "MongoDB 2.0",
          "alias_type": "VERSION",
          "id": 238,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "MongoDB 2.2",
          "alias_type": "VERSION",
          "id": 239,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "MongoDB 2.4",
          "alias_type": "VERSION",
          "id": 240,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "MongoDB 2.6",
          "alias_type": "VERSION",
          "id": 241,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "MongoDB 3.0",
          "alias_type": "VERSION",
          "id": 242,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "MongoDB 3.2",
          "alias_type": "VERSION",
          "id": 243,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "MongoDB 3.4",
          "alias_type": "VERSION",
          "id": 244,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "MongoDB 3.6",
          "alias_type": "VERSION",
          "id": 245,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "MongoDB 4",
          "alias_type": "VERSION",
          "id": 233,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "MongoDB 4.0",
          "alias_type": "VERSION",
          "id": 246,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "MongoDB 4.2",
          "alias_type": "VERSION",
          "id": 247,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "MongoDB 4.4",
          "alias_type": "VERSION",
          "id": 248,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "MongoDB 5",
          "alias_type": "VERSION",
          "id": 234,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "MongoDB 5.0",
          "alias_type": "VERSION",
          "id": 249,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "MongoDB 6",
          "alias_type": "VERSION",
          "id": 235,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "MongoDB 6.0",
          "alias_type": "VERSION",
          "id": 250,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "MongoDB 7",
          "alias_type": "VERSION",
          "id": 236,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "MongoDB 7.0",
          "alias_type": "VERSION",
          "id": 251,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "MongoDB 8",
          "alias_type": "VERSION",
          "id": 237,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "MongoDB 8.0",
          "alias_type": "VERSION",
          "id": 252,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 3,
        "display_name": "MongoDB",
        "id": 91,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "TOOL",
        "slug": "mongodb",
        "sub_category_id": 27,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "NoSQL Databases",
            "id": 19,
            "rationale": "Models and manages data using non-relational database systems.",
            "slug": "nosql-databases",
            "source": "db"
          },
          "input_skill": "MongoDB",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Backend Developer",
              "id": 1,
              "rationale": null,
              "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
              "slug": "backend-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "MongoDB",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "Amazon DynamoDB",
          "alias_type": "CANONICAL",
          "id": 254,
          "is_primary": true,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 11,
        "display_name": "Amazon DynamoDB",
        "id": 93,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "CLOUD_SERVICE",
        "slug": "amazon-dynamodb",
        "sub_category_id": 55,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "NoSQL Databases",
            "id": 19,
            "rationale": "Models and manages data using non-relational database systems.",
            "slug": "nosql-databases",
            "source": "db"
          },
          "input_skill": "DynamoDB",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Backend Developer",
              "id": 1,
              "rationale": null,
              "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
              "slug": "backend-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "DynamoDB",
      "matched_via": "embedding_display_name",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "HBase",
          "alias_type": "CANONICAL",
          "id": 2011,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 3,
        "display_name": "HBase",
        "id": 1352,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "TOOL",
        "slug": "hbase",
        "sub_category_id": 31,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Cloud Storage and Data Services",
            "id": 144,
            "rationale": "Cloud-native storage and managed data services used to place workloads, choose durability tiers, and define platform boundaries. This is a coherent cluster because architects evaluate storage fit, access patterns, and managed service tradeoffs.",
            "slug": "cloud-storage-and-data-services",
            "source": "db"
          },
          "input_skill": "HBase",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Cloud Architect",
              "id": 9,
              "rationale": null,
              "role_archetype": null,
              "slug": "cloud-architect",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "HBase",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Scrapy",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Web Frameworks",
          "skill_nature": "TOOL",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "scrapy",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Selenium",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Testing Tools",
          "skill_nature": "TOOL",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "selenium",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Beautiful Soup",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Web Frameworks",
          "skill_nature": "TOOL",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "beautiful-soup",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    }
  ],
  "unmatched_skills": [
    "Google Dataflow",
    "Bigtable",
    "Scrapy",
    "Selenium",
    "Beautiful Soup"
  ]
}
API 3 — final-role-output
{
  "chosen_role": {
    "display_name": "Data Engineer",
    "id": 2,
    "rationale": "Domain=Data Engineering \u0026 Analytics; The JD centers on Python-based data onboarding and collection with Kafka, Spark, Dataflow, Airflow, NoSQL databases, and data cleansing/normalization, which best fits a Data Engineer role.",
    "role_archetype": null,
    "slug": "data-engineer",
    "source": "db"
  },
  "chosen_role_resolution": "in_db",
  "final_input_skills": [
    {
      "skill": "Python",
      "tag": "in_db"
    },
    {
      "skill": "Kafka",
      "tag": "in_db"
    },
    {
      "skill": "Apache Spark",
      "tag": "in_db"
    },
    {
      "skill": "Google Dataflow",
      "tag": "new"
    },
    {
      "skill": "Airflow",
      "tag": "in_db"
    },
    {
      "skill": "Bigtable",
      "tag": "new"
    },
    {
      "skill": "MongoDB",
      "tag": "in_db"
    },
    {
      "skill": "DynamoDB",
      "tag": "in_db"
    },
    {
      "skill": "HBase",
      "tag": "in_db"
    },
    {
      "skill": "Scrapy",
      "tag": "new"
    },
    {
      "skill": "Selenium",
      "tag": "new"
    },
    {
      "skill": "Beautiful Soup",
      "tag": "new"
    }
  ],
  "llm_cost_api1_usd": null,
  "llm_cost_api2_usd": null,
  "llm_cost_api3_usd": null,
  "llm_cost_total_usd": null,
  "persistence": {
    "items": [
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Cloud Security Scripting \u0026 DSL Languages",
          "id": 248,
          "rationale": "Proficiency in programming and domain-specific languages used to automate and script cloud security controls.",
          "slug": "cloud-security-scripting-dsl-languages",
          "source": "db"
        },
        "dimension_id": 248,
        "input_skill": "Python",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Cloud Security Engineer",
            "id": 23,
            "rationale": null,
            "role_archetype": null,
            "slug": "cloud-security-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 5,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Programming Languages",
          "id": 1,
          "rationale": "Primary implementation languages used to build client and server feature code. Full stack engineers need enough fluency to move across layers and implement product behavior end to end.",
          "slug": "programming-languages",
          "source": "db"
        },
        "dimension_id": 1,
        "input_skill": "Python",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Backend Developer",
            "id": 1,
            "rationale": null,
            "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
            "slug": "backend-engineer",
            "source": "db"
          },
          {
            "display_name": "Fullstack Developer",
            "id": 15,
            "rationale": null,
            "role_archetype": null,
            "slug": "full-stack-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 5,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Programming Languages and Scripting",
          "id": 59,
          "rationale": "Languages used to write security automation, analysis scripts, detection logic, and remediation helpers. This is the primary implementation surface for a cybersecurity engineer across tooling and response workflows.",
          "slug": "programming-languages-and-scripting",
          "source": "db"
        },
        "dimension_id": 59,
        "input_skill": "Python",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Cyber Security Engineer",
            "id": 5,
            "rationale": null,
            "role_archetype": null,
            "slug": "cybersecurity-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 5,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Programming Languages for Data Work",
          "id": 21,
          "rationale": "Languages used to implement data pipelines, transformations, and operational glue. This is the primary coding surface for building ingestion, enrichment, and automation logic in data engineering.",
          "slug": "programming-languages-for-data-work",
          "source": "db"
        },
        "dimension_id": 21,
        "input_skill": "Python",
        "llm_role": null,
        "matched_chosen_role": true,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension saved",
        "role_dimension_saved": true,
        "roles_from_db": [
          {
            "display_name": "Data Engineer",
            "id": 2,
            "rationale": null,
            "role_archetype": null,
            "slug": "data-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 5,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Programming Languages for ML Systems",
          "id": 39,
          "rationale": "Languages used to build training code, inference services, evaluation jobs, and ML glue code. This is the primary implementation surface for ML engineers across experimentation and productionization.",
          "slug": "programming-languages-for-ml-systems",
          "source": "db"
        },
        "dimension_id": 39,
        "input_skill": "Python",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "ML Engineer",
            "id": 3,
            "rationale": null,
            "role_archetype": null,
            "slug": "ml-engineer",
            "source": "db"
          },
          {
            "display_name": "MLOps Engineer",
            "id": 16,
            "rationale": null,
            "role_archetype": null,
            "slug": "ml-ops-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 5,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Programming Languages for XR",
          "id": 97,
          "rationale": "Primary implementation languages used to build immersive client features, interaction logic, and device-specific runtime behavior. This is the core coding surface for AR/VR experiences.",
          "slug": "programming-languages-for-xr",
          "source": "db"
        },
        "dimension_id": 97,
        "input_skill": "Python",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "AR/VR Engineer",
            "id": 8,
            "rationale": null,
            "role_archetype": null,
            "slug": "ar-vr-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 5,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Python Programming",
          "id": 290,
          "rationale": "Core Python language skills used to implement backend business logic, request handlers, integrations, and service internals. This is the primary coding surface for the role.",
          "slug": "python-programming",
          "source": "db"
        },
        "dimension_id": 290,
        "input_skill": "Python",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Python Backend Developer",
            "id": 80,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "python-backend-developer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 5,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Asynchronous Messaging and Event Streaming",
          "id": 297,
          "rationale": "Asynchronous communication patterns and broker technologies used to decouple backend services and move work off the request path. Includes queues, pub/sub, event streams, consumer groups, dead-letter queues, and delivery semantics across systems such as Kafka, RabbitMQ, NATS, SQS/SNS, Pulsar, and ActiveMQ.",
          "slug": "asynchronous-messaging-and-event-streaming",
          "source": "db"
        },
        "dimension_id": 297,
        "input_skill": "Kafka",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": ".NET Backend Developer",
            "id": 83,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "dotnet-backend-developer",
            "source": "db"
          },
          {
            "display_name": "Go Backend Developer",
            "id": 81,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "go-backend-developer",
            "source": "db"
          },
          {
            "display_name": "Kotlin Backend Developer",
            "id": 84,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "kotlin-server-backend-developer",
            "source": "db"
          },
          {
            "display_name": "Node.js Backend Developer",
            "id": 82,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "node-backend-developer",
            "source": "db"
          },
          {
            "display_name": "Scala Backend Developer",
            "id": 87,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "scala-backend-developer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 36,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Messaging and Background Jobs",
          "id": 291,
          "rationale": "Asynchronous processing patterns and worker systems used to decouple backend work from request handling. This is a coherent cluster because the role supports background jobs, retries, and deferred processing.",
          "slug": "messaging-and-background-jobs",
          "source": "db"
        },
        "dimension_id": 291,
        "input_skill": "Kafka",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "PHP Backend Developer",
            "id": 86,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "php-backend-developer",
            "source": "db"
          },
          {
            "display_name": "Python Backend Developer",
            "id": 80,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "python-backend-developer",
            "source": "db"
          },
          {
            "display_name": "Ruby Backend Developer",
            "id": 85,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "ruby-backend-developer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 36,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Messaging and Event Streaming",
          "id": 8,
          "rationale": "Transport-layer systems used to move events and decouple producers from consumers. Data engineers use these systems to ingest, buffer, and distribute event data before downstream processing.",
          "slug": "messaging-and-event-streaming",
          "source": "db"
        },
        "dimension_id": 8,
        "input_skill": "Kafka",
        "llm_role": null,
        "matched_chosen_role": true,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension saved",
        "role_dimension_saved": true,
        "roles_from_db": [
          {
            "display_name": "Backend Developer",
            "id": 1,
            "rationale": null,
            "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
            "slug": "backend-engineer",
            "source": "db"
          },
          {
            "display_name": "Data Engineer",
            "id": 2,
            "rationale": null,
            "role_archetype": null,
            "slug": "data-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 36,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "ETL and ELT Tooling",
          "id": 24,
          "rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
          "slug": "etl-and-elt-tooling",
          "source": "db"
        },
        "dimension_id": 24,
        "input_skill": "Apache Spark",
        "llm_role": null,
        "matched_chosen_role": true,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension saved",
        "role_dimension_saved": true,
        "roles_from_db": [
          {
            "display_name": "Data Engineer",
            "id": 2,
            "rationale": null,
            "role_archetype": null,
            "slug": "data-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 1350,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Workflow Orchestration for ML Pipelines",
          "id": 54,
          "rationale": "Workflow engines used to coordinate training, evaluation, deployment, and retraining jobs. This cluster covers dependencies, retries, scheduling, and pipeline composition for ML lifecycle automation.",
          "slug": "workflow-orchestration-for-ml-pipelines",
          "source": "db"
        },
        "dimension_id": 54,
        "input_skill": "Airflow",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "ML Engineer",
            "id": 3,
            "rationale": null,
            "role_archetype": null,
            "slug": "ml-engineer",
            "source": "db"
          },
          {
            "display_name": "MLOps Engineer",
            "id": 16,
            "rationale": null,
            "role_archetype": null,
            "slug": "ml-ops-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 265,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "NoSQL Databases",
          "id": 19,
          "rationale": "Models and manages data using non-relational database systems.",
          "slug": "nosql-databases",
          "source": "db"
        },
        "dimension_id": 19,
        "input_skill": "MongoDB",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Backend Developer",
            "id": 1,
            "rationale": null,
            "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
            "slug": "backend-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 91,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "NoSQL Databases",
          "id": 19,
          "rationale": "Models and manages data using non-relational database systems.",
          "slug": "nosql-databases",
          "source": "db"
        },
        "dimension_id": 19,
        "input_skill": "DynamoDB",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Skipped \u2014 no persistable v3 meta for new skill",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Backend Developer",
            "id": 1,
            "rationale": null,
            "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
            "slug": "backend-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": false,
        "skill_id": null,
        "skill_tag": "new",
        "skipped_reason": "skill_not_in_db_v3_proposed"
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Cloud Storage and Data Services",
          "id": 144,
          "rationale": "Cloud-native storage and managed data services used to place workloads, choose durability tiers, and define platform boundaries. This is a coherent cluster because architects evaluate storage fit, access patterns, and managed service tradeoffs.",
          "slug": "cloud-storage-and-data-services",
          "source": "db"
        },
        "dimension_id": 144,
        "input_skill": "HBase",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Cloud Architect",
            "id": 9,
            "rationale": null,
            "role_archetype": null,
            "slug": "cloud-architect",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 1352,
        "skill_tag": "in_db",
        "skipped_reason": null
      }
    ],
    "new_skills_created": 0,
    "role_dimension_saved": 0,
    "skill_dimension_saved": 0,
    "skipped": 1
  },
  "planner_output": null,
  "run_id": "bae69e34-7cea-4d40-93b7-a7eea576fb78"
}

LLM Calls

Every model call made for this run, in pipeline order. Click a card to see the model's response.

Loading…