← Back to history

Pipeline run

4dab0597-abaf-4f4b-a68c-d1493be026db

Pipeline LLM cost (USD)
API 1: $0.0037 API 2: $0.0003 API 3: $0.0000 Total: $0.0040

Client output enrichment

v2 Skill cluster · Nature of work · AI index · Tech stack maturity · Evidence · KRA description
role baseline loaded sources · ai_index: jd · nature_of_work: jd · tech_stack_maturity: jd
Nature of work · Data pipeline development
Designs and maintains Python/Databricks/PySpark data pipelines and ETL flows, cleans and validates data for quality/integrity, and troubleshoots issues while working with cross-functional teams.
"Create and maintain data pipelines to ensure efficient data flow."
Tech stack maturity
Modern Cloud Native
Databricks and Python are strongly associated with contemporary cloud-based data engineering platforms and modern distributed processing workflows.
AI index (0 = no AI use, 5 = totally AI-dependent · v2.1)
0.50 / 5
· Title match
Has AI skill
· AI skill (primary)
AI skill (secondary)
· On AI team
· Builds AI products
vocab breakdown (legacy)
Assistants (×1):
Frameworks (×2):
Models / concepts (×3): Machine Learning
Evidence — skills matched in JD (14)
Python Databricks PySpark Tableau Power BI Machine Learning Linear Regression Logistic Regression Decision Trees Clustering Data Visualization ETL Data Pipelines Data Quality
Skill cluster (5 dimension groups, role-scoped)
BI and Visualization Tools
Tableau Power BI
AI Governance and Model Security
Machine Learning
Performance and Cost Optimization
Clustering
Programming Languages for Data Work
Python
Cross-cutting / unaligned
Databricks PySpark Linear Regression Logistic Regression Decision Trees Data Visualization ETL Data Pipelines Data Quality
Show KRA description ↓
As a Data Engineer, you will design, develop, and maintain data solutions for data generation, collection, and processing. You will create data pipelines, ensure data quality, and implement ETL processes to migrate and deploy data across systems. Your typical day will involve designing and developing data solutions, collaborating with cross-functional teams, and ensuring data integrity and quality. - Expected to perform independently and become an SME. - Required active participation/contribution in team discussions. - Contribute in providing solutions to work-related problems. - Design and develop data solutions for data generation, collection, and processing. - Create and maintain data pipelines to ensure efficient data flow. - Implement ETL processes to migrate and deploy data across systems. - Collaborate with cross-functional teams to understand data requirements and provide technical expertise. - Ensure data quality and integrity by implementing data validation and cleansing techniques. - Optimize data solutions for performance and scalability. - Troubleshoot and resolve data-related issues and provide technical support. - Stay updated with industry trends and best practices in data engineering. - Train and mentor junior data engineers to enhance their skills and knowledge. - Must To Have Skills: Proficiency in Python (Programming Language). - Experience with Databricks Unified Data Analytics Platform. - Experience with PySpark. - Strong understanding of statistical analysis and machine learning algorithms. - Experience with data visualization tools such as Tableau or Power BI. - Hands-on implementing various machine learning algorithms such as linear regression, logistic regression, decision trees, and clustering algorithms. - Solid grasp of data munging techniques, including data cleaning, transformation, and normalization to ensure data quality and integrity.

Signals

Skill data-engineer
0.33
Alias data-engineer
1.00
KRA data-engineer
0.66

Post-classification

Centroidupdated · n=369
Alias collision log
New-role queue
New skills captured8
New KRA captured

Captured for admin review

PySpark primary Data Engineer pending
Linear Regression Data Engineer pending
Logistic Regression Data Engineer pending
Decision Trees Data Engineer pending
Data Visualization Data Engineer pending
ETL Data Engineer pending
Data Pipelines Data Engineer pending
Data Quality Data Engineer pending
Status: completed Created: 2026-05-27T15:57:04.888589Z Updated: 2026-06-12T15:42:55.485556Z API 3 duration: 38797 ms
Flow Current 3-step pipeline

1 POST /skills/extract-from-jd

2 POST /skills/extract-details

3 POST /skills/final-role-output

Role Chosen role & resolution

Data Engineer

CASE A

slug: data-engineer · id: 2 · source: db

Exact alias hit on data-engineer (1.0) — no other alias at this confidence; skill_top data-engineer 0.33 does not contradict

Resolution: in_db — role exists in library; skill↔dim and role↔dim links saved when applicable.

0
New skills
0
Skill↔dim saved
0
Role↔dim saved
2
Skipped

Job description

Project Role : Data Engineer

Project Role Description : Design, develop and maintain data solutions for data generation, collection, and processing. Create data pipelines, ensure data quality, and implement ETL (extract, transform and load) processes to migrate and deploy data across systems.

Must have skills : Python (Programming Language)

Good to have skills : Databricks Unified Data Analytics Platform, PySpark

Minimum 3 Year(s) Of Experience Is Required

Educational Qualification : 15 years full time education

Summary: As a Data Engineer, you will design, develop, and maintain data solutions for data generation, collection, and processing. You will create data pipelines, ensure data quality, and implement ETL processes to migrate and deploy data across systems. Your typical day will involve designing and developing data solutions, collaborating with cross-functional teams, and ensuring data integrity and quality. Roles & Responsibilities: - Expected to perform independently and become an SME. - Required active participation/contribution in team discussions. - Contribute in providing solutions to work-related problems. - Design and develop data solutions for data generation, collection, and processing. - Create and maintain data pipelines to ensure efficient data flow. - Implement ETL processes to migrate and deploy data across systems. - Collaborate with cross-functional teams to understand data requirements and provide technical expertise. - Ensure data quality and integrity by implementing data validation and cleansing techniques. - Optimize data solutions for performance and scalability. - Troubleshoot and resolve data-related issues and provide technical support. - Stay updated with industry trends and best practices in data engineering. - Train and mentor junior data engineers to enhance their skills and knowledge. Professional & Technical Skills: - Must To Have Skills: Proficiency in Python (Programming Language). - Experience with Databricks Unified Data Analytics Platform. - Experience with PySpark. - Strong understanding of statistical analysis and machine learning algorithms. - Experience with data visualization tools such as Tableau or Power BI. - Hands-on implementing various machine learning algorithms such as linear regression, logistic regression, decision trees, and clustering algorithms. - Solid grasp of data munging techniques, including data cleaning, transformation, and normalization to ensure data quality and integrity. Additional Information: - The candidate should have a minimum of 3 years of experience in Python (Programming Language). - This position is based at our Bengaluru office. - A 15 years full-time education is required.

15 years full time education

Skills from this JD

Each row merges API 1 extraction, API 2 library match / v3 orchestration (dimensions + locked dims), and API 3 persistence tags.

Python Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: Python id=5 · python

Aliases — catalog

  • Python (CANONICAL) primary
  • Python 2 (VERSION)
  • Python 2.x (VERSION)
  • Python 3 (VERSION)
  • Python 3.10 (VERSION)
  • Python 3.11 (VERSION)
  • Python 3.12 (VERSION)
  • Python 3.x (VERSION)
  • py (VERSION)
  • py2 (VERSION)
  • py3 (VERSION)
  • python 3 (VERSION)
  • python 3.x (VERSION)
  • python2 (VERSION)
  • python3 (VERSION)
  • python3.x (VERSION)

Context tags (catalog)

API Django FastAPI Flask Jupyter NumPy PEP 8 Pandas REST SQLAlchemy asyncio pandas pip pytest type hints venv virtualenv

Stored enrichment (catalog DB)

Category
Language
Sub-category
Programming Language
Vendor
PSF
License
mit
Year introduced
1991
Confidence
0.99
Version strategy
SEPARATE_ENTITY
Version tag
3

Maturity reasoning: Python appears in a very high volume of job descriptions across data, backend, automation, and ML roles, and remains a default hiring-pipeline language on major job boards and tech stacks.

Skill profile (library / DB)

Skill nature
LANGUAGE
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
6
Sub-category id
96
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • Cloud Security Scripting & DSL Languages Catalog dimension db id 248

    Library dimension (catalog)

    Roles linked in library: Cloud Security Engineer

  • Programming Languages Catalog dimension db id 1

    Library dimension (catalog)

    Roles linked in library: Backend Developer, Fullstack Developer, Fullstack Developer

  • Programming Languages & DSLs Catalog dimension db id 475

    Library dimension (catalog)

    Roles linked in library: Engineering Manager

  • Programming Languages and Scripting Catalog dimension db id 59

    Library dimension (catalog)

    Roles linked in library: Cyber Security Engineer

  • Programming Languages for Data Work Catalog dimension db id 21

    Library dimension (catalog)

    Roles linked in library: Data Engineer

  • Programming Languages for ML Systems Catalog dimension db id 39

    Library dimension (catalog)

    Roles linked in library: ML Engineer, MLOps Engineer

  • Programming Languages for XR Catalog dimension db id 97

    Library dimension (catalog)

    Roles linked in library: AR/VR Engineer

  • Python Programming Catalog dimension db id 290

    Library dimension (catalog)

    Roles linked in library: Python Backend Developer

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
Cloud Security Scripting & DSL Languages
cloud-security-scripting-dsl-languages
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Programming Languages
programming-languages
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Programming Languages & DSLs
programming-languages-dsls
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Programming Languages and Scripting
programming-languages-and-scripting
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Programming Languages for Data Work
programming-languages-for-data-work
Existing dimension (library) · Role↔dimension saved
Programming Languages for ML Systems
programming-languages-for-ml-systems
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Programming Languages for XR
programming-languages-for-xr
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Python Programming
python-programming
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Databricks Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: Databricks id=1202 · databricks

Aliases — catalog

  • Databricks (CANONICAL)

Context tags (catalog)

Apache Spark Databricks Runtime Delta Lake MLflow SQL Analytics Spark cloud integration collaborative workspace data engineering data lakes data pipelines data visualization job scheduling machine learning notebooks real-time analytics

Stored enrichment (catalog DB)

Category
Platform
Sub-category
Data Analytics Platform
Vendor
Databricks, Inc.
License
other_open
Year introduced
2013
Confidence
0.97
Version strategy
NOT_APPLICABLE

Maturity reasoning: Databricks appears frequently in data engineering and analytics job postings, especially alongside Spark, Delta Lake, and lakehouse stacks; strong vendor adoption and broad enterprise usage signal mainstream demand.

Skill profile (library / DB)

Skill nature
PLATFORM
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
9
Sub-category id
911
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • React Frontend Development Catalog dimension db id 96

    Library dimension (catalog)

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
React Frontend Development
d_init_01
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
PySpark Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: Apache Spark id=1350 · apache-spark

Aliases — catalog

  • Apache Spark (CANONICAL)
  • apache spark 3 (VERSION)
  • spark (VERSION)
  • spark 3 (VERSION)
  • spark 3.x (VERSION)
  • spark3 (VERSION)

Context tags (catalog)

Apache Kafka Cluster Manager DAGScheduler Data Lake DataFrame ETL Hadoop MLlib Machine Learning PySpark RDD Scala Spark SQL Spark Streaming SparkSession

Stored enrichment (catalog DB)

Category
Framework
Sub-category
Distributed Data Processing Framework
Vendor
Apache Software Foundation
License
apache_2
Year introduced
2010
Confidence
0.94
Version strategy
SEPARATE_ENTITY
Version tag
3.x

Maturity reasoning: Apache Spark appears in many data engineering JDs and remains a standard for distributed ETL/ELT; its GitHub and vendor ecosystem activity stay strong, with Databricks and cloud platforms still promoting it.

Skill profile (library / DB)

Skill nature
FRAMEWORK
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
5
Sub-category id
1021
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • ETL and ELT Tooling Catalog dimension db id 24

    Library dimension (catalog)

    Roles linked in library: Data Engineer

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
ETL and ELT Tooling
etl-and-elt-tooling
Skipped — no persistable v3 meta for new skill
skill_not_in_db_v3_proposed
Tableau Secondary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: Tableau id=150 · tableau

Aliases — catalog

  • Tableau (CANONICAL) primary

Context tags (catalog)

LOD expressions Tableau Cloud Tableau Desktop Tableau Prep Tableau Server actions calculated fields dashboards data blending data visualization extracts filters parameters published data sources workbooks

Stored enrichment (catalog DB)

Category
Platform
Sub-category
Bi Analytics Platform
Vendor
Tableau Software
License
proprietary
Year introduced
2003
Confidence
0.96
Version strategy
NOT_APPLICABLE

Maturity reasoning: Tableau appears frequently in BI/data analyst job descriptions and remains a standard enterprise analytics platform with strong vendor support and broad adoption.

Skill profile (library / DB)

Skill nature
PLATFORM
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
9
Sub-category id
111
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • BI and Visualization Tools Catalog dimension db id 31

    Library dimension (catalog)

    Roles linked in library: Data Engineer

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
BI and Visualization Tools
bi-and-visualization-tools
Existing dimension (library) · Role↔dimension saved
Power BI Secondary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: Power BI id=151 · power-bi

Aliases — catalog

  • Power BI (CANONICAL) primary

Context tags (catalog)

Azure Synapse DAX DirectQuery Import mode M language Power Query RLS SQL Server SSAS dashboard data modeling data warehouse gateway reporting star schema

Stored enrichment (catalog DB)

Category
Platform
Sub-category
Bi Analytics Platform
Vendor
Microsoft
License
proprietary
Year introduced
2015
Confidence
0.96
Version strategy
NOT_APPLICABLE

Maturity reasoning: Power BI appears frequently in BI/data analyst job descriptions and is a standard Microsoft analytics platform in enterprise stacks, with strong vendor support and broad adoption.

Skill profile (library / DB)

Skill nature
PLATFORM
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
9
Sub-category id
111
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • BI and Visualization Tools Catalog dimension db id 31

    Library dimension (catalog)

    Roles linked in library: Data Engineer

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
BI and Visualization Tools
bi-and-visualization-tools
Existing dimension (library) · Role↔dimension saved
Machine Learning Secondary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: Machine Learning id=1356 · machine-learning

Aliases — catalog

  • Machine Learning (CANONICAL)

Context tags (catalog)

Keras PyTorch TensorFlow cross-validation data preprocessing ensemble methods feature engineering hyperparameter tuning model evaluation natural language processing neural networks reinforcement learning scikit-learn supervised learning unsupervised learning

Stored enrichment (catalog DB)

Category
Concept
Sub-category
Machine Learning
Confidence
0.98
Version strategy
NOT_APPLICABLE

Maturity reasoning: Machine Learning appears in large volumes of job descriptions across data, product, and platform roles, and major cloud vendors (AWS, Google Cloud, Azure) offer dedicated ML services and certifications, indicating broad adoption.

Skill profile (library / DB)

Skill nature
CONCEPT
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
2
Sub-category id
1024
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • AI Governance and Model Security Catalog dimension db id 50

    Library dimension (catalog)

    Roles linked in library: AI Engineer, ML Engineer, MLOps Engineer

  • React Frontend Development Catalog dimension db id 96

    Library dimension (catalog)

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
AI Governance and Model Security
ai-governance-and-model-security
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
React Frontend Development
d_init_01
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Linear Regression Secondary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Machine Learning Frameworks
Sub-category
general
Skill nature
CONCEPT
Volatility
STABLE
Typical lifespan
EVERGREEN
Version strategy
UNVERSIONED
Logistic Regression Secondary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Machine Learning Frameworks
Sub-category
general
Skill nature
CONCEPT
Volatility
STABLE
Typical lifespan
EVERGREEN
Version strategy
UNVERSIONED
Decision Trees Secondary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: Decision Tree id=1878 · decision-tree

Aliases — catalog

  • Decision Tree (CANONICAL) primary

Context tags (catalog)

CART Gini impurity classification cross-validation ensemble methods entropy feature importance leaf nodes max depth overfitting pruning random forest regression scikit-learn split criteria

Stored enrichment (catalog DB)

Category
Concept
Sub-category
Decision Tree Concept
Confidence
0.96
Version strategy
NOT_APPLICABLE

Maturity reasoning: Decision trees are a standard ML concept in many job descriptions and curricula; they’re a common baseline model in scikit-learn/XGBoost workflows and widely taught across data science roles.

Skill profile (library / DB)

Skill nature
CONCEPT
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
2
Sub-category id
1423
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • Business Rules and Decision Logic Catalog dimension db id 252

    Library dimension (catalog)

    Roles linked in library: Pega Developer

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
Business Rules and Decision Logic
business-rules-and-decision-logic
Skipped — no persistable v3 meta for new skill
skill_not_in_db_v3_proposed
Clustering Secondary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: clustering id=162 · clustering

Aliases — catalog

  • clustering (CANONICAL) primary
  • Clustering (CANONICAL)

Context tags (catalog)

DBSCAN Gaussian mixture Gaussian mixture model PCA agglomerative centroid cluster analysis clustering algorithms data partitioning dendrogram density-based dimensionality reduction distance metric elbow method feature extraction feature scaling hierarchical hierarchical clustering k-means outlier detection silhouette score spectral clustering unsupervised learning

Stored enrichment (catalog DB)

Category
Concept
Sub-category
Distributed Systems Concept
Confidence
0.72
Version strategy
NOT_APPLICABLE

Maturity reasoning: Clustering is a standard distributed-systems concept and appears broadly in JDs for databases, Kubernetes, and load-balanced services; vendor docs for AWS, Kubernetes, and PostgreSQL all treat clustering as a common production pattern.

Skill profile (library / DB)

Skill nature
CONCEPT
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
2
Sub-category id
1053
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • Concurrency and Parallel Processing Catalog dimension db id 17

    Library dimension (catalog)

    Roles linked in library: Backend Developer, Java Backend Developer, Node.js Backend Developer, Ruby Backend Developer, Scala Backend Developer

  • Performance and Cost Optimization Catalog dimension db id 33

    Library dimension (catalog)

    Roles linked in library: Data Engineer

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
Concurrency and Parallel Processing
concurrency-and-parallel-processing
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Performance and Cost Optimization
performance-and-cost-optimization
Existing dimension (library) · Role↔dimension saved
Data Visualization Secondary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Data Engineering Tools
Sub-category
general
Skill nature
PRACTICE
Volatility
MEDIUM
Typical lifespan
MULTI_YEAR
Version strategy
UNVERSIONED
ETL Secondary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Data Engineering Tools
Sub-category
general
Skill nature
PRACTICE
Volatility
MEDIUM
Typical lifespan
MULTI_YEAR
Version strategy
UNVERSIONED
Data Pipelines Secondary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Data Engineering Tools
Sub-category
general
Skill nature
PRACTICE
Volatility
MEDIUM
Typical lifespan
MULTI_YEAR
Version strategy
UNVERSIONED
Data Quality Secondary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Data Engineering Tools
Sub-category
general
Skill nature
PRACTICE
Volatility
MEDIUM
Typical lifespan
MULTI_YEAR
Version strategy
UNVERSIONED

All API 3 persistence rows

Same grid as the skill-extractor “Persistence items” table: one row per (skill × dimension) work item.

Skill Tag Dimension Skill↔dim Role↔dim Outcome Notes
Python in_db
Cloud Security Scripting & DSL Languages
cloud-security-scripting-dsl-languages
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Python in_db
Programming Languages
programming-languages
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Python in_db
Programming Languages & DSLs
programming-languages-dsls
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Python in_db
Programming Languages and Scripting
programming-languages-and-scripting
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Python in_db
Programming Languages for Data Work
programming-languages-for-data-work
Existing dimension (library) · Role↔dimension saved
Python in_db
Programming Languages for ML Systems
programming-languages-for-ml-systems
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Python in_db
Programming Languages for XR
programming-languages-for-xr
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Python in_db
Python Programming
python-programming
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Databricks in_db
React Frontend Development
d_init_01
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
PySpark new
ETL and ELT Tooling
etl-and-elt-tooling
Skipped — no persistable v3 meta for new skill skill_not_in_db_v3_proposed
Tableau in_db
BI and Visualization Tools
bi-and-visualization-tools
Existing dimension (library) · Role↔dimension saved
Power BI in_db
BI and Visualization Tools
bi-and-visualization-tools
Existing dimension (library) · Role↔dimension saved
Machine Learning in_db
AI Governance and Model Security
ai-governance-and-model-security
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Machine Learning in_db
React Frontend Development
d_init_01
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Decision Trees new
Business Rules and Decision Logic
business-rules-and-decision-logic
Skipped — no persistable v3 meta for new skill skill_not_in_db_v3_proposed
Clustering in_db
Concurrency and Parallel Processing
concurrency-and-parallel-processing
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Clustering in_db
Performance and Cost Optimization
performance-and-cost-optimization
Existing dimension (library) · Role↔dimension saved

Library artifacts (this run)

Kind Detail DB id
canonical_skill_proposed Linear Regression | type=Machine Learning Frameworks subtype=general nature=CONCEPT lifespan=EVERGREEN
canonical_skill_proposed Logistic Regression | type=Machine Learning Frameworks subtype=general nature=CONCEPT lifespan=EVERGREEN
canonical_skill_proposed Data Visualization | type=Data Engineering Tools subtype=general nature=PRACTICE lifespan=MULTI_YEAR
canonical_skill_proposed ETL | type=Data Engineering Tools subtype=general nature=PRACTICE lifespan=MULTI_YEAR
canonical_skill_proposed Data Pipelines | type=Data Engineering Tools subtype=general nature=PRACTICE lifespan=MULTI_YEAR
canonical_skill_proposed Data Quality | type=Data Engineering Tools subtype=general nature=PRACTICE lifespan=MULTI_YEAR
dimension_skill_link_proposed PySpark ↔ ETL and ELT Tooling
role_dimension_link_proposed Data Engineer ↔ ETL and ELT Tooling
dimension_skill_link_proposed Decision Trees ↔ Business Rules and Decision Logic
nano JD Parser — gpt-4.1-nano click to toggle
RoleData Engineer
ExperienceMinimum 3 Year(s) Of Experience Is Required
DomainOther
Location Bengaluru, India
JD type pass
Show raw JSON
{
  "JD_type": "pass",
  "about_company": null,
  "certifications": [],
  "company_name": null,
  "ctc": null,
  "domain": {
    "primary": {
      "aliases": [],
      "domain": "Other"
    },
    "secondary": null
  },
  "education": [
    {
      "level": "Bachelor\u0027s",
      "qualification": "Bachelor\u0027s - Any Discipline",
      "raw": "15 years full time education",
      "requirement": "required"
    }
  ],
  "experience": {
    "max": null,
    "min": 3,
    "raw": "Minimum 3 Year(s) Of Experience Is Required"
  },
  "job_locations": [
    {
      "aliases": [
        "Bangalore"
      ],
      "city": "Bengaluru",
      "country": "India",
      "state": null,
      "work_mode": null
    }
  ],
  "role": "Data Engineer",
  "role_aliases": [
    "Data Engineer",
    "Data Developer",
    "ETL Developer"
  ],
  "role_archetype": "Data",
  "roles_and_responsibilities": [
    {
      "bullet_count": 0,
      "heading": "Summary",
      "heading_was_present": true,
      "source_marker": {
        "first_5_words": "As a Data Engineer, you",
        "last_5_words": "data integrity and quality."
      },
      "text": "As a Data Engineer, you will design, develop, and maintain data solutions for data generation, collection, and processing. You will create data pipelines, ensure data quality, and implement ETL processes to migrate and deploy data across systems. Your typical day will involve designing and developing data solutions, collaborating with cross-functional teams, and ensuring data integrity and quality.",
      "word_count": 54
    },
    {
      "bullet_count": 12,
      "heading": "Roles \u0026 Responsibilities",
      "heading_was_present": true,
      "source_marker": {
        "first_5_words": "Expected to perform independently and",
        "last_5_words": "enhance their skills and knowledge."
      },
      "text": "- Expected to perform independently and become an SME.\n- Required active participation/contribution in team discussions.\n- Contribute in providing solutions to work-related problems.\n- Design and develop data solutions for data generation, collection, and processing.\n- Create and maintain data pipelines to ensure efficient data flow.\n- Implement ETL processes to migrate and deploy data across systems.\n- Collaborate with cross-functional teams to understand data requirements and provide technical expertise.\n- Ensure data quality and integrity by implementing data validation and cleansing techniques.\n- Optimize data solutions for performance and scalability.\n- Troubleshoot and resolve data-related issues and provide technical support.\n- Stay updated with industry trends and best practices in data engineering.\n- Train and mentor junior data engineers to enhance their skills and knowledge.",
      "word_count": 139
    },
    {
      "bullet_count": 7,
      "heading": "Professional \u0026 Technical Skills",
      "heading_was_present": true,
      "source_marker": {
        "first_5_words": "Must To Have Skills: Proficiency",
        "last_5_words": "quality and integrity."
      },
      "text": "- Must To Have Skills: Proficiency in Python (Programming Language).\n- Experience with Databricks Unified Data Analytics Platform.\n- Experience with PySpark.\n- Strong understanding of statistical analysis and machine learning algorithms.\n- Experience with data visualization tools such as Tableau or Power BI.\n- Hands-on implementing various machine learning algorithms such as linear regression, logistic regression, decision trees, and clustering algorithms.\n- Solid grasp of data munging techniques, including data cleaning, transformation, and normalization to ensure data quality and integrity.",
      "word_count": 104
    }
  ],
  "urls": []
}
API 1 — extract-from-jd click to toggle
{
  "final_skills": [
    {
      "is_primary": true,
      "skill_name": "Python"
    },
    {
      "is_primary": true,
      "skill_name": "Databricks"
    },
    {
      "is_primary": true,
      "skill_name": "PySpark"
    },
    {
      "is_primary": false,
      "skill_name": "Tableau"
    },
    {
      "is_primary": false,
      "skill_name": "Power BI"
    },
    {
      "is_primary": false,
      "skill_name": "Machine Learning"
    },
    {
      "is_primary": false,
      "skill_name": "Linear Regression"
    },
    {
      "is_primary": false,
      "skill_name": "Logistic Regression"
    },
    {
      "is_primary": false,
      "skill_name": "Decision Trees"
    },
    {
      "is_primary": false,
      "skill_name": "Clustering"
    },
    {
      "is_primary": false,
      "skill_name": "Data Visualization"
    },
    {
      "is_primary": false,
      "skill_name": "ETL"
    },
    {
      "is_primary": false,
      "skill_name": "Data Pipelines"
    },
    {
      "is_primary": false,
      "skill_name": "Data Quality"
    }
  ],
  "jd_role": {
    "display_name": "Data Engineer",
    "rationale": null,
    "role_aliases": [
      "Data Engineer",
      "Data Developer",
      "ETL Developer"
    ],
    "role_archetype": "Data",
    "slug": ""
  },
  "nano_parsed": {
    "JD_type": "pass",
    "about_company": null,
    "certifications": [],
    "company_name": null,
    "ctc": null,
    "domain": {
      "primary": {
        "aliases": [],
        "domain": "Other"
      },
      "secondary": null
    },
    "education": [
      {
        "level": "Bachelor\u0027s",
        "qualification": "Bachelor\u0027s - Any Discipline",
        "raw": "15 years full time education",
        "requirement": "required"
      }
    ],
    "experience": {
      "max": null,
      "min": 3,
      "raw": "Minimum 3 Year(s) Of Experience Is Required"
    },
    "job_locations": [
      {
        "aliases": [
          "Bangalore"
        ],
        "city": "Bengaluru",
        "country": "India",
        "state": null,
        "work_mode": null
      }
    ],
    "role": "Data Engineer",
    "role_aliases": [
      "Data Engineer",
      "Data Developer",
      "ETL Developer"
    ],
    "role_archetype": "Data",
    "roles_and_responsibilities": [
      {
        "bullet_count": 0,
        "heading": "Summary",
        "heading_was_present": true,
        "source_marker": {
          "first_5_words": "As a Data Engineer, you",
          "last_5_words": "data integrity and quality."
        },
        "text": "As a Data Engineer, you will design, develop, and maintain data solutions for data generation, collection, and processing. You will create data pipelines, ensure data quality, and implement ETL processes to migrate and deploy data across systems. Your typical day will involve designing and developing data solutions, collaborating with cross-functional teams, and ensuring data integrity and quality.",
        "word_count": 54
      },
      {
        "bullet_count": 12,
        "heading": "Roles \u0026 Responsibilities",
        "heading_was_present": true,
        "source_marker": {
          "first_5_words": "Expected to perform independently and",
          "last_5_words": "enhance their skills and knowledge."
        },
        "text": "- Expected to perform independently and become an SME.\n- Required active participation/contribution in team discussions.\n- Contribute in providing solutions to work-related problems.\n- Design and develop data solutions for data generation, collection, and processing.\n- Create and maintain data pipelines to ensure efficient data flow.\n- Implement ETL processes to migrate and deploy data across systems.\n- Collaborate with cross-functional teams to understand data requirements and provide technical expertise.\n- Ensure data quality and integrity by implementing data validation and cleansing techniques.\n- Optimize data solutions for performance and scalability.\n- Troubleshoot and resolve data-related issues and provide technical support.\n- Stay updated with industry trends and best practices in data engineering.\n- Train and mentor junior data engineers to enhance their skills and knowledge.",
        "word_count": 139
      },
      {
        "bullet_count": 7,
        "heading": "Professional \u0026 Technical Skills",
        "heading_was_present": true,
        "source_marker": {
          "first_5_words": "Must To Have Skills: Proficiency",
          "last_5_words": "quality and integrity."
        },
        "text": "- Must To Have Skills: Proficiency in Python (Programming Language).\n- Experience with Databricks Unified Data Analytics Platform.\n- Experience with PySpark.\n- Strong understanding of statistical analysis and machine learning algorithms.\n- Experience with data visualization tools such as Tableau or Power BI.\n- Hands-on implementing various machine learning algorithms such as linear regression, logistic regression, decision trees, and clustering algorithms.\n- Solid grasp of data munging techniques, including data cleaning, transformation, and normalization to ensure data quality and integrity.",
        "word_count": 104
      }
    ],
    "urls": []
  },
  "rejected": false,
  "rejection_reason": null,
  "run_id": "4dab0597-abaf-4f4b-a68c-d1493be026db",
  "stage3_signals": {
    "alias_found": true,
    "alias_match_roles": [
      {
        "display_name": "Data Engineer",
        "kra_matches": null,
        "matched_count": null,
        "matched_skills": null,
        "role_id": 2,
        "score": 1.0,
        "slug": "data-engineer",
        "total_count": null
      }
    ],
    "kra_match_roles": [
      {
        "display_name": "Data Engineer",
        "kra_matches": [
          {
            "kra_text": "Implements data quality validation rules, reconciliation checks, and anomaly detection to ensure data completeness, accuracy, and consistency.",
            "sentence": "Ensure data quality and integrity by implementing data validation and cleansing techniques.",
            "similarity": 0.6956
          },
          {
            "kra_text": "Builds data ingestion pipelines to collect data from transactional databases, third-party APIs, event streams, and file sources into centralized data platforms.",
            "sentence": "Create and maintain data pipelines to ensure efficient data flow.",
            "similarity": 0.6413
          },
          {
            "kra_text": "Works with data analysts, data scientists, and business stakeholders to define data models, ingestion schedules, and data delivery requirements.",
            "sentence": "Collaborate with cross-functional teams to understand data requirements and provide technical expertise.",
            "similarity": 0.633
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 2,
        "score": 0.6566,
        "slug": "data-engineer",
        "total_count": null
      },
      {
        "display_name": "Flutter Developer",
        "kra_matches": [
          {
            "kra_text": "optimize responsiveness and performance",
            "sentence": "Optimize data solutions for performance and scalability.",
            "similarity": 0.5978
          },
          {
            "kra_text": "collaborate with design, product, and backend teams",
            "sentence": "Collaborate with cross-functional teams to understand data requirements and provide technical expertise.",
            "similarity": 0.5705
          },
          {
            "kra_text": "collaborate with design, product, and backend teams",
            "sentence": "Your typical day will involve designing and developing data solutions, collaborating with cross-functional teams, and ensuring data integrity and quality.",
            "similarity": 0.4843
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 74,
        "score": 0.5509,
        "slug": "flutter-developer",
        "total_count": null
      },
      {
        "display_name": "Java Backend Developer",
        "kra_matches": [
          {
            "kra_text": "backend performance tuning",
            "sentence": "Optimize data solutions for performance and scalability.",
            "similarity": 0.5766
          },
          {
            "kra_text": "persistence and data modeling",
            "sentence": "Solid grasp of data munging techniques, including data cleaning, transformation, and normalization to ensure data quality and integrity.",
            "similarity": 0.5155
          },
          {
            "kra_text": "persistence and data modeling",
            "sentence": "Create and maintain data pipelines to ensure efficient data flow.",
            "similarity": 0.4623
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 79,
        "score": 0.5182,
        "slug": "java-backend-developer",
        "total_count": null
      },
      {
        "display_name": "Backend Developer",
        "kra_matches": [
          {
            "kra_text": "Identifies and resolves backend performance bottlenecks through query optimization, indexing strategies, connection pooling, and distributed caching with Redis.",
            "sentence": "Optimize data solutions for performance and scalability.",
            "similarity": 0.577
          },
          {
            "kra_text": "Investigates and resolves production incidents, API bugs, and service degradation through root cause analysis, hotfixes, and post-mortems.",
            "sentence": "Troubleshoot and resolve data-related issues and provide technical support.",
            "similarity": 0.5215
          },
          {
            "kra_text": "Implements request validation, structured error handling, and input sanitization across backend services to ensure predictable and secure API behavior.",
            "sentence": "Ensure data quality and integrity by implementing data validation and cleansing techniques.",
            "similarity": 0.4493
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 1,
        "score": 0.5159,
        "slug": "backend-engineer",
        "total_count": null
      },
      {
        "display_name": "ML Engineer",
        "kra_matches": [
          {
            "kra_text": "Prepares, cleans, and transforms training datasets, manages feature stores, and builds feature engineering pipelines for model training.",
            "sentence": "Create and maintain data pipelines to ensure efficient data flow.",
            "similarity": 0.5173
          },
          {
            "kra_text": "Prepares, cleans, and transforms training datasets, manages feature stores, and builds feature engineering pipelines for model training.",
            "sentence": "You will create data pipelines, ensure data quality, and implement ETL processes to migrate and deploy data across systems.",
            "similarity": 0.4999
          },
          {
            "kra_text": "Prepares, cleans, and transforms training datasets, manages feature stores, and builds feature engineering pipelines for model training.",
            "sentence": "As a Data Engineer, you will design, develop, and maintain data solutions for data generation, collection, and processing.",
            "similarity": 0.4896
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 3,
        "score": 0.5023,
        "slug": "ml-engineer",
        "total_count": null
      }
    ],
    "skill_match_roles": [
      {
        "display_name": "Data Engineer",
        "kra_matches": null,
        "matched_count": 1,
        "matched_skills": [
          "Python"
        ],
        "role_id": 2,
        "score": 0.3333,
        "slug": "data-engineer",
        "total_count": 3
      },
      {
        "display_name": "ML Engineer",
        "kra_matches": null,
        "matched_count": 1,
        "matched_skills": [
          "Python"
        ],
        "role_id": 3,
        "score": 0.3333,
        "slug": "ml-engineer",
        "total_count": 3
      },
      {
        "display_name": "Cyber Security Engineer",
        "kra_matches": null,
        "matched_count": 1,
        "matched_skills": [
          "Python"
        ],
        "role_id": 5,
        "score": 0.3333,
        "slug": "cybersecurity-engineer",
        "total_count": 3
      },
      {
        "display_name": "AR/VR Engineer",
        "kra_matches": null,
        "matched_count": 1,
        "matched_skills": [
          "Python"
        ],
        "role_id": 8,
        "score": 0.3333,
        "slug": "ar-vr-engineer",
        "total_count": 3
      },
      {
        "display_name": "Backend Developer",
        "kra_matches": null,
        "matched_count": 1,
        "matched_skills": [
          "Python"
        ],
        "role_id": 1,
        "score": 0.3333,
        "slug": "backend-engineer",
        "total_count": 3
      }
    ]
  },
  "stage4_decision": {
    "alias_collision_detected": false,
    "case": "A",
    "chosen_role": {
      "display_name": "Data Engineer",
      "kra_matches": null,
      "matched_count": null,
      "matched_skills": null,
      "role_id": 2,
      "score": 1.0,
      "slug": "data-engineer",
      "total_count": null
    },
    "confidence": 1.0,
    "is_new_role": false,
    "llm2_fired": false,
    "llm2_reasoning": null,
    "matched_dimensions": [],
    "matched_kras": [],
    "matched_skills": [],
    "new_role_display_name": null,
    "new_role_slug": null,
    "queued": false,
    "reasoning": "Exact alias hit on data-engineer (1.0) \u2014 no other alias at this confidence; skill_top data-engineer 0.33 does not contradict",
    "sub_role": null
  },
  "stage5_updates": {
    "centroid_n_after": 369,
    "centroid_updated": true,
    "collision_log_id": null,
    "new_kra_attached": null,
    "new_skills_attached": [
      {
        "is_primary": true,
        "queue_id": 17441,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "PySpark",
        "status": "pending"
      },
      {
        "is_primary": false,
        "queue_id": 17442,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Linear Regression",
        "status": "pending"
      },
      {
        "is_primary": false,
        "queue_id": 17443,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Logistic Regression",
        "status": "pending"
      },
      {
        "is_primary": false,
        "queue_id": 17444,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Decision Trees",
        "status": "pending"
      },
      {
        "is_primary": false,
        "queue_id": 17445,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Data Visualization",
        "status": "pending"
      },
      {
        "is_primary": false,
        "queue_id": 17446,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "ETL",
        "status": "pending"
      },
      {
        "is_primary": false,
        "queue_id": 17447,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Data Pipelines",
        "status": "pending"
      },
      {
        "is_primary": false,
        "queue_id": 17448,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Data Quality",
        "status": "pending"
      }
    ],
    "queue_entry_id": null,
    "v3_pipeline_triggered": false,
    "v3_role_slug": null,
    "v3_run_id": null
  }
}
API 2 — extract-details
{
  "alias_matches": [
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 67,
      "existing_alias_text": "Python",
      "input_term": "Python",
      "matched_canonical": {
        "category_id": 6,
        "display_name": "Python",
        "id": 5,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "LANGUAGE",
        "slug": "python",
        "sub_category_id": 96,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 1838,
      "existing_alias_text": "Databricks",
      "input_term": "Databricks",
      "matched_canonical": {
        "category_id": 9,
        "display_name": "Databricks",
        "id": 1202,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "PLATFORM",
        "slug": "databricks",
        "sub_category_id": 911,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "TODO: REMOVE AFTER TESTING \u2014 alias DB write disabled",
      "alias_persisted": false,
      "existing_alias_id": 2004,
      "existing_alias_text": "Apache Spark",
      "input_term": "PySpark",
      "matched_canonical": {
        "category_id": 5,
        "display_name": "Apache Spark",
        "id": 1350,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "FRAMEWORK",
        "slug": "apache-spark",
        "sub_category_id": 1021,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "embedding_alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 359,
      "existing_alias_text": "Tableau",
      "input_term": "Tableau",
      "matched_canonical": {
        "category_id": 9,
        "display_name": "Tableau",
        "id": 150,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "PLATFORM",
        "slug": "tableau",
        "sub_category_id": 111,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 360,
      "existing_alias_text": "Power BI",
      "input_term": "Power BI",
      "matched_canonical": {
        "category_id": 9,
        "display_name": "Power BI",
        "id": 151,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "PLATFORM",
        "slug": "power-bi",
        "sub_category_id": 111,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 2015,
      "existing_alias_text": "Machine Learning",
      "input_term": "Machine Learning",
      "matched_canonical": {
        "category_id": 2,
        "display_name": "Machine Learning",
        "id": 1356,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "CONCEPT",
        "slug": "machine-learning",
        "sub_category_id": 1024,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "TODO: REMOVE AFTER TESTING \u2014 alias DB write disabled",
      "alias_persisted": false,
      "existing_alias_id": 2882,
      "existing_alias_text": "Decision Tree",
      "input_term": "Decision Trees",
      "matched_canonical": {
        "category_id": 2,
        "display_name": "Decision Tree",
        "id": 1878,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "CONCEPT",
        "slug": "decision-tree",
        "sub_category_id": 1423,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "embedding_alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 371,
      "existing_alias_text": "Clustering",
      "input_term": "Clustering",
      "matched_canonical": {
        "category_id": 2,
        "display_name": "clustering",
        "id": 162,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "CONCEPT",
        "slug": "clustering",
        "sub_category_id": 1053,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    }
  ],
  "candidate_roles": [
    {
      "display_name": "Cloud Security Engineer",
      "id": 23,
      "rationale": null,
      "role_archetype": null,
      "slug": "cloud-security-engineer",
      "source": "db"
    },
    {
      "display_name": "Backend Developer",
      "id": 1,
      "rationale": null,
      "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
      "slug": "backend-engineer",
      "source": "db"
    },
    {
      "display_name": "Fullstack Developer",
      "id": 15,
      "rationale": null,
      "role_archetype": null,
      "slug": "full-stack-engineer",
      "source": "db"
    },
    {
      "display_name": "Fullstack Developer",
      "id": 435,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "fullstack-developer",
      "source": "db"
    },
    {
      "display_name": "Engineering Manager",
      "id": 121,
      "rationale": null,
      "role_archetype": null,
      "slug": "engineering-manager",
      "source": "db"
    },
    {
      "display_name": "Cyber Security Engineer",
      "id": 5,
      "rationale": null,
      "role_archetype": null,
      "slug": "cybersecurity-engineer",
      "source": "db"
    },
    {
      "display_name": "Data Engineer",
      "id": 2,
      "rationale": null,
      "role_archetype": null,
      "slug": "data-engineer",
      "source": "db"
    },
    {
      "display_name": "ML Engineer",
      "id": 3,
      "rationale": null,
      "role_archetype": null,
      "slug": "ml-engineer",
      "source": "db"
    },
    {
      "display_name": "MLOps Engineer",
      "id": 16,
      "rationale": null,
      "role_archetype": null,
      "slug": "ml-ops-engineer",
      "source": "db"
    },
    {
      "display_name": "AR/VR Engineer",
      "id": 8,
      "rationale": null,
      "role_archetype": null,
      "slug": "ar-vr-engineer",
      "source": "db"
    },
    {
      "display_name": "Python Backend Developer",
      "id": 80,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "python-backend-developer",
      "source": "db"
    },
    {
      "display_name": "AI Engineer",
      "id": 13,
      "rationale": null,
      "role_archetype": null,
      "slug": "ai-engineer",
      "source": "db"
    },
    {
      "display_name": "Pega Developer",
      "id": 24,
      "rationale": null,
      "role_archetype": null,
      "slug": "pega-developer",
      "source": "db"
    },
    {
      "display_name": "Java Backend Developer",
      "id": 79,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "java-backend-developer",
      "source": "db"
    },
    {
      "display_name": "Node.js Backend Developer",
      "id": 82,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "node-backend-developer",
      "source": "db"
    },
    {
      "display_name": "Ruby Backend Developer",
      "id": 85,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "ruby-backend-developer",
      "source": "db"
    },
    {
      "display_name": "Scala Backend Developer",
      "id": 87,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "scala-backend-developer",
      "source": "db"
    }
  ],
  "chosen_role": {
    "display_name": "Data Engineer",
    "id": 2,
    "rationale": "Exact alias hit on data-engineer (1.0) \u2014 no other alias at this confidence; skill_top data-engineer 0.33 does not contradict",
    "role_archetype": null,
    "slug": "data-engineer",
    "source": "db"
  },
  "dimensions": [
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Cloud Security Scripting \u0026 DSL Languages",
        "id": 248,
        "rationale": "Proficiency in programming and domain-specific languages used to automate and script cloud security controls.",
        "slug": "cloud-security-scripting-dsl-languages",
        "source": "db"
      },
      "input_skill": "Python",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Cloud Security Engineer",
          "id": 23,
          "rationale": null,
          "role_archetype": null,
          "slug": "cloud-security-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Programming Languages",
        "id": 1,
        "rationale": "Primary implementation languages used to build client and server feature code. Full stack engineers need enough fluency to move across layers and implement product behavior end to end.",
        "slug": "programming-languages",
        "source": "db"
      },
      "input_skill": "Python",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Backend Developer",
          "id": 1,
          "rationale": null,
          "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
          "slug": "backend-engineer",
          "source": "db"
        },
        {
          "display_name": "Fullstack Developer",
          "id": 15,
          "rationale": null,
          "role_archetype": null,
          "slug": "full-stack-engineer",
          "source": "db"
        },
        {
          "display_name": "Fullstack Developer",
          "id": 435,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "fullstack-developer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Programming Languages \u0026 DSLs",
        "id": 475,
        "rationale": "Oversee and guide the selection and effective use of programming and domain\u2010specific languages in software projects.",
        "slug": "programming-languages-dsls",
        "source": "db"
      },
      "input_skill": "Python",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Engineering Manager",
          "id": 121,
          "rationale": null,
          "role_archetype": null,
          "slug": "engineering-manager",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Programming Languages and Scripting",
        "id": 59,
        "rationale": "Languages used to write security automation, analysis scripts, detection logic, and remediation helpers. This is the primary implementation surface for a cybersecurity engineer across tooling and response workflows.",
        "slug": "programming-languages-and-scripting",
        "source": "db"
      },
      "input_skill": "Python",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Cyber Security Engineer",
          "id": 5,
          "rationale": null,
          "role_archetype": null,
          "slug": "cybersecurity-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Programming Languages for Data Work",
        "id": 21,
        "rationale": "Languages used to implement data pipelines, transformations, and operational glue. This is the primary coding surface for building ingestion, enrichment, and automation logic in data engineering.",
        "slug": "programming-languages-for-data-work",
        "source": "db"
      },
      "input_skill": "Python",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Data Engineer",
          "id": 2,
          "rationale": null,
          "role_archetype": null,
          "slug": "data-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Programming Languages for ML Systems",
        "id": 39,
        "rationale": "Languages used to build training code, inference services, evaluation jobs, and ML glue code. This is the primary implementation surface for ML engineers across experimentation and productionization.",
        "slug": "programming-languages-for-ml-systems",
        "source": "db"
      },
      "input_skill": "Python",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "ML Engineer",
          "id": 3,
          "rationale": null,
          "role_archetype": null,
          "slug": "ml-engineer",
          "source": "db"
        },
        {
          "display_name": "MLOps Engineer",
          "id": 16,
          "rationale": null,
          "role_archetype": null,
          "slug": "ml-ops-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Programming Languages for XR",
        "id": 97,
        "rationale": "Primary implementation languages used to build immersive client features, interaction logic, and device-specific runtime behavior. This is the core coding surface for AR/VR experiences.",
        "slug": "programming-languages-for-xr",
        "source": "db"
      },
      "input_skill": "Python",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "AR/VR Engineer",
          "id": 8,
          "rationale": null,
          "role_archetype": null,
          "slug": "ar-vr-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Python Programming",
        "id": 290,
        "rationale": "Core Python language skills used to implement backend business logic, request handlers, integrations, and service internals. This is the primary coding surface for the role.",
        "slug": "python-programming",
        "source": "db"
      },
      "input_skill": "Python",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Python Backend Developer",
          "id": 80,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "python-backend-developer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "React Frontend Development",
        "id": 96,
        "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
        "slug": "d_init_01",
        "source": "db"
      },
      "input_skill": "Databricks",
      "llm_role": null,
      "roles_from_db": []
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "ETL and ELT Tooling",
        "id": 24,
        "rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
        "slug": "etl-and-elt-tooling",
        "source": "db"
      },
      "input_skill": "PySpark",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Data Engineer",
          "id": 2,
          "rationale": null,
          "role_archetype": null,
          "slug": "data-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "BI and Visualization Tools",
        "id": 31,
        "rationale": "Tools used to expose curated data to analysts and business users through dashboards, reports, and semantic exploration. Data engineers support these tools by shaping reliable datasets and performant models.",
        "slug": "bi-and-visualization-tools",
        "source": "db"
      },
      "input_skill": "Tableau",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Data Engineer",
          "id": 2,
          "rationale": null,
          "role_archetype": null,
          "slug": "data-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "BI and Visualization Tools",
        "id": 31,
        "rationale": "Tools used to expose curated data to analysts and business users through dashboards, reports, and semantic exploration. Data engineers support these tools by shaping reliable datasets and performant models.",
        "slug": "bi-and-visualization-tools",
        "source": "db"
      },
      "input_skill": "Power BI",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Data Engineer",
          "id": 2,
          "rationale": null,
          "role_archetype": null,
          "slug": "data-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "AI Governance and Model Security",
        "id": 50,
        "rationale": "Controls and documentation used to make models safer, auditable, and compliant. ML engineers use this to manage model risk, supply chain integrity, and governance requirements.",
        "slug": "ai-governance-and-model-security",
        "source": "db"
      },
      "input_skill": "Machine Learning",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "AI Engineer",
          "id": 13,
          "rationale": null,
          "role_archetype": null,
          "slug": "ai-engineer",
          "source": "db"
        },
        {
          "display_name": "ML Engineer",
          "id": 3,
          "rationale": null,
          "role_archetype": null,
          "slug": "ml-engineer",
          "source": "db"
        },
        {
          "display_name": "MLOps Engineer",
          "id": 16,
          "rationale": null,
          "role_archetype": null,
          "slug": "ml-ops-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "React Frontend Development",
        "id": 96,
        "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
        "slug": "d_init_01",
        "source": "db"
      },
      "input_skill": "Machine Learning",
      "llm_role": null,
      "roles_from_db": []
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Business Rules and Decision Logic",
        "id": 252,
        "rationale": "Configures the rule layer that drives eligibility, branching, calculations, and policy enforcement in Pega applications. These rules are a coherent cluster because they determine how cases behave without changing the broader application structure.",
        "slug": "business-rules-and-decision-logic",
        "source": "db"
      },
      "input_skill": "Decision Trees",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Pega Developer",
          "id": 24,
          "rationale": null,
          "role_archetype": null,
          "slug": "pega-developer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Concurrency and Parallel Processing",
        "id": 17,
        "rationale": "Programming techniques for handling multiple requests and background work safely and efficiently. Includes synchronization, async execution, and coordination of concurrent tasks.",
        "slug": "concurrency-and-parallel-processing",
        "source": "db"
      },
      "input_skill": "Clustering",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Backend Developer",
          "id": 1,
          "rationale": null,
          "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
          "slug": "backend-engineer",
          "source": "db"
        },
        {
          "display_name": "Java Backend Developer",
          "id": 79,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "java-backend-developer",
          "source": "db"
        },
        {
          "display_name": "Node.js Backend Developer",
          "id": 82,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "node-backend-developer",
          "source": "db"
        },
        {
          "display_name": "Ruby Backend Developer",
          "id": 85,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "ruby-backend-developer",
          "source": "db"
        },
        {
          "display_name": "Scala Backend Developer",
          "id": 87,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "scala-backend-developer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Performance and Cost Optimization",
        "id": 33,
        "rationale": "Techniques for improving the speed, reliability, and cost efficiency of data workloads. This includes query tuning, partitioning, file sizing, compute right-sizing, and workload management.",
        "slug": "performance-and-cost-optimization",
        "source": "db"
      },
      "input_skill": "Clustering",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Data Engineer",
          "id": 2,
          "rationale": null,
          "role_archetype": null,
          "slug": "data-engineer",
          "source": "db"
        }
      ]
    }
  ],
  "input_final_skills": [
    "Python",
    "Databricks",
    "PySpark",
    "Tableau",
    "Power BI",
    "Machine Learning",
    "Linear Regression",
    "Logistic Regression",
    "Decision Trees",
    "Clustering",
    "Data Visualization",
    "ETL",
    "Data Pipelines",
    "Data Quality"
  ],
  "input_llm_skills": [
    "Python",
    "Databricks",
    "PySpark",
    "Tableau",
    "Power BI",
    "Machine Learning",
    "Linear Regression",
    "Logistic Regression",
    "Decision Trees",
    "Clustering",
    "Data Visualization",
    "ETL",
    "Data Pipelines",
    "Data Quality"
  ],
  "new_aliases_persisted": 0,
  "run_id": "4dab0597-abaf-4f4b-a68c-d1493be026db",
  "skills_detail": [
    {
      "aliases_in_db": [
        {
          "alias_text": "Python",
          "alias_type": "CANONICAL",
          "id": 67,
          "is_primary": true,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Python 2",
          "alias_type": "VERSION",
          "id": 72,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Python 2.x",
          "alias_type": "VERSION",
          "id": 74,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Python 3",
          "alias_type": "VERSION",
          "id": 73,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Python 3.10",
          "alias_type": "VERSION",
          "id": 76,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Python 3.11",
          "alias_type": "VERSION",
          "id": 77,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Python 3.12",
          "alias_type": "VERSION",
          "id": 78,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Python 3.x",
          "alias_type": "VERSION",
          "id": 75,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "py",
          "alias_type": "VERSION",
          "id": 2183,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "py2",
          "alias_type": "VERSION",
          "id": 68,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "py3",
          "alias_type": "VERSION",
          "id": 69,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "python 3",
          "alias_type": "VERSION",
          "id": 2186,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "python 3.x",
          "alias_type": "VERSION",
          "id": 2849,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "python2",
          "alias_type": "VERSION",
          "id": 70,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "python3",
          "alias_type": "VERSION",
          "id": 71,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "python3.x",
          "alias_type": "VERSION",
          "id": 2848,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 6,
        "display_name": "Python",
        "id": 5,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "LANGUAGE",
        "slug": "python",
        "sub_category_id": 96,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Cloud Security Scripting \u0026 DSL Languages",
            "id": 248,
            "rationale": "Proficiency in programming and domain-specific languages used to automate and script cloud security controls.",
            "slug": "cloud-security-scripting-dsl-languages",
            "source": "db"
          },
          "input_skill": "Python",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Cloud Security Engineer",
              "id": 23,
              "rationale": null,
              "role_archetype": null,
              "slug": "cloud-security-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Programming Languages",
            "id": 1,
            "rationale": "Primary implementation languages used to build client and server feature code. Full stack engineers need enough fluency to move across layers and implement product behavior end to end.",
            "slug": "programming-languages",
            "source": "db"
          },
          "input_skill": "Python",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Backend Developer",
              "id": 1,
              "rationale": null,
              "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
              "slug": "backend-engineer",
              "source": "db"
            },
            {
              "display_name": "Fullstack Developer",
              "id": 15,
              "rationale": null,
              "role_archetype": null,
              "slug": "full-stack-engineer",
              "source": "db"
            },
            {
              "display_name": "Fullstack Developer",
              "id": 435,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "fullstack-developer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Programming Languages \u0026 DSLs",
            "id": 475,
            "rationale": "Oversee and guide the selection and effective use of programming and domain\u2010specific languages in software projects.",
            "slug": "programming-languages-dsls",
            "source": "db"
          },
          "input_skill": "Python",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Engineering Manager",
              "id": 121,
              "rationale": null,
              "role_archetype": null,
              "slug": "engineering-manager",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Programming Languages and Scripting",
            "id": 59,
            "rationale": "Languages used to write security automation, analysis scripts, detection logic, and remediation helpers. This is the primary implementation surface for a cybersecurity engineer across tooling and response workflows.",
            "slug": "programming-languages-and-scripting",
            "source": "db"
          },
          "input_skill": "Python",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Cyber Security Engineer",
              "id": 5,
              "rationale": null,
              "role_archetype": null,
              "slug": "cybersecurity-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Programming Languages for Data Work",
            "id": 21,
            "rationale": "Languages used to implement data pipelines, transformations, and operational glue. This is the primary coding surface for building ingestion, enrichment, and automation logic in data engineering.",
            "slug": "programming-languages-for-data-work",
            "source": "db"
          },
          "input_skill": "Python",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Data Engineer",
              "id": 2,
              "rationale": null,
              "role_archetype": null,
              "slug": "data-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Programming Languages for ML Systems",
            "id": 39,
            "rationale": "Languages used to build training code, inference services, evaluation jobs, and ML glue code. This is the primary implementation surface for ML engineers across experimentation and productionization.",
            "slug": "programming-languages-for-ml-systems",
            "source": "db"
          },
          "input_skill": "Python",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "ML Engineer",
              "id": 3,
              "rationale": null,
              "role_archetype": null,
              "slug": "ml-engineer",
              "source": "db"
            },
            {
              "display_name": "MLOps Engineer",
              "id": 16,
              "rationale": null,
              "role_archetype": null,
              "slug": "ml-ops-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Programming Languages for XR",
            "id": 97,
            "rationale": "Primary implementation languages used to build immersive client features, interaction logic, and device-specific runtime behavior. This is the core coding surface for AR/VR experiences.",
            "slug": "programming-languages-for-xr",
            "source": "db"
          },
          "input_skill": "Python",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "AR/VR Engineer",
              "id": 8,
              "rationale": null,
              "role_archetype": null,
              "slug": "ar-vr-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Python Programming",
            "id": 290,
            "rationale": "Core Python language skills used to implement backend business logic, request handlers, integrations, and service internals. This is the primary coding surface for the role.",
            "slug": "python-programming",
            "source": "db"
          },
          "input_skill": "Python",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Python Backend Developer",
              "id": 80,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "python-backend-developer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "Python",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "Databricks",
          "alias_type": "CANONICAL",
          "id": 1838,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 9,
        "display_name": "Databricks",
        "id": 1202,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "PLATFORM",
        "slug": "databricks",
        "sub_category_id": 911,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "React Frontend Development",
            "id": 96,
            "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
            "slug": "d_init_01",
            "source": "db"
          },
          "input_skill": "Databricks",
          "llm_role": null,
          "roles_from_db": []
        }
      ],
      "input_skill": "Databricks",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "Apache Spark",
          "alias_type": "CANONICAL",
          "id": 2004,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "apache spark 3",
          "alias_type": "VERSION",
          "id": 2006,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "spark",
          "alias_type": "VERSION",
          "id": 2510,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "spark 3",
          "alias_type": "VERSION",
          "id": 2007,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "spark 3.x",
          "alias_type": "VERSION",
          "id": 2009,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "spark3",
          "alias_type": "VERSION",
          "id": 2008,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 5,
        "display_name": "Apache Spark",
        "id": 1350,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "FRAMEWORK",
        "slug": "apache-spark",
        "sub_category_id": 1021,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "ETL and ELT Tooling",
            "id": 24,
            "rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
            "slug": "etl-and-elt-tooling",
            "source": "db"
          },
          "input_skill": "PySpark",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Data Engineer",
              "id": 2,
              "rationale": null,
              "role_archetype": null,
              "slug": "data-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "PySpark",
      "matched_via": "embedding_alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "Tableau",
          "alias_type": "CANONICAL",
          "id": 359,
          "is_primary": true,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 9,
        "display_name": "Tableau",
        "id": 150,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "PLATFORM",
        "slug": "tableau",
        "sub_category_id": 111,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "BI and Visualization Tools",
            "id": 31,
            "rationale": "Tools used to expose curated data to analysts and business users through dashboards, reports, and semantic exploration. Data engineers support these tools by shaping reliable datasets and performant models.",
            "slug": "bi-and-visualization-tools",
            "source": "db"
          },
          "input_skill": "Tableau",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Data Engineer",
              "id": 2,
              "rationale": null,
              "role_archetype": null,
              "slug": "data-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "Tableau",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "Power BI",
          "alias_type": "CANONICAL",
          "id": 360,
          "is_primary": true,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 9,
        "display_name": "Power BI",
        "id": 151,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "PLATFORM",
        "slug": "power-bi",
        "sub_category_id": 111,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "BI and Visualization Tools",
            "id": 31,
            "rationale": "Tools used to expose curated data to analysts and business users through dashboards, reports, and semantic exploration. Data engineers support these tools by shaping reliable datasets and performant models.",
            "slug": "bi-and-visualization-tools",
            "source": "db"
          },
          "input_skill": "Power BI",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Data Engineer",
              "id": 2,
              "rationale": null,
              "role_archetype": null,
              "slug": "data-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "Power BI",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "Machine Learning",
          "alias_type": "CANONICAL",
          "id": 2015,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 2,
        "display_name": "Machine Learning",
        "id": 1356,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "CONCEPT",
        "slug": "machine-learning",
        "sub_category_id": 1024,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "AI Governance and Model Security",
            "id": 50,
            "rationale": "Controls and documentation used to make models safer, auditable, and compliant. ML engineers use this to manage model risk, supply chain integrity, and governance requirements.",
            "slug": "ai-governance-and-model-security",
            "source": "db"
          },
          "input_skill": "Machine Learning",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "AI Engineer",
              "id": 13,
              "rationale": null,
              "role_archetype": null,
              "slug": "ai-engineer",
              "source": "db"
            },
            {
              "display_name": "ML Engineer",
              "id": 3,
              "rationale": null,
              "role_archetype": null,
              "slug": "ml-engineer",
              "source": "db"
            },
            {
              "display_name": "MLOps Engineer",
              "id": 16,
              "rationale": null,
              "role_archetype": null,
              "slug": "ml-ops-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "React Frontend Development",
            "id": 96,
            "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
            "slug": "d_init_01",
            "source": "db"
          },
          "input_skill": "Machine Learning",
          "llm_role": null,
          "roles_from_db": []
        }
      ],
      "input_skill": "Machine Learning",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Linear Regression",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Machine Learning Frameworks",
          "skill_nature": "CONCEPT",
          "sub_category": "general",
          "typical_lifespan": "EVERGREEN",
          "version_strategy": "UNVERSIONED",
          "volatility": "STABLE"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "linear-regression",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Logistic Regression",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Machine Learning Frameworks",
          "skill_nature": "CONCEPT",
          "sub_category": "general",
          "typical_lifespan": "EVERGREEN",
          "version_strategy": "UNVERSIONED",
          "volatility": "STABLE"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "logistic-regression",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "Decision Tree",
          "alias_type": "CANONICAL",
          "id": 2882,
          "is_primary": true,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 2,
        "display_name": "Decision Tree",
        "id": 1878,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "CONCEPT",
        "slug": "decision-tree",
        "sub_category_id": 1423,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Business Rules and Decision Logic",
            "id": 252,
            "rationale": "Configures the rule layer that drives eligibility, branching, calculations, and policy enforcement in Pega applications. These rules are a coherent cluster because they determine how cases behave without changing the broader application structure.",
            "slug": "business-rules-and-decision-logic",
            "source": "db"
          },
          "input_skill": "Decision Trees",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Pega Developer",
              "id": 24,
              "rationale": null,
              "role_archetype": null,
              "slug": "pega-developer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "Decision Trees",
      "matched_via": "embedding_alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "clustering",
          "alias_type": "CANONICAL",
          "id": 3841,
          "is_primary": true,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Clustering",
          "alias_type": "CANONICAL",
          "id": 371,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 2,
        "display_name": "clustering",
        "id": 162,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "CONCEPT",
        "slug": "clustering",
        "sub_category_id": 1053,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Concurrency and Parallel Processing",
            "id": 17,
            "rationale": "Programming techniques for handling multiple requests and background work safely and efficiently. Includes synchronization, async execution, and coordination of concurrent tasks.",
            "slug": "concurrency-and-parallel-processing",
            "source": "db"
          },
          "input_skill": "Clustering",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Backend Developer",
              "id": 1,
              "rationale": null,
              "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
              "slug": "backend-engineer",
              "source": "db"
            },
            {
              "display_name": "Java Backend Developer",
              "id": 79,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "java-backend-developer",
              "source": "db"
            },
            {
              "display_name": "Node.js Backend Developer",
              "id": 82,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "node-backend-developer",
              "source": "db"
            },
            {
              "display_name": "Ruby Backend Developer",
              "id": 85,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "ruby-backend-developer",
              "source": "db"
            },
            {
              "display_name": "Scala Backend Developer",
              "id": 87,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "scala-backend-developer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Performance and Cost Optimization",
            "id": 33,
            "rationale": "Techniques for improving the speed, reliability, and cost efficiency of data workloads. This includes query tuning, partitioning, file sizing, compute right-sizing, and workload management.",
            "slug": "performance-and-cost-optimization",
            "source": "db"
          },
          "input_skill": "Clustering",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Data Engineer",
              "id": 2,
              "rationale": null,
              "role_archetype": null,
              "slug": "data-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "Clustering",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Data Visualization",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Data Engineering Tools",
          "skill_nature": "PRACTICE",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "data-visualization",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "ETL",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Data Engineering Tools",
          "skill_nature": "PRACTICE",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "etl",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Data Pipelines",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Data Engineering Tools",
          "skill_nature": "PRACTICE",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "data-pipelines",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Data Quality",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Data Engineering Tools",
          "skill_nature": "PRACTICE",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "data-quality",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    }
  ],
  "unmatched_skills": [
    "Linear Regression",
    "Logistic Regression",
    "Data Visualization",
    "ETL",
    "Data Pipelines",
    "Data Quality"
  ]
}
API 3 — final-role-output
{
  "chosen_role": {
    "display_name": "Data Engineer",
    "id": 2,
    "rationale": "Exact alias hit on data-engineer (1.0) \u2014 no other alias at this confidence; skill_top data-engineer 0.33 does not contradict",
    "role_archetype": null,
    "slug": "data-engineer",
    "source": "db"
  },
  "chosen_role_resolution": "in_db",
  "final_input_skills": [
    {
      "skill": "Python",
      "tag": "in_db"
    },
    {
      "skill": "Databricks",
      "tag": "in_db"
    },
    {
      "skill": "PySpark",
      "tag": "in_db"
    },
    {
      "skill": "Tableau",
      "tag": "in_db"
    },
    {
      "skill": "Power BI",
      "tag": "in_db"
    },
    {
      "skill": "Machine Learning",
      "tag": "in_db"
    },
    {
      "skill": "Linear Regression",
      "tag": "new"
    },
    {
      "skill": "Logistic Regression",
      "tag": "new"
    },
    {
      "skill": "Decision Trees",
      "tag": "in_db"
    },
    {
      "skill": "Clustering",
      "tag": "in_db"
    },
    {
      "skill": "Data Visualization",
      "tag": "new"
    },
    {
      "skill": "ETL",
      "tag": "new"
    },
    {
      "skill": "Data Pipelines",
      "tag": "new"
    },
    {
      "skill": "Data Quality",
      "tag": "new"
    }
  ],
  "llm_cost_api1_usd": null,
  "llm_cost_api2_usd": null,
  "llm_cost_api3_usd": null,
  "llm_cost_total_usd": null,
  "persistence": {
    "items": [
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Cloud Security Scripting \u0026 DSL Languages",
          "id": 248,
          "rationale": "Proficiency in programming and domain-specific languages used to automate and script cloud security controls.",
          "slug": "cloud-security-scripting-dsl-languages",
          "source": "db"
        },
        "dimension_id": 248,
        "input_skill": "Python",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Cloud Security Engineer",
            "id": 23,
            "rationale": null,
            "role_archetype": null,
            "slug": "cloud-security-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 5,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Programming Languages",
          "id": 1,
          "rationale": "Primary implementation languages used to build client and server feature code. Full stack engineers need enough fluency to move across layers and implement product behavior end to end.",
          "slug": "programming-languages",
          "source": "db"
        },
        "dimension_id": 1,
        "input_skill": "Python",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Backend Developer",
            "id": 1,
            "rationale": null,
            "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
            "slug": "backend-engineer",
            "source": "db"
          },
          {
            "display_name": "Fullstack Developer",
            "id": 15,
            "rationale": null,
            "role_archetype": null,
            "slug": "full-stack-engineer",
            "source": "db"
          },
          {
            "display_name": "Fullstack Developer",
            "id": 435,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "fullstack-developer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 5,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Programming Languages \u0026 DSLs",
          "id": 475,
          "rationale": "Oversee and guide the selection and effective use of programming and domain\u2010specific languages in software projects.",
          "slug": "programming-languages-dsls",
          "source": "db"
        },
        "dimension_id": 475,
        "input_skill": "Python",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Engineering Manager",
            "id": 121,
            "rationale": null,
            "role_archetype": null,
            "slug": "engineering-manager",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 5,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Programming Languages and Scripting",
          "id": 59,
          "rationale": "Languages used to write security automation, analysis scripts, detection logic, and remediation helpers. This is the primary implementation surface for a cybersecurity engineer across tooling and response workflows.",
          "slug": "programming-languages-and-scripting",
          "source": "db"
        },
        "dimension_id": 59,
        "input_skill": "Python",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Cyber Security Engineer",
            "id": 5,
            "rationale": null,
            "role_archetype": null,
            "slug": "cybersecurity-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 5,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Programming Languages for Data Work",
          "id": 21,
          "rationale": "Languages used to implement data pipelines, transformations, and operational glue. This is the primary coding surface for building ingestion, enrichment, and automation logic in data engineering.",
          "slug": "programming-languages-for-data-work",
          "source": "db"
        },
        "dimension_id": 21,
        "input_skill": "Python",
        "llm_role": null,
        "matched_chosen_role": true,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension saved",
        "role_dimension_saved": true,
        "roles_from_db": [
          {
            "display_name": "Data Engineer",
            "id": 2,
            "rationale": null,
            "role_archetype": null,
            "slug": "data-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 5,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Programming Languages for ML Systems",
          "id": 39,
          "rationale": "Languages used to build training code, inference services, evaluation jobs, and ML glue code. This is the primary implementation surface for ML engineers across experimentation and productionization.",
          "slug": "programming-languages-for-ml-systems",
          "source": "db"
        },
        "dimension_id": 39,
        "input_skill": "Python",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "ML Engineer",
            "id": 3,
            "rationale": null,
            "role_archetype": null,
            "slug": "ml-engineer",
            "source": "db"
          },
          {
            "display_name": "MLOps Engineer",
            "id": 16,
            "rationale": null,
            "role_archetype": null,
            "slug": "ml-ops-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 5,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Programming Languages for XR",
          "id": 97,
          "rationale": "Primary implementation languages used to build immersive client features, interaction logic, and device-specific runtime behavior. This is the core coding surface for AR/VR experiences.",
          "slug": "programming-languages-for-xr",
          "source": "db"
        },
        "dimension_id": 97,
        "input_skill": "Python",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "AR/VR Engineer",
            "id": 8,
            "rationale": null,
            "role_archetype": null,
            "slug": "ar-vr-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 5,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Python Programming",
          "id": 290,
          "rationale": "Core Python language skills used to implement backend business logic, request handlers, integrations, and service internals. This is the primary coding surface for the role.",
          "slug": "python-programming",
          "source": "db"
        },
        "dimension_id": 290,
        "input_skill": "Python",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Python Backend Developer",
            "id": 80,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "python-backend-developer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 5,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "React Frontend Development",
          "id": 96,
          "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
          "slug": "d_init_01",
          "source": "db"
        },
        "dimension_id": 96,
        "input_skill": "Databricks",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [],
        "skill_dimension_saved": true,
        "skill_id": 1202,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "ETL and ELT Tooling",
          "id": 24,
          "rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
          "slug": "etl-and-elt-tooling",
          "source": "db"
        },
        "dimension_id": 24,
        "input_skill": "PySpark",
        "llm_role": null,
        "matched_chosen_role": true,
        "outcome_line": "Skipped \u2014 no persistable v3 meta for new skill",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Data Engineer",
            "id": 2,
            "rationale": null,
            "role_archetype": null,
            "slug": "data-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": false,
        "skill_id": null,
        "skill_tag": "new",
        "skipped_reason": "skill_not_in_db_v3_proposed"
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "BI and Visualization Tools",
          "id": 31,
          "rationale": "Tools used to expose curated data to analysts and business users through dashboards, reports, and semantic exploration. Data engineers support these tools by shaping reliable datasets and performant models.",
          "slug": "bi-and-visualization-tools",
          "source": "db"
        },
        "dimension_id": 31,
        "input_skill": "Tableau",
        "llm_role": null,
        "matched_chosen_role": true,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension saved",
        "role_dimension_saved": true,
        "roles_from_db": [
          {
            "display_name": "Data Engineer",
            "id": 2,
            "rationale": null,
            "role_archetype": null,
            "slug": "data-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 150,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "BI and Visualization Tools",
          "id": 31,
          "rationale": "Tools used to expose curated data to analysts and business users through dashboards, reports, and semantic exploration. Data engineers support these tools by shaping reliable datasets and performant models.",
          "slug": "bi-and-visualization-tools",
          "source": "db"
        },
        "dimension_id": 31,
        "input_skill": "Power BI",
        "llm_role": null,
        "matched_chosen_role": true,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension saved",
        "role_dimension_saved": true,
        "roles_from_db": [
          {
            "display_name": "Data Engineer",
            "id": 2,
            "rationale": null,
            "role_archetype": null,
            "slug": "data-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 151,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "AI Governance and Model Security",
          "id": 50,
          "rationale": "Controls and documentation used to make models safer, auditable, and compliant. ML engineers use this to manage model risk, supply chain integrity, and governance requirements.",
          "slug": "ai-governance-and-model-security",
          "source": "db"
        },
        "dimension_id": 50,
        "input_skill": "Machine Learning",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "AI Engineer",
            "id": 13,
            "rationale": null,
            "role_archetype": null,
            "slug": "ai-engineer",
            "source": "db"
          },
          {
            "display_name": "ML Engineer",
            "id": 3,
            "rationale": null,
            "role_archetype": null,
            "slug": "ml-engineer",
            "source": "db"
          },
          {
            "display_name": "MLOps Engineer",
            "id": 16,
            "rationale": null,
            "role_archetype": null,
            "slug": "ml-ops-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 1356,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "React Frontend Development",
          "id": 96,
          "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
          "slug": "d_init_01",
          "source": "db"
        },
        "dimension_id": 96,
        "input_skill": "Machine Learning",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [],
        "skill_dimension_saved": true,
        "skill_id": 1356,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Business Rules and Decision Logic",
          "id": 252,
          "rationale": "Configures the rule layer that drives eligibility, branching, calculations, and policy enforcement in Pega applications. These rules are a coherent cluster because they determine how cases behave without changing the broader application structure.",
          "slug": "business-rules-and-decision-logic",
          "source": "db"
        },
        "dimension_id": 252,
        "input_skill": "Decision Trees",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Skipped \u2014 no persistable v3 meta for new skill",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Pega Developer",
            "id": 24,
            "rationale": null,
            "role_archetype": null,
            "slug": "pega-developer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": false,
        "skill_id": null,
        "skill_tag": "new",
        "skipped_reason": "skill_not_in_db_v3_proposed"
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Concurrency and Parallel Processing",
          "id": 17,
          "rationale": "Programming techniques for handling multiple requests and background work safely and efficiently. Includes synchronization, async execution, and coordination of concurrent tasks.",
          "slug": "concurrency-and-parallel-processing",
          "source": "db"
        },
        "dimension_id": 17,
        "input_skill": "Clustering",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Backend Developer",
            "id": 1,
            "rationale": null,
            "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
            "slug": "backend-engineer",
            "source": "db"
          },
          {
            "display_name": "Java Backend Developer",
            "id": 79,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "java-backend-developer",
            "source": "db"
          },
          {
            "display_name": "Node.js Backend Developer",
            "id": 82,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "node-backend-developer",
            "source": "db"
          },
          {
            "display_name": "Ruby Backend Developer",
            "id": 85,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "ruby-backend-developer",
            "source": "db"
          },
          {
            "display_name": "Scala Backend Developer",
            "id": 87,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "scala-backend-developer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 162,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Performance and Cost Optimization",
          "id": 33,
          "rationale": "Techniques for improving the speed, reliability, and cost efficiency of data workloads. This includes query tuning, partitioning, file sizing, compute right-sizing, and workload management.",
          "slug": "performance-and-cost-optimization",
          "source": "db"
        },
        "dimension_id": 33,
        "input_skill": "Clustering",
        "llm_role": null,
        "matched_chosen_role": true,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension saved",
        "role_dimension_saved": true,
        "roles_from_db": [
          {
            "display_name": "Data Engineer",
            "id": 2,
            "rationale": null,
            "role_archetype": null,
            "slug": "data-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 162,
        "skill_tag": "in_db",
        "skipped_reason": null
      }
    ],
    "new_skills_created": 0,
    "role_dimension_saved": 0,
    "skill_dimension_saved": 0,
    "skipped": 2
  },
  "planner_output": null,
  "run_id": "4dab0597-abaf-4f4b-a68c-d1493be026db"
}

LLM Calls

Every model call made for this run, in pipeline order. Click a card to see the model's response.

Loading…