Pipeline run

f44ceab1-c4b3-4c44-93df-420af9b73fce

Pipeline LLM cost (USD)

API 1: $0.0088 API 2: $0.0002 API 3: $0.0000 Total: $0.0090

Client output enrichment

v2 Skill cluster · Nature of work · AI index · Tech stack maturity · Evidence · KRA description

Nature of work · Data Engineering / ML Platform

Build and run GCP-based ML/data pipelines in Python, SQL, Airflow and PySpark; engineer features from large datasets, deploy scalable ML infrastructure/models, and keep CI/CD, Kubernetes, and orchestration reliable.

"develop ML pipeline solutions using airflow, Python and SQL"

Tech stack maturity

Modern Cloud Native

The stack centers on Kubernetes, Google Cloud Platform, CI/CD, Airflow, and modern ML tooling, which is characteristic of a cloud-native MLOps environment.

AI index (0 = no AI use, 5 = totally AI-dependent · v2.1)

1.70 / 5

· Title match

✓ Has AI skill

✓ AI skill (primary)

· AI skill (secondary)

· On AI team

· Builds AI products

vocab breakdown (legacy)

Assistants (×1): —

Frameworks (×2): —

Models / concepts (×3): ML, Machine Learning, Deep Learning

Evidence — skills matched in JD (14)

Python SQL Google Cloud Platform Apache Airflow PySpark ETL Kubernetes CI/CD TensorFlow Kubeflow Pipelines TFX Machine Learning OOP Databases

Skill cluster (6 dimension groups, role-scoped)

AI Governance and Model Security

Machine Learning

Cloud Provider Platforms

Google Cloud Platform

Kubernetes for ML Workloads

Kubernetes

ML Frameworks and Libraries

TensorFlow

Programming Languages for ML Systems

Python

Cross-cutting / unaligned

SQL Apache Airflow PySpark ETL CI/CD Kubeflow Pipelines TFX OOP Databases

Show KRA description ↓

EXL provides consulting and analytics support to fortune 500 companies across multiple industry domains. For this role, you will be supporting the data science team of a leading US Media firm. While working on Google Cloud Platform (GCP), you are expected to develop ML pipeline solutions using airflow, Python and SQL. • Advanced Python (building applications using python code OOPs concepts), advanced SQL skills and experience with Cloud Platforms (GCP preferred) • Experience in developing scalable and robust data pipelines using Python, SQL, PySpark, ETL orchestration tools like Airflow. • A good understanding of system design, Databases and OOPs concepts • Work with large, big data sources, focusing on efficient data creation and feature engineering and create pipelines that feed the data science models • Develop and maintain Build, Continuous Deployment, and Continuous Integration systems. • Hands on experience in Kubernetes, CI/CD pipelines, continually improve CI/CD tools, processes, and procedures. • A good understanding of system architecture (pertaining to data pipelines) and the availability to solve complex problems • Contribute to the team through mentorship, technical methods, improvements in how we work • Design & build data pipelines and production level ML infrastructure, using tools such as TFX, , Kubeflow Pipelines, TensorFlow • Experience in deploying scalable ML Models in cloud platforms, setting up alerting, restartability etc. • Deploy ML models under the constraints of scalability, correctness, and maintainability. • Drive work on creating a state-of-the-art codebase and machine learning lifecycle infrastructure

Signals

Skill ml-engineer

0.42

Alias ml-engineer

1.00

KRA ml-engineer

0.61

Post-classification

Centroidupdated · n=6

Alias collision log—

New-role queue—

New skills captured6

New KRA captured—

Captured for admin review

PySpark primary ↔ MLOps Engineer pending

ETL primary ↔ MLOps Engineer pending

Kubeflow Pipelines primary ↔ MLOps Engineer pending

TFX primary ↔ MLOps Engineer pending

OOP ↔ MLOps Engineer pending

Databases ↔ MLOps Engineer pending

Status: completed Created: 2026-05-27T14:21:16.049754Z Updated: 2026-05-27T14:23:00.846316Z API 3 duration: 65406 ms

Flow Current 3-step pipeline

1 POST /skills/extract-from-jd

2 POST /skills/extract-details

3 POST /skills/final-role-output

Role Chosen role & resolution

MLOps Engineer

domain · AI / ML CASE DOMAIN

slug: ml-ops-engineer · id: 16 · source: db

Domain=AI / ML; The JD is centered on building and operating ML pipelines and infrastructure with Airflow, Kubernetes, CI/CD, and cloud deployment, which best matches MLOps engineering.

Matched skills

Google Cloud Platform (GCP)PythonSQLPySparkAirflowKubernetesCI/CD pipelinesTFXKubeflow PipelinesTensorFlowDatabasesOOPs concepts

Matched dimensions

ML pipeline engineeringData pipeline orchestrationCloud ML infrastructureCI/CD automationScalable model deploymentSystem design and architectureFeature engineering for big dataOperational reliability and maintainability

Matched KRAs

develop ML pipeline solutions using airflow, Python and SQLdeveloping scalable and robust data pipelinescreate pipelines that feed the data science modelsDevelop and maintain Build, Continuous Deployment, and Continuous Integration systemsHands on experience in Kubernetes, CI/CD pipelinesDesign & build data pipelines and production level ML infrastructureExperience in deploying scalable ML Models in cloud platformssetting up alerting, restartability etc.Drive work on creating a state-of-the-art codebasemachine learning lifecycle infrastructure

Resolution: in_db — role exists in library; skill↔dim and role↔dim links saved when applicable.

New skills

Skill↔dim saved

Role↔dim saved

Skipped

Job description

DevOps/ML Engineer

Role Overview

EXL provides consulting and analytics support to fortune 500 companies across multiple industry domains. For this role, you will be supporting the data science team of a leading US Media firm. While working on Google Cloud Platform (GCP), you are expected to develop ML pipeline solutions using airflow, Python and SQL.

Required Skills –
• Advanced Python (building applications using python code OOPs concepts), advanced SQL skills and experience with Cloud Platforms (GCP preferred)
• Experience in developing scalable and robust data pipelines using Python, SQL, PySpark, ETL orchestration tools like Airflow.
• A good understanding of system design, Databases and OOPs concepts
• Work with large, big data sources, focusing on efficient data creation and feature engineering and create pipelines that feed the data science models
• Develop and maintain Build, Continuous Deployment, and Continuous Integration systems.
• Hands on experience in Kubernetes, CI/CD pipelines, continually improve CI/CD tools, processes, and procedures.
• A good understanding of system architecture (pertaining to data pipelines) and the availability to solve complex problems
• Contribute to the team through mentorship, technical methods, improvements in how we work

Good To Have Skills
• Design & build data pipelines and production level ML infrastructure, using tools such as TFX, , Kubeflow Pipelines, TensorFlow
• Experience in deploying scalable ML Models in cloud platforms, setting up alerting, restartability etc.
• Deploy ML models under the constraints of scalability, correctness, and maintainability.
• Drive work on creating a state-of-the-art codebase and machine learning lifecycle infrastructure

Qualifications
• Masters or Bachelor's degree in Computer Science (or) math heavy degrees from top-tier universities with strong record of achievement
• Deep knowledge of statistical methods and machine learning with special emphasis on deep learning algorithms.
• 3+ Years of experience in algorithms, machine learning, data science (or) statistics
• Experience solving problems using Machine Learning Frameworks(e.g. PyTorch, TensorFlow) 
• Experience with Big Query; Comfortable with writing complex SQL queries for data retrieval & transformation
• Proficient in Python / Pyspark. 
• Prior experience in management consulting and/or analytics based consulting is a plus
• Experience in building and maintaining data pipeline using Airflow/ Jenkins will be preferred.

EXL Company Overview

EXL (NASDAQ: EXLS) is a leading operations management and analytics company that designs and enables agile, customer-centric operating models to help clients improve their revenue growth and profitability. Our delivery model provides market-leading business outcomes using EXL’s proprietary Business EXLerator Framework™, cutting-edge analytics, digital transformation and domain expertise. At EXL, we look deeper to help companies improve global operations, enhance data-driven insights, increase customer satisfaction, and manage risk and compliance. EXL serves the insurance, healthcare, banking and financial services, utilities, travel, transportation and logistics industries. Headquartered in New York, New York, EXL has more than 32,000 professionals in locations throughout the United States, Europe, Asia (primarily India and Philippines), South America, Australia and South Africa.

EXL Analytics provides data-driven, action-oriented solutions to business problems through statistical data mining, cutting edge analytics techniques and a consultative approach. Leveraging proprietary methodology and best-of-breed technology, EXL Analytics takes an industry-specific approach to transform our clients’ decision making and embed analytics more deeply into their business processes. Our global footprint of nearly 2,000 data scientists and analysts assist client organizations with complex risk minimization methods, advanced marketing, pricing and CRM strategies, internal cost analysis, and cost and resource optimization within the organization. EXL Analytics serves the insurance, healthcare, banking, capital markets, utilities, retail and e-commerce, travel, transportation and logistics industries.

Please visit www.exlservice.com for more information about EXL Analytics.

Skills from this JD

Each row merges API 1 extraction, API 2 library match / v3 orchestration (dimensions + locked dims), and API 3 persistence tags.

Python Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)

Canonical: Python id=5 · python

Aliases — catalog

Python (CANONICAL) primary
Python 2 (VERSION)
Python 2.x (VERSION)
Python 3 (VERSION)
Python 3.10 (VERSION)
Python 3.11 (VERSION)
Python 3.12 (VERSION)
Python 3.x (VERSION)
py (VERSION)
py2 (VERSION)
py3 (VERSION)
python 3 (VERSION)
python 3.x (VERSION)
python2 (VERSION)
python3 (VERSION)
python3.x (VERSION)

Context tags (catalog)

API Django FastAPI Flask Jupyter NumPy PEP 8 Pandas REST SQLAlchemy asyncio pandas pip pytest type hints venv virtualenv

Stored enrichment (catalog DB)

Category: Language
Sub-category: Programming Language
Vendor: PSF
License: mit
Year introduced: 1991
Confidence: 0.99
Version strategy: SEPARATE_ENTITY
Version tag: 3

Maturity reasoning: Python appears in a very high volume of job descriptions across data, backend, automation, and ML roles, and remains a default hiring-pipeline language on major job boards and tech stacks.

Skill profile (library / DB)

Skill nature: LANGUAGE
Volatility: STABLE
Typical lifespan: EVERGREEN
Category id: 6
Sub-category id: 96
Extractable: True
Also category: False

Dimensions (API 2 worklist)

Cloud Security Scripting & DSL Languages Catalog dimension db id 248

Library dimension (catalog)

Roles linked in library: Cloud Security Engineer
Programming Languages Catalog dimension db id 1

Library dimension (catalog)

Roles linked in library: Backend Developer, Fullstack Developer, Fullstack Developer
Programming Languages and Scripting Catalog dimension db id 59

Library dimension (catalog)

Roles linked in library: Cyber Security Engineer
Programming Languages for Data Work Catalog dimension db id 21

Library dimension (catalog)

Roles linked in library: Data Engineer
Programming Languages for ML Systems Catalog dimension db id 39

Library dimension (catalog)

Roles linked in library: ML Engineer, MLOps Engineer
Programming Languages for XR Catalog dimension db id 97

Library dimension (catalog)

Roles linked in library: AR/VR Engineer
Python Programming Catalog dimension db id 290

Library dimension (catalog)

Roles linked in library: Python Backend Developer

API 3 link attempts (this skill)

Dimension	Skill↔dim	Role↔dim	Outcome
Cloud Security Scripting & DSL Languages cloud-security-scripting-dsl-languages	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Programming Languages programming-languages	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Programming Languages and Scripting programming-languages-and-scripting	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Programming Languages for Data Work programming-languages-for-data-work	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Programming Languages for ML Systems programming-languages-for-ml-systems	✓	✓	Existing dimension (library) · Role↔dimension saved
Programming Languages for XR programming-languages-for-xr	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Python Programming python-programming	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)

SQL Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)

Canonical: SQL id=101 · sql

Aliases — catalog

SQL (CANONICAL) primary

Context tags (catalog)

ACID CTE DDL DML ETL JOIN MySQL NoSQL OLAP ORM PostgreSQL SQL injection SQLite T-SQL data modeling data warehousing database normalization execution plan indexing joins normalization query optimization stored procedures subquery transaction isolation transaction management window functions

Stored enrichment (catalog DB)

Category: Language
Sub-category: Query Language
Vendor: ANSI
License: unknown
Year introduced: 1974
Confidence: 0.99
Version strategy: NOT_APPLICABLE

Maturity reasoning: SQL appears in a large share of data, backend, and analytics job descriptions and remains the default query language for PostgreSQL, MySQL, and cloud warehouses like Snowflake/BigQuery.

Skill profile (library / DB)

Skill nature: LANGUAGE
Volatility: STABLE
Typical lifespan: EVERGREEN
Category id: 6
Sub-category id: 97
Extractable: True
Also category: False

Dimensions (API 2 worklist)

Pega Programming Languages & DSLs Catalog dimension db id 267

Library dimension (catalog)

Roles linked in library: Pega Developer
Programming Languages for Data Work Catalog dimension db id 21

Library dimension (catalog)

Roles linked in library: Data Engineer

API 3 link attempts (this skill)

Dimension	Skill↔dim	Role↔dim	Outcome
Pega Programming Languages & DSLs pega-programming-languages-dsls	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Programming Languages for Data Work programming-languages-for-data-work	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)

Google Cloud Platform Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)

Canonical: Google Cloud Platform id=425 · google-cloud-platform

Aliases — catalog

Google Cloud Platform (CANONICAL) primary

Context tags (catalog)

Anthos App Engine Artifact Registry BigQuery Cloud Build Cloud Functions Cloud Monitoring Cloud Pub/Sub Cloud Run Cloud SQL Cloud Spanner Cloud Storage Compute Engine Dataflow Dataproc GCP GKE IAM Kubernetes Kubernetes Engine Pub/Sub Serverless Stackdriver Terraform VPC

Stored enrichment (catalog DB)

Category: Platform
Sub-category: Cloud Platform
Vendor: Google
License: other_open
Year introduced: 2008
Confidence: 0.99
Version strategy: NOT_APPLICABLE

Maturity reasoning: GCP appears in many cloud-engineering job descriptions alongside AWS/Azure, and Google continues expanding managed services and certifications, indicating broad hiring demand rather than niche use.

Skill profile (library / DB)

Skill nature: PLATFORM
Volatility: STABLE
Typical lifespan: EVERGREEN
Category id: 9
Sub-category id: 46
Extractable: True
Also category: False

Dimensions (API 2 worklist)

Cloud & Hosting Providers Catalog dimension db id 414

Library dimension (catalog)

Roles linked in library: PHP Backend Developer
Cloud Provider Platforms Catalog dimension db id 131

Library dimension (catalog)

Roles linked in library: Cloud Architect, Cloud Security Engineer
Cloud Security Posture Tools Catalog dimension db id 64

Library dimension (catalog)

Roles linked in library: Cloud Security Engineer, Cyber Security Engineer

API 3 link attempts (this skill)

Dimension	Skill↔dim	Role↔dim	Outcome
Cloud & Hosting Providers cloud-hosting-providers	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Cloud Provider Platforms cloud-provider-platforms	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Cloud Security Posture Tools cloud-security-posture-tools	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)

Apache Airflow Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)

Canonical: Apache Airflow id=110 · apache-airflow

Aliases — catalog

Apache Airflow (CANONICAL) primary

Context tags (catalog)

CeleryExecutor DAG ETL KubernetesExecutor Sensors XCom backfill catchup cron data pipelines executor hooks operators scheduler task dependencies

Stored enrichment (catalog DB)

Category: Tool
Sub-category: Workflow Orchestration Tool
Vendor: Apache Software Foundation
License: apache_2
Year introduced: 2015
Confidence: 0.98
Version strategy: NOT_APPLICABLE

Maturity reasoning: Frequently listed in data engineering JDs and widely adopted for workflow orchestration; strong GitHub activity and managed offerings from AWS/GCP/Azure signal broad market demand.

Skill profile (library / DB)

Skill nature: TOOL
Volatility: STABLE
Typical lifespan: EVERGREEN
Category id: 13
Sub-category id: 130
Extractable: True
Also category: False

Dimensions (API 2 worklist)

Data Pipeline Orchestration Catalog dimension db id 23

Library dimension (catalog)

Roles linked in library: Data Engineer

API 3 link attempts (this skill)

Dimension	Skill↔dim	Role↔dim	Outcome
Data Pipeline Orchestration data-pipeline-orchestration	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)

PySpark Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)

Canonical: Apache Spark id=1350 · apache-spark

Aliases — catalog

Apache Spark (CANONICAL)
apache spark 3 (VERSION)
spark (VERSION)
spark 3 (VERSION)
spark 3.x (VERSION)
spark3 (VERSION)

Context tags (catalog)

Apache Kafka Cluster Manager DAGScheduler Data Lake DataFrame ETL Hadoop MLlib Machine Learning PySpark RDD Scala Spark SQL Spark Streaming SparkSession

Stored enrichment (catalog DB)

Category: Framework
Sub-category: Distributed Data Processing Framework
Vendor: Apache Software Foundation
License: apache_2
Year introduced: 2010
Confidence: 0.94
Version strategy: SEPARATE_ENTITY
Version tag: 3.x

Maturity reasoning: Apache Spark appears in many data engineering JDs and remains a standard for distributed ETL/ELT; its GitHub and vendor ecosystem activity stay strong, with Databricks and cloud platforms still promoting it.

Skill profile (library / DB)

Skill nature: FRAMEWORK
Volatility: STABLE
Typical lifespan: EVERGREEN
Category id: 5
Sub-category id: 1021
Extractable: True
Also category: False

Dimensions (API 2 worklist)

ETL and ELT Tooling Catalog dimension db id 24

Library dimension (catalog)

Roles linked in library: Data Engineer

API 3 link attempts (this skill)

Dimension	Skill↔dim	Role↔dim	Outcome
ETL and ELT Tooling etl-and-elt-tooling	—	—	Skipped — no persistable v3 meta for new skill skill_not_in_db_v3_proposed

ETL Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields

Category: Data Engineering Tools
Sub-category: general
Skill nature: PRACTICE
Volatility: MEDIUM
Typical lifespan: MULTI_YEAR
Version strategy: UNVERSIONED

Kubernetes Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)

Canonical: Kubernetes id=726 · kubernetes

Aliases — catalog

Kubernetes (CANONICAL) primary
Kubernetes 1.0+ (VERSION)
Kubernetes 1.x (VERSION)
Kubernetes v1 (VERSION)
k8s (VERSION)
kubernetes 1.x (VERSION)
kubernetes latest (VERSION)

Context tags (catalog)

CI/CD Cluster Autoscaler ConfigMap DaemonSet Deployment Docker Grafana Helm Ingress Istio K8s Kubelet Namespace Pod Prometheus RBAC Secret Service StatefulSet containerization deployment etcd kubectl load balancing microservices namespace orchestration persistent storage scalability service mesh

Stored enrichment (catalog DB)

Category: Platform
Sub-category: Container Orchestration Platform
Vendor: Cloud Native Computing Foundation
License: apache_2
Year introduced: 2014
Confidence: 0.90
Version strategy: SEPARATE_ENTITY
Version tag: 1.30

Maturity reasoning: Broadly adopted in cloud-native stacks; Kubernetes appears in a large share of DevOps/SRE job descriptions and is the default orchestration platform across major cloud vendors.

Skill profile (library / DB)

Skill nature: PLATFORM
Volatility: STABLE
Typical lifespan: EVERGREEN
Category id: 9
Sub-category id: 557
Extractable: True
Also category: False

Dimensions (API 2 worklist)

Container Orchestration Platforms Catalog dimension db id 134

Library dimension (catalog)

Roles linked in library: Cloud Architect, DevOps Engineer
Kubernetes for ML Workloads Catalog dimension db id 47

Library dimension (catalog)

Roles linked in library: ML Engineer, MLOps Engineer

API 3 link attempts (this skill)

Dimension	Skill↔dim	Role↔dim	Outcome
Container Orchestration Platforms container-orchestration-platforms	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Kubernetes for ML Workloads kubernetes-for-ml-workloads	✓	✓	Existing dimension (library) · Role↔dimension saved

CI/CD Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)

Canonical: CI/CD id=1190 · ci-cd

Aliases — catalog

CI/CD (CANONICAL)

Context tags (catalog)

Ansible CircleCI Docker GitLab CI Jenkins Kubernetes Terraform Travis CI automated testing build automation continuous deployment continuous integration deployment pipelines monitoring version control

Stored enrichment (catalog DB)

Category: Methodology
Sub-category: Ci Cd Process
Confidence: 0.93
Version strategy: NOT_APPLICABLE

Maturity reasoning: CI/CD appears in a large share of software engineering JDs and is a standard requirement across DevOps, platform, and backend roles; major vendors like GitHub, GitLab, and AWS all center product roadmaps on CI/CD pipelines.

Skill profile (library / DB)

Skill nature: METHODOLOGY
Volatility: STABLE
Typical lifespan: EVERGREEN
Category id: 8
Sub-category id: 900
Extractable: True
Also category: False

Dimensions (API 2 worklist)

CI/CD Pipeline Platforms Catalog dimension db id 150

Library dimension (catalog)

Roles linked in library: DevOps Engineer
CI/CD for Machine Learning Catalog dimension db id 56

Library dimension (catalog)

Roles linked in library: ML Engineer

API 3 link attempts (this skill)

Dimension	Skill↔dim	Role↔dim	Outcome
CI/CD Pipeline Platforms ci-cd-pipeline-platforms	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
CI/CD for Machine Learning ci-cd-for-machine-learning	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)

TensorFlow Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)

Canonical: TensorFlow id=196 · tensorflow

Aliases — catalog

TensorFlow (CANONICAL) primary
TF1 (VERSION)
TF2 (VERSION)
TensorFlow 1 (VERSION)
TensorFlow 1.x (VERSION)
TensorFlow 2 (VERSION)
TensorFlow 2.x (VERSION)
tensorflow 1 (VERSION)
tensorflow 1.x (VERSION)
tensorflow 2 (VERSION)
tensorflow 2.x (VERSION)
tensorflow v1 (VERSION)
tensorflow v2 (VERSION)
tf (VERSION)
tf1 (VERSION)
tf2 (VERSION)

Context tags (catalog)

AutoGraph Distributed Training Eager Execution Estimator GPU Gradient Descent Hyperparameter Tuning Keras ModelCheckpoint Neural Networks ONNX SavedModel TF Lite TF Serving TF.js TFX TPU TensorBoard TensorFlow Hub TensorFlow Lite TensorFlow Serving Transfer Learning XLA tf.data tf.keras

Stored enrichment (catalog DB)

Category: Library
Sub-category: Machine Learning Library
Vendor: Google
License: apache_2
Year introduced: 2015
Confidence: 0.90
Version strategy: SEPARATE_ENTITY
Version tag: 2.x

Maturity reasoning: TensorFlow appears in many ML/AI job descriptions and remains a standard production framework, with strong GitHub activity and broad vendor support from Google and cloud platforms.

Skill profile (library / DB)

Skill nature: LIBRARY
Volatility: STABLE
Typical lifespan: EVERGREEN
Category id: 7
Sub-category id: 156
Extractable: True
Also category: False

Dimensions (API 2 worklist)

ML Frameworks and Libraries Catalog dimension db id 40

Library dimension (catalog)

Roles linked in library: ML Engineer, MLOps Engineer

API 3 link attempts (this skill)

Dimension	Skill↔dim	Role↔dim	Outcome
ML Frameworks and Libraries ml-frameworks-and-libraries	✓	✓	Existing dimension (library) · Role↔dimension saved

Kubeflow Pipelines Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)

Canonical: Kubeflow id=213 · kubeflow

Aliases — catalog

Kubeflow (CANONICAL) primary
Kubeflow 1.x (VERSION)
Kubeflow 2.x (VERSION)
Kubeflow v1 (VERSION)
Kubeflow v2 (VERSION)

Context tags (catalog)

Argo Argo Workflows CI/CD Data preprocessing GPU scheduling Hyperparameter tuning Istio Jupyter notebooks KFServing Katib Kubeflow Pipelines Kubeflow Training Kubernetes ML pipelines MLOps MLflow MinIO Model serving Pipeline components PyTorch Seldon Seldon Core TensorFlow model serving

Stored enrichment (catalog DB)

Category: Framework
Sub-category: Mlops Framework
Vendor: Google
License: apache_2
Year introduced: 2017
Confidence: 0.90
Version strategy: NOT_APPLICABLE

Maturity reasoning: Kubeflow appears in some MLOps/ML platform JDs, but far less often than Kubernetes or managed ML platforms; GitHub activity is steady yet adoption remains specialized to ML infrastructure teams.

Skill profile (library / DB)

Skill nature: FRAMEWORK
Volatility: STABLE
Typical lifespan: EVERGREEN
Category id: 5
Sub-category id: 1127
Extractable: True
Also category: False

Dimensions (API 2 worklist)

MLOps Platforms and Lifecycle Catalog dimension db id 43

Library dimension (catalog)

Roles linked in library: ML Engineer, MLOps Engineer

API 3 link attempts (this skill)

Dimension	Skill↔dim	Role↔dim	Outcome
MLOps Platforms and Lifecycle mlops-platforms-and-lifecycle	—	—	Skipped — no persistable v3 meta for new skill skill_not_in_db_v3_proposed

TFX Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields

Category: Machine Learning Frameworks
Sub-category: general
Skill nature: TOOL
Volatility: FAST
Typical lifespan: SHORT_LIVED
Version strategy: VERSIONED

Machine Learning Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)

Canonical: Machine Learning id=1356 · machine-learning

Aliases — catalog

Machine Learning (CANONICAL)

Context tags (catalog)

Keras PyTorch TensorFlow cross-validation data preprocessing ensemble methods feature engineering hyperparameter tuning model evaluation natural language processing neural networks reinforcement learning scikit-learn supervised learning unsupervised learning

Stored enrichment (catalog DB)

Category: Concept
Sub-category: Machine Learning
Confidence: 0.98
Version strategy: NOT_APPLICABLE

Maturity reasoning: Machine Learning appears in large volumes of job descriptions across data, product, and platform roles, and major cloud vendors (AWS, Google Cloud, Azure) offer dedicated ML services and certifications, indicating broad adoption.

Skill profile (library / DB)

Skill nature: CONCEPT
Volatility: STABLE
Typical lifespan: EVERGREEN
Category id: 2
Sub-category id: 1024
Extractable: True
Also category: False

Dimensions (API 2 worklist)

AI Governance and Model Security Catalog dimension db id 50

Library dimension (catalog)

Roles linked in library: AI Engineer, ML Engineer, MLOps Engineer
React Frontend Development Catalog dimension db id 96

Library dimension (catalog)

API 3 link attempts (this skill)

Dimension	Skill↔dim	Role↔dim	Outcome
AI Governance and Model Security ai-governance-and-model-security	✓	✓	Existing dimension (library) · Role↔dimension saved
React Frontend Development d_init_01	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)

OOP Secondary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields

Category: Concepts
Sub-category: general
Skill nature: CONCEPT
Volatility: STABLE
Typical lifespan: EVERGREEN
Version strategy: UNVERSIONED

Databases Secondary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields

Category: Databases
Sub-category: general
Skill nature: CONCEPT
Volatility: STABLE
Typical lifespan: EVERGREEN
Version strategy: UNVERSIONED

All API 3 persistence rows

Same grid as the skill-extractor “Persistence items” table: one row per (skill × dimension) work item.

Skill	Tag	Dimension	Skill↔dim	Role↔dim	Outcome	Notes
Python	in_db	Cloud Security Scripting & DSL Languages cloud-security-scripting-dsl-languages	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Python	in_db	Programming Languages programming-languages	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Python	in_db	Programming Languages and Scripting programming-languages-and-scripting	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Python	in_db	Programming Languages for Data Work programming-languages-for-data-work	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Python	in_db	Programming Languages for ML Systems programming-languages-for-ml-systems	✓	✓	Existing dimension (library) · Role↔dimension saved
Python	in_db	Programming Languages for XR programming-languages-for-xr	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Python	in_db	Python Programming python-programming	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
SQL	in_db	Pega Programming Languages & DSLs pega-programming-languages-dsls	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
SQL	in_db	Programming Languages for Data Work programming-languages-for-data-work	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Google Cloud Platform	in_db	Cloud & Hosting Providers cloud-hosting-providers	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Google Cloud Platform	in_db	Cloud Provider Platforms cloud-provider-platforms	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Google Cloud Platform	in_db	Cloud Security Posture Tools cloud-security-posture-tools	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Apache Airflow	in_db	Data Pipeline Orchestration data-pipeline-orchestration	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
PySpark	new	ETL and ELT Tooling etl-and-elt-tooling	—	—	Skipped — no persistable v3 meta for new skill	skill_not_in_db_v3_proposed
Kubernetes	in_db	Container Orchestration Platforms container-orchestration-platforms	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Kubernetes	in_db	Kubernetes for ML Workloads kubernetes-for-ml-workloads	✓	✓	Existing dimension (library) · Role↔dimension saved
CI/CD	in_db	CI/CD Pipeline Platforms ci-cd-pipeline-platforms	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
CI/CD	in_db	CI/CD for Machine Learning ci-cd-for-machine-learning	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
TensorFlow	in_db	ML Frameworks and Libraries ml-frameworks-and-libraries	✓	✓	Existing dimension (library) · Role↔dimension saved
Kubeflow Pipelines	new	MLOps Platforms and Lifecycle mlops-platforms-and-lifecycle	—	—	Skipped — no persistable v3 meta for new skill	skill_not_in_db_v3_proposed
Machine Learning	in_db	AI Governance and Model Security ai-governance-and-model-security	✓	✓	Existing dimension (library) · Role↔dimension saved
Machine Learning	in_db	React Frontend Development d_init_01	✓	—	Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)

Library artifacts (this run)

Kind	Detail	DB id
canonical_skill_proposed	ETL \| type=Data Engineering Tools subtype=general nature=PRACTICE lifespan=MULTI_YEAR
canonical_skill_proposed	TFX \| type=Machine Learning Frameworks subtype=general nature=TOOL lifespan=SHORT_LIVED
canonical_skill_proposed	OOP \| type=Concepts subtype=general nature=CONCEPT lifespan=EVERGREEN
canonical_skill_proposed	Databases \| type=Databases subtype=general nature=CONCEPT lifespan=EVERGREEN
dimension_skill_link_proposed	PySpark ↔ ETL and ELT Tooling
dimension_skill_link_proposed	Kubeflow Pipelines ↔ MLOps Platforms and Lifecycle
role_dimension_link_proposed	MLOps Engineer ↔ MLOps Platforms and Lifecycle

nano JD Parser — gpt-4.1-nano click to toggle

RoleDevOps/ML Engineer

CompanyEXL

Experience3+ Years of experience in algorithms, machine learning, data science (or) statistics

DomainIT Services & Consulting

JD type pass

Show raw JSON

{
  "JD_type": "pass",
  "about_company": {
    "source_marker": {
      "first_5_words": "EXL (NASDAQ: EXLS) is a",
      "last_5_words": "in South America, Australia and South Africa."
    },
    "text": "EXL (NASDAQ: EXLS) is a leading operations management and analytics company that designs and enables agile, customer-centric operating models to help clients improve their revenue growth and profitability. Our delivery model provides market-leading business outcomes using EXL\u2019s proprietary Business EXLerator Framework\u2122, cutting-edge analytics, digital transformation and domain expertise. At EXL, we look deeper to help companies improve global operations, enhance data-driven insights, increase customer satisfaction, and manage risk and compliance. EXL serves the insurance, healthcare, banking and financial services, utilities, travel, transportation and logistics industries. Headquartered in New York, New York, EXL has more than 32,000 professionals in locations throughout the United States, Europe, Asia (primarily India and Philippines), South America, Australia and South Africa.",
    "word_count": 100
  },
  "certifications": [],
  "company_name": "EXL",
  "ctc": null,
  "domain": {
    "primary": {
      "aliases": [
        "Analytics",
        "Consulting"
      ],
      "domain": "IT Services \u0026 Consulting"
    },
    "secondary": null
  },
  "education": [
    {
      "level": "Bachelor\u0027s",
      "qualification": "Master\u0027s/Bachelor\u0027s - Computer Science (or) Math Heavy Degrees",
      "raw": "Masters or Bachelor\u0027s degree in Computer Science (or) math heavy degrees from top-tier universities with strong record of achievement",
      "requirement": "required"
    }
  ],
  "experience": {
    "max": null,
    "min": 3,
    "raw": "3+ Years of experience in algorithms, machine learning, data science (or) statistics"
  },
  "job_locations": [],
  "role": "DevOps/ML Engineer",
  "role_aliases": [
    "ML Engineer",
    "DevOps Engineer",
    "Machine Learning Engineer"
  ],
  "role_archetype": "Engineering",
  "roles_and_responsibilities": [
    {
      "bullet_count": 0,
      "heading": "Role Overview",
      "heading_was_present": true,
      "source_marker": {
        "first_5_words": "EXL provides consulting and analytics",
        "last_5_words": "using airflow, Python and SQL."
      },
      "text": "EXL provides consulting and analytics support to fortune 500 companies across multiple industry domains. For this role, you will be supporting the data science team of a leading US Media firm. While working on Google Cloud Platform (GCP), you are expected to develop ML pipeline solutions using airflow, Python and SQL.",
      "word_count": 51
    },
    {
      "bullet_count": 8,
      "heading": "Required Skills",
      "heading_was_present": true,
      "source_marker": {
        "first_5_words": "\u2022 Advanced Python (building applications",
        "last_5_words": "improvements in how we work"
      },
      "text": "\u2022 Advanced Python (building applications using python code OOPs concepts), advanced SQL skills and experience with Cloud Platforms (GCP preferred)\n\u2022 Experience in developing scalable and robust data pipelines using Python, SQL, PySpark, ETL orchestration tools like Airflow.\n\u2022 A good understanding of system design, Databases and OOPs concepts\n\u2022 Work with large, big data sources, focusing on efficient data creation and feature engineering and create pipelines that feed the data science models\n\u2022 Develop and maintain Build, Continuous Deployment, and Continuous Integration systems.\n\u2022 Hands on experience in Kubernetes, CI/CD pipelines, continually improve CI/CD tools, processes, and procedures.\n\u2022 A good understanding of system architecture (pertaining to data pipelines) and the availability to solve complex problems\n\u2022 Contribute to the team through mentorship, technical methods, improvements in how we work",
      "word_count": 116
    },
    {
      "bullet_count": 4,
      "heading": "Good To Have Skills",
      "heading_was_present": true,
      "source_marker": {
        "first_5_words": "\u2022 Design \u0026 build data pipelines",
        "last_5_words": "machine learning lifecycle infrastructure"
      },
      "text": "\u2022 Design \u0026 build data pipelines and production level ML infrastructure, using tools such as TFX, , Kubeflow Pipelines, TensorFlow\n\u2022 Experience in deploying scalable ML Models in cloud platforms, setting up alerting, restartability etc.\n\u2022 Deploy ML models under the constraints of scalability, correctness, and maintainability.\n\u2022 Drive work on creating a state-of-the-art codebase and machine learning lifecycle infrastructure",
      "word_count": 66
    }
  ],
  "urls": [
    {
      "type": "website",
      "url": "http://www.exlservice.com"
    }
  ]
}

API 1 — extract-from-jd click to toggle

{
  "final_skills": [
    {
      "is_primary": true,
      "skill_name": "Python"
    },
    {
      "is_primary": true,
      "skill_name": "SQL"
    },
    {
      "is_primary": true,
      "skill_name": "Google Cloud Platform"
    },
    {
      "is_primary": true,
      "skill_name": "Apache Airflow"
    },
    {
      "is_primary": true,
      "skill_name": "PySpark"
    },
    {
      "is_primary": true,
      "skill_name": "ETL"
    },
    {
      "is_primary": true,
      "skill_name": "Kubernetes"
    },
    {
      "is_primary": true,
      "skill_name": "CI/CD"
    },
    {
      "is_primary": true,
      "skill_name": "TensorFlow"
    },
    {
      "is_primary": true,
      "skill_name": "Kubeflow Pipelines"
    },
    {
      "is_primary": true,
      "skill_name": "TFX"
    },
    {
      "is_primary": true,
      "skill_name": "Machine Learning"
    },
    {
      "is_primary": false,
      "skill_name": "OOP"
    },
    {
      "is_primary": false,
      "skill_name": "Databases"
    }
  ],
  "jd_role": {
    "display_name": "DevOps/ML Engineer",
    "rationale": null,
    "role_aliases": [
      "ML Engineer",
      "DevOps Engineer",
      "Machine Learning Engineer"
    ],
    "role_archetype": "Engineering",
    "slug": ""
  },
  "nano_parsed": {
    "JD_type": "pass",
    "about_company": {
      "source_marker": {
        "first_5_words": "EXL (NASDAQ: EXLS) is a",
        "last_5_words": "in South America, Australia and South Africa."
      },
      "text": "EXL (NASDAQ: EXLS) is a leading operations management and analytics company that designs and enables agile, customer-centric operating models to help clients improve their revenue growth and profitability. Our delivery model provides market-leading business outcomes using EXL\u2019s proprietary Business EXLerator Framework\u2122, cutting-edge analytics, digital transformation and domain expertise. At EXL, we look deeper to help companies improve global operations, enhance data-driven insights, increase customer satisfaction, and manage risk and compliance. EXL serves the insurance, healthcare, banking and financial services, utilities, travel, transportation and logistics industries. Headquartered in New York, New York, EXL has more than 32,000 professionals in locations throughout the United States, Europe, Asia (primarily India and Philippines), South America, Australia and South Africa.",
      "word_count": 100
    },
    "certifications": [],
    "company_name": "EXL",
    "ctc": null,
    "domain": {
      "primary": {
        "aliases": [
          "Analytics",
          "Consulting"
        ],
        "domain": "IT Services \u0026 Consulting"
      },
      "secondary": null
    },
    "education": [
      {
        "level": "Bachelor\u0027s",
        "qualification": "Master\u0027s/Bachelor\u0027s - Computer Science (or) Math Heavy Degrees",
        "raw": "Masters or Bachelor\u0027s degree in Computer Science (or) math heavy degrees from top-tier universities with strong record of achievement",
        "requirement": "required"
      }
    ],
    "experience": {
      "max": null,
      "min": 3,
      "raw": "3+ Years of experience in algorithms, machine learning, data science (or) statistics"
    },
    "job_locations": [],
    "role": "DevOps/ML Engineer",
    "role_aliases": [
      "ML Engineer",
      "DevOps Engineer",
      "Machine Learning Engineer"
    ],
    "role_archetype": "Engineering",
    "roles_and_responsibilities": [
      {
        "bullet_count": 0,
        "heading": "Role Overview",
        "heading_was_present": true,
        "source_marker": {
          "first_5_words": "EXL provides consulting and analytics",
          "last_5_words": "using airflow, Python and SQL."
        },
        "text": "EXL provides consulting and analytics support to fortune 500 companies across multiple industry domains. For this role, you will be supporting the data science team of a leading US Media firm. While working on Google Cloud Platform (GCP), you are expected to develop ML pipeline solutions using airflow, Python and SQL.",
        "word_count": 51
      },
      {
        "bullet_count": 8,
        "heading": "Required Skills",
        "heading_was_present": true,
        "source_marker": {
          "first_5_words": "\u2022 Advanced Python (building applications",
          "last_5_words": "improvements in how we work"
        },
        "text": "\u2022 Advanced Python (building applications using python code OOPs concepts), advanced SQL skills and experience with Cloud Platforms (GCP preferred)\n\u2022 Experience in developing scalable and robust data pipelines using Python, SQL, PySpark, ETL orchestration tools like Airflow.\n\u2022 A good understanding of system design, Databases and OOPs concepts\n\u2022 Work with large, big data sources, focusing on efficient data creation and feature engineering and create pipelines that feed the data science models\n\u2022 Develop and maintain Build, Continuous Deployment, and Continuous Integration systems.\n\u2022 Hands on experience in Kubernetes, CI/CD pipelines, continually improve CI/CD tools, processes, and procedures.\n\u2022 A good understanding of system architecture (pertaining to data pipelines) and the availability to solve complex problems\n\u2022 Contribute to the team through mentorship, technical methods, improvements in how we work",
        "word_count": 116
      },
      {
        "bullet_count": 4,
        "heading": "Good To Have Skills",
        "heading_was_present": true,
        "source_marker": {
          "first_5_words": "\u2022 Design \u0026 build data pipelines",
          "last_5_words": "machine learning lifecycle infrastructure"
        },
        "text": "\u2022 Design \u0026 build data pipelines and production level ML infrastructure, using tools such as TFX, , Kubeflow Pipelines, TensorFlow\n\u2022 Experience in deploying scalable ML Models in cloud platforms, setting up alerting, restartability etc.\n\u2022 Deploy ML models under the constraints of scalability, correctness, and maintainability.\n\u2022 Drive work on creating a state-of-the-art codebase and machine learning lifecycle infrastructure",
        "word_count": 66
      }
    ],
    "urls": [
      {
        "type": "website",
        "url": "http://www.exlservice.com"
      }
    ]
  },
  "rejected": false,
  "rejection_reason": null,
  "run_id": "f44ceab1-c4b3-4c44-93df-420af9b73fce",
  "stage3_signals": {
    "alias_found": true,
    "alias_match_roles": [
      {
        "display_name": "ML Engineer",
        "kra_matches": null,
        "matched_count": null,
        "matched_skills": null,
        "role_id": 3,
        "score": 1.0,
        "slug": "ml-engineer",
        "total_count": null
      },
      {
        "display_name": "DevOps Engineer",
        "kra_matches": null,
        "matched_count": null,
        "matched_skills": null,
        "role_id": 10,
        "score": 1.0,
        "slug": "devops-engineer",
        "total_count": null
      }
    ],
    "kra_match_roles": [
      {
        "display_name": "ML Engineer",
        "kra_matches": [
          {
            "kra_text": "Designs end-to-end ML training pipelines and model inference workflows using TensorFlow, PyTorch, or scikit-learn on cloud ML platforms.",
            "sentence": "Design \u0026 build data pipelines and production level ML infrastructure, using tools such as TFX, , Kubeflow Pipelines, TensorFlow",
            "similarity": 0.6663
          },
          {
            "kra_text": "Prepares, cleans, and transforms training datasets, manages feature stores, and builds feature engineering pipelines for model training.",
            "sentence": "Work with large, big data sources, focusing on efficient data creation and feature engineering and create pipelines that feed the data science models",
            "similarity": 0.6255
          },
          {
            "kra_text": "Designs end-to-end ML training pipelines and model inference workflows using TensorFlow, PyTorch, or scikit-learn on cloud ML platforms.",
            "sentence": "While working on Google Cloud Platform (GCP), you are expected to develop ML pipeline solutions using airflow, Python and SQL.",
            "similarity": 0.536
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 3,
        "score": 0.6093,
        "slug": "ml-engineer",
        "total_count": null
      },
      {
        "display_name": "Data Engineer",
        "kra_matches": [
          {
            "kra_text": "Develops batch and real-time streaming data pipelines using Apache Spark, Apache Kafka, Apache Flink, or Airflow for data movement and processing at scale.",
            "sentence": "Design \u0026 build data pipelines and production level ML infrastructure, using tools such as TFX, , Kubeflow Pipelines, TensorFlow",
            "similarity": 0.6139
          },
          {
            "kra_text": "Works with data analysts, data scientists, and business stakeholders to define data models, ingestion schedules, and data delivery requirements.",
            "sentence": "Work with large, big data sources, focusing on efficient data creation and feature engineering and create pipelines that feed the data science models",
            "similarity": 0.6114
          },
          {
            "kra_text": "Optimizes pipeline throughput, partitioning strategies, and query performance across cloud data warehouses like Snowflake, BigQuery, or Redshift.",
            "sentence": "While working on Google Cloud Platform (GCP), you are expected to develop ML pipeline solutions using airflow, Python and SQL.",
            "similarity": 0.5164
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 2,
        "score": 0.5806,
        "slug": "data-engineer",
        "total_count": null
      },
      {
        "display_name": "DevOps Engineer",
        "kra_matches": [
          {
            "kra_text": "Collaborates with development teams to improve build processes, reduce deployment friction, containerize applications, and adopt DevOps best practices.",
            "sentence": "Develop and maintain Build, Continuous Deployment, and Continuous Integration systems.",
            "similarity": 0.6668
          },
          {
            "kra_text": "Collaborates with development teams to improve build processes, reduce deployment friction, containerize applications, and adopt DevOps best practices.",
            "sentence": "Hands on experience in Kubernetes, CI/CD pipelines, continually improve CI/CD tools, processes, and procedures.",
            "similarity": 0.603
          },
          {
            "kra_text": "Collaborates with development teams to improve build processes, reduce deployment friction, containerize applications, and adopt DevOps best practices.",
            "sentence": "Contribute to the team through mentorship, technical methods, improvements in how we work",
            "similarity": 0.4644
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 10,
        "score": 0.5781,
        "slug": "devops-engineer",
        "total_count": null
      },
      {
        "display_name": "Fullstack Developer",
        "kra_matches": [
          {
            "kra_text": "Delivers features through CI/CD pipelines using automated tests, staged rollouts, feature flags, and incremental deployments.",
            "sentence": "Develop and maintain Build, Continuous Deployment, and Continuous Integration systems.",
            "similarity": 0.5858
          },
          {
            "kra_text": "Delivers features through CI/CD pipelines using automated tests, staged rollouts, feature flags, and incremental deployments.",
            "sentence": "Hands on experience in Kubernetes, CI/CD pipelines, continually improve CI/CD tools, processes, and procedures.",
            "similarity": 0.5646
          },
          {
            "kra_text": "Designs and queries relational databases like PostgreSQL and document stores like MongoDB, writing migrations, indexes, and optimized queries.",
            "sentence": "A good understanding of system design, Databases and OOPs concepts",
            "similarity": 0.4875
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 15,
        "score": 0.546,
        "slug": "full-stack-engineer",
        "total_count": null
      },
      {
        "display_name": "MLOps Engineer",
        "kra_matches": [
          {
            "kra_text": "Orchestrates model serving deployments to production using Kubernetes, MLflow Model Registry, SageMaker, or Kubeflow Serving infrastructure.",
            "sentence": "Design \u0026 build data pipelines and production level ML infrastructure, using tools such as TFX, , Kubeflow Pipelines, TensorFlow",
            "similarity": 0.5843
          },
          {
            "kra_text": "Automates ML platform operations including scheduled retraining triggers, pipeline orchestration, evaluation workflows, and alerting configuration.",
            "sentence": "While working on Google Cloud Platform (GCP), you are expected to develop ML pipeline solutions using airflow, Python and SQL.",
            "similarity": 0.5022
          },
          {
            "kra_text": "Automates ML platform operations including scheduled retraining triggers, pipeline orchestration, evaluation workflows, and alerting configuration.",
            "sentence": "Drive work on creating a state-of-the-art codebase and machine learning lifecycle infrastructure",
            "similarity": 0.4862
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 16,
        "score": 0.5243,
        "slug": "ml-ops-engineer",
        "total_count": null
      }
    ],
    "skill_match_roles": [
      {
        "display_name": "ML Engineer",
        "kra_matches": null,
        "matched_count": 5,
        "matched_skills": [
          "CI/CD",
          "Kubernetes",
          "Machine Learning",
          "Python",
          "TensorFlow"
        ],
        "role_id": 3,
        "score": 0.4167,
        "slug": "ml-engineer",
        "total_count": 12
      },
      {
        "display_name": "MLOps Engineer",
        "kra_matches": null,
        "matched_count": 4,
        "matched_skills": [
          "Kubernetes",
          "Machine Learning",
          "Python",
          "TensorFlow"
        ],
        "role_id": 16,
        "score": 0.3333,
        "slug": "ml-ops-engineer",
        "total_count": 12
      },
      {
        "display_name": "Data Engineer",
        "kra_matches": null,
        "matched_count": 3,
        "matched_skills": [
          "Apache Airflow",
          "Python",
          "SQL"
        ],
        "role_id": 2,
        "score": 0.25,
        "slug": "data-engineer",
        "total_count": 12
      },
      {
        "display_name": "Cyber Security Engineer",
        "kra_matches": null,
        "matched_count": 2,
        "matched_skills": [
          "Google Cloud Platform",
          "Python"
        ],
        "role_id": 5,
        "score": 0.1667,
        "slug": "cybersecurity-engineer",
        "total_count": 12
      },
      {
        "display_name": "Cloud Architect",
        "kra_matches": null,
        "matched_count": 2,
        "matched_skills": [
          "Google Cloud Platform",
          "Kubernetes"
        ],
        "role_id": 9,
        "score": 0.1667,
        "slug": "cloud-architect",
        "total_count": 12
      }
    ]
  },
  "stage4_decision": {
    "alias_collision_detected": false,
    "case": "DOMAIN",
    "chosen_role": {
      "display_name": "MLOps Engineer",
      "kra_matches": null,
      "matched_count": null,
      "matched_skills": null,
      "role_id": 16,
      "score": 0.94,
      "slug": "ml-ops-engineer",
      "total_count": null
    },
    "confidence": 0.94,
    "is_new_role": false,
    "llm2_fired": false,
    "llm2_reasoning": null,
    "matched_dimensions": [
      "ML pipeline engineering",
      "Data pipeline orchestration",
      "Cloud ML infrastructure",
      "CI/CD automation",
      "Scalable model deployment",
      "System design and architecture",
      "Feature engineering for big data",
      "Operational reliability and maintainability"
    ],
    "matched_kras": [
      "develop ML pipeline solutions using airflow, Python and SQL",
      "developing scalable and robust data pipelines",
      "create pipelines that feed the data science models",
      "Develop and maintain Build, Continuous Deployment, and Continuous Integration systems",
      "Hands on experience in Kubernetes, CI/CD pipelines",
      "Design \u0026 build data pipelines and production level ML infrastructure",
      "Experience in deploying scalable ML Models in cloud platforms",
      "setting up alerting, restartability etc.",
      "Drive work on creating a state-of-the-art codebase",
      "machine learning lifecycle infrastructure"
    ],
    "matched_skills": [
      "Google Cloud Platform (GCP)",
      "Python",
      "SQL",
      "PySpark",
      "Airflow",
      "Kubernetes",
      "CI/CD pipelines",
      "TFX",
      "Kubeflow Pipelines",
      "TensorFlow",
      "Databases",
      "OOPs concepts"
    ],
    "new_role_display_name": null,
    "new_role_slug": null,
    "queued": false,
    "reasoning": "Domain=AI / ML; The JD is centered on building and operating ML pipelines and infrastructure with Airflow, Kubernetes, CI/CD, and cloud deployment, which best matches MLOps engineering.",
    "sub_role": null
  },
  "stage5_updates": {
    "centroid_n_after": 6,
    "centroid_updated": true,
    "collision_log_id": null,
    "new_kra_attached": null,
    "new_skills_attached": [
      {
        "is_primary": true,
        "queue_id": 8496,
        "role_display_name": "MLOps Engineer",
        "role_slug": "ml-ops-engineer",
        "skill_name": "PySpark",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 8497,
        "role_display_name": "MLOps Engineer",
        "role_slug": "ml-ops-engineer",
        "skill_name": "ETL",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 8498,
        "role_display_name": "MLOps Engineer",
        "role_slug": "ml-ops-engineer",
        "skill_name": "Kubeflow Pipelines",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 8499,
        "role_display_name": "MLOps Engineer",
        "role_slug": "ml-ops-engineer",
        "skill_name": "TFX",
        "status": "pending"
      },
      {
        "is_primary": false,
        "queue_id": 8500,
        "role_display_name": "MLOps Engineer",
        "role_slug": "ml-ops-engineer",
        "skill_name": "OOP",
        "status": "pending"
      },
      {
        "is_primary": false,
        "queue_id": 8501,
        "role_display_name": "MLOps Engineer",
        "role_slug": "ml-ops-engineer",
        "skill_name": "Databases",
        "status": "pending"
      }
    ],
    "queue_entry_id": null,
    "v3_pipeline_triggered": false,
    "v3_role_slug": null,
    "v3_run_id": null
  }
}

API 2 — extract-details

{
  "alias_matches": [
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 67,
      "existing_alias_text": "Python",
      "input_term": "Python",
      "matched_canonical": {
        "category_id": 6,
        "display_name": "Python",
        "id": 5,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "LANGUAGE",
        "slug": "python",
        "sub_category_id": 96,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 271,
      "existing_alias_text": "SQL",
      "input_term": "SQL",
      "matched_canonical": {
        "category_id": 6,
        "display_name": "SQL",
        "id": 101,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "LANGUAGE",
        "slug": "sql",
        "sub_category_id": 97,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 741,
      "existing_alias_text": "Google Cloud Platform",
      "input_term": "Google Cloud Platform",
      "matched_canonical": {
        "category_id": 9,
        "display_name": "Google Cloud Platform",
        "id": 425,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "PLATFORM",
        "slug": "google-cloud-platform",
        "sub_category_id": 46,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 304,
      "existing_alias_text": "Apache Airflow",
      "input_term": "Apache Airflow",
      "matched_canonical": {
        "category_id": 13,
        "display_name": "Apache Airflow",
        "id": 110,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "TOOL",
        "slug": "apache-airflow",
        "sub_category_id": 130,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "TODO: REMOVE AFTER TESTING \u2014 alias DB write disabled",
      "alias_persisted": false,
      "existing_alias_id": 2004,
      "existing_alias_text": "Apache Spark",
      "input_term": "PySpark",
      "matched_canonical": {
        "category_id": 5,
        "display_name": "Apache Spark",
        "id": 1350,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "FRAMEWORK",
        "slug": "apache-spark",
        "sub_category_id": 1021,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "embedding_alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 1267,
      "existing_alias_text": "Kubernetes",
      "input_term": "Kubernetes",
      "matched_canonical": {
        "category_id": 9,
        "display_name": "Kubernetes",
        "id": 726,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "PLATFORM",
        "slug": "kubernetes",
        "sub_category_id": 557,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 1826,
      "existing_alias_text": "CI/CD",
      "input_term": "CI/CD",
      "matched_canonical": {
        "category_id": 8,
        "display_name": "CI/CD",
        "id": 1190,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "METHODOLOGY",
        "slug": "ci-cd",
        "sub_category_id": 900,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 442,
      "existing_alias_text": "TensorFlow",
      "input_term": "TensorFlow",
      "matched_canonical": {
        "category_id": 7,
        "display_name": "TensorFlow",
        "id": 196,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "LIBRARY",
        "slug": "tensorflow",
        "sub_category_id": 156,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "TODO: REMOVE AFTER TESTING \u2014 alias DB write disabled",
      "alias_persisted": false,
      "existing_alias_id": 465,
      "existing_alias_text": "Kubeflow",
      "input_term": "Kubeflow Pipelines",
      "matched_canonical": {
        "category_id": 5,
        "display_name": "Kubeflow",
        "id": 213,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "FRAMEWORK",
        "slug": "kubeflow",
        "sub_category_id": 1127,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "embedding_alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 2015,
      "existing_alias_text": "Machine Learning",
      "input_term": "Machine Learning",
      "matched_canonical": {
        "category_id": 2,
        "display_name": "Machine Learning",
        "id": 1356,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "CONCEPT",
        "slug": "machine-learning",
        "sub_category_id": 1024,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    }
  ],
  "candidate_roles": [
    {
      "display_name": "Cloud Security Engineer",
      "id": 23,
      "rationale": null,
      "role_archetype": null,
      "slug": "cloud-security-engineer",
      "source": "db"
    },
    {
      "display_name": "Backend Developer",
      "id": 1,
      "rationale": null,
      "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
      "slug": "backend-engineer",
      "source": "db"
    },
    {
      "display_name": "Fullstack Developer",
      "id": 435,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "fullstack-developer",
      "source": "db"
    },
    {
      "display_name": "Fullstack Developer",
      "id": 15,
      "rationale": null,
      "role_archetype": null,
      "slug": "full-stack-engineer",
      "source": "db"
    },
    {
      "display_name": "Cyber Security Engineer",
      "id": 5,
      "rationale": null,
      "role_archetype": null,
      "slug": "cybersecurity-engineer",
      "source": "db"
    },
    {
      "display_name": "Data Engineer",
      "id": 2,
      "rationale": null,
      "role_archetype": null,
      "slug": "data-engineer",
      "source": "db"
    },
    {
      "display_name": "ML Engineer",
      "id": 3,
      "rationale": null,
      "role_archetype": null,
      "slug": "ml-engineer",
      "source": "db"
    },
    {
      "display_name": "MLOps Engineer",
      "id": 16,
      "rationale": null,
      "role_archetype": null,
      "slug": "ml-ops-engineer",
      "source": "db"
    },
    {
      "display_name": "AR/VR Engineer",
      "id": 8,
      "rationale": null,
      "role_archetype": null,
      "slug": "ar-vr-engineer",
      "source": "db"
    },
    {
      "display_name": "Python Backend Developer",
      "id": 80,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "python-backend-developer",
      "source": "db"
    },
    {
      "display_name": "Pega Developer",
      "id": 24,
      "rationale": null,
      "role_archetype": null,
      "slug": "pega-developer",
      "source": "db"
    },
    {
      "display_name": "PHP Backend Developer",
      "id": 86,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "php-backend-developer",
      "source": "db"
    },
    {
      "display_name": "Cloud Architect",
      "id": 9,
      "rationale": null,
      "role_archetype": null,
      "slug": "cloud-architect",
      "source": "db"
    },
    {
      "display_name": "DevOps Engineer",
      "id": 10,
      "rationale": null,
      "role_archetype": null,
      "slug": "devops-engineer",
      "source": "db"
    },
    {
      "display_name": "AI Engineer",
      "id": 13,
      "rationale": null,
      "role_archetype": null,
      "slug": "ai-engineer",
      "source": "db"
    }
  ],
  "chosen_role": {
    "display_name": "MLOps Engineer",
    "id": 16,
    "rationale": "Domain=AI / ML; The JD is centered on building and operating ML pipelines and infrastructure with Airflow, Kubernetes, CI/CD, and cloud deployment, which best matches MLOps engineering.",
    "role_archetype": null,
    "slug": "ml-ops-engineer",
    "source": "db"
  },
  "dimensions": [
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Cloud Security Scripting \u0026 DSL Languages",
        "id": 248,
        "rationale": "Proficiency in programming and domain-specific languages used to automate and script cloud security controls.",
        "slug": "cloud-security-scripting-dsl-languages",
        "source": "db"
      },
      "input_skill": "Python",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Cloud Security Engineer",
          "id": 23,
          "rationale": null,
          "role_archetype": null,
          "slug": "cloud-security-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Programming Languages",
        "id": 1,
        "rationale": "Primary implementation languages used to build client and server feature code. Full stack engineers need enough fluency to move across layers and implement product behavior end to end.",
        "slug": "programming-languages",
        "source": "db"
      },
      "input_skill": "Python",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Backend Developer",
          "id": 1,
          "rationale": null,
          "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
          "slug": "backend-engineer",
          "source": "db"
        },
        {
          "display_name": "Fullstack Developer",
          "id": 435,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "fullstack-developer",
          "source": "db"
        },
        {
          "display_name": "Fullstack Developer",
          "id": 15,
          "rationale": null,
          "role_archetype": null,
          "slug": "full-stack-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Programming Languages and Scripting",
        "id": 59,
        "rationale": "Languages used to write security automation, analysis scripts, detection logic, and remediation helpers. This is the primary implementation surface for a cybersecurity engineer across tooling and response workflows.",
        "slug": "programming-languages-and-scripting",
        "source": "db"
      },
      "input_skill": "Python",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Cyber Security Engineer",
          "id": 5,
          "rationale": null,
          "role_archetype": null,
          "slug": "cybersecurity-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Programming Languages for Data Work",
        "id": 21,
        "rationale": "Languages used to implement data pipelines, transformations, and operational glue. This is the primary coding surface for building ingestion, enrichment, and automation logic in data engineering.",
        "slug": "programming-languages-for-data-work",
        "source": "db"
      },
      "input_skill": "Python",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Data Engineer",
          "id": 2,
          "rationale": null,
          "role_archetype": null,
          "slug": "data-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Programming Languages for ML Systems",
        "id": 39,
        "rationale": "Languages used to build training code, inference services, evaluation jobs, and ML glue code. This is the primary implementation surface for ML engineers across experimentation and productionization.",
        "slug": "programming-languages-for-ml-systems",
        "source": "db"
      },
      "input_skill": "Python",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "ML Engineer",
          "id": 3,
          "rationale": null,
          "role_archetype": null,
          "slug": "ml-engineer",
          "source": "db"
        },
        {
          "display_name": "MLOps Engineer",
          "id": 16,
          "rationale": null,
          "role_archetype": null,
          "slug": "ml-ops-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Programming Languages for XR",
        "id": 97,
        "rationale": "Primary implementation languages used to build immersive client features, interaction logic, and device-specific runtime behavior. This is the core coding surface for AR/VR experiences.",
        "slug": "programming-languages-for-xr",
        "source": "db"
      },
      "input_skill": "Python",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "AR/VR Engineer",
          "id": 8,
          "rationale": null,
          "role_archetype": null,
          "slug": "ar-vr-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Python Programming",
        "id": 290,
        "rationale": "Core Python language skills used to implement backend business logic, request handlers, integrations, and service internals. This is the primary coding surface for the role.",
        "slug": "python-programming",
        "source": "db"
      },
      "input_skill": "Python",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Python Backend Developer",
          "id": 80,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "python-backend-developer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Pega Programming Languages \u0026 DSLs",
        "id": 267,
        "rationale": "Programming languages and domain-specific languages used in Pega development.",
        "slug": "pega-programming-languages-dsls",
        "source": "db"
      },
      "input_skill": "SQL",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Pega Developer",
          "id": 24,
          "rationale": null,
          "role_archetype": null,
          "slug": "pega-developer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Programming Languages for Data Work",
        "id": 21,
        "rationale": "Languages used to implement data pipelines, transformations, and operational glue. This is the primary coding surface for building ingestion, enrichment, and automation logic in data engineering.",
        "slug": "programming-languages-for-data-work",
        "source": "db"
      },
      "input_skill": "SQL",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Data Engineer",
          "id": 2,
          "rationale": null,
          "role_archetype": null,
          "slug": "data-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Cloud \u0026 Hosting Providers",
        "id": 414,
        "rationale": "Knowledge of major cloud and hosting vendor platforms for deploying and managing PHP applications.",
        "slug": "cloud-hosting-providers",
        "source": "db"
      },
      "input_skill": "Google Cloud Platform",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "PHP Backend Developer",
          "id": 86,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "php-backend-developer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Cloud Provider Platforms",
        "id": 131,
        "rationale": "Major cloud platforms and their core service ecosystems used to design target-state architectures, choose deployment boundaries, and evaluate managed capabilities. This is the primary substrate for cloud architecture decisions.",
        "slug": "cloud-provider-platforms",
        "source": "db"
      },
      "input_skill": "Google Cloud Platform",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Cloud Architect",
          "id": 9,
          "rationale": null,
          "role_archetype": null,
          "slug": "cloud-architect",
          "source": "db"
        },
        {
          "display_name": "Cloud Security Engineer",
          "id": 23,
          "rationale": null,
          "role_archetype": null,
          "slug": "cloud-security-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Cloud Security Posture Tools",
        "id": 64,
        "rationale": "Cloud-native security platforms used to assess misconfiguration, workload exposure, and cloud control coverage. This dimension includes the major CNAPP/CSPM/CWPP vendors and cloud security services the role reviews and tunes.",
        "slug": "cloud-security-posture-tools",
        "source": "db"
      },
      "input_skill": "Google Cloud Platform",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Cloud Security Engineer",
          "id": 23,
          "rationale": null,
          "role_archetype": null,
          "slug": "cloud-security-engineer",
          "source": "db"
        },
        {
          "display_name": "Cyber Security Engineer",
          "id": 5,
          "rationale": null,
          "role_archetype": null,
          "slug": "cybersecurity-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Data Pipeline Orchestration",
        "id": 23,
        "rationale": "Workflow engines that schedule, coordinate, and recover batch data jobs. This cluster covers dependency management, retries, backfills, sensors, and operational control of pipeline DAGs.",
        "slug": "data-pipeline-orchestration",
        "source": "db"
      },
      "input_skill": "Apache Airflow",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Data Engineer",
          "id": 2,
          "rationale": null,
          "role_archetype": null,
          "slug": "data-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "ETL and ELT Tooling",
        "id": 24,
        "rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
        "slug": "etl-and-elt-tooling",
        "source": "db"
      },
      "input_skill": "PySpark",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Data Engineer",
          "id": 2,
          "rationale": null,
          "role_archetype": null,
          "slug": "data-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Container Orchestration Platforms",
        "id": 134,
        "rationale": "Platforms that schedule and manage containerized workloads across clusters and environments. Cloud Architects need these to define workload placement standards, cluster boundaries, and platform capabilities.",
        "slug": "container-orchestration-platforms",
        "source": "db"
      },
      "input_skill": "Kubernetes",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Cloud Architect",
          "id": 9,
          "rationale": null,
          "role_archetype": null,
          "slug": "cloud-architect",
          "source": "db"
        },
        {
          "display_name": "DevOps Engineer",
          "id": 10,
          "rationale": null,
          "role_archetype": null,
          "slug": "devops-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Kubernetes for ML Workloads",
        "id": 47,
        "rationale": "Kubernetes-native components used to schedule, accelerate, and isolate ML training and serving workloads. This includes GPU enablement and ML-specific controllers rather than generic cluster administration.",
        "slug": "kubernetes-for-ml-workloads",
        "source": "db"
      },
      "input_skill": "Kubernetes",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "ML Engineer",
          "id": 3,
          "rationale": null,
          "role_archetype": null,
          "slug": "ml-engineer",
          "source": "db"
        },
        {
          "display_name": "MLOps Engineer",
          "id": 16,
          "rationale": null,
          "role_archetype": null,
          "slug": "ml-ops-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "CI/CD Pipeline Platforms",
        "id": 150,
        "rationale": "Systems used to define, run, and maintain automated build and deployment workflows. This cluster is coherent because the role owns delivery automation end to end, including pipeline reliability and promotion logic.",
        "slug": "ci-cd-pipeline-platforms",
        "source": "db"
      },
      "input_skill": "CI/CD",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "DevOps Engineer",
          "id": 10,
          "rationale": null,
          "role_archetype": null,
          "slug": "devops-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "CI/CD for Machine Learning",
        "id": 56,
        "rationale": "Tools and platforms for automating ML model integration, testing, and deployment pipelines.",
        "slug": "ci-cd-for-machine-learning",
        "source": "db"
      },
      "input_skill": "CI/CD",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "ML Engineer",
          "id": 3,
          "rationale": null,
          "role_archetype": null,
          "slug": "ml-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "ML Frameworks and Libraries",
        "id": 40,
        "rationale": "Core libraries used to define models, train them, run inference, and evaluate predictive performance. These frameworks shape how ML engineers express model architectures and training loops.",
        "slug": "ml-frameworks-and-libraries",
        "source": "db"
      },
      "input_skill": "TensorFlow",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "ML Engineer",
          "id": 3,
          "rationale": null,
          "role_archetype": null,
          "slug": "ml-engineer",
          "source": "db"
        },
        {
          "display_name": "MLOps Engineer",
          "id": 16,
          "rationale": null,
          "role_archetype": null,
          "slug": "ml-ops-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "MLOps Platforms and Lifecycle",
        "id": 43,
        "rationale": "End-to-end managed platforms used to train, deploy, register, and govern models across their lifecycle. This is the operational control plane for production ML workflows.",
        "slug": "mlops-platforms-and-lifecycle",
        "source": "db"
      },
      "input_skill": "Kubeflow Pipelines",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "ML Engineer",
          "id": 3,
          "rationale": null,
          "role_archetype": null,
          "slug": "ml-engineer",
          "source": "db"
        },
        {
          "display_name": "MLOps Engineer",
          "id": 16,
          "rationale": null,
          "role_archetype": null,
          "slug": "ml-ops-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "AI Governance and Model Security",
        "id": 50,
        "rationale": "Controls and documentation used to make models safer, auditable, and compliant. ML engineers use this to manage model risk, supply chain integrity, and governance requirements.",
        "slug": "ai-governance-and-model-security",
        "source": "db"
      },
      "input_skill": "Machine Learning",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "AI Engineer",
          "id": 13,
          "rationale": null,
          "role_archetype": null,
          "slug": "ai-engineer",
          "source": "db"
        },
        {
          "display_name": "ML Engineer",
          "id": 3,
          "rationale": null,
          "role_archetype": null,
          "slug": "ml-engineer",
          "source": "db"
        },
        {
          "display_name": "MLOps Engineer",
          "id": 16,
          "rationale": null,
          "role_archetype": null,
          "slug": "ml-ops-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "React Frontend Development",
        "id": 96,
        "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
        "slug": "d_init_01",
        "source": "db"
      },
      "input_skill": "Machine Learning",
      "llm_role": null,
      "roles_from_db": []
    }
  ],
  "input_final_skills": [
    "Python",
    "SQL",
    "Google Cloud Platform",
    "Apache Airflow",
    "PySpark",
    "ETL",
    "Kubernetes",
    "CI/CD",
    "TensorFlow",
    "Kubeflow Pipelines",
    "TFX",
    "Machine Learning",
    "OOP",
    "Databases"
  ],
  "input_llm_skills": [
    "Python",
    "SQL",
    "Google Cloud Platform",
    "Apache Airflow",
    "PySpark",
    "ETL",
    "Kubernetes",
    "CI/CD",
    "TensorFlow",
    "Kubeflow Pipelines",
    "TFX",
    "Machine Learning",
    "OOP",
    "Databases"
  ],
  "new_aliases_persisted": 0,
  "run_id": "f44ceab1-c4b3-4c44-93df-420af9b73fce",
  "skills_detail": [
    {
      "aliases_in_db": [
        {
          "alias_text": "Python",
          "alias_type": "CANONICAL",
          "id": 67,
          "is_primary": true,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Python 2",
          "alias_type": "VERSION",
          "id": 72,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Python 2.x",
          "alias_type": "VERSION",
          "id": 74,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Python 3",
          "alias_type": "VERSION",
          "id": 73,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Python 3.10",
          "alias_type": "VERSION",
          "id": 76,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Python 3.11",
          "alias_type": "VERSION",
          "id": 77,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Python 3.12",
          "alias_type": "VERSION",
          "id": 78,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Python 3.x",
          "alias_type": "VERSION",
          "id": 75,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "py",
          "alias_type": "VERSION",
          "id": 2183,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "py2",
          "alias_type": "VERSION",
          "id": 68,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "py3",
          "alias_type": "VERSION",
          "id": 69,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "python 3",
          "alias_type": "VERSION",
          "id": 2186,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "python 3.x",
          "alias_type": "VERSION",
          "id": 2849,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "python2",
          "alias_type": "VERSION",
          "id": 70,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "python3",
          "alias_type": "VERSION",
          "id": 71,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "python3.x",
          "alias_type": "VERSION",
          "id": 2848,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 6,
        "display_name": "Python",
        "id": 5,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "LANGUAGE",
        "slug": "python",
        "sub_category_id": 96,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Cloud Security Scripting \u0026 DSL Languages",
            "id": 248,
            "rationale": "Proficiency in programming and domain-specific languages used to automate and script cloud security controls.",
            "slug": "cloud-security-scripting-dsl-languages",
            "source": "db"
          },
          "input_skill": "Python",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Cloud Security Engineer",
              "id": 23,
              "rationale": null,
              "role_archetype": null,
              "slug": "cloud-security-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Programming Languages",
            "id": 1,
            "rationale": "Primary implementation languages used to build client and server feature code. Full stack engineers need enough fluency to move across layers and implement product behavior end to end.",
            "slug": "programming-languages",
            "source": "db"
          },
          "input_skill": "Python",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Backend Developer",
              "id": 1,
              "rationale": null,
              "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
              "slug": "backend-engineer",
              "source": "db"
            },
            {
              "display_name": "Fullstack Developer",
              "id": 435,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "fullstack-developer",
              "source": "db"
            },
            {
              "display_name": "Fullstack Developer",
              "id": 15,
              "rationale": null,
              "role_archetype": null,
              "slug": "full-stack-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Programming Languages and Scripting",
            "id": 59,
            "rationale": "Languages used to write security automation, analysis scripts, detection logic, and remediation helpers. This is the primary implementation surface for a cybersecurity engineer across tooling and response workflows.",
            "slug": "programming-languages-and-scripting",
            "source": "db"
          },
          "input_skill": "Python",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Cyber Security Engineer",
              "id": 5,
              "rationale": null,
              "role_archetype": null,
              "slug": "cybersecurity-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Programming Languages for Data Work",
            "id": 21,
            "rationale": "Languages used to implement data pipelines, transformations, and operational glue. This is the primary coding surface for building ingestion, enrichment, and automation logic in data engineering.",
            "slug": "programming-languages-for-data-work",
            "source": "db"
          },
          "input_skill": "Python",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Data Engineer",
              "id": 2,
              "rationale": null,
              "role_archetype": null,
              "slug": "data-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Programming Languages for ML Systems",
            "id": 39,
            "rationale": "Languages used to build training code, inference services, evaluation jobs, and ML glue code. This is the primary implementation surface for ML engineers across experimentation and productionization.",
            "slug": "programming-languages-for-ml-systems",
            "source": "db"
          },
          "input_skill": "Python",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "ML Engineer",
              "id": 3,
              "rationale": null,
              "role_archetype": null,
              "slug": "ml-engineer",
              "source": "db"
            },
            {
              "display_name": "MLOps Engineer",
              "id": 16,
              "rationale": null,
              "role_archetype": null,
              "slug": "ml-ops-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Programming Languages for XR",
            "id": 97,
            "rationale": "Primary implementation languages used to build immersive client features, interaction logic, and device-specific runtime behavior. This is the core coding surface for AR/VR experiences.",
            "slug": "programming-languages-for-xr",
            "source": "db"
          },
          "input_skill": "Python",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "AR/VR Engineer",
              "id": 8,
              "rationale": null,
              "role_archetype": null,
              "slug": "ar-vr-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Python Programming",
            "id": 290,
            "rationale": "Core Python language skills used to implement backend business logic, request handlers, integrations, and service internals. This is the primary coding surface for the role.",
            "slug": "python-programming",
            "source": "db"
          },
          "input_skill": "Python",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Python Backend Developer",
              "id": 80,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "python-backend-developer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "Python",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "SQL",
          "alias_type": "CANONICAL",
          "id": 271,
          "is_primary": true,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 6,
        "display_name": "SQL",
        "id": 101,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "LANGUAGE",
        "slug": "sql",
        "sub_category_id": 97,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Pega Programming Languages \u0026 DSLs",
            "id": 267,
            "rationale": "Programming languages and domain-specific languages used in Pega development.",
            "slug": "pega-programming-languages-dsls",
            "source": "db"
          },
          "input_skill": "SQL",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Pega Developer",
              "id": 24,
              "rationale": null,
              "role_archetype": null,
              "slug": "pega-developer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Programming Languages for Data Work",
            "id": 21,
            "rationale": "Languages used to implement data pipelines, transformations, and operational glue. This is the primary coding surface for building ingestion, enrichment, and automation logic in data engineering.",
            "slug": "programming-languages-for-data-work",
            "source": "db"
          },
          "input_skill": "SQL",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Data Engineer",
              "id": 2,
              "rationale": null,
              "role_archetype": null,
              "slug": "data-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "SQL",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "Google Cloud Platform",
          "alias_type": "CANONICAL",
          "id": 741,
          "is_primary": true,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 9,
        "display_name": "Google Cloud Platform",
        "id": 425,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "PLATFORM",
        "slug": "google-cloud-platform",
        "sub_category_id": 46,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Cloud \u0026 Hosting Providers",
            "id": 414,
            "rationale": "Knowledge of major cloud and hosting vendor platforms for deploying and managing PHP applications.",
            "slug": "cloud-hosting-providers",
            "source": "db"
          },
          "input_skill": "Google Cloud Platform",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "PHP Backend Developer",
              "id": 86,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "php-backend-developer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Cloud Provider Platforms",
            "id": 131,
            "rationale": "Major cloud platforms and their core service ecosystems used to design target-state architectures, choose deployment boundaries, and evaluate managed capabilities. This is the primary substrate for cloud architecture decisions.",
            "slug": "cloud-provider-platforms",
            "source": "db"
          },
          "input_skill": "Google Cloud Platform",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Cloud Architect",
              "id": 9,
              "rationale": null,
              "role_archetype": null,
              "slug": "cloud-architect",
              "source": "db"
            },
            {
              "display_name": "Cloud Security Engineer",
              "id": 23,
              "rationale": null,
              "role_archetype": null,
              "slug": "cloud-security-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Cloud Security Posture Tools",
            "id": 64,
            "rationale": "Cloud-native security platforms used to assess misconfiguration, workload exposure, and cloud control coverage. This dimension includes the major CNAPP/CSPM/CWPP vendors and cloud security services the role reviews and tunes.",
            "slug": "cloud-security-posture-tools",
            "source": "db"
          },
          "input_skill": "Google Cloud Platform",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Cloud Security Engineer",
              "id": 23,
              "rationale": null,
              "role_archetype": null,
              "slug": "cloud-security-engineer",
              "source": "db"
            },
            {
              "display_name": "Cyber Security Engineer",
              "id": 5,
              "rationale": null,
              "role_archetype": null,
              "slug": "cybersecurity-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "Google Cloud Platform",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "Apache Airflow",
          "alias_type": "CANONICAL",
          "id": 304,
          "is_primary": true,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 13,
        "display_name": "Apache Airflow",
        "id": 110,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "TOOL",
        "slug": "apache-airflow",
        "sub_category_id": 130,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Data Pipeline Orchestration",
            "id": 23,
            "rationale": "Workflow engines that schedule, coordinate, and recover batch data jobs. This cluster covers dependency management, retries, backfills, sensors, and operational control of pipeline DAGs.",
            "slug": "data-pipeline-orchestration",
            "source": "db"
          },
          "input_skill": "Apache Airflow",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Data Engineer",
              "id": 2,
              "rationale": null,
              "role_archetype": null,
              "slug": "data-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "Apache Airflow",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "Apache Spark",
          "alias_type": "CANONICAL",
          "id": 2004,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "apache spark 3",
          "alias_type": "VERSION",
          "id": 2006,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "spark",
          "alias_type": "VERSION",
          "id": 2510,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "spark 3",
          "alias_type": "VERSION",
          "id": 2007,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "spark 3.x",
          "alias_type": "VERSION",
          "id": 2009,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "spark3",
          "alias_type": "VERSION",
          "id": 2008,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 5,
        "display_name": "Apache Spark",
        "id": 1350,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "FRAMEWORK",
        "slug": "apache-spark",
        "sub_category_id": 1021,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "ETL and ELT Tooling",
            "id": 24,
            "rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
            "slug": "etl-and-elt-tooling",
            "source": "db"
          },
          "input_skill": "PySpark",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Data Engineer",
              "id": 2,
              "rationale": null,
              "role_archetype": null,
              "slug": "data-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "PySpark",
      "matched_via": "embedding_alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "ETL",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Data Engineering Tools",
          "skill_nature": "PRACTICE",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "etl",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "Kubernetes",
          "alias_type": "CANONICAL",
          "id": 1267,
          "is_primary": true,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Kubernetes 1.0+",
          "alias_type": "VERSION",
          "id": 1271,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Kubernetes 1.x",
          "alias_type": "VERSION",
          "id": 1270,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Kubernetes v1",
          "alias_type": "VERSION",
          "id": 1269,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "k8s",
          "alias_type": "VERSION",
          "id": 1268,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "kubernetes 1.x",
          "alias_type": "VERSION",
          "id": 1400,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "kubernetes latest",
          "alias_type": "VERSION",
          "id": 1401,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 9,
        "display_name": "Kubernetes",
        "id": 726,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "PLATFORM",
        "slug": "kubernetes",
        "sub_category_id": 557,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Container Orchestration Platforms",
            "id": 134,
            "rationale": "Platforms that schedule and manage containerized workloads across clusters and environments. Cloud Architects need these to define workload placement standards, cluster boundaries, and platform capabilities.",
            "slug": "container-orchestration-platforms",
            "source": "db"
          },
          "input_skill": "Kubernetes",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Cloud Architect",
              "id": 9,
              "rationale": null,
              "role_archetype": null,
              "slug": "cloud-architect",
              "source": "db"
            },
            {
              "display_name": "DevOps Engineer",
              "id": 10,
              "rationale": null,
              "role_archetype": null,
              "slug": "devops-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Kubernetes for ML Workloads",
            "id": 47,
            "rationale": "Kubernetes-native components used to schedule, accelerate, and isolate ML training and serving workloads. This includes GPU enablement and ML-specific controllers rather than generic cluster administration.",
            "slug": "kubernetes-for-ml-workloads",
            "source": "db"
          },
          "input_skill": "Kubernetes",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "ML Engineer",
              "id": 3,
              "rationale": null,
              "role_archetype": null,
              "slug": "ml-engineer",
              "source": "db"
            },
            {
              "display_name": "MLOps Engineer",
              "id": 16,
              "rationale": null,
              "role_archetype": null,
              "slug": "ml-ops-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "Kubernetes",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "CI/CD",
          "alias_type": "CANONICAL",
          "id": 1826,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 8,
        "display_name": "CI/CD",
        "id": 1190,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "METHODOLOGY",
        "slug": "ci-cd",
        "sub_category_id": 900,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "CI/CD Pipeline Platforms",
            "id": 150,
            "rationale": "Systems used to define, run, and maintain automated build and deployment workflows. This cluster is coherent because the role owns delivery automation end to end, including pipeline reliability and promotion logic.",
            "slug": "ci-cd-pipeline-platforms",
            "source": "db"
          },
          "input_skill": "CI/CD",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "DevOps Engineer",
              "id": 10,
              "rationale": null,
              "role_archetype": null,
              "slug": "devops-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "CI/CD for Machine Learning",
            "id": 56,
            "rationale": "Tools and platforms for automating ML model integration, testing, and deployment pipelines.",
            "slug": "ci-cd-for-machine-learning",
            "source": "db"
          },
          "input_skill": "CI/CD",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "ML Engineer",
              "id": 3,
              "rationale": null,
              "role_archetype": null,
              "slug": "ml-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "CI/CD",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "TensorFlow",
          "alias_type": "CANONICAL",
          "id": 442,
          "is_primary": true,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "TF1",
          "alias_type": "VERSION",
          "id": 443,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "TF2",
          "alias_type": "VERSION",
          "id": 444,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "TensorFlow 1",
          "alias_type": "VERSION",
          "id": 445,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "TensorFlow 1.x",
          "alias_type": "VERSION",
          "id": 447,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "TensorFlow 2",
          "alias_type": "VERSION",
          "id": 446,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "TensorFlow 2.x",
          "alias_type": "VERSION",
          "id": 448,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "tensorflow 1",
          "alias_type": "VERSION",
          "id": 2490,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "tensorflow 1.x",
          "alias_type": "VERSION",
          "id": 2494,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "tensorflow 2",
          "alias_type": "VERSION",
          "id": 2491,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "tensorflow 2.x",
          "alias_type": "VERSION",
          "id": 2495,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "tensorflow v1",
          "alias_type": "VERSION",
          "id": 2492,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "tensorflow v2",
          "alias_type": "VERSION",
          "id": 2493,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "tf",
          "alias_type": "VERSION",
          "id": 2487,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "tf1",
          "alias_type": "VERSION",
          "id": 2488,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "tf2",
          "alias_type": "VERSION",
          "id": 2489,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 7,
        "display_name": "TensorFlow",
        "id": 196,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "LIBRARY",
        "slug": "tensorflow",
        "sub_category_id": 156,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "ML Frameworks and Libraries",
            "id": 40,
            "rationale": "Core libraries used to define models, train them, run inference, and evaluate predictive performance. These frameworks shape how ML engineers express model architectures and training loops.",
            "slug": "ml-frameworks-and-libraries",
            "source": "db"
          },
          "input_skill": "TensorFlow",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "ML Engineer",
              "id": 3,
              "rationale": null,
              "role_archetype": null,
              "slug": "ml-engineer",
              "source": "db"
            },
            {
              "display_name": "MLOps Engineer",
              "id": 16,
              "rationale": null,
              "role_archetype": null,
              "slug": "ml-ops-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "TensorFlow",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "Kubeflow",
          "alias_type": "CANONICAL",
          "id": 465,
          "is_primary": true,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Kubeflow 1.x",
          "alias_type": "VERSION",
          "id": 468,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Kubeflow 2.x",
          "alias_type": "VERSION",
          "id": 469,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Kubeflow v1",
          "alias_type": "VERSION",
          "id": 466,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Kubeflow v2",
          "alias_type": "VERSION",
          "id": 467,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 5,
        "display_name": "Kubeflow",
        "id": 213,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "FRAMEWORK",
        "slug": "kubeflow",
        "sub_category_id": 1127,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "MLOps Platforms and Lifecycle",
            "id": 43,
            "rationale": "End-to-end managed platforms used to train, deploy, register, and govern models across their lifecycle. This is the operational control plane for production ML workflows.",
            "slug": "mlops-platforms-and-lifecycle",
            "source": "db"
          },
          "input_skill": "Kubeflow Pipelines",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "ML Engineer",
              "id": 3,
              "rationale": null,
              "role_archetype": null,
              "slug": "ml-engineer",
              "source": "db"
            },
            {
              "display_name": "MLOps Engineer",
              "id": 16,
              "rationale": null,
              "role_archetype": null,
              "slug": "ml-ops-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "Kubeflow Pipelines",
      "matched_via": "embedding_alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "TFX",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Machine Learning Frameworks",
          "skill_nature": "TOOL",
          "sub_category": "general",
          "typical_lifespan": "SHORT_LIVED",
          "version_strategy": "VERSIONED",
          "volatility": "FAST"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "tfx",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "Machine Learning",
          "alias_type": "CANONICAL",
          "id": 2015,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 2,
        "display_name": "Machine Learning",
        "id": 1356,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "CONCEPT",
        "slug": "machine-learning",
        "sub_category_id": 1024,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "AI Governance and Model Security",
            "id": 50,
            "rationale": "Controls and documentation used to make models safer, auditable, and compliant. ML engineers use this to manage model risk, supply chain integrity, and governance requirements.",
            "slug": "ai-governance-and-model-security",
            "source": "db"
          },
          "input_skill": "Machine Learning",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "AI Engineer",
              "id": 13,
              "rationale": null,
              "role_archetype": null,
              "slug": "ai-engineer",
              "source": "db"
            },
            {
              "display_name": "ML Engineer",
              "id": 3,
              "rationale": null,
              "role_archetype": null,
              "slug": "ml-engineer",
              "source": "db"
            },
            {
              "display_name": "MLOps Engineer",
              "id": 16,
              "rationale": null,
              "role_archetype": null,
              "slug": "ml-ops-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "React Frontend Development",
            "id": 96,
            "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
            "slug": "d_init_01",
            "source": "db"
          },
          "input_skill": "Machine Learning",
          "llm_role": null,
          "roles_from_db": []
        }
      ],
      "input_skill": "Machine Learning",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "OOP",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Concepts",
          "skill_nature": "CONCEPT",
          "sub_category": "general",
          "typical_lifespan": "EVERGREEN",
          "version_strategy": "UNVERSIONED",
          "volatility": "STABLE"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "oop",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Databases",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Databases",
          "skill_nature": "CONCEPT",
          "sub_category": "general",
          "typical_lifespan": "EVERGREEN",
          "version_strategy": "UNVERSIONED",
          "volatility": "STABLE"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "databases",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    }
  ],
  "unmatched_skills": [
    "ETL",
    "TFX",
    "OOP",
    "Databases"
  ]
}

API 3 — final-role-output

{
  "chosen_role": {
    "display_name": "MLOps Engineer",
    "id": 16,
    "rationale": "Domain=AI / ML; The JD is centered on building and operating ML pipelines and infrastructure with Airflow, Kubernetes, CI/CD, and cloud deployment, which best matches MLOps engineering.",
    "role_archetype": null,
    "slug": "ml-ops-engineer",
    "source": "db"
  },
  "chosen_role_resolution": "in_db",
  "final_input_skills": [
    {
      "skill": "Python",
      "tag": "in_db"
    },
    {
      "skill": "SQL",
      "tag": "in_db"
    },
    {
      "skill": "Google Cloud Platform",
      "tag": "in_db"
    },
    {
      "skill": "Apache Airflow",
      "tag": "in_db"
    },
    {
      "skill": "PySpark",
      "tag": "in_db"
    },
    {
      "skill": "ETL",
      "tag": "new"
    },
    {
      "skill": "Kubernetes",
      "tag": "in_db"
    },
    {
      "skill": "CI/CD",
      "tag": "in_db"
    },
    {
      "skill": "TensorFlow",
      "tag": "in_db"
    },
    {
      "skill": "Kubeflow Pipelines",
      "tag": "in_db"
    },
    {
      "skill": "TFX",
      "tag": "new"
    },
    {
      "skill": "Machine Learning",
      "tag": "in_db"
    },
    {
      "skill": "OOP",
      "tag": "new"
    },
    {
      "skill": "Databases",
      "tag": "new"
    }
  ],
  "llm_cost_api1_usd": null,
  "llm_cost_api2_usd": null,
  "llm_cost_api3_usd": null,
  "llm_cost_total_usd": null,
  "persistence": {
    "items": [
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Cloud Security Scripting \u0026 DSL Languages",
          "id": 248,
          "rationale": "Proficiency in programming and domain-specific languages used to automate and script cloud security controls.",
          "slug": "cloud-security-scripting-dsl-languages",
          "source": "db"
        },
        "dimension_id": 248,
        "input_skill": "Python",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Cloud Security Engineer",
            "id": 23,
            "rationale": null,
            "role_archetype": null,
            "slug": "cloud-security-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 5,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Programming Languages",
          "id": 1,
          "rationale": "Primary implementation languages used to build client and server feature code. Full stack engineers need enough fluency to move across layers and implement product behavior end to end.",
          "slug": "programming-languages",
          "source": "db"
        },
        "dimension_id": 1,
        "input_skill": "Python",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Backend Developer",
            "id": 1,
            "rationale": null,
            "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
            "slug": "backend-engineer",
            "source": "db"
          },
          {
            "display_name": "Fullstack Developer",
            "id": 435,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "fullstack-developer",
            "source": "db"
          },
          {
            "display_name": "Fullstack Developer",
            "id": 15,
            "rationale": null,
            "role_archetype": null,
            "slug": "full-stack-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 5,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Programming Languages and Scripting",
          "id": 59,
          "rationale": "Languages used to write security automation, analysis scripts, detection logic, and remediation helpers. This is the primary implementation surface for a cybersecurity engineer across tooling and response workflows.",
          "slug": "programming-languages-and-scripting",
          "source": "db"
        },
        "dimension_id": 59,
        "input_skill": "Python",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Cyber Security Engineer",
            "id": 5,
            "rationale": null,
            "role_archetype": null,
            "slug": "cybersecurity-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 5,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Programming Languages for Data Work",
          "id": 21,
          "rationale": "Languages used to implement data pipelines, transformations, and operational glue. This is the primary coding surface for building ingestion, enrichment, and automation logic in data engineering.",
          "slug": "programming-languages-for-data-work",
          "source": "db"
        },
        "dimension_id": 21,
        "input_skill": "Python",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Data Engineer",
            "id": 2,
            "rationale": null,
            "role_archetype": null,
            "slug": "data-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 5,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Programming Languages for ML Systems",
          "id": 39,
          "rationale": "Languages used to build training code, inference services, evaluation jobs, and ML glue code. This is the primary implementation surface for ML engineers across experimentation and productionization.",
          "slug": "programming-languages-for-ml-systems",
          "source": "db"
        },
        "dimension_id": 39,
        "input_skill": "Python",
        "llm_role": null,
        "matched_chosen_role": true,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension saved",
        "role_dimension_saved": true,
        "roles_from_db": [
          {
            "display_name": "ML Engineer",
            "id": 3,
            "rationale": null,
            "role_archetype": null,
            "slug": "ml-engineer",
            "source": "db"
          },
          {
            "display_name": "MLOps Engineer",
            "id": 16,
            "rationale": null,
            "role_archetype": null,
            "slug": "ml-ops-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 5,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Programming Languages for XR",
          "id": 97,
          "rationale": "Primary implementation languages used to build immersive client features, interaction logic, and device-specific runtime behavior. This is the core coding surface for AR/VR experiences.",
          "slug": "programming-languages-for-xr",
          "source": "db"
        },
        "dimension_id": 97,
        "input_skill": "Python",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "AR/VR Engineer",
            "id": 8,
            "rationale": null,
            "role_archetype": null,
            "slug": "ar-vr-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 5,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Python Programming",
          "id": 290,
          "rationale": "Core Python language skills used to implement backend business logic, request handlers, integrations, and service internals. This is the primary coding surface for the role.",
          "slug": "python-programming",
          "source": "db"
        },
        "dimension_id": 290,
        "input_skill": "Python",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Python Backend Developer",
            "id": 80,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "python-backend-developer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 5,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Pega Programming Languages \u0026 DSLs",
          "id": 267,
          "rationale": "Programming languages and domain-specific languages used in Pega development.",
          "slug": "pega-programming-languages-dsls",
          "source": "db"
        },
        "dimension_id": 267,
        "input_skill": "SQL",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Pega Developer",
            "id": 24,
            "rationale": null,
            "role_archetype": null,
            "slug": "pega-developer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 101,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Programming Languages for Data Work",
          "id": 21,
          "rationale": "Languages used to implement data pipelines, transformations, and operational glue. This is the primary coding surface for building ingestion, enrichment, and automation logic in data engineering.",
          "slug": "programming-languages-for-data-work",
          "source": "db"
        },
        "dimension_id": 21,
        "input_skill": "SQL",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Data Engineer",
            "id": 2,
            "rationale": null,
            "role_archetype": null,
            "slug": "data-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 101,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Cloud \u0026 Hosting Providers",
          "id": 414,
          "rationale": "Knowledge of major cloud and hosting vendor platforms for deploying and managing PHP applications.",
          "slug": "cloud-hosting-providers",
          "source": "db"
        },
        "dimension_id": 414,
        "input_skill": "Google Cloud Platform",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "PHP Backend Developer",
            "id": 86,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "php-backend-developer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 425,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Cloud Provider Platforms",
          "id": 131,
          "rationale": "Major cloud platforms and their core service ecosystems used to design target-state architectures, choose deployment boundaries, and evaluate managed capabilities. This is the primary substrate for cloud architecture decisions.",
          "slug": "cloud-provider-platforms",
          "source": "db"
        },
        "dimension_id": 131,
        "input_skill": "Google Cloud Platform",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Cloud Architect",
            "id": 9,
            "rationale": null,
            "role_archetype": null,
            "slug": "cloud-architect",
            "source": "db"
          },
          {
            "display_name": "Cloud Security Engineer",
            "id": 23,
            "rationale": null,
            "role_archetype": null,
            "slug": "cloud-security-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 425,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Cloud Security Posture Tools",
          "id": 64,
          "rationale": "Cloud-native security platforms used to assess misconfiguration, workload exposure, and cloud control coverage. This dimension includes the major CNAPP/CSPM/CWPP vendors and cloud security services the role reviews and tunes.",
          "slug": "cloud-security-posture-tools",
          "source": "db"
        },
        "dimension_id": 64,
        "input_skill": "Google Cloud Platform",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Cloud Security Engineer",
            "id": 23,
            "rationale": null,
            "role_archetype": null,
            "slug": "cloud-security-engineer",
            "source": "db"
          },
          {
            "display_name": "Cyber Security Engineer",
            "id": 5,
            "rationale": null,
            "role_archetype": null,
            "slug": "cybersecurity-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 425,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Data Pipeline Orchestration",
          "id": 23,
          "rationale": "Workflow engines that schedule, coordinate, and recover batch data jobs. This cluster covers dependency management, retries, backfills, sensors, and operational control of pipeline DAGs.",
          "slug": "data-pipeline-orchestration",
          "source": "db"
        },
        "dimension_id": 23,
        "input_skill": "Apache Airflow",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Data Engineer",
            "id": 2,
            "rationale": null,
            "role_archetype": null,
            "slug": "data-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 110,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "ETL and ELT Tooling",
          "id": 24,
          "rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
          "slug": "etl-and-elt-tooling",
          "source": "db"
        },
        "dimension_id": 24,
        "input_skill": "PySpark",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Skipped \u2014 no persistable v3 meta for new skill",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Data Engineer",
            "id": 2,
            "rationale": null,
            "role_archetype": null,
            "slug": "data-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": false,
        "skill_id": null,
        "skill_tag": "new",
        "skipped_reason": "skill_not_in_db_v3_proposed"
      },
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Container Orchestration Platforms",
          "id": 134,
          "rationale": "Platforms that schedule and manage containerized workloads across clusters and environments. Cloud Architects need these to define workload placement standards, cluster boundaries, and platform capabilities.",
          "slug": "container-orchestration-platforms",
          "source": "db"
        },
        "dimension_id": 134,
        "input_skill": "Kubernetes",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Cloud Architect",
            "id": 9,
            "rationale": null,
            "role_archetype": null,
            "slug": "cloud-architect",
            "source": "db"
          },
          {
            "display_name": "DevOps Engineer",
            "id": 10,
            "rationale": null,
            "role_archetype": null,
            "slug": "devops-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 726,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Kubernetes for ML Workloads",
          "id": 47,
          "rationale": "Kubernetes-native components used to schedule, accelerate, and isolate ML training and serving workloads. This includes GPU enablement and ML-specific controllers rather than generic cluster administration.",
          "slug": "kubernetes-for-ml-workloads",
          "source": "db"
        },
        "dimension_id": 47,
        "input_skill": "Kubernetes",
        "llm_role": null,
        "matched_chosen_role": true,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension saved",
        "role_dimension_saved": true,
        "roles_from_db": [
          {
            "display_name": "ML Engineer",
            "id": 3,
            "rationale": null,
            "role_archetype": null,
            "slug": "ml-engineer",
            "source": "db"
          },
          {
            "display_name": "MLOps Engineer",
            "id": 16,
            "rationale": null,
            "role_archetype": null,
            "slug": "ml-ops-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 726,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "CI/CD Pipeline Platforms",
          "id": 150,
          "rationale": "Systems used to define, run, and maintain automated build and deployment workflows. This cluster is coherent because the role owns delivery automation end to end, including pipeline reliability and promotion logic.",
          "slug": "ci-cd-pipeline-platforms",
          "source": "db"
        },
        "dimension_id": 150,
        "input_skill": "CI/CD",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "DevOps Engineer",
            "id": 10,
            "rationale": null,
            "role_archetype": null,
            "slug": "devops-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 1190,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "CI/CD for Machine Learning",
          "id": 56,
          "rationale": "Tools and platforms for automating ML model integration, testing, and deployment pipelines.",
          "slug": "ci-cd-for-machine-learning",
          "source": "db"
        },
        "dimension_id": 56,
        "input_skill": "CI/CD",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "ML Engineer",
            "id": 3,
            "rationale": null,
            "role_archetype": null,
            "slug": "ml-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 1190,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "ML Frameworks and Libraries",
          "id": 40,
          "rationale": "Core libraries used to define models, train them, run inference, and evaluate predictive performance. These frameworks shape how ML engineers express model architectures and training loops.",
          "slug": "ml-frameworks-and-libraries",
          "source": "db"
        },
        "dimension_id": 40,
        "input_skill": "TensorFlow",
        "llm_role": null,
        "matched_chosen_role": true,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension saved",
        "role_dimension_saved": true,
        "roles_from_db": [
          {
            "display_name": "ML Engineer",
            "id": 3,
            "rationale": null,
            "role_archetype": null,
            "slug": "ml-engineer",
            "source": "db"
          },
          {
            "display_name": "MLOps Engineer",
            "id": 16,
            "rationale": null,
            "role_archetype": null,
            "slug": "ml-ops-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 196,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "MLOps Platforms and Lifecycle",
          "id": 43,
          "rationale": "End-to-end managed platforms used to train, deploy, register, and govern models across their lifecycle. This is the operational control plane for production ML workflows.",
          "slug": "mlops-platforms-and-lifecycle",
          "source": "db"
        },
        "dimension_id": 43,
        "input_skill": "Kubeflow Pipelines",
        "llm_role": null,
        "matched_chosen_role": true,
        "outcome_line": "Skipped \u2014 no persistable v3 meta for new skill",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "ML Engineer",
            "id": 3,
            "rationale": null,
            "role_archetype": null,
            "slug": "ml-engineer",
            "source": "db"
          },
          {
            "display_name": "MLOps Engineer",
            "id": 16,
            "rationale": null,
            "role_archetype": null,
            "slug": "ml-ops-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": false,
        "skill_id": null,
        "skill_tag": "new",
        "skipped_reason": "skill_not_in_db_v3_proposed"
      },
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "AI Governance and Model Security",
          "id": 50,
          "rationale": "Controls and documentation used to make models safer, auditable, and compliant. ML engineers use this to manage model risk, supply chain integrity, and governance requirements.",
          "slug": "ai-governance-and-model-security",
          "source": "db"
        },
        "dimension_id": 50,
        "input_skill": "Machine Learning",
        "llm_role": null,
        "matched_chosen_role": true,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension saved",
        "role_dimension_saved": true,
        "roles_from_db": [
          {
            "display_name": "AI Engineer",
            "id": 13,
            "rationale": null,
            "role_archetype": null,
            "slug": "ai-engineer",
            "source": "db"
          },
          {
            "display_name": "ML Engineer",
            "id": 3,
            "rationale": null,
            "role_archetype": null,
            "slug": "ml-engineer",
            "source": "db"
          },
          {
            "display_name": "MLOps Engineer",
            "id": 16,
            "rationale": null,
            "role_archetype": null,
            "slug": "ml-ops-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 1356,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "React Frontend Development",
          "id": 96,
          "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
          "slug": "d_init_01",
          "source": "db"
        },
        "dimension_id": 96,
        "input_skill": "Machine Learning",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [],
        "skill_dimension_saved": true,
        "skill_id": 1356,
        "skill_tag": "in_db",
        "skipped_reason": null
      }
    ],
    "new_skills_created": 0,
    "role_dimension_saved": 0,
    "skill_dimension_saved": 0,
    "skipped": 2
  },
  "planner_output": null,
  "run_id": "f44ceab1-c4b3-4c44-93df-420af9b73fce"
}

LLM Calls

Every model call made for this run, in pipeline order. Click a card to see the model's response.

Loading…