← Back to history

Pipeline run

f44ceab1-c4b3-4c44-93df-420af9b73fce

Pipeline LLM cost (USD)
API 1: $0.0088 API 2: $0.0002 API 3: $0.0000 Total: $0.0090

Client output enrichment

v2 Skill cluster · Nature of work · AI index · Tech stack maturity · Evidence · KRA description
Nature of work · Data Engineering / ML Platform
Build and run GCP-based ML/data pipelines in Python, SQL, Airflow and PySpark; engineer features from large datasets, deploy scalable ML infrastructure/models, and keep CI/CD, Kubernetes, and orchestration reliable.
"develop ML pipeline solutions using airflow, Python and SQL"
Tech stack maturity
Modern Cloud Native
The stack centers on Kubernetes, Google Cloud Platform, CI/CD, Airflow, and modern ML tooling, which is characteristic of a cloud-native MLOps environment.
AI index (0 = no AI use, 5 = totally AI-dependent · v2.1)
1.70 / 5
· Title match
Has AI skill
AI skill (primary)
· AI skill (secondary)
· On AI team
· Builds AI products
vocab breakdown (legacy)
Assistants (×1):
Frameworks (×2):
Models / concepts (×3): ML, Machine Learning, Deep Learning
Evidence — skills matched in JD (14)
Python SQL Google Cloud Platform Apache Airflow PySpark ETL Kubernetes CI/CD TensorFlow Kubeflow Pipelines TFX Machine Learning OOP Databases
Skill cluster (6 dimension groups, role-scoped)
AI Governance and Model Security
Machine Learning
Cloud Provider Platforms
Google Cloud Platform
Kubernetes for ML Workloads
Kubernetes
ML Frameworks and Libraries
TensorFlow
Programming Languages for ML Systems
Python
Cross-cutting / unaligned
SQL Apache Airflow PySpark ETL CI/CD Kubeflow Pipelines TFX OOP Databases
Show KRA description ↓
EXL provides consulting and analytics support to fortune 500 companies across multiple industry domains. For this role, you will be supporting the data science team of a leading US Media firm. While working on Google Cloud Platform (GCP), you are expected to develop ML pipeline solutions using airflow, Python and SQL. • Advanced Python (building applications using python code OOPs concepts), advanced SQL skills and experience with Cloud Platforms (GCP preferred) • Experience in developing scalable and robust data pipelines using Python, SQL, PySpark, ETL orchestration tools like Airflow. • A good understanding of system design, Databases and OOPs concepts • Work with large, big data sources, focusing on efficient data creation and feature engineering and create pipelines that feed the data science models • Develop and maintain Build, Continuous Deployment, and Continuous Integration systems. • Hands on experience in Kubernetes, CI/CD pipelines, continually improve CI/CD tools, processes, and procedures. • A good understanding of system architecture (pertaining to data pipelines) and the availability to solve complex problems • Contribute to the team through mentorship, technical methods, improvements in how we work • Design & build data pipelines and production level ML infrastructure, using tools such as TFX, , Kubeflow Pipelines, TensorFlow • Experience in deploying scalable ML Models in cloud platforms, setting up alerting, restartability etc. • Deploy ML models under the constraints of scalability, correctness, and maintainability. • Drive work on creating a state-of-the-art codebase and machine learning lifecycle infrastructure

Signals

Skill ml-engineer
0.42
Alias ml-engineer
1.00
KRA ml-engineer
0.61

Post-classification

Centroidupdated · n=6
Alias collision log
New-role queue
New skills captured6
New KRA captured

Captured for admin review

PySpark primary MLOps Engineer pending
ETL primary MLOps Engineer pending
Kubeflow Pipelines primary MLOps Engineer pending
TFX primary MLOps Engineer pending
OOP MLOps Engineer pending
Databases MLOps Engineer pending
Status: completed Created: 2026-05-27T14:21:16.049754Z Updated: 2026-05-27T14:23:00.846316Z API 3 duration: 65406 ms
Flow Current 3-step pipeline

1 POST /skills/extract-from-jd

2 POST /skills/extract-details

3 POST /skills/final-role-output

Role Chosen role & resolution

MLOps Engineer

domain · AI / ML CASE DOMAIN

slug: ml-ops-engineer · id: 16 · source: db

Domain=AI / ML; The JD is centered on building and operating ML pipelines and infrastructure with Airflow, Kubernetes, CI/CD, and cloud deployment, which best matches MLOps engineering.

Matched skills

Google Cloud Platform (GCP)PythonSQLPySparkAirflowKubernetesCI/CD pipelinesTFXKubeflow PipelinesTensorFlowDatabasesOOPs concepts

Matched dimensions

ML pipeline engineeringData pipeline orchestrationCloud ML infrastructureCI/CD automationScalable model deploymentSystem design and architectureFeature engineering for big dataOperational reliability and maintainability

Matched KRAs

develop ML pipeline solutions using airflow, Python and SQLdeveloping scalable and robust data pipelinescreate pipelines that feed the data science modelsDevelop and maintain Build, Continuous Deployment, and Continuous Integration systemsHands on experience in Kubernetes, CI/CD pipelinesDesign & build data pipelines and production level ML infrastructureExperience in deploying scalable ML Models in cloud platformssetting up alerting, restartability etc.Drive work on creating a state-of-the-art codebasemachine learning lifecycle infrastructure

Resolution: in_db — role exists in library; skill↔dim and role↔dim links saved when applicable.

0
New skills
0
Skill↔dim saved
0
Role↔dim saved
2
Skipped

Job description

DevOps/ML Engineer

Role Overview

EXL provides consulting and analytics support to fortune 500 companies across multiple industry domains. For this role, you will be supporting the data science team of a leading US Media firm. While working on Google Cloud Platform (GCP), you are expected to develop ML pipeline solutions using airflow, Python and SQL.

Required Skills –
• Advanced Python (building applications using python code OOPs concepts), advanced SQL skills and experience with Cloud Platforms (GCP preferred)
• Experience in developing scalable and robust data pipelines using Python, SQL, PySpark, ETL orchestration tools like Airflow.
• A good understanding of system design, Databases and OOPs concepts
• Work with large, big data sources, focusing on efficient data creation and feature engineering and create pipelines that feed the data science models
• Develop and maintain Build, Continuous Deployment, and Continuous Integration systems.
• Hands on experience in Kubernetes, CI/CD pipelines, continually improve CI/CD tools, processes, and procedures.
• A good understanding of system architecture (pertaining to data pipelines) and the availability to solve complex problems
• Contribute to the team through mentorship, technical methods, improvements in how we work

Good To Have Skills
• Design & build data pipelines and production level ML infrastructure, using tools such as TFX, , Kubeflow Pipelines, TensorFlow
• Experience in deploying scalable ML Models in cloud platforms, setting up alerting, restartability etc.
• Deploy ML models under the constraints of scalability, correctness, and maintainability.
• Drive work on creating a state-of-the-art codebase and machine learning lifecycle infrastructure

Qualifications
• Masters or Bachelor's degree in Computer Science (or) math heavy degrees from top-tier universities with strong record of achievement
• Deep knowledge of statistical methods and machine learning with special emphasis on deep learning algorithms.
• 3+ Years of experience in algorithms, machine learning, data science (or) statistics
• Experience solving problems using Machine Learning Frameworks(e.g. PyTorch, TensorFlow) 
• Experience with Big Query; Comfortable with writing complex SQL queries for data retrieval & transformation
• Proficient in Python / Pyspark. 
• Prior experience in management consulting and/or analytics based consulting is a plus
• Experience in building and maintaining data pipeline using Airflow/ Jenkins will be preferred.

EXL Company Overview

EXL (NASDAQ: EXLS) is a leading operations management and analytics company that designs and enables agile, customer-centric operating models to help clients improve their revenue growth and profitability. Our delivery model provides market-leading business outcomes using EXL’s proprietary Business EXLerator Framework™, cutting-edge analytics, digital transformation and domain expertise. At EXL, we look deeper to help companies improve global operations, enhance data-driven insights, increase customer satisfaction, and manage risk and compliance. EXL serves the insurance, healthcare, banking and financial services, utilities, travel, transportation and logistics industries. Headquartered in New York, New York, EXL has more than 32,000 professionals in locations throughout the United States, Europe, Asia (primarily India and Philippines), South America, Australia and South Africa.

EXL Analytics provides data-driven, action-oriented solutions to business problems through statistical data mining, cutting edge analytics techniques and a consultative approach. Leveraging proprietary methodology and best-of-breed technology, EXL Analytics takes an industry-specific approach to transform our clients’ decision making and embed analytics more deeply into their business processes. Our global footprint of nearly 2,000 data scientists and analysts assist client organizations with complex risk minimization methods, advanced marketing, pricing and CRM strategies, internal cost analysis, and cost and resource optimization within the organization. EXL Analytics serves the insurance, healthcare, banking, capital markets, utilities, retail and e-commerce, travel, transportation and logistics industries.

Please visit www.exlservice.com for more information about EXL Analytics.

Skills from this JD

Each row merges API 1 extraction, API 2 library match / v3 orchestration (dimensions + locked dims), and API 3 persistence tags.

Python Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: Python id=5 · python

Aliases — catalog

  • Python (CANONICAL) primary
  • Python 2 (VERSION)
  • Python 2.x (VERSION)
  • Python 3 (VERSION)
  • Python 3.10 (VERSION)
  • Python 3.11 (VERSION)
  • Python 3.12 (VERSION)
  • Python 3.x (VERSION)
  • py (VERSION)
  • py2 (VERSION)
  • py3 (VERSION)
  • python 3 (VERSION)
  • python 3.x (VERSION)
  • python2 (VERSION)
  • python3 (VERSION)
  • python3.x (VERSION)

Context tags (catalog)

API Django FastAPI Flask Jupyter NumPy PEP 8 Pandas REST SQLAlchemy asyncio pandas pip pytest type hints venv virtualenv

Stored enrichment (catalog DB)

Category
Language
Sub-category
Programming Language
Vendor
PSF
License
mit
Year introduced
1991
Confidence
0.99
Version strategy
SEPARATE_ENTITY
Version tag
3

Maturity reasoning: Python appears in a very high volume of job descriptions across data, backend, automation, and ML roles, and remains a default hiring-pipeline language on major job boards and tech stacks.

Skill profile (library / DB)

Skill nature
LANGUAGE
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
6
Sub-category id
96
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • Cloud Security Scripting & DSL Languages Catalog dimension db id 248

    Library dimension (catalog)

    Roles linked in library: Cloud Security Engineer

  • Programming Languages Catalog dimension db id 1

    Library dimension (catalog)

    Roles linked in library: Backend Developer, Fullstack Developer, Fullstack Developer

  • Programming Languages and Scripting Catalog dimension db id 59

    Library dimension (catalog)

    Roles linked in library: Cyber Security Engineer

  • Programming Languages for Data Work Catalog dimension db id 21

    Library dimension (catalog)

    Roles linked in library: Data Engineer

  • Programming Languages for ML Systems Catalog dimension db id 39

    Library dimension (catalog)

    Roles linked in library: ML Engineer, MLOps Engineer

  • Programming Languages for XR Catalog dimension db id 97

    Library dimension (catalog)

    Roles linked in library: AR/VR Engineer

  • Python Programming Catalog dimension db id 290

    Library dimension (catalog)

    Roles linked in library: Python Backend Developer

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
Cloud Security Scripting & DSL Languages
cloud-security-scripting-dsl-languages
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Programming Languages
programming-languages
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Programming Languages and Scripting
programming-languages-and-scripting
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Programming Languages for Data Work
programming-languages-for-data-work
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Programming Languages for ML Systems
programming-languages-for-ml-systems
Existing dimension (library) · Role↔dimension saved
Programming Languages for XR
programming-languages-for-xr
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Python Programming
python-programming
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
SQL Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: SQL id=101 · sql

Aliases — catalog

  • SQL (CANONICAL) primary

Context tags (catalog)

ACID CTE DDL DML ETL JOIN MySQL NoSQL OLAP ORM PostgreSQL SQL injection SQLite T-SQL data modeling data warehousing database normalization execution plan indexing joins normalization query optimization stored procedures subquery transaction isolation transaction management window functions

Stored enrichment (catalog DB)

Category
Language
Sub-category
Query Language
Vendor
ANSI
License
unknown
Year introduced
1974
Confidence
0.99
Version strategy
NOT_APPLICABLE

Maturity reasoning: SQL appears in a large share of data, backend, and analytics job descriptions and remains the default query language for PostgreSQL, MySQL, and cloud warehouses like Snowflake/BigQuery.

Skill profile (library / DB)

Skill nature
LANGUAGE
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
6
Sub-category id
97
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • Pega Programming Languages & DSLs Catalog dimension db id 267

    Library dimension (catalog)

    Roles linked in library: Pega Developer

  • Programming Languages for Data Work Catalog dimension db id 21

    Library dimension (catalog)

    Roles linked in library: Data Engineer

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
Pega Programming Languages & DSLs
pega-programming-languages-dsls
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Programming Languages for Data Work
programming-languages-for-data-work
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Google Cloud Platform Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: Google Cloud Platform id=425 · google-cloud-platform

Aliases — catalog

  • Google Cloud Platform (CANONICAL) primary

Context tags (catalog)

Anthos App Engine Artifact Registry BigQuery Cloud Build Cloud Functions Cloud Monitoring Cloud Pub/Sub Cloud Run Cloud SQL Cloud Spanner Cloud Storage Compute Engine Dataflow Dataproc GCP GKE IAM Kubernetes Kubernetes Engine Pub/Sub Serverless Stackdriver Terraform VPC

Stored enrichment (catalog DB)

Category
Platform
Sub-category
Cloud Platform
Vendor
Google
License
other_open
Year introduced
2008
Confidence
0.99
Version strategy
NOT_APPLICABLE

Maturity reasoning: GCP appears in many cloud-engineering job descriptions alongside AWS/Azure, and Google continues expanding managed services and certifications, indicating broad hiring demand rather than niche use.

Skill profile (library / DB)

Skill nature
PLATFORM
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
9
Sub-category id
46
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • Cloud & Hosting Providers Catalog dimension db id 414

    Library dimension (catalog)

    Roles linked in library: PHP Backend Developer

  • Cloud Provider Platforms Catalog dimension db id 131

    Library dimension (catalog)

    Roles linked in library: Cloud Architect, Cloud Security Engineer

  • Cloud Security Posture Tools Catalog dimension db id 64

    Library dimension (catalog)

    Roles linked in library: Cloud Security Engineer, Cyber Security Engineer

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
Cloud & Hosting Providers
cloud-hosting-providers
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Cloud Provider Platforms
cloud-provider-platforms
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Cloud Security Posture Tools
cloud-security-posture-tools
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Apache Airflow Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: Apache Airflow id=110 · apache-airflow

Aliases — catalog

  • Apache Airflow (CANONICAL) primary

Context tags (catalog)

CeleryExecutor DAG ETL KubernetesExecutor Sensors XCom backfill catchup cron data pipelines executor hooks operators scheduler task dependencies

Stored enrichment (catalog DB)

Category
Tool
Sub-category
Workflow Orchestration Tool
Vendor
Apache Software Foundation
License
apache_2
Year introduced
2015
Confidence
0.98
Version strategy
NOT_APPLICABLE

Maturity reasoning: Frequently listed in data engineering JDs and widely adopted for workflow orchestration; strong GitHub activity and managed offerings from AWS/GCP/Azure signal broad market demand.

Skill profile (library / DB)

Skill nature
TOOL
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
13
Sub-category id
130
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • Data Pipeline Orchestration Catalog dimension db id 23

    Library dimension (catalog)

    Roles linked in library: Data Engineer

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
Data Pipeline Orchestration
data-pipeline-orchestration
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
PySpark Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: Apache Spark id=1350 · apache-spark

Aliases — catalog

  • Apache Spark (CANONICAL)
  • apache spark 3 (VERSION)
  • spark (VERSION)
  • spark 3 (VERSION)
  • spark 3.x (VERSION)
  • spark3 (VERSION)

Context tags (catalog)

Apache Kafka Cluster Manager DAGScheduler Data Lake DataFrame ETL Hadoop MLlib Machine Learning PySpark RDD Scala Spark SQL Spark Streaming SparkSession

Stored enrichment (catalog DB)

Category
Framework
Sub-category
Distributed Data Processing Framework
Vendor
Apache Software Foundation
License
apache_2
Year introduced
2010
Confidence
0.94
Version strategy
SEPARATE_ENTITY
Version tag
3.x

Maturity reasoning: Apache Spark appears in many data engineering JDs and remains a standard for distributed ETL/ELT; its GitHub and vendor ecosystem activity stay strong, with Databricks and cloud platforms still promoting it.

Skill profile (library / DB)

Skill nature
FRAMEWORK
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
5
Sub-category id
1021
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • ETL and ELT Tooling Catalog dimension db id 24

    Library dimension (catalog)

    Roles linked in library: Data Engineer

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
ETL and ELT Tooling
etl-and-elt-tooling
Skipped — no persistable v3 meta for new skill
skill_not_in_db_v3_proposed
ETL Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Data Engineering Tools
Sub-category
general
Skill nature
PRACTICE
Volatility
MEDIUM
Typical lifespan
MULTI_YEAR
Version strategy
UNVERSIONED
Kubernetes Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: Kubernetes id=726 · kubernetes

Aliases — catalog

  • Kubernetes (CANONICAL) primary
  • Kubernetes 1.0+ (VERSION)
  • Kubernetes 1.x (VERSION)
  • Kubernetes v1 (VERSION)
  • k8s (VERSION)
  • kubernetes 1.x (VERSION)
  • kubernetes latest (VERSION)

Context tags (catalog)

CI/CD Cluster Autoscaler ConfigMap DaemonSet Deployment Docker Grafana Helm Ingress Istio K8s Kubelet Namespace Pod Prometheus RBAC Secret Service StatefulSet containerization deployment etcd kubectl load balancing microservices namespace orchestration persistent storage scalability service mesh

Stored enrichment (catalog DB)

Category
Platform
Sub-category
Container Orchestration Platform
Vendor
Cloud Native Computing Foundation
License
apache_2
Year introduced
2014
Confidence
0.90
Version strategy
SEPARATE_ENTITY
Version tag
1.30

Maturity reasoning: Broadly adopted in cloud-native stacks; Kubernetes appears in a large share of DevOps/SRE job descriptions and is the default orchestration platform across major cloud vendors.

Skill profile (library / DB)

Skill nature
PLATFORM
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
9
Sub-category id
557
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • Container Orchestration Platforms Catalog dimension db id 134

    Library dimension (catalog)

    Roles linked in library: Cloud Architect, DevOps Engineer

  • Kubernetes for ML Workloads Catalog dimension db id 47

    Library dimension (catalog)

    Roles linked in library: ML Engineer, MLOps Engineer

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
Container Orchestration Platforms
container-orchestration-platforms
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Kubernetes for ML Workloads
kubernetes-for-ml-workloads
Existing dimension (library) · Role↔dimension saved
CI/CD Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: CI/CD id=1190 · ci-cd

Aliases — catalog

  • CI/CD (CANONICAL)

Context tags (catalog)

Ansible CircleCI Docker GitLab CI Jenkins Kubernetes Terraform Travis CI automated testing build automation continuous deployment continuous integration deployment pipelines monitoring version control

Stored enrichment (catalog DB)

Category
Methodology
Sub-category
Ci Cd Process
Confidence
0.93
Version strategy
NOT_APPLICABLE

Maturity reasoning: CI/CD appears in a large share of software engineering JDs and is a standard requirement across DevOps, platform, and backend roles; major vendors like GitHub, GitLab, and AWS all center product roadmaps on CI/CD pipelines.

Skill profile (library / DB)

Skill nature
METHODOLOGY
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
8
Sub-category id
900
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • CI/CD Pipeline Platforms Catalog dimension db id 150

    Library dimension (catalog)

    Roles linked in library: DevOps Engineer

  • CI/CD for Machine Learning Catalog dimension db id 56

    Library dimension (catalog)

    Roles linked in library: ML Engineer

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
CI/CD Pipeline Platforms
ci-cd-pipeline-platforms
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
CI/CD for Machine Learning
ci-cd-for-machine-learning
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
TensorFlow Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: TensorFlow id=196 · tensorflow

Aliases — catalog

  • TensorFlow (CANONICAL) primary
  • TF1 (VERSION)
  • TF2 (VERSION)
  • TensorFlow 1 (VERSION)
  • TensorFlow 1.x (VERSION)
  • TensorFlow 2 (VERSION)
  • TensorFlow 2.x (VERSION)
  • tensorflow 1 (VERSION)
  • tensorflow 1.x (VERSION)
  • tensorflow 2 (VERSION)
  • tensorflow 2.x (VERSION)
  • tensorflow v1 (VERSION)
  • tensorflow v2 (VERSION)
  • tf (VERSION)
  • tf1 (VERSION)
  • tf2 (VERSION)

Context tags (catalog)

AutoGraph Distributed Training Eager Execution Estimator GPU Gradient Descent Hyperparameter Tuning Keras ModelCheckpoint Neural Networks ONNX SavedModel TF Lite TF Serving TF.js TFX TPU TensorBoard TensorFlow Hub TensorFlow Lite TensorFlow Serving Transfer Learning XLA tf.data tf.keras

Stored enrichment (catalog DB)

Category
Library
Sub-category
Machine Learning Library
Vendor
Google
License
apache_2
Year introduced
2015
Confidence
0.90
Version strategy
SEPARATE_ENTITY
Version tag
2.x

Maturity reasoning: TensorFlow appears in many ML/AI job descriptions and remains a standard production framework, with strong GitHub activity and broad vendor support from Google and cloud platforms.

Skill profile (library / DB)

Skill nature
LIBRARY
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
7
Sub-category id
156
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • ML Frameworks and Libraries Catalog dimension db id 40

    Library dimension (catalog)

    Roles linked in library: ML Engineer, MLOps Engineer

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
ML Frameworks and Libraries
ml-frameworks-and-libraries
Existing dimension (library) · Role↔dimension saved
Kubeflow Pipelines Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: Kubeflow id=213 · kubeflow

Aliases — catalog

  • Kubeflow (CANONICAL) primary
  • Kubeflow 1.x (VERSION)
  • Kubeflow 2.x (VERSION)
  • Kubeflow v1 (VERSION)
  • Kubeflow v2 (VERSION)

Context tags (catalog)

Argo Argo Workflows CI/CD Data preprocessing GPU scheduling Hyperparameter tuning Istio Jupyter notebooks KFServing Katib Kubeflow Pipelines Kubeflow Training Kubernetes ML pipelines MLOps MLflow MinIO Model serving Pipeline components PyTorch Seldon Seldon Core TensorFlow model serving

Stored enrichment (catalog DB)

Category
Framework
Sub-category
Mlops Framework
Vendor
Google
License
apache_2
Year introduced
2017
Confidence
0.90
Version strategy
NOT_APPLICABLE

Maturity reasoning: Kubeflow appears in some MLOps/ML platform JDs, but far less often than Kubernetes or managed ML platforms; GitHub activity is steady yet adoption remains specialized to ML infrastructure teams.

Skill profile (library / DB)

Skill nature
FRAMEWORK
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
5
Sub-category id
1127
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • MLOps Platforms and Lifecycle Catalog dimension db id 43

    Library dimension (catalog)

    Roles linked in library: ML Engineer, MLOps Engineer

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
MLOps Platforms and Lifecycle
mlops-platforms-and-lifecycle
Skipped — no persistable v3 meta for new skill
skill_not_in_db_v3_proposed
TFX Primary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Machine Learning Frameworks
Sub-category
general
Skill nature
TOOL
Volatility
FAST
Typical lifespan
SHORT_LIVED
Version strategy
VERSIONED
Machine Learning Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: Machine Learning id=1356 · machine-learning

Aliases — catalog

  • Machine Learning (CANONICAL)

Context tags (catalog)

Keras PyTorch TensorFlow cross-validation data preprocessing ensemble methods feature engineering hyperparameter tuning model evaluation natural language processing neural networks reinforcement learning scikit-learn supervised learning unsupervised learning

Stored enrichment (catalog DB)

Category
Concept
Sub-category
Machine Learning
Confidence
0.98
Version strategy
NOT_APPLICABLE

Maturity reasoning: Machine Learning appears in large volumes of job descriptions across data, product, and platform roles, and major cloud vendors (AWS, Google Cloud, Azure) offer dedicated ML services and certifications, indicating broad adoption.

Skill profile (library / DB)

Skill nature
CONCEPT
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
2
Sub-category id
1024
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • AI Governance and Model Security Catalog dimension db id 50

    Library dimension (catalog)

    Roles linked in library: AI Engineer, ML Engineer, MLOps Engineer

  • React Frontend Development Catalog dimension db id 96

    Library dimension (catalog)

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
AI Governance and Model Security
ai-governance-and-model-security
Existing dimension (library) · Role↔dimension saved
React Frontend Development
d_init_01
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
OOP Secondary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Concepts
Sub-category
general
Skill nature
CONCEPT
Volatility
STABLE
Typical lifespan
EVERGREEN
Version strategy
UNVERSIONED
Databases Secondary New / orchestrated API 3: new canonical path (new) New / unmatched skill (orchestrated in API 2)

Skill enrichment (orchestrator / LLM)

No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).

Derived legacy fields
Category
Databases
Sub-category
general
Skill nature
CONCEPT
Volatility
STABLE
Typical lifespan
EVERGREEN
Version strategy
UNVERSIONED

All API 3 persistence rows

Same grid as the skill-extractor “Persistence items” table: one row per (skill × dimension) work item.

Skill Tag Dimension Skill↔dim Role↔dim Outcome Notes
Python in_db
Cloud Security Scripting & DSL Languages
cloud-security-scripting-dsl-languages
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Python in_db
Programming Languages
programming-languages
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Python in_db
Programming Languages and Scripting
programming-languages-and-scripting
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Python in_db
Programming Languages for Data Work
programming-languages-for-data-work
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Python in_db
Programming Languages for ML Systems
programming-languages-for-ml-systems
Existing dimension (library) · Role↔dimension saved
Python in_db
Programming Languages for XR
programming-languages-for-xr
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Python in_db
Python Programming
python-programming
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
SQL in_db
Pega Programming Languages & DSLs
pega-programming-languages-dsls
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
SQL in_db
Programming Languages for Data Work
programming-languages-for-data-work
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Google Cloud Platform in_db
Cloud & Hosting Providers
cloud-hosting-providers
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Google Cloud Platform in_db
Cloud Provider Platforms
cloud-provider-platforms
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Google Cloud Platform in_db
Cloud Security Posture Tools
cloud-security-posture-tools
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Apache Airflow in_db
Data Pipeline Orchestration
data-pipeline-orchestration
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
PySpark new
ETL and ELT Tooling
etl-and-elt-tooling
Skipped — no persistable v3 meta for new skill skill_not_in_db_v3_proposed
Kubernetes in_db
Container Orchestration Platforms
container-orchestration-platforms
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Kubernetes in_db
Kubernetes for ML Workloads
kubernetes-for-ml-workloads
Existing dimension (library) · Role↔dimension saved
CI/CD in_db
CI/CD Pipeline Platforms
ci-cd-pipeline-platforms
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
CI/CD in_db
CI/CD for Machine Learning
ci-cd-for-machine-learning
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
TensorFlow in_db
ML Frameworks and Libraries
ml-frameworks-and-libraries
Existing dimension (library) · Role↔dimension saved
Kubeflow Pipelines new
MLOps Platforms and Lifecycle
mlops-platforms-and-lifecycle
Skipped — no persistable v3 meta for new skill skill_not_in_db_v3_proposed
Machine Learning in_db
AI Governance and Model Security
ai-governance-and-model-security
Existing dimension (library) · Role↔dimension saved
Machine Learning in_db
React Frontend Development
d_init_01
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)

Library artifacts (this run)

Kind Detail DB id
canonical_skill_proposed ETL | type=Data Engineering Tools subtype=general nature=PRACTICE lifespan=MULTI_YEAR
canonical_skill_proposed TFX | type=Machine Learning Frameworks subtype=general nature=TOOL lifespan=SHORT_LIVED
canonical_skill_proposed OOP | type=Concepts subtype=general nature=CONCEPT lifespan=EVERGREEN
canonical_skill_proposed Databases | type=Databases subtype=general nature=CONCEPT lifespan=EVERGREEN
dimension_skill_link_proposed PySpark ↔ ETL and ELT Tooling
dimension_skill_link_proposed Kubeflow Pipelines ↔ MLOps Platforms and Lifecycle
role_dimension_link_proposed MLOps Engineer ↔ MLOps Platforms and Lifecycle
nano JD Parser — gpt-4.1-nano click to toggle
RoleDevOps/ML Engineer
CompanyEXL
Experience3+ Years of experience in algorithms, machine learning, data science (or) statistics
DomainIT Services & Consulting
JD type pass
Show raw JSON
{
  "JD_type": "pass",
  "about_company": {
    "source_marker": {
      "first_5_words": "EXL (NASDAQ: EXLS) is a",
      "last_5_words": "in South America, Australia and South Africa."
    },
    "text": "EXL (NASDAQ: EXLS) is a leading operations management and analytics company that designs and enables agile, customer-centric operating models to help clients improve their revenue growth and profitability. Our delivery model provides market-leading business outcomes using EXL\u2019s proprietary Business EXLerator Framework\u2122, cutting-edge analytics, digital transformation and domain expertise. At EXL, we look deeper to help companies improve global operations, enhance data-driven insights, increase customer satisfaction, and manage risk and compliance. EXL serves the insurance, healthcare, banking and financial services, utilities, travel, transportation and logistics industries. Headquartered in New York, New York, EXL has more than 32,000 professionals in locations throughout the United States, Europe, Asia (primarily India and Philippines), South America, Australia and South Africa.",
    "word_count": 100
  },
  "certifications": [],
  "company_name": "EXL",
  "ctc": null,
  "domain": {
    "primary": {
      "aliases": [
        "Analytics",
        "Consulting"
      ],
      "domain": "IT Services \u0026 Consulting"
    },
    "secondary": null
  },
  "education": [
    {
      "level": "Bachelor\u0027s",
      "qualification": "Master\u0027s/Bachelor\u0027s - Computer Science (or) Math Heavy Degrees",
      "raw": "Masters or Bachelor\u0027s degree in Computer Science (or) math heavy degrees from top-tier universities with strong record of achievement",
      "requirement": "required"
    }
  ],
  "experience": {
    "max": null,
    "min": 3,
    "raw": "3+ Years of experience in algorithms, machine learning, data science (or) statistics"
  },
  "job_locations": [],
  "role": "DevOps/ML Engineer",
  "role_aliases": [
    "ML Engineer",
    "DevOps Engineer",
    "Machine Learning Engineer"
  ],
  "role_archetype": "Engineering",
  "roles_and_responsibilities": [
    {
      "bullet_count": 0,
      "heading": "Role Overview",
      "heading_was_present": true,
      "source_marker": {
        "first_5_words": "EXL provides consulting and analytics",
        "last_5_words": "using airflow, Python and SQL."
      },
      "text": "EXL provides consulting and analytics support to fortune 500 companies across multiple industry domains. For this role, you will be supporting the data science team of a leading US Media firm. While working on Google Cloud Platform (GCP), you are expected to develop ML pipeline solutions using airflow, Python and SQL.",
      "word_count": 51
    },
    {
      "bullet_count": 8,
      "heading": "Required Skills",
      "heading_was_present": true,
      "source_marker": {
        "first_5_words": "\u2022 Advanced Python (building applications",
        "last_5_words": "improvements in how we work"
      },
      "text": "\u2022 Advanced Python (building applications using python code OOPs concepts), advanced SQL skills and experience with Cloud Platforms (GCP preferred)\n\u2022 Experience in developing scalable and robust data pipelines using Python, SQL, PySpark, ETL orchestration tools like Airflow.\n\u2022 A good understanding of system design, Databases and OOPs concepts\n\u2022 Work with large, big data sources, focusing on efficient data creation and feature engineering and create pipelines that feed the data science models\n\u2022 Develop and maintain Build, Continuous Deployment, and Continuous Integration systems.\n\u2022 Hands on experience in Kubernetes, CI/CD pipelines, continually improve CI/CD tools, processes, and procedures.\n\u2022 A good understanding of system architecture (pertaining to data pipelines) and the availability to solve complex problems\n\u2022 Contribute to the team through mentorship, technical methods, improvements in how we work",
      "word_count": 116
    },
    {
      "bullet_count": 4,
      "heading": "Good To Have Skills",
      "heading_was_present": true,
      "source_marker": {
        "first_5_words": "\u2022 Design \u0026 build data pipelines",
        "last_5_words": "machine learning lifecycle infrastructure"
      },
      "text": "\u2022 Design \u0026 build data pipelines and production level ML infrastructure, using tools such as TFX, , Kubeflow Pipelines, TensorFlow\n\u2022 Experience in deploying scalable ML Models in cloud platforms, setting up alerting, restartability etc.\n\u2022 Deploy ML models under the constraints of scalability, correctness, and maintainability.\n\u2022 Drive work on creating a state-of-the-art codebase and machine learning lifecycle infrastructure",
      "word_count": 66
    }
  ],
  "urls": [
    {
      "type": "website",
      "url": "http://www.exlservice.com"
    }
  ]
}
API 1 — extract-from-jd click to toggle
{
  "final_skills": [
    {
      "is_primary": true,
      "skill_name": "Python"
    },
    {
      "is_primary": true,
      "skill_name": "SQL"
    },
    {
      "is_primary": true,
      "skill_name": "Google Cloud Platform"
    },
    {
      "is_primary": true,
      "skill_name": "Apache Airflow"
    },
    {
      "is_primary": true,
      "skill_name": "PySpark"
    },
    {
      "is_primary": true,
      "skill_name": "ETL"
    },
    {
      "is_primary": true,
      "skill_name": "Kubernetes"
    },
    {
      "is_primary": true,
      "skill_name": "CI/CD"
    },
    {
      "is_primary": true,
      "skill_name": "TensorFlow"
    },
    {
      "is_primary": true,
      "skill_name": "Kubeflow Pipelines"
    },
    {
      "is_primary": true,
      "skill_name": "TFX"
    },
    {
      "is_primary": true,
      "skill_name": "Machine Learning"
    },
    {
      "is_primary": false,
      "skill_name": "OOP"
    },
    {
      "is_primary": false,
      "skill_name": "Databases"
    }
  ],
  "jd_role": {
    "display_name": "DevOps/ML Engineer",
    "rationale": null,
    "role_aliases": [
      "ML Engineer",
      "DevOps Engineer",
      "Machine Learning Engineer"
    ],
    "role_archetype": "Engineering",
    "slug": ""
  },
  "nano_parsed": {
    "JD_type": "pass",
    "about_company": {
      "source_marker": {
        "first_5_words": "EXL (NASDAQ: EXLS) is a",
        "last_5_words": "in South America, Australia and South Africa."
      },
      "text": "EXL (NASDAQ: EXLS) is a leading operations management and analytics company that designs and enables agile, customer-centric operating models to help clients improve their revenue growth and profitability. Our delivery model provides market-leading business outcomes using EXL\u2019s proprietary Business EXLerator Framework\u2122, cutting-edge analytics, digital transformation and domain expertise. At EXL, we look deeper to help companies improve global operations, enhance data-driven insights, increase customer satisfaction, and manage risk and compliance. EXL serves the insurance, healthcare, banking and financial services, utilities, travel, transportation and logistics industries. Headquartered in New York, New York, EXL has more than 32,000 professionals in locations throughout the United States, Europe, Asia (primarily India and Philippines), South America, Australia and South Africa.",
      "word_count": 100
    },
    "certifications": [],
    "company_name": "EXL",
    "ctc": null,
    "domain": {
      "primary": {
        "aliases": [
          "Analytics",
          "Consulting"
        ],
        "domain": "IT Services \u0026 Consulting"
      },
      "secondary": null
    },
    "education": [
      {
        "level": "Bachelor\u0027s",
        "qualification": "Master\u0027s/Bachelor\u0027s - Computer Science (or) Math Heavy Degrees",
        "raw": "Masters or Bachelor\u0027s degree in Computer Science (or) math heavy degrees from top-tier universities with strong record of achievement",
        "requirement": "required"
      }
    ],
    "experience": {
      "max": null,
      "min": 3,
      "raw": "3+ Years of experience in algorithms, machine learning, data science (or) statistics"
    },
    "job_locations": [],
    "role": "DevOps/ML Engineer",
    "role_aliases": [
      "ML Engineer",
      "DevOps Engineer",
      "Machine Learning Engineer"
    ],
    "role_archetype": "Engineering",
    "roles_and_responsibilities": [
      {
        "bullet_count": 0,
        "heading": "Role Overview",
        "heading_was_present": true,
        "source_marker": {
          "first_5_words": "EXL provides consulting and analytics",
          "last_5_words": "using airflow, Python and SQL."
        },
        "text": "EXL provides consulting and analytics support to fortune 500 companies across multiple industry domains. For this role, you will be supporting the data science team of a leading US Media firm. While working on Google Cloud Platform (GCP), you are expected to develop ML pipeline solutions using airflow, Python and SQL.",
        "word_count": 51
      },
      {
        "bullet_count": 8,
        "heading": "Required Skills",
        "heading_was_present": true,
        "source_marker": {
          "first_5_words": "\u2022 Advanced Python (building applications",
          "last_5_words": "improvements in how we work"
        },
        "text": "\u2022 Advanced Python (building applications using python code OOPs concepts), advanced SQL skills and experience with Cloud Platforms (GCP preferred)\n\u2022 Experience in developing scalable and robust data pipelines using Python, SQL, PySpark, ETL orchestration tools like Airflow.\n\u2022 A good understanding of system design, Databases and OOPs concepts\n\u2022 Work with large, big data sources, focusing on efficient data creation and feature engineering and create pipelines that feed the data science models\n\u2022 Develop and maintain Build, Continuous Deployment, and Continuous Integration systems.\n\u2022 Hands on experience in Kubernetes, CI/CD pipelines, continually improve CI/CD tools, processes, and procedures.\n\u2022 A good understanding of system architecture (pertaining to data pipelines) and the availability to solve complex problems\n\u2022 Contribute to the team through mentorship, technical methods, improvements in how we work",
        "word_count": 116
      },
      {
        "bullet_count": 4,
        "heading": "Good To Have Skills",
        "heading_was_present": true,
        "source_marker": {
          "first_5_words": "\u2022 Design \u0026 build data pipelines",
          "last_5_words": "machine learning lifecycle infrastructure"
        },
        "text": "\u2022 Design \u0026 build data pipelines and production level ML infrastructure, using tools such as TFX, , Kubeflow Pipelines, TensorFlow\n\u2022 Experience in deploying scalable ML Models in cloud platforms, setting up alerting, restartability etc.\n\u2022 Deploy ML models under the constraints of scalability, correctness, and maintainability.\n\u2022 Drive work on creating a state-of-the-art codebase and machine learning lifecycle infrastructure",
        "word_count": 66
      }
    ],
    "urls": [
      {
        "type": "website",
        "url": "http://www.exlservice.com"
      }
    ]
  },
  "rejected": false,
  "rejection_reason": null,
  "run_id": "f44ceab1-c4b3-4c44-93df-420af9b73fce",
  "stage3_signals": {
    "alias_found": true,
    "alias_match_roles": [
      {
        "display_name": "ML Engineer",
        "kra_matches": null,
        "matched_count": null,
        "matched_skills": null,
        "role_id": 3,
        "score": 1.0,
        "slug": "ml-engineer",
        "total_count": null
      },
      {
        "display_name": "DevOps Engineer",
        "kra_matches": null,
        "matched_count": null,
        "matched_skills": null,
        "role_id": 10,
        "score": 1.0,
        "slug": "devops-engineer",
        "total_count": null
      }
    ],
    "kra_match_roles": [
      {
        "display_name": "ML Engineer",
        "kra_matches": [
          {
            "kra_text": "Designs end-to-end ML training pipelines and model inference workflows using TensorFlow, PyTorch, or scikit-learn on cloud ML platforms.",
            "sentence": "Design \u0026 build data pipelines and production level ML infrastructure, using tools such as TFX, , Kubeflow Pipelines, TensorFlow",
            "similarity": 0.6663
          },
          {
            "kra_text": "Prepares, cleans, and transforms training datasets, manages feature stores, and builds feature engineering pipelines for model training.",
            "sentence": "Work with large, big data sources, focusing on efficient data creation and feature engineering and create pipelines that feed the data science models",
            "similarity": 0.6255
          },
          {
            "kra_text": "Designs end-to-end ML training pipelines and model inference workflows using TensorFlow, PyTorch, or scikit-learn on cloud ML platforms.",
            "sentence": "While working on Google Cloud Platform (GCP), you are expected to develop ML pipeline solutions using airflow, Python and SQL.",
            "similarity": 0.536
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 3,
        "score": 0.6093,
        "slug": "ml-engineer",
        "total_count": null
      },
      {
        "display_name": "Data Engineer",
        "kra_matches": [
          {
            "kra_text": "Develops batch and real-time streaming data pipelines using Apache Spark, Apache Kafka, Apache Flink, or Airflow for data movement and processing at scale.",
            "sentence": "Design \u0026 build data pipelines and production level ML infrastructure, using tools such as TFX, , Kubeflow Pipelines, TensorFlow",
            "similarity": 0.6139
          },
          {
            "kra_text": "Works with data analysts, data scientists, and business stakeholders to define data models, ingestion schedules, and data delivery requirements.",
            "sentence": "Work with large, big data sources, focusing on efficient data creation and feature engineering and create pipelines that feed the data science models",
            "similarity": 0.6114
          },
          {
            "kra_text": "Optimizes pipeline throughput, partitioning strategies, and query performance across cloud data warehouses like Snowflake, BigQuery, or Redshift.",
            "sentence": "While working on Google Cloud Platform (GCP), you are expected to develop ML pipeline solutions using airflow, Python and SQL.",
            "similarity": 0.5164
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 2,
        "score": 0.5806,
        "slug": "data-engineer",
        "total_count": null
      },
      {
        "display_name": "DevOps Engineer",
        "kra_matches": [
          {
            "kra_text": "Collaborates with development teams to improve build processes, reduce deployment friction, containerize applications, and adopt DevOps best practices.",
            "sentence": "Develop and maintain Build, Continuous Deployment, and Continuous Integration systems.",
            "similarity": 0.6668
          },
          {
            "kra_text": "Collaborates with development teams to improve build processes, reduce deployment friction, containerize applications, and adopt DevOps best practices.",
            "sentence": "Hands on experience in Kubernetes, CI/CD pipelines, continually improve CI/CD tools, processes, and procedures.",
            "similarity": 0.603
          },
          {
            "kra_text": "Collaborates with development teams to improve build processes, reduce deployment friction, containerize applications, and adopt DevOps best practices.",
            "sentence": "Contribute to the team through mentorship, technical methods, improvements in how we work",
            "similarity": 0.4644
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 10,
        "score": 0.5781,
        "slug": "devops-engineer",
        "total_count": null
      },
      {
        "display_name": "Fullstack Developer",
        "kra_matches": [
          {
            "kra_text": "Delivers features through CI/CD pipelines using automated tests, staged rollouts, feature flags, and incremental deployments.",
            "sentence": "Develop and maintain Build, Continuous Deployment, and Continuous Integration systems.",
            "similarity": 0.5858
          },
          {
            "kra_text": "Delivers features through CI/CD pipelines using automated tests, staged rollouts, feature flags, and incremental deployments.",
            "sentence": "Hands on experience in Kubernetes, CI/CD pipelines, continually improve CI/CD tools, processes, and procedures.",
            "similarity": 0.5646
          },
          {
            "kra_text": "Designs and queries relational databases like PostgreSQL and document stores like MongoDB, writing migrations, indexes, and optimized queries.",
            "sentence": "A good understanding of system design, Databases and OOPs concepts",
            "similarity": 0.4875
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 15,
        "score": 0.546,
        "slug": "full-stack-engineer",
        "total_count": null
      },
      {
        "display_name": "MLOps Engineer",
        "kra_matches": [
          {
            "kra_text": "Orchestrates model serving deployments to production using Kubernetes, MLflow Model Registry, SageMaker, or Kubeflow Serving infrastructure.",
            "sentence": "Design \u0026 build data pipelines and production level ML infrastructure, using tools such as TFX, , Kubeflow Pipelines, TensorFlow",
            "similarity": 0.5843
          },
          {
            "kra_text": "Automates ML platform operations including scheduled retraining triggers, pipeline orchestration, evaluation workflows, and alerting configuration.",
            "sentence": "While working on Google Cloud Platform (GCP), you are expected to develop ML pipeline solutions using airflow, Python and SQL.",
            "similarity": 0.5022
          },
          {
            "kra_text": "Automates ML platform operations including scheduled retraining triggers, pipeline orchestration, evaluation workflows, and alerting configuration.",
            "sentence": "Drive work on creating a state-of-the-art codebase and machine learning lifecycle infrastructure",
            "similarity": 0.4862
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 16,
        "score": 0.5243,
        "slug": "ml-ops-engineer",
        "total_count": null
      }
    ],
    "skill_match_roles": [
      {
        "display_name": "ML Engineer",
        "kra_matches": null,
        "matched_count": 5,
        "matched_skills": [
          "CI/CD",
          "Kubernetes",
          "Machine Learning",
          "Python",
          "TensorFlow"
        ],
        "role_id": 3,
        "score": 0.4167,
        "slug": "ml-engineer",
        "total_count": 12
      },
      {
        "display_name": "MLOps Engineer",
        "kra_matches": null,
        "matched_count": 4,
        "matched_skills": [
          "Kubernetes",
          "Machine Learning",
          "Python",
          "TensorFlow"
        ],
        "role_id": 16,
        "score": 0.3333,
        "slug": "ml-ops-engineer",
        "total_count": 12
      },
      {
        "display_name": "Data Engineer",
        "kra_matches": null,
        "matched_count": 3,
        "matched_skills": [
          "Apache Airflow",
          "Python",
          "SQL"
        ],
        "role_id": 2,
        "score": 0.25,
        "slug": "data-engineer",
        "total_count": 12
      },
      {
        "display_name": "Cyber Security Engineer",
        "kra_matches": null,
        "matched_count": 2,
        "matched_skills": [
          "Google Cloud Platform",
          "Python"
        ],
        "role_id": 5,
        "score": 0.1667,
        "slug": "cybersecurity-engineer",
        "total_count": 12
      },
      {
        "display_name": "Cloud Architect",
        "kra_matches": null,
        "matched_count": 2,
        "matched_skills": [
          "Google Cloud Platform",
          "Kubernetes"
        ],
        "role_id": 9,
        "score": 0.1667,
        "slug": "cloud-architect",
        "total_count": 12
      }
    ]
  },
  "stage4_decision": {
    "alias_collision_detected": false,
    "case": "DOMAIN",
    "chosen_role": {
      "display_name": "MLOps Engineer",
      "kra_matches": null,
      "matched_count": null,
      "matched_skills": null,
      "role_id": 16,
      "score": 0.94,
      "slug": "ml-ops-engineer",
      "total_count": null
    },
    "confidence": 0.94,
    "is_new_role": false,
    "llm2_fired": false,
    "llm2_reasoning": null,
    "matched_dimensions": [
      "ML pipeline engineering",
      "Data pipeline orchestration",
      "Cloud ML infrastructure",
      "CI/CD automation",
      "Scalable model deployment",
      "System design and architecture",
      "Feature engineering for big data",
      "Operational reliability and maintainability"
    ],
    "matched_kras": [
      "develop ML pipeline solutions using airflow, Python and SQL",
      "developing scalable and robust data pipelines",
      "create pipelines that feed the data science models",
      "Develop and maintain Build, Continuous Deployment, and Continuous Integration systems",
      "Hands on experience in Kubernetes, CI/CD pipelines",
      "Design \u0026 build data pipelines and production level ML infrastructure",
      "Experience in deploying scalable ML Models in cloud platforms",
      "setting up alerting, restartability etc.",
      "Drive work on creating a state-of-the-art codebase",
      "machine learning lifecycle infrastructure"
    ],
    "matched_skills": [
      "Google Cloud Platform (GCP)",
      "Python",
      "SQL",
      "PySpark",
      "Airflow",
      "Kubernetes",
      "CI/CD pipelines",
      "TFX",
      "Kubeflow Pipelines",
      "TensorFlow",
      "Databases",
      "OOPs concepts"
    ],
    "new_role_display_name": null,
    "new_role_slug": null,
    "queued": false,
    "reasoning": "Domain=AI / ML; The JD is centered on building and operating ML pipelines and infrastructure with Airflow, Kubernetes, CI/CD, and cloud deployment, which best matches MLOps engineering.",
    "sub_role": null
  },
  "stage5_updates": {
    "centroid_n_after": 6,
    "centroid_updated": true,
    "collision_log_id": null,
    "new_kra_attached": null,
    "new_skills_attached": [
      {
        "is_primary": true,
        "queue_id": 8496,
        "role_display_name": "MLOps Engineer",
        "role_slug": "ml-ops-engineer",
        "skill_name": "PySpark",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 8497,
        "role_display_name": "MLOps Engineer",
        "role_slug": "ml-ops-engineer",
        "skill_name": "ETL",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 8498,
        "role_display_name": "MLOps Engineer",
        "role_slug": "ml-ops-engineer",
        "skill_name": "Kubeflow Pipelines",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 8499,
        "role_display_name": "MLOps Engineer",
        "role_slug": "ml-ops-engineer",
        "skill_name": "TFX",
        "status": "pending"
      },
      {
        "is_primary": false,
        "queue_id": 8500,
        "role_display_name": "MLOps Engineer",
        "role_slug": "ml-ops-engineer",
        "skill_name": "OOP",
        "status": "pending"
      },
      {
        "is_primary": false,
        "queue_id": 8501,
        "role_display_name": "MLOps Engineer",
        "role_slug": "ml-ops-engineer",
        "skill_name": "Databases",
        "status": "pending"
      }
    ],
    "queue_entry_id": null,
    "v3_pipeline_triggered": false,
    "v3_role_slug": null,
    "v3_run_id": null
  }
}
API 2 — extract-details
{
  "alias_matches": [
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 67,
      "existing_alias_text": "Python",
      "input_term": "Python",
      "matched_canonical": {
        "category_id": 6,
        "display_name": "Python",
        "id": 5,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "LANGUAGE",
        "slug": "python",
        "sub_category_id": 96,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 271,
      "existing_alias_text": "SQL",
      "input_term": "SQL",
      "matched_canonical": {
        "category_id": 6,
        "display_name": "SQL",
        "id": 101,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "LANGUAGE",
        "slug": "sql",
        "sub_category_id": 97,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 741,
      "existing_alias_text": "Google Cloud Platform",
      "input_term": "Google Cloud Platform",
      "matched_canonical": {
        "category_id": 9,
        "display_name": "Google Cloud Platform",
        "id": 425,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "PLATFORM",
        "slug": "google-cloud-platform",
        "sub_category_id": 46,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 304,
      "existing_alias_text": "Apache Airflow",
      "input_term": "Apache Airflow",
      "matched_canonical": {
        "category_id": 13,
        "display_name": "Apache Airflow",
        "id": 110,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "TOOL",
        "slug": "apache-airflow",
        "sub_category_id": 130,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "TODO: REMOVE AFTER TESTING \u2014 alias DB write disabled",
      "alias_persisted": false,
      "existing_alias_id": 2004,
      "existing_alias_text": "Apache Spark",
      "input_term": "PySpark",
      "matched_canonical": {
        "category_id": 5,
        "display_name": "Apache Spark",
        "id": 1350,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "FRAMEWORK",
        "slug": "apache-spark",
        "sub_category_id": 1021,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "embedding_alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 1267,
      "existing_alias_text": "Kubernetes",
      "input_term": "Kubernetes",
      "matched_canonical": {
        "category_id": 9,
        "display_name": "Kubernetes",
        "id": 726,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "PLATFORM",
        "slug": "kubernetes",
        "sub_category_id": 557,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 1826,
      "existing_alias_text": "CI/CD",
      "input_term": "CI/CD",
      "matched_canonical": {
        "category_id": 8,
        "display_name": "CI/CD",
        "id": 1190,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "METHODOLOGY",
        "slug": "ci-cd",
        "sub_category_id": 900,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 442,
      "existing_alias_text": "TensorFlow",
      "input_term": "TensorFlow",
      "matched_canonical": {
        "category_id": 7,
        "display_name": "TensorFlow",
        "id": 196,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "LIBRARY",
        "slug": "tensorflow",
        "sub_category_id": 156,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "TODO: REMOVE AFTER TESTING \u2014 alias DB write disabled",
      "alias_persisted": false,
      "existing_alias_id": 465,
      "existing_alias_text": "Kubeflow",
      "input_term": "Kubeflow Pipelines",
      "matched_canonical": {
        "category_id": 5,
        "display_name": "Kubeflow",
        "id": 213,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "FRAMEWORK",
        "slug": "kubeflow",
        "sub_category_id": 1127,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "embedding_alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 2015,
      "existing_alias_text": "Machine Learning",
      "input_term": "Machine Learning",
      "matched_canonical": {
        "category_id": 2,
        "display_name": "Machine Learning",
        "id": 1356,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "CONCEPT",
        "slug": "machine-learning",
        "sub_category_id": 1024,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    }
  ],
  "candidate_roles": [
    {
      "display_name": "Cloud Security Engineer",
      "id": 23,
      "rationale": null,
      "role_archetype": null,
      "slug": "cloud-security-engineer",
      "source": "db"
    },
    {
      "display_name": "Backend Developer",
      "id": 1,
      "rationale": null,
      "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
      "slug": "backend-engineer",
      "source": "db"
    },
    {
      "display_name": "Fullstack Developer",
      "id": 435,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "fullstack-developer",
      "source": "db"
    },
    {
      "display_name": "Fullstack Developer",
      "id": 15,
      "rationale": null,
      "role_archetype": null,
      "slug": "full-stack-engineer",
      "source": "db"
    },
    {
      "display_name": "Cyber Security Engineer",
      "id": 5,
      "rationale": null,
      "role_archetype": null,
      "slug": "cybersecurity-engineer",
      "source": "db"
    },
    {
      "display_name": "Data Engineer",
      "id": 2,
      "rationale": null,
      "role_archetype": null,
      "slug": "data-engineer",
      "source": "db"
    },
    {
      "display_name": "ML Engineer",
      "id": 3,
      "rationale": null,
      "role_archetype": null,
      "slug": "ml-engineer",
      "source": "db"
    },
    {
      "display_name": "MLOps Engineer",
      "id": 16,
      "rationale": null,
      "role_archetype": null,
      "slug": "ml-ops-engineer",
      "source": "db"
    },
    {
      "display_name": "AR/VR Engineer",
      "id": 8,
      "rationale": null,
      "role_archetype": null,
      "slug": "ar-vr-engineer",
      "source": "db"
    },
    {
      "display_name": "Python Backend Developer",
      "id": 80,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "python-backend-developer",
      "source": "db"
    },
    {
      "display_name": "Pega Developer",
      "id": 24,
      "rationale": null,
      "role_archetype": null,
      "slug": "pega-developer",
      "source": "db"
    },
    {
      "display_name": "PHP Backend Developer",
      "id": 86,
      "rationale": null,
      "role_archetype": "Engineering",
      "slug": "php-backend-developer",
      "source": "db"
    },
    {
      "display_name": "Cloud Architect",
      "id": 9,
      "rationale": null,
      "role_archetype": null,
      "slug": "cloud-architect",
      "source": "db"
    },
    {
      "display_name": "DevOps Engineer",
      "id": 10,
      "rationale": null,
      "role_archetype": null,
      "slug": "devops-engineer",
      "source": "db"
    },
    {
      "display_name": "AI Engineer",
      "id": 13,
      "rationale": null,
      "role_archetype": null,
      "slug": "ai-engineer",
      "source": "db"
    }
  ],
  "chosen_role": {
    "display_name": "MLOps Engineer",
    "id": 16,
    "rationale": "Domain=AI / ML; The JD is centered on building and operating ML pipelines and infrastructure with Airflow, Kubernetes, CI/CD, and cloud deployment, which best matches MLOps engineering.",
    "role_archetype": null,
    "slug": "ml-ops-engineer",
    "source": "db"
  },
  "dimensions": [
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Cloud Security Scripting \u0026 DSL Languages",
        "id": 248,
        "rationale": "Proficiency in programming and domain-specific languages used to automate and script cloud security controls.",
        "slug": "cloud-security-scripting-dsl-languages",
        "source": "db"
      },
      "input_skill": "Python",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Cloud Security Engineer",
          "id": 23,
          "rationale": null,
          "role_archetype": null,
          "slug": "cloud-security-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Programming Languages",
        "id": 1,
        "rationale": "Primary implementation languages used to build client and server feature code. Full stack engineers need enough fluency to move across layers and implement product behavior end to end.",
        "slug": "programming-languages",
        "source": "db"
      },
      "input_skill": "Python",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Backend Developer",
          "id": 1,
          "rationale": null,
          "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
          "slug": "backend-engineer",
          "source": "db"
        },
        {
          "display_name": "Fullstack Developer",
          "id": 435,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "fullstack-developer",
          "source": "db"
        },
        {
          "display_name": "Fullstack Developer",
          "id": 15,
          "rationale": null,
          "role_archetype": null,
          "slug": "full-stack-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Programming Languages and Scripting",
        "id": 59,
        "rationale": "Languages used to write security automation, analysis scripts, detection logic, and remediation helpers. This is the primary implementation surface for a cybersecurity engineer across tooling and response workflows.",
        "slug": "programming-languages-and-scripting",
        "source": "db"
      },
      "input_skill": "Python",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Cyber Security Engineer",
          "id": 5,
          "rationale": null,
          "role_archetype": null,
          "slug": "cybersecurity-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Programming Languages for Data Work",
        "id": 21,
        "rationale": "Languages used to implement data pipelines, transformations, and operational glue. This is the primary coding surface for building ingestion, enrichment, and automation logic in data engineering.",
        "slug": "programming-languages-for-data-work",
        "source": "db"
      },
      "input_skill": "Python",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Data Engineer",
          "id": 2,
          "rationale": null,
          "role_archetype": null,
          "slug": "data-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Programming Languages for ML Systems",
        "id": 39,
        "rationale": "Languages used to build training code, inference services, evaluation jobs, and ML glue code. This is the primary implementation surface for ML engineers across experimentation and productionization.",
        "slug": "programming-languages-for-ml-systems",
        "source": "db"
      },
      "input_skill": "Python",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "ML Engineer",
          "id": 3,
          "rationale": null,
          "role_archetype": null,
          "slug": "ml-engineer",
          "source": "db"
        },
        {
          "display_name": "MLOps Engineer",
          "id": 16,
          "rationale": null,
          "role_archetype": null,
          "slug": "ml-ops-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Programming Languages for XR",
        "id": 97,
        "rationale": "Primary implementation languages used to build immersive client features, interaction logic, and device-specific runtime behavior. This is the core coding surface for AR/VR experiences.",
        "slug": "programming-languages-for-xr",
        "source": "db"
      },
      "input_skill": "Python",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "AR/VR Engineer",
          "id": 8,
          "rationale": null,
          "role_archetype": null,
          "slug": "ar-vr-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Python Programming",
        "id": 290,
        "rationale": "Core Python language skills used to implement backend business logic, request handlers, integrations, and service internals. This is the primary coding surface for the role.",
        "slug": "python-programming",
        "source": "db"
      },
      "input_skill": "Python",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Python Backend Developer",
          "id": 80,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "python-backend-developer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Pega Programming Languages \u0026 DSLs",
        "id": 267,
        "rationale": "Programming languages and domain-specific languages used in Pega development.",
        "slug": "pega-programming-languages-dsls",
        "source": "db"
      },
      "input_skill": "SQL",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Pega Developer",
          "id": 24,
          "rationale": null,
          "role_archetype": null,
          "slug": "pega-developer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Programming Languages for Data Work",
        "id": 21,
        "rationale": "Languages used to implement data pipelines, transformations, and operational glue. This is the primary coding surface for building ingestion, enrichment, and automation logic in data engineering.",
        "slug": "programming-languages-for-data-work",
        "source": "db"
      },
      "input_skill": "SQL",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Data Engineer",
          "id": 2,
          "rationale": null,
          "role_archetype": null,
          "slug": "data-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Cloud \u0026 Hosting Providers",
        "id": 414,
        "rationale": "Knowledge of major cloud and hosting vendor platforms for deploying and managing PHP applications.",
        "slug": "cloud-hosting-providers",
        "source": "db"
      },
      "input_skill": "Google Cloud Platform",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "PHP Backend Developer",
          "id": 86,
          "rationale": null,
          "role_archetype": "Engineering",
          "slug": "php-backend-developer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Cloud Provider Platforms",
        "id": 131,
        "rationale": "Major cloud platforms and their core service ecosystems used to design target-state architectures, choose deployment boundaries, and evaluate managed capabilities. This is the primary substrate for cloud architecture decisions.",
        "slug": "cloud-provider-platforms",
        "source": "db"
      },
      "input_skill": "Google Cloud Platform",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Cloud Architect",
          "id": 9,
          "rationale": null,
          "role_archetype": null,
          "slug": "cloud-architect",
          "source": "db"
        },
        {
          "display_name": "Cloud Security Engineer",
          "id": 23,
          "rationale": null,
          "role_archetype": null,
          "slug": "cloud-security-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Cloud Security Posture Tools",
        "id": 64,
        "rationale": "Cloud-native security platforms used to assess misconfiguration, workload exposure, and cloud control coverage. This dimension includes the major CNAPP/CSPM/CWPP vendors and cloud security services the role reviews and tunes.",
        "slug": "cloud-security-posture-tools",
        "source": "db"
      },
      "input_skill": "Google Cloud Platform",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Cloud Security Engineer",
          "id": 23,
          "rationale": null,
          "role_archetype": null,
          "slug": "cloud-security-engineer",
          "source": "db"
        },
        {
          "display_name": "Cyber Security Engineer",
          "id": 5,
          "rationale": null,
          "role_archetype": null,
          "slug": "cybersecurity-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Data Pipeline Orchestration",
        "id": 23,
        "rationale": "Workflow engines that schedule, coordinate, and recover batch data jobs. This cluster covers dependency management, retries, backfills, sensors, and operational control of pipeline DAGs.",
        "slug": "data-pipeline-orchestration",
        "source": "db"
      },
      "input_skill": "Apache Airflow",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Data Engineer",
          "id": 2,
          "rationale": null,
          "role_archetype": null,
          "slug": "data-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "ETL and ELT Tooling",
        "id": 24,
        "rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
        "slug": "etl-and-elt-tooling",
        "source": "db"
      },
      "input_skill": "PySpark",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Data Engineer",
          "id": 2,
          "rationale": null,
          "role_archetype": null,
          "slug": "data-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Container Orchestration Platforms",
        "id": 134,
        "rationale": "Platforms that schedule and manage containerized workloads across clusters and environments. Cloud Architects need these to define workload placement standards, cluster boundaries, and platform capabilities.",
        "slug": "container-orchestration-platforms",
        "source": "db"
      },
      "input_skill": "Kubernetes",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Cloud Architect",
          "id": 9,
          "rationale": null,
          "role_archetype": null,
          "slug": "cloud-architect",
          "source": "db"
        },
        {
          "display_name": "DevOps Engineer",
          "id": 10,
          "rationale": null,
          "role_archetype": null,
          "slug": "devops-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "Kubernetes for ML Workloads",
        "id": 47,
        "rationale": "Kubernetes-native components used to schedule, accelerate, and isolate ML training and serving workloads. This includes GPU enablement and ML-specific controllers rather than generic cluster administration.",
        "slug": "kubernetes-for-ml-workloads",
        "source": "db"
      },
      "input_skill": "Kubernetes",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "ML Engineer",
          "id": 3,
          "rationale": null,
          "role_archetype": null,
          "slug": "ml-engineer",
          "source": "db"
        },
        {
          "display_name": "MLOps Engineer",
          "id": 16,
          "rationale": null,
          "role_archetype": null,
          "slug": "ml-ops-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "CI/CD Pipeline Platforms",
        "id": 150,
        "rationale": "Systems used to define, run, and maintain automated build and deployment workflows. This cluster is coherent because the role owns delivery automation end to end, including pipeline reliability and promotion logic.",
        "slug": "ci-cd-pipeline-platforms",
        "source": "db"
      },
      "input_skill": "CI/CD",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "DevOps Engineer",
          "id": 10,
          "rationale": null,
          "role_archetype": null,
          "slug": "devops-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "CI/CD for Machine Learning",
        "id": 56,
        "rationale": "Tools and platforms for automating ML model integration, testing, and deployment pipelines.",
        "slug": "ci-cd-for-machine-learning",
        "source": "db"
      },
      "input_skill": "CI/CD",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "ML Engineer",
          "id": 3,
          "rationale": null,
          "role_archetype": null,
          "slug": "ml-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "ML Frameworks and Libraries",
        "id": 40,
        "rationale": "Core libraries used to define models, train them, run inference, and evaluate predictive performance. These frameworks shape how ML engineers express model architectures and training loops.",
        "slug": "ml-frameworks-and-libraries",
        "source": "db"
      },
      "input_skill": "TensorFlow",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "ML Engineer",
          "id": 3,
          "rationale": null,
          "role_archetype": null,
          "slug": "ml-engineer",
          "source": "db"
        },
        {
          "display_name": "MLOps Engineer",
          "id": 16,
          "rationale": null,
          "role_archetype": null,
          "slug": "ml-ops-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "MLOps Platforms and Lifecycle",
        "id": 43,
        "rationale": "End-to-end managed platforms used to train, deploy, register, and govern models across their lifecycle. This is the operational control plane for production ML workflows.",
        "slug": "mlops-platforms-and-lifecycle",
        "source": "db"
      },
      "input_skill": "Kubeflow Pipelines",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "ML Engineer",
          "id": 3,
          "rationale": null,
          "role_archetype": null,
          "slug": "ml-engineer",
          "source": "db"
        },
        {
          "display_name": "MLOps Engineer",
          "id": 16,
          "rationale": null,
          "role_archetype": null,
          "slug": "ml-ops-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "AI Governance and Model Security",
        "id": 50,
        "rationale": "Controls and documentation used to make models safer, auditable, and compliant. ML engineers use this to manage model risk, supply chain integrity, and governance requirements.",
        "slug": "ai-governance-and-model-security",
        "source": "db"
      },
      "input_skill": "Machine Learning",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "AI Engineer",
          "id": 13,
          "rationale": null,
          "role_archetype": null,
          "slug": "ai-engineer",
          "source": "db"
        },
        {
          "display_name": "ML Engineer",
          "id": 3,
          "rationale": null,
          "role_archetype": null,
          "slug": "ml-engineer",
          "source": "db"
        },
        {
          "display_name": "MLOps Engineer",
          "id": 16,
          "rationale": null,
          "role_archetype": null,
          "slug": "ml-ops-engineer",
          "source": "db"
        }
      ]
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "React Frontend Development",
        "id": 96,
        "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
        "slug": "d_init_01",
        "source": "db"
      },
      "input_skill": "Machine Learning",
      "llm_role": null,
      "roles_from_db": []
    }
  ],
  "input_final_skills": [
    "Python",
    "SQL",
    "Google Cloud Platform",
    "Apache Airflow",
    "PySpark",
    "ETL",
    "Kubernetes",
    "CI/CD",
    "TensorFlow",
    "Kubeflow Pipelines",
    "TFX",
    "Machine Learning",
    "OOP",
    "Databases"
  ],
  "input_llm_skills": [
    "Python",
    "SQL",
    "Google Cloud Platform",
    "Apache Airflow",
    "PySpark",
    "ETL",
    "Kubernetes",
    "CI/CD",
    "TensorFlow",
    "Kubeflow Pipelines",
    "TFX",
    "Machine Learning",
    "OOP",
    "Databases"
  ],
  "new_aliases_persisted": 0,
  "run_id": "f44ceab1-c4b3-4c44-93df-420af9b73fce",
  "skills_detail": [
    {
      "aliases_in_db": [
        {
          "alias_text": "Python",
          "alias_type": "CANONICAL",
          "id": 67,
          "is_primary": true,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Python 2",
          "alias_type": "VERSION",
          "id": 72,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Python 2.x",
          "alias_type": "VERSION",
          "id": 74,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Python 3",
          "alias_type": "VERSION",
          "id": 73,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Python 3.10",
          "alias_type": "VERSION",
          "id": 76,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Python 3.11",
          "alias_type": "VERSION",
          "id": 77,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Python 3.12",
          "alias_type": "VERSION",
          "id": 78,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Python 3.x",
          "alias_type": "VERSION",
          "id": 75,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "py",
          "alias_type": "VERSION",
          "id": 2183,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "py2",
          "alias_type": "VERSION",
          "id": 68,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "py3",
          "alias_type": "VERSION",
          "id": 69,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "python 3",
          "alias_type": "VERSION",
          "id": 2186,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "python 3.x",
          "alias_type": "VERSION",
          "id": 2849,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "python2",
          "alias_type": "VERSION",
          "id": 70,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "python3",
          "alias_type": "VERSION",
          "id": 71,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "python3.x",
          "alias_type": "VERSION",
          "id": 2848,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 6,
        "display_name": "Python",
        "id": 5,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "LANGUAGE",
        "slug": "python",
        "sub_category_id": 96,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Cloud Security Scripting \u0026 DSL Languages",
            "id": 248,
            "rationale": "Proficiency in programming and domain-specific languages used to automate and script cloud security controls.",
            "slug": "cloud-security-scripting-dsl-languages",
            "source": "db"
          },
          "input_skill": "Python",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Cloud Security Engineer",
              "id": 23,
              "rationale": null,
              "role_archetype": null,
              "slug": "cloud-security-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Programming Languages",
            "id": 1,
            "rationale": "Primary implementation languages used to build client and server feature code. Full stack engineers need enough fluency to move across layers and implement product behavior end to end.",
            "slug": "programming-languages",
            "source": "db"
          },
          "input_skill": "Python",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Backend Developer",
              "id": 1,
              "rationale": null,
              "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
              "slug": "backend-engineer",
              "source": "db"
            },
            {
              "display_name": "Fullstack Developer",
              "id": 435,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "fullstack-developer",
              "source": "db"
            },
            {
              "display_name": "Fullstack Developer",
              "id": 15,
              "rationale": null,
              "role_archetype": null,
              "slug": "full-stack-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Programming Languages and Scripting",
            "id": 59,
            "rationale": "Languages used to write security automation, analysis scripts, detection logic, and remediation helpers. This is the primary implementation surface for a cybersecurity engineer across tooling and response workflows.",
            "slug": "programming-languages-and-scripting",
            "source": "db"
          },
          "input_skill": "Python",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Cyber Security Engineer",
              "id": 5,
              "rationale": null,
              "role_archetype": null,
              "slug": "cybersecurity-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Programming Languages for Data Work",
            "id": 21,
            "rationale": "Languages used to implement data pipelines, transformations, and operational glue. This is the primary coding surface for building ingestion, enrichment, and automation logic in data engineering.",
            "slug": "programming-languages-for-data-work",
            "source": "db"
          },
          "input_skill": "Python",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Data Engineer",
              "id": 2,
              "rationale": null,
              "role_archetype": null,
              "slug": "data-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Programming Languages for ML Systems",
            "id": 39,
            "rationale": "Languages used to build training code, inference services, evaluation jobs, and ML glue code. This is the primary implementation surface for ML engineers across experimentation and productionization.",
            "slug": "programming-languages-for-ml-systems",
            "source": "db"
          },
          "input_skill": "Python",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "ML Engineer",
              "id": 3,
              "rationale": null,
              "role_archetype": null,
              "slug": "ml-engineer",
              "source": "db"
            },
            {
              "display_name": "MLOps Engineer",
              "id": 16,
              "rationale": null,
              "role_archetype": null,
              "slug": "ml-ops-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Programming Languages for XR",
            "id": 97,
            "rationale": "Primary implementation languages used to build immersive client features, interaction logic, and device-specific runtime behavior. This is the core coding surface for AR/VR experiences.",
            "slug": "programming-languages-for-xr",
            "source": "db"
          },
          "input_skill": "Python",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "AR/VR Engineer",
              "id": 8,
              "rationale": null,
              "role_archetype": null,
              "slug": "ar-vr-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Python Programming",
            "id": 290,
            "rationale": "Core Python language skills used to implement backend business logic, request handlers, integrations, and service internals. This is the primary coding surface for the role.",
            "slug": "python-programming",
            "source": "db"
          },
          "input_skill": "Python",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Python Backend Developer",
              "id": 80,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "python-backend-developer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "Python",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "SQL",
          "alias_type": "CANONICAL",
          "id": 271,
          "is_primary": true,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 6,
        "display_name": "SQL",
        "id": 101,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "LANGUAGE",
        "slug": "sql",
        "sub_category_id": 97,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Pega Programming Languages \u0026 DSLs",
            "id": 267,
            "rationale": "Programming languages and domain-specific languages used in Pega development.",
            "slug": "pega-programming-languages-dsls",
            "source": "db"
          },
          "input_skill": "SQL",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Pega Developer",
              "id": 24,
              "rationale": null,
              "role_archetype": null,
              "slug": "pega-developer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Programming Languages for Data Work",
            "id": 21,
            "rationale": "Languages used to implement data pipelines, transformations, and operational glue. This is the primary coding surface for building ingestion, enrichment, and automation logic in data engineering.",
            "slug": "programming-languages-for-data-work",
            "source": "db"
          },
          "input_skill": "SQL",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Data Engineer",
              "id": 2,
              "rationale": null,
              "role_archetype": null,
              "slug": "data-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "SQL",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "Google Cloud Platform",
          "alias_type": "CANONICAL",
          "id": 741,
          "is_primary": true,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 9,
        "display_name": "Google Cloud Platform",
        "id": 425,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "PLATFORM",
        "slug": "google-cloud-platform",
        "sub_category_id": 46,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Cloud \u0026 Hosting Providers",
            "id": 414,
            "rationale": "Knowledge of major cloud and hosting vendor platforms for deploying and managing PHP applications.",
            "slug": "cloud-hosting-providers",
            "source": "db"
          },
          "input_skill": "Google Cloud Platform",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "PHP Backend Developer",
              "id": 86,
              "rationale": null,
              "role_archetype": "Engineering",
              "slug": "php-backend-developer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Cloud Provider Platforms",
            "id": 131,
            "rationale": "Major cloud platforms and their core service ecosystems used to design target-state architectures, choose deployment boundaries, and evaluate managed capabilities. This is the primary substrate for cloud architecture decisions.",
            "slug": "cloud-provider-platforms",
            "source": "db"
          },
          "input_skill": "Google Cloud Platform",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Cloud Architect",
              "id": 9,
              "rationale": null,
              "role_archetype": null,
              "slug": "cloud-architect",
              "source": "db"
            },
            {
              "display_name": "Cloud Security Engineer",
              "id": 23,
              "rationale": null,
              "role_archetype": null,
              "slug": "cloud-security-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Cloud Security Posture Tools",
            "id": 64,
            "rationale": "Cloud-native security platforms used to assess misconfiguration, workload exposure, and cloud control coverage. This dimension includes the major CNAPP/CSPM/CWPP vendors and cloud security services the role reviews and tunes.",
            "slug": "cloud-security-posture-tools",
            "source": "db"
          },
          "input_skill": "Google Cloud Platform",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Cloud Security Engineer",
              "id": 23,
              "rationale": null,
              "role_archetype": null,
              "slug": "cloud-security-engineer",
              "source": "db"
            },
            {
              "display_name": "Cyber Security Engineer",
              "id": 5,
              "rationale": null,
              "role_archetype": null,
              "slug": "cybersecurity-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "Google Cloud Platform",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "Apache Airflow",
          "alias_type": "CANONICAL",
          "id": 304,
          "is_primary": true,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 13,
        "display_name": "Apache Airflow",
        "id": 110,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "TOOL",
        "slug": "apache-airflow",
        "sub_category_id": 130,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Data Pipeline Orchestration",
            "id": 23,
            "rationale": "Workflow engines that schedule, coordinate, and recover batch data jobs. This cluster covers dependency management, retries, backfills, sensors, and operational control of pipeline DAGs.",
            "slug": "data-pipeline-orchestration",
            "source": "db"
          },
          "input_skill": "Apache Airflow",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Data Engineer",
              "id": 2,
              "rationale": null,
              "role_archetype": null,
              "slug": "data-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "Apache Airflow",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "Apache Spark",
          "alias_type": "CANONICAL",
          "id": 2004,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "apache spark 3",
          "alias_type": "VERSION",
          "id": 2006,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "spark",
          "alias_type": "VERSION",
          "id": 2510,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "spark 3",
          "alias_type": "VERSION",
          "id": 2007,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "spark 3.x",
          "alias_type": "VERSION",
          "id": 2009,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "spark3",
          "alias_type": "VERSION",
          "id": 2008,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 5,
        "display_name": "Apache Spark",
        "id": 1350,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "FRAMEWORK",
        "slug": "apache-spark",
        "sub_category_id": 1021,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "ETL and ELT Tooling",
            "id": 24,
            "rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
            "slug": "etl-and-elt-tooling",
            "source": "db"
          },
          "input_skill": "PySpark",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Data Engineer",
              "id": 2,
              "rationale": null,
              "role_archetype": null,
              "slug": "data-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "PySpark",
      "matched_via": "embedding_alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "ETL",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Data Engineering Tools",
          "skill_nature": "PRACTICE",
          "sub_category": "general",
          "typical_lifespan": "MULTI_YEAR",
          "version_strategy": "UNVERSIONED",
          "volatility": "MEDIUM"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "etl",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "Kubernetes",
          "alias_type": "CANONICAL",
          "id": 1267,
          "is_primary": true,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Kubernetes 1.0+",
          "alias_type": "VERSION",
          "id": 1271,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Kubernetes 1.x",
          "alias_type": "VERSION",
          "id": 1270,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Kubernetes v1",
          "alias_type": "VERSION",
          "id": 1269,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "k8s",
          "alias_type": "VERSION",
          "id": 1268,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "kubernetes 1.x",
          "alias_type": "VERSION",
          "id": 1400,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "kubernetes latest",
          "alias_type": "VERSION",
          "id": 1401,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 9,
        "display_name": "Kubernetes",
        "id": 726,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "PLATFORM",
        "slug": "kubernetes",
        "sub_category_id": 557,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Container Orchestration Platforms",
            "id": 134,
            "rationale": "Platforms that schedule and manage containerized workloads across clusters and environments. Cloud Architects need these to define workload placement standards, cluster boundaries, and platform capabilities.",
            "slug": "container-orchestration-platforms",
            "source": "db"
          },
          "input_skill": "Kubernetes",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Cloud Architect",
              "id": 9,
              "rationale": null,
              "role_archetype": null,
              "slug": "cloud-architect",
              "source": "db"
            },
            {
              "display_name": "DevOps Engineer",
              "id": 10,
              "rationale": null,
              "role_archetype": null,
              "slug": "devops-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "Kubernetes for ML Workloads",
            "id": 47,
            "rationale": "Kubernetes-native components used to schedule, accelerate, and isolate ML training and serving workloads. This includes GPU enablement and ML-specific controllers rather than generic cluster administration.",
            "slug": "kubernetes-for-ml-workloads",
            "source": "db"
          },
          "input_skill": "Kubernetes",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "ML Engineer",
              "id": 3,
              "rationale": null,
              "role_archetype": null,
              "slug": "ml-engineer",
              "source": "db"
            },
            {
              "display_name": "MLOps Engineer",
              "id": 16,
              "rationale": null,
              "role_archetype": null,
              "slug": "ml-ops-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "Kubernetes",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "CI/CD",
          "alias_type": "CANONICAL",
          "id": 1826,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 8,
        "display_name": "CI/CD",
        "id": 1190,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "METHODOLOGY",
        "slug": "ci-cd",
        "sub_category_id": 900,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "CI/CD Pipeline Platforms",
            "id": 150,
            "rationale": "Systems used to define, run, and maintain automated build and deployment workflows. This cluster is coherent because the role owns delivery automation end to end, including pipeline reliability and promotion logic.",
            "slug": "ci-cd-pipeline-platforms",
            "source": "db"
          },
          "input_skill": "CI/CD",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "DevOps Engineer",
              "id": 10,
              "rationale": null,
              "role_archetype": null,
              "slug": "devops-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "CI/CD for Machine Learning",
            "id": 56,
            "rationale": "Tools and platforms for automating ML model integration, testing, and deployment pipelines.",
            "slug": "ci-cd-for-machine-learning",
            "source": "db"
          },
          "input_skill": "CI/CD",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "ML Engineer",
              "id": 3,
              "rationale": null,
              "role_archetype": null,
              "slug": "ml-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "CI/CD",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "TensorFlow",
          "alias_type": "CANONICAL",
          "id": 442,
          "is_primary": true,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "TF1",
          "alias_type": "VERSION",
          "id": 443,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "TF2",
          "alias_type": "VERSION",
          "id": 444,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "TensorFlow 1",
          "alias_type": "VERSION",
          "id": 445,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "TensorFlow 1.x",
          "alias_type": "VERSION",
          "id": 447,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "TensorFlow 2",
          "alias_type": "VERSION",
          "id": 446,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "TensorFlow 2.x",
          "alias_type": "VERSION",
          "id": 448,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "tensorflow 1",
          "alias_type": "VERSION",
          "id": 2490,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "tensorflow 1.x",
          "alias_type": "VERSION",
          "id": 2494,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "tensorflow 2",
          "alias_type": "VERSION",
          "id": 2491,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "tensorflow 2.x",
          "alias_type": "VERSION",
          "id": 2495,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "tensorflow v1",
          "alias_type": "VERSION",
          "id": 2492,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "tensorflow v2",
          "alias_type": "VERSION",
          "id": 2493,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "tf",
          "alias_type": "VERSION",
          "id": 2487,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "tf1",
          "alias_type": "VERSION",
          "id": 2488,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "tf2",
          "alias_type": "VERSION",
          "id": 2489,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 7,
        "display_name": "TensorFlow",
        "id": 196,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "LIBRARY",
        "slug": "tensorflow",
        "sub_category_id": 156,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "ML Frameworks and Libraries",
            "id": 40,
            "rationale": "Core libraries used to define models, train them, run inference, and evaluate predictive performance. These frameworks shape how ML engineers express model architectures and training loops.",
            "slug": "ml-frameworks-and-libraries",
            "source": "db"
          },
          "input_skill": "TensorFlow",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "ML Engineer",
              "id": 3,
              "rationale": null,
              "role_archetype": null,
              "slug": "ml-engineer",
              "source": "db"
            },
            {
              "display_name": "MLOps Engineer",
              "id": 16,
              "rationale": null,
              "role_archetype": null,
              "slug": "ml-ops-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "TensorFlow",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "Kubeflow",
          "alias_type": "CANONICAL",
          "id": 465,
          "is_primary": true,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Kubeflow 1.x",
          "alias_type": "VERSION",
          "id": 468,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Kubeflow 2.x",
          "alias_type": "VERSION",
          "id": 469,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Kubeflow v1",
          "alias_type": "VERSION",
          "id": 466,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "Kubeflow v2",
          "alias_type": "VERSION",
          "id": 467,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 5,
        "display_name": "Kubeflow",
        "id": 213,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "FRAMEWORK",
        "slug": "kubeflow",
        "sub_category_id": 1127,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "MLOps Platforms and Lifecycle",
            "id": 43,
            "rationale": "End-to-end managed platforms used to train, deploy, register, and govern models across their lifecycle. This is the operational control plane for production ML workflows.",
            "slug": "mlops-platforms-and-lifecycle",
            "source": "db"
          },
          "input_skill": "Kubeflow Pipelines",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "ML Engineer",
              "id": 3,
              "rationale": null,
              "role_archetype": null,
              "slug": "ml-engineer",
              "source": "db"
            },
            {
              "display_name": "MLOps Engineer",
              "id": 16,
              "rationale": null,
              "role_archetype": null,
              "slug": "ml-ops-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "Kubeflow Pipelines",
      "matched_via": "embedding_alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "TFX",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Machine Learning Frameworks",
          "skill_nature": "TOOL",
          "sub_category": "general",
          "typical_lifespan": "SHORT_LIVED",
          "version_strategy": "VERSIONED",
          "volatility": "FAST"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "tfx",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "Machine Learning",
          "alias_type": "CANONICAL",
          "id": 2015,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 2,
        "display_name": "Machine Learning",
        "id": 1356,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "CONCEPT",
        "slug": "machine-learning",
        "sub_category_id": 1024,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "AI Governance and Model Security",
            "id": 50,
            "rationale": "Controls and documentation used to make models safer, auditable, and compliant. ML engineers use this to manage model risk, supply chain integrity, and governance requirements.",
            "slug": "ai-governance-and-model-security",
            "source": "db"
          },
          "input_skill": "Machine Learning",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "AI Engineer",
              "id": 13,
              "rationale": null,
              "role_archetype": null,
              "slug": "ai-engineer",
              "source": "db"
            },
            {
              "display_name": "ML Engineer",
              "id": 3,
              "rationale": null,
              "role_archetype": null,
              "slug": "ml-engineer",
              "source": "db"
            },
            {
              "display_name": "MLOps Engineer",
              "id": 16,
              "rationale": null,
              "role_archetype": null,
              "slug": "ml-ops-engineer",
              "source": "db"
            }
          ]
        },
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "React Frontend Development",
            "id": 96,
            "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
            "slug": "d_init_01",
            "source": "db"
          },
          "input_skill": "Machine Learning",
          "llm_role": null,
          "roles_from_db": []
        }
      ],
      "input_skill": "Machine Learning",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "OOP",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Concepts",
          "skill_nature": "CONCEPT",
          "sub_category": "general",
          "typical_lifespan": "EVERGREEN",
          "version_strategy": "UNVERSIONED",
          "volatility": "STABLE"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "oop",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [],
      "canonical": null,
      "dimensions": [],
      "input_skill": "Databases",
      "matched_via": null,
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": {
        "derived": {
          "category": "Databases",
          "skill_nature": "CONCEPT",
          "sub_category": "general",
          "typical_lifespan": "EVERGREEN",
          "version_strategy": "UNVERSIONED",
          "volatility": "STABLE"
        },
        "enrichment": null,
        "keep_log": [],
        "locked_dimensions": [],
        "merge_log": [],
        "placed": null,
        "relationships": null,
        "skill_id": "databases",
        "split_log": [],
        "typed": null,
        "warnings": []
      },
      "source_tag": "llm",
      "was_in_llm_skills": true
    }
  ],
  "unmatched_skills": [
    "ETL",
    "TFX",
    "OOP",
    "Databases"
  ]
}
API 3 — final-role-output
{
  "chosen_role": {
    "display_name": "MLOps Engineer",
    "id": 16,
    "rationale": "Domain=AI / ML; The JD is centered on building and operating ML pipelines and infrastructure with Airflow, Kubernetes, CI/CD, and cloud deployment, which best matches MLOps engineering.",
    "role_archetype": null,
    "slug": "ml-ops-engineer",
    "source": "db"
  },
  "chosen_role_resolution": "in_db",
  "final_input_skills": [
    {
      "skill": "Python",
      "tag": "in_db"
    },
    {
      "skill": "SQL",
      "tag": "in_db"
    },
    {
      "skill": "Google Cloud Platform",
      "tag": "in_db"
    },
    {
      "skill": "Apache Airflow",
      "tag": "in_db"
    },
    {
      "skill": "PySpark",
      "tag": "in_db"
    },
    {
      "skill": "ETL",
      "tag": "new"
    },
    {
      "skill": "Kubernetes",
      "tag": "in_db"
    },
    {
      "skill": "CI/CD",
      "tag": "in_db"
    },
    {
      "skill": "TensorFlow",
      "tag": "in_db"
    },
    {
      "skill": "Kubeflow Pipelines",
      "tag": "in_db"
    },
    {
      "skill": "TFX",
      "tag": "new"
    },
    {
      "skill": "Machine Learning",
      "tag": "in_db"
    },
    {
      "skill": "OOP",
      "tag": "new"
    },
    {
      "skill": "Databases",
      "tag": "new"
    }
  ],
  "llm_cost_api1_usd": null,
  "llm_cost_api2_usd": null,
  "llm_cost_api3_usd": null,
  "llm_cost_total_usd": null,
  "persistence": {
    "items": [
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Cloud Security Scripting \u0026 DSL Languages",
          "id": 248,
          "rationale": "Proficiency in programming and domain-specific languages used to automate and script cloud security controls.",
          "slug": "cloud-security-scripting-dsl-languages",
          "source": "db"
        },
        "dimension_id": 248,
        "input_skill": "Python",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Cloud Security Engineer",
            "id": 23,
            "rationale": null,
            "role_archetype": null,
            "slug": "cloud-security-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 5,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Programming Languages",
          "id": 1,
          "rationale": "Primary implementation languages used to build client and server feature code. Full stack engineers need enough fluency to move across layers and implement product behavior end to end.",
          "slug": "programming-languages",
          "source": "db"
        },
        "dimension_id": 1,
        "input_skill": "Python",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Backend Developer",
            "id": 1,
            "rationale": null,
            "role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
            "slug": "backend-engineer",
            "source": "db"
          },
          {
            "display_name": "Fullstack Developer",
            "id": 435,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "fullstack-developer",
            "source": "db"
          },
          {
            "display_name": "Fullstack Developer",
            "id": 15,
            "rationale": null,
            "role_archetype": null,
            "slug": "full-stack-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 5,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Programming Languages and Scripting",
          "id": 59,
          "rationale": "Languages used to write security automation, analysis scripts, detection logic, and remediation helpers. This is the primary implementation surface for a cybersecurity engineer across tooling and response workflows.",
          "slug": "programming-languages-and-scripting",
          "source": "db"
        },
        "dimension_id": 59,
        "input_skill": "Python",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Cyber Security Engineer",
            "id": 5,
            "rationale": null,
            "role_archetype": null,
            "slug": "cybersecurity-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 5,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Programming Languages for Data Work",
          "id": 21,
          "rationale": "Languages used to implement data pipelines, transformations, and operational glue. This is the primary coding surface for building ingestion, enrichment, and automation logic in data engineering.",
          "slug": "programming-languages-for-data-work",
          "source": "db"
        },
        "dimension_id": 21,
        "input_skill": "Python",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Data Engineer",
            "id": 2,
            "rationale": null,
            "role_archetype": null,
            "slug": "data-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 5,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Programming Languages for ML Systems",
          "id": 39,
          "rationale": "Languages used to build training code, inference services, evaluation jobs, and ML glue code. This is the primary implementation surface for ML engineers across experimentation and productionization.",
          "slug": "programming-languages-for-ml-systems",
          "source": "db"
        },
        "dimension_id": 39,
        "input_skill": "Python",
        "llm_role": null,
        "matched_chosen_role": true,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension saved",
        "role_dimension_saved": true,
        "roles_from_db": [
          {
            "display_name": "ML Engineer",
            "id": 3,
            "rationale": null,
            "role_archetype": null,
            "slug": "ml-engineer",
            "source": "db"
          },
          {
            "display_name": "MLOps Engineer",
            "id": 16,
            "rationale": null,
            "role_archetype": null,
            "slug": "ml-ops-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 5,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Programming Languages for XR",
          "id": 97,
          "rationale": "Primary implementation languages used to build immersive client features, interaction logic, and device-specific runtime behavior. This is the core coding surface for AR/VR experiences.",
          "slug": "programming-languages-for-xr",
          "source": "db"
        },
        "dimension_id": 97,
        "input_skill": "Python",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "AR/VR Engineer",
            "id": 8,
            "rationale": null,
            "role_archetype": null,
            "slug": "ar-vr-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 5,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Python Programming",
          "id": 290,
          "rationale": "Core Python language skills used to implement backend business logic, request handlers, integrations, and service internals. This is the primary coding surface for the role.",
          "slug": "python-programming",
          "source": "db"
        },
        "dimension_id": 290,
        "input_skill": "Python",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Python Backend Developer",
            "id": 80,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "python-backend-developer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 5,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Pega Programming Languages \u0026 DSLs",
          "id": 267,
          "rationale": "Programming languages and domain-specific languages used in Pega development.",
          "slug": "pega-programming-languages-dsls",
          "source": "db"
        },
        "dimension_id": 267,
        "input_skill": "SQL",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Pega Developer",
            "id": 24,
            "rationale": null,
            "role_archetype": null,
            "slug": "pega-developer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 101,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Programming Languages for Data Work",
          "id": 21,
          "rationale": "Languages used to implement data pipelines, transformations, and operational glue. This is the primary coding surface for building ingestion, enrichment, and automation logic in data engineering.",
          "slug": "programming-languages-for-data-work",
          "source": "db"
        },
        "dimension_id": 21,
        "input_skill": "SQL",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Data Engineer",
            "id": 2,
            "rationale": null,
            "role_archetype": null,
            "slug": "data-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 101,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Cloud \u0026 Hosting Providers",
          "id": 414,
          "rationale": "Knowledge of major cloud and hosting vendor platforms for deploying and managing PHP applications.",
          "slug": "cloud-hosting-providers",
          "source": "db"
        },
        "dimension_id": 414,
        "input_skill": "Google Cloud Platform",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "PHP Backend Developer",
            "id": 86,
            "rationale": null,
            "role_archetype": "Engineering",
            "slug": "php-backend-developer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 425,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Cloud Provider Platforms",
          "id": 131,
          "rationale": "Major cloud platforms and their core service ecosystems used to design target-state architectures, choose deployment boundaries, and evaluate managed capabilities. This is the primary substrate for cloud architecture decisions.",
          "slug": "cloud-provider-platforms",
          "source": "db"
        },
        "dimension_id": 131,
        "input_skill": "Google Cloud Platform",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Cloud Architect",
            "id": 9,
            "rationale": null,
            "role_archetype": null,
            "slug": "cloud-architect",
            "source": "db"
          },
          {
            "display_name": "Cloud Security Engineer",
            "id": 23,
            "rationale": null,
            "role_archetype": null,
            "slug": "cloud-security-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 425,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Cloud Security Posture Tools",
          "id": 64,
          "rationale": "Cloud-native security platforms used to assess misconfiguration, workload exposure, and cloud control coverage. This dimension includes the major CNAPP/CSPM/CWPP vendors and cloud security services the role reviews and tunes.",
          "slug": "cloud-security-posture-tools",
          "source": "db"
        },
        "dimension_id": 64,
        "input_skill": "Google Cloud Platform",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Cloud Security Engineer",
            "id": 23,
            "rationale": null,
            "role_archetype": null,
            "slug": "cloud-security-engineer",
            "source": "db"
          },
          {
            "display_name": "Cyber Security Engineer",
            "id": 5,
            "rationale": null,
            "role_archetype": null,
            "slug": "cybersecurity-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 425,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Data Pipeline Orchestration",
          "id": 23,
          "rationale": "Workflow engines that schedule, coordinate, and recover batch data jobs. This cluster covers dependency management, retries, backfills, sensors, and operational control of pipeline DAGs.",
          "slug": "data-pipeline-orchestration",
          "source": "db"
        },
        "dimension_id": 23,
        "input_skill": "Apache Airflow",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Data Engineer",
            "id": 2,
            "rationale": null,
            "role_archetype": null,
            "slug": "data-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 110,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "ETL and ELT Tooling",
          "id": 24,
          "rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
          "slug": "etl-and-elt-tooling",
          "source": "db"
        },
        "dimension_id": 24,
        "input_skill": "PySpark",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Skipped \u2014 no persistable v3 meta for new skill",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Data Engineer",
            "id": 2,
            "rationale": null,
            "role_archetype": null,
            "slug": "data-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": false,
        "skill_id": null,
        "skill_tag": "new",
        "skipped_reason": "skill_not_in_db_v3_proposed"
      },
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Container Orchestration Platforms",
          "id": 134,
          "rationale": "Platforms that schedule and manage containerized workloads across clusters and environments. Cloud Architects need these to define workload placement standards, cluster boundaries, and platform capabilities.",
          "slug": "container-orchestration-platforms",
          "source": "db"
        },
        "dimension_id": 134,
        "input_skill": "Kubernetes",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "Cloud Architect",
            "id": 9,
            "rationale": null,
            "role_archetype": null,
            "slug": "cloud-architect",
            "source": "db"
          },
          {
            "display_name": "DevOps Engineer",
            "id": 10,
            "rationale": null,
            "role_archetype": null,
            "slug": "devops-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 726,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "Kubernetes for ML Workloads",
          "id": 47,
          "rationale": "Kubernetes-native components used to schedule, accelerate, and isolate ML training and serving workloads. This includes GPU enablement and ML-specific controllers rather than generic cluster administration.",
          "slug": "kubernetes-for-ml-workloads",
          "source": "db"
        },
        "dimension_id": 47,
        "input_skill": "Kubernetes",
        "llm_role": null,
        "matched_chosen_role": true,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension saved",
        "role_dimension_saved": true,
        "roles_from_db": [
          {
            "display_name": "ML Engineer",
            "id": 3,
            "rationale": null,
            "role_archetype": null,
            "slug": "ml-engineer",
            "source": "db"
          },
          {
            "display_name": "MLOps Engineer",
            "id": 16,
            "rationale": null,
            "role_archetype": null,
            "slug": "ml-ops-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 726,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "CI/CD Pipeline Platforms",
          "id": 150,
          "rationale": "Systems used to define, run, and maintain automated build and deployment workflows. This cluster is coherent because the role owns delivery automation end to end, including pipeline reliability and promotion logic.",
          "slug": "ci-cd-pipeline-platforms",
          "source": "db"
        },
        "dimension_id": 150,
        "input_skill": "CI/CD",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "DevOps Engineer",
            "id": 10,
            "rationale": null,
            "role_archetype": null,
            "slug": "devops-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 1190,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "CI/CD for Machine Learning",
          "id": 56,
          "rationale": "Tools and platforms for automating ML model integration, testing, and deployment pipelines.",
          "slug": "ci-cd-for-machine-learning",
          "source": "db"
        },
        "dimension_id": 56,
        "input_skill": "CI/CD",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "ML Engineer",
            "id": 3,
            "rationale": null,
            "role_archetype": null,
            "slug": "ml-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 1190,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "ML Frameworks and Libraries",
          "id": 40,
          "rationale": "Core libraries used to define models, train them, run inference, and evaluate predictive performance. These frameworks shape how ML engineers express model architectures and training loops.",
          "slug": "ml-frameworks-and-libraries",
          "source": "db"
        },
        "dimension_id": 40,
        "input_skill": "TensorFlow",
        "llm_role": null,
        "matched_chosen_role": true,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension saved",
        "role_dimension_saved": true,
        "roles_from_db": [
          {
            "display_name": "ML Engineer",
            "id": 3,
            "rationale": null,
            "role_archetype": null,
            "slug": "ml-engineer",
            "source": "db"
          },
          {
            "display_name": "MLOps Engineer",
            "id": 16,
            "rationale": null,
            "role_archetype": null,
            "slug": "ml-ops-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 196,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "MLOps Platforms and Lifecycle",
          "id": 43,
          "rationale": "End-to-end managed platforms used to train, deploy, register, and govern models across their lifecycle. This is the operational control plane for production ML workflows.",
          "slug": "mlops-platforms-and-lifecycle",
          "source": "db"
        },
        "dimension_id": 43,
        "input_skill": "Kubeflow Pipelines",
        "llm_role": null,
        "matched_chosen_role": true,
        "outcome_line": "Skipped \u2014 no persistable v3 meta for new skill",
        "role_dimension_saved": false,
        "roles_from_db": [
          {
            "display_name": "ML Engineer",
            "id": 3,
            "rationale": null,
            "role_archetype": null,
            "slug": "ml-engineer",
            "source": "db"
          },
          {
            "display_name": "MLOps Engineer",
            "id": 16,
            "rationale": null,
            "role_archetype": null,
            "slug": "ml-ops-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": false,
        "skill_id": null,
        "skill_tag": "new",
        "skipped_reason": "skill_not_in_db_v3_proposed"
      },
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "AI Governance and Model Security",
          "id": 50,
          "rationale": "Controls and documentation used to make models safer, auditable, and compliant. ML engineers use this to manage model risk, supply chain integrity, and governance requirements.",
          "slug": "ai-governance-and-model-security",
          "source": "db"
        },
        "dimension_id": 50,
        "input_skill": "Machine Learning",
        "llm_role": null,
        "matched_chosen_role": true,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension saved",
        "role_dimension_saved": true,
        "roles_from_db": [
          {
            "display_name": "AI Engineer",
            "id": 13,
            "rationale": null,
            "role_archetype": null,
            "slug": "ai-engineer",
            "source": "db"
          },
          {
            "display_name": "ML Engineer",
            "id": 3,
            "rationale": null,
            "role_archetype": null,
            "slug": "ml-engineer",
            "source": "db"
          },
          {
            "display_name": "MLOps Engineer",
            "id": 16,
            "rationale": null,
            "role_archetype": null,
            "slug": "ml-ops-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 1356,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 16,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "React Frontend Development",
          "id": 96,
          "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
          "slug": "d_init_01",
          "source": "db"
        },
        "dimension_id": 96,
        "input_skill": "Machine Learning",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [],
        "skill_dimension_saved": true,
        "skill_id": 1356,
        "skill_tag": "in_db",
        "skipped_reason": null
      }
    ],
    "new_skills_created": 0,
    "role_dimension_saved": 0,
    "skill_dimension_saved": 0,
    "skipped": 2
  },
  "planner_output": null,
  "run_id": "f44ceab1-c4b3-4c44-93df-420af9b73fce"
}

LLM Calls

Every model call made for this run, in pipeline order. Click a card to see the model's response.

Loading…