Pipeline run
aa174cd1-62e6-4b1e-8972-4d763e3333fc
Client output enrichment
v2 Skill cluster · Nature of work · AI index · Tech stack maturity · Evidence · KRA descriptionvocab breakdown (legacy)
Signals
Post-classification
Captured for admin review
1 POST /skills/extract-from-jd
2 POST /skills/extract-details
3 POST /skills/final-role-output
Data Governance Engineer
domain · Data Engineering & Analytics CASE DOMAINslug: data-governance-engineer · id: 146 · source: db
Domain=Data Engineering & Analytics; The JD centers on Azure Purview-based data governance alongside data extraction, ETL orchestration, and Azure data engineering tasks, which best matches Data Governance Engineer.
Matched skills
Matched dimensions
Matched KRAs
Resolution:
in_db
— role exists in library; skill↔dim and role↔dim links saved when applicable.
Job description
Should have total 5-8 yrs experience. Has working experience in Data Governance and Data Engineering. At least 1 year Working experience in Azure Purview or similar data governance tool along with good working knowledge on Azure Perview Worked experience required services (Azure Data Factory, Azure DataBricks, Azure Logic Apps, Azure Log AnalyticsWorkspace) . Working Experience Data extraction using Azure Data Factory and Azure DataBricks and storing the data in DataLake Gen2 Strong understanding of complex ETL logic in Azure DataBricks using Pyspark and SparkSQL. Orchestrating and Scheduling the pipeline using Azure Data Factory. Experience with PowerShell Strong communication skills and working experience with Microsoft Azure Purview is mandatory.
Skills from this JD
Each row merges API 1 extraction, API 2 library match / v3 orchestration (dimensions + locked dims), and API 3 persistence tags.
Skill enrichment (orchestrator / LLM)
No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).
- Category
- Data Engineering Tools
- Sub-category
- general
- Skill nature
- CONCEPT
- Volatility
- MEDIUM
- Typical lifespan
- MULTI_YEAR
- Version strategy
- UNVERSIONED
Skill enrichment (orchestrator / LLM)
No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).
- Category
- Data Engineering Tools
- Sub-category
- general
- Skill nature
- CONCEPT
- Volatility
- MEDIUM
- Typical lifespan
- MULTI_YEAR
- Version strategy
- UNVERSIONED
Skill enrichment (orchestrator / LLM)
No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).
- Category
- Cloud Platforms
- Sub-category
- Data Governance
- Skill nature
- PLATFORM
- Volatility
- MEDIUM
- Typical lifespan
- MULTI_YEAR
- Version strategy
- UNVERSIONED
Skill enrichment (orchestrator / LLM)
No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).
- Category
- Cloud Platforms
- Sub-category
- Data Integration
- Skill nature
- PLATFORM
- Volatility
- MEDIUM
- Typical lifespan
- MULTI_YEAR
- Version strategy
- UNVERSIONED
Skill enrichment (orchestrator / LLM)
No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).
- Category
- Cloud Platforms
- Sub-category
- Data Science
- Skill nature
- PLATFORM
- Volatility
- MEDIUM
- Typical lifespan
- MULTI_YEAR
- Version strategy
- UNVERSIONED
Skill enrichment (orchestrator / LLM)
No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).
- Category
- Cloud Platforms
- Sub-category
- Integration Services
- Skill nature
- PLATFORM
- Volatility
- MEDIUM
- Typical lifespan
- MULTI_YEAR
- Version strategy
- UNVERSIONED
Skill enrichment (orchestrator / LLM)
No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).
- Category
- Cloud Platforms
- Sub-category
- Monitoring Tools
- Skill nature
- PLATFORM
- Volatility
- MEDIUM
- Typical lifespan
- MULTI_YEAR
- Version strategy
- UNVERSIONED
Skill enrichment (orchestrator / LLM)
No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).
- Category
- Cloud Platforms
- Sub-category
- Data Storage
- Skill nature
- PLATFORM
- Volatility
- MEDIUM
- Typical lifespan
- MULTI_YEAR
- Version strategy
- UNVERSIONED
Aliases — catalog
- Apache Spark (CANONICAL)
- apache spark 3 (VERSION)
- spark (VERSION)
- spark 3 (VERSION)
- spark 3.x (VERSION)
- spark3 (VERSION)
Context tags (catalog)
Stored enrichment (catalog DB)
- Category
- Framework
- Sub-category
- Distributed Data Processing Framework
- Vendor
- Apache Software Foundation
- License
- apache_2
- Year introduced
- 2010
- Confidence
- 0.94
- Version strategy
- SEPARATE_ENTITY
- Version tag
- 3.x
Maturity reasoning: Apache Spark appears in many data engineering JDs and remains a standard for distributed ETL/ELT; its GitHub and vendor ecosystem activity stay strong, with Databricks and cloud platforms still promoting it.
Skill profile (library / DB)
- Skill nature
- FRAMEWORK
- Volatility
- STABLE
- Typical lifespan
- EVERGREEN
- Category id
- 5
- Sub-category id
- 1021
- Extractable
- True
- Also category
- False
Dimensions (API 2 worklist)
-
ETL and ELT Tooling Catalog dimension db id 24
Library dimension (catalog)
Roles linked in library: Data Engineer
API 3 link attempts (this skill)
| Dimension | Skill↔dim | Role↔dim | Outcome |
|---|---|---|---|
|
ETL and ELT Tooling
etl-and-elt-tooling
|
— | — |
Skipped — no persistable v3 meta for new skill
skill_not_in_db_v3_proposed
|
Skill enrichment (orchestrator / LLM)
No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).
- Category
- Programming Languages
- Sub-category
- general
- Skill nature
- LANGUAGE
- Volatility
- MEDIUM
- Typical lifespan
- MULTI_YEAR
- Version strategy
- UNVERSIONED
Aliases — catalog
- PowerShell (CANONICAL) primary
- PowerShell 5 (VERSION)
- PowerShell 5.1 (VERSION)
- PowerShell 6 (VERSION)
- PowerShell 7 (VERSION)
- PowerShell 7.x (VERSION)
- PowerShell Core (VERSION)
- Windows PowerShell (VERSION)
- powershell 7 (VERSION)
- powershell 7.x (VERSION)
- powershell core (VERSION)
- ps 7 (VERSION)
- pwsh (VERSION)
Context tags (catalog)
Stored enrichment (catalog DB)
- Category
- Language
- Sub-category
- Scripting Language
- Vendor
- Microsoft
- License
- mit
- Year introduced
- 2006
- Confidence
- 0.98
- Version strategy
- SEPARATE_ENTITY
- Version tag
- 7
Maturity reasoning: Common in Windows/admin and DevOps job descriptions; Microsoft continues active development and it remains a standard automation language alongside Bash in enterprise tooling.
Skill profile (library / DB)
- Skill nature
- LANGUAGE
- Volatility
- STABLE
- Typical lifespan
- EVERGREEN
- Category id
- 6
- Sub-category id
- 38
- Extractable
- True
- Also category
- False
Dimensions (API 2 worklist)
-
Programming Languages and Scripting Catalog dimension db id 59
Library dimension (catalog)
Roles linked in library: Cyber Security Engineer
-
Programming Languages for ML Systems Catalog dimension db id 39
Library dimension (catalog)
Roles linked in library: ML Engineer, MLOps Engineer
API 3 link attempts (this skill)
| Dimension | Skill↔dim | Role↔dim | Outcome |
|---|---|---|---|
|
Programming Languages and Scripting
programming-languages-and-scripting
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) |
|
Programming Languages for ML Systems
programming-languages-for-ml-systems
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) |
Aliases — catalog
- Microsoft Azure (CANONICAL) primary
Context tags (catalog)
Stored enrichment (catalog DB)
- Category
- Platform
- Sub-category
- Cloud Platform
- Vendor
- Microsoft
- License
- other_open
- Year introduced
- 2010
- Confidence
- 0.99
- Version strategy
- NOT_APPLICABLE
Maturity reasoning: Azure appears in large volumes of cloud/DevOps job descriptions and is a core hyperscaler alongside AWS/GCP; Microsoft’s continued product investment and broad enterprise adoption signal mainstream demand.
Skill profile (library / DB)
- Skill nature
- PLATFORM
- Volatility
- STABLE
- Typical lifespan
- EVERGREEN
- Category id
- 9
- Sub-category id
- 46
- Extractable
- True
- Also category
- False
Dimensions (API 2 worklist)
-
Cloud & Hosting Providers Catalog dimension db id 414
Library dimension (catalog)
Roles linked in library: PHP Backend Developer
-
Cloud Platforms Catalog dimension db id 20
Library dimension (catalog)
Roles linked in library: .NET Backend Developer, Backend Developer, Cyber Security Engineer, Data Engineer, DevOps Engineer, Fullstack Developer, Go Backend Developer, Java Backend Developer, Kotlin Backend Developer, ML Engineer, MLOps Engineer, Node.js Backend Developer, Python Backend Developer, Scala Backend Developer
API 3 link attempts (this skill)
| Dimension | Skill↔dim | Role↔dim | Outcome |
|---|---|---|---|
|
Cloud & Hosting Providers
cloud-hosting-providers
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) |
|
Cloud Platforms
cloud-platforms
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) |
All API 3 persistence rows
Same grid as the skill-extractor “Persistence items” table: one row per (skill × dimension) work item.
| Skill | Tag | Dimension | Skill↔dim | Role↔dim | Outcome | Notes |
|---|---|---|---|---|---|---|
| PySpark | new |
ETL and ELT Tooling
etl-and-elt-tooling
|
— | — | Skipped — no persistable v3 meta for new skill | skill_not_in_db_v3_proposed |
| PowerShell | in_db |
Programming Languages and Scripting
programming-languages-and-scripting
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) | |
| PowerShell | in_db |
Programming Languages for ML Systems
programming-languages-for-ml-systems
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) | |
| Microsoft Azure | in_db |
Cloud & Hosting Providers
cloud-hosting-providers
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) | |
| Microsoft Azure | in_db |
Cloud Platforms
cloud-platforms
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) |
Library artifacts (this run)
| Kind | Detail | DB id |
|---|---|---|
| canonical_skill_proposed | Data Governance | type=Data Engineering Tools subtype=general nature=CONCEPT lifespan=MULTI_YEAR | |
| canonical_skill_proposed | Data Engineering | type=Data Engineering Tools subtype=general nature=CONCEPT lifespan=MULTI_YEAR | |
| canonical_skill_proposed | Azure Purview | type=Cloud Platforms subtype=Data Governance nature=PLATFORM lifespan=MULTI_YEAR | |
| canonical_skill_proposed | Azure Data Factory | type=Cloud Platforms subtype=Data Integration nature=PLATFORM lifespan=MULTI_YEAR | |
| canonical_skill_proposed | Azure Databricks | type=Cloud Platforms subtype=Data Science nature=PLATFORM lifespan=MULTI_YEAR | |
| canonical_skill_proposed | Azure Logic Apps | type=Cloud Platforms subtype=Integration Services nature=PLATFORM lifespan=MULTI_YEAR | |
| canonical_skill_proposed | Azure Log Analytics | type=Cloud Platforms subtype=Monitoring Tools nature=PLATFORM lifespan=MULTI_YEAR | |
| canonical_skill_proposed | Azure Data Lake Storage Gen2 | type=Cloud Platforms subtype=Data Storage nature=PLATFORM lifespan=MULTI_YEAR | |
| canonical_skill_proposed | Spark SQL | type=Programming Languages subtype=general nature=LANGUAGE lifespan=MULTI_YEAR | |
| dimension_skill_link_proposed | PySpark ↔ ETL and ELT Tooling |
nano JD Parser — gpt-4.1-nano click to toggle
Show raw JSON
{
"JD_type": "fail",
"archetype_override_applied": true,
"archetype_override_matched_skills": [
"Microsoft Azure",
"Databricks",
"Azure",
"PowerShell"
],
"role_archetype": "Engineering"
}
API 1 — extract-from-jd click to toggle
{
"final_skills": [
{
"is_primary": true,
"skill_name": "Data Governance"
},
{
"is_primary": true,
"skill_name": "Data Engineering"
},
{
"is_primary": true,
"skill_name": "Azure Purview"
},
{
"is_primary": true,
"skill_name": "Azure Data Factory"
},
{
"is_primary": true,
"skill_name": "Azure Databricks"
},
{
"is_primary": true,
"skill_name": "Azure Logic Apps"
},
{
"is_primary": true,
"skill_name": "Azure Log Analytics"
},
{
"is_primary": true,
"skill_name": "Azure Data Lake Storage Gen2"
},
{
"is_primary": true,
"skill_name": "PySpark"
},
{
"is_primary": true,
"skill_name": "Spark SQL"
},
{
"is_primary": true,
"skill_name": "PowerShell"
},
{
"is_primary": true,
"skill_name": "Microsoft Azure"
}
],
"jd_role": null,
"nano_parsed": {
"JD_type": "fail",
"archetype_override_applied": true,
"archetype_override_matched_skills": [
"Microsoft Azure",
"Databricks",
"Azure",
"PowerShell"
],
"role_archetype": "Engineering"
},
"rejected": false,
"rejection_reason": null,
"run_id": "aa174cd1-62e6-4b1e-8972-4d763e3333fc",
"stage3_signals": {
"alias_found": false,
"alias_match_roles": [],
"kra_match_roles": [
{
"display_name": "Data Engineer",
"kra_matches": [
{
"kra_text": "Works with data analysts, data scientists, and business stakeholders to define data models, ingestion schedules, and data delivery requirements.",
"sentence": "Has working experience in Data Governance and Data Engineering.",
"similarity": 0.533
},
{
"kra_text": "Optimizes pipeline throughput, partitioning strategies, and query performance across cloud data warehouses like Snowflake, BigQuery, or Redshift.",
"sentence": "Orchestrating and Scheduling the pipeline using Azure Data Factory.",
"similarity": 0.4827
},
{
"kra_text": "Builds data ingestion pipelines to collect data from transactional databases, third-party APIs, event streams, and file sources into centralized data platforms.",
"sentence": "Working Experience Data extraction using Azure Data Factory and Azure DataBricks and storing the data in DataLake Gen2",
"similarity": 0.4407
}
],
"matched_count": null,
"matched_skills": null,
"role_id": 2,
"score": 0.4855,
"slug": "data-engineer",
"total_count": null
},
{
"display_name": "Cloud Architect",
"kra_matches": [
{
"kra_text": "Conducts architecture reviews, approves technical design documents, and guides engineering teams through cloud migration and modernization projects.",
"sentence": "Strong communication skills and working experience with Microsoft Azure Purview is mandatory.",
"similarity": 0.4524
},
{
"kra_text": "Conducts architecture reviews, approves technical design documents, and guides engineering teams through cloud migration and modernization projects.",
"sentence": "At least 1 year Working experience in Azure Purview or similar data governance tool along with good working knowledge on Azure Perview",
"similarity": 0.4065
},
{
"kra_text": "Defines cloud adoption roadmaps, lift-and-shift vs. refactor migration strategies, and landing zone architectures for workloads moving to AWS, Azure, or GCP.",
"sentence": "Orchestrating and Scheduling the pipeline using Azure Data Factory.",
"similarity": 0.4029
}
],
"matched_count": null,
"matched_skills": null,
"role_id": 9,
"score": 0.4206,
"slug": "cloud-architect",
"total_count": null
},
{
"display_name": "Cloud Security Engineer",
"kra_matches": [
{
"kra_text": "Designs IAM role policies, service account permissions, resource-based policies, and least-privilege access controls for cloud workloads and pipelines.",
"sentence": "Orchestrating and Scheduling the pipeline using Azure Data Factory.",
"similarity": 0.4156
},
{
"kra_text": "Designs IAM role policies, service account permissions, resource-based policies, and least-privilege access controls for cloud workloads and pipelines.",
"sentence": "Worked experience required services (Azure Data Factory, Azure DataBricks, Azure Logic Apps, Azure Log AnalyticsWorkspace) .",
"similarity": 0.4125
},
{
"kra_text": "Designs and implements cloud security controls including KMS encryption, secrets management, and data-at-rest protection for AWS, Azure, or GCP workloads.",
"sentence": "At least 1 year Working experience in Azure Purview or similar data governance tool along with good working knowledge on Azure Perview",
"similarity": 0.3985
}
],
"matched_count": null,
"matched_skills": null,
"role_id": 23,
"score": 0.4088,
"slug": "cloud-security-engineer",
"total_count": null
},
{
"display_name": "MLOps Engineer",
"kra_matches": [
{
"kra_text": "Automates ML platform operations including scheduled retraining triggers, pipeline orchestration, evaluation workflows, and alerting configuration.",
"sentence": "Orchestrating and Scheduling the pipeline using Azure Data Factory.",
"similarity": 0.5012
},
{
"kra_text": "Automates ML platform operations including scheduled retraining triggers, pipeline orchestration, evaluation workflows, and alerting configuration.",
"sentence": "Worked experience required services (Azure Data Factory, Azure DataBricks, Azure Logic Apps, Azure Log AnalyticsWorkspace) .",
"similarity": 0.3688
},
{
"kra_text": "Validates model performance benchmarks, data schema contracts, and system integration health before signing off on production release readiness.",
"sentence": "Has working experience in Data Governance and Data Engineering.",
"similarity": 0.3492
}
],
"matched_count": null,
"matched_skills": null,
"role_id": 16,
"score": 0.4064,
"slug": "ml-ops-engineer",
"total_count": null
},
{
"display_name": "ML Engineer",
"kra_matches": [
{
"kra_text": "Prepares, cleans, and transforms training datasets, manages feature stores, and builds feature engineering pipelines for model training.",
"sentence": "Has working experience in Data Governance and Data Engineering.",
"similarity": 0.4133
},
{
"kra_text": "Prepares, cleans, and transforms training datasets, manages feature stores, and builds feature engineering pipelines for model training.",
"sentence": "Orchestrating and Scheduling the pipeline using Azure Data Factory.",
"similarity": 0.4126
},
{
"kra_text": "Prepares, cleans, and transforms training datasets, manages feature stores, and builds feature engineering pipelines for model training.",
"sentence": "Working Experience Data extraction using Azure Data Factory and Azure DataBricks and storing the data in DataLake Gen2",
"similarity": 0.3854
}
],
"matched_count": null,
"matched_skills": null,
"role_id": 3,
"score": 0.4037,
"slug": "ml-engineer",
"total_count": null
}
],
"skill_match_roles": [
{
"display_name": "MLOps Engineer",
"kra_matches": null,
"matched_count": 2,
"matched_skills": [
"Microsoft Azure",
"PowerShell"
],
"role_id": 16,
"score": 0.1667,
"slug": "ml-ops-engineer",
"total_count": 12
},
{
"display_name": "ML Engineer",
"kra_matches": null,
"matched_count": 2,
"matched_skills": [
"Microsoft Azure",
"PowerShell"
],
"role_id": 3,
"score": 0.1667,
"slug": "ml-engineer",
"total_count": 12
},
{
"display_name": "Cyber Security Engineer",
"kra_matches": null,
"matched_count": 2,
"matched_skills": [
"Microsoft Azure",
"PowerShell"
],
"role_id": 5,
"score": 0.1667,
"slug": "cybersecurity-engineer",
"total_count": 12
},
{
"display_name": "DevOps Engineer",
"kra_matches": null,
"matched_count": 1,
"matched_skills": [
"Microsoft Azure"
],
"role_id": 10,
"score": 0.0833,
"slug": "devops-engineer",
"total_count": 12
},
{
"display_name": "Data Engineer",
"kra_matches": null,
"matched_count": 1,
"matched_skills": [
"Microsoft Azure"
],
"role_id": 2,
"score": 0.0833,
"slug": "data-engineer",
"total_count": 12
}
]
},
"stage4_decision": {
"alias_collision_detected": false,
"case": "DOMAIN",
"chosen_role": {
"display_name": "Data Governance Engineer",
"kra_matches": null,
"matched_count": null,
"matched_skills": null,
"role_id": 146,
"score": 0.91,
"slug": "data-governance-engineer",
"total_count": null
},
"confidence": 0.91,
"is_new_role": false,
"llm2_fired": false,
"llm2_reasoning": null,
"matched_dimensions": [
"Data governance tooling",
"Azure-based data engineering",
"ETL pipeline development",
"Pipeline orchestration and scheduling",
"Data extraction and lake storage",
"Spark-based transformation logic"
],
"matched_kras": [
"Working experience in Data Governance and Data Engineering",
"Working experience in Azure Purview or similar data governance tool",
"Data extraction using Azure Data Factory and Azure DataBricks",
"Storing the data in DataLake Gen2",
"Strong understanding of complex ETL logic in Azure DataBricks",
"Orchestrating and Scheduling the pipeline using Azure Data Factory",
"Working experience with Microsoft Azure Purview is mandatory"
],
"matched_skills": [
"Data Governance",
"Data Engineering",
"Azure Purview",
"Azure Data Factory",
"Azure DataBricks",
"Azure Logic Apps",
"Azure Log AnalyticsWorkspace",
"DataLake Gen2",
"Pyspark",
"SparkSQL",
"PowerShell",
"Microsoft Azure Purview"
],
"new_role_display_name": null,
"new_role_slug": null,
"queued": false,
"reasoning": "Domain=Data Engineering \u0026 Analytics; The JD centers on Azure Purview-based data governance alongside data extraction, ETL orchestration, and Azure data engineering tasks, which best matches Data Governance Engineer.",
"sub_role": null
},
"stage5_updates": {
"centroid_n_after": 10,
"centroid_updated": true,
"collision_log_id": null,
"new_kra_attached": null,
"new_skills_attached": [
{
"is_primary": true,
"queue_id": 23441,
"role_display_name": "Data Governance Engineer",
"role_slug": "data-governance-engineer",
"skill_name": "Data Governance",
"status": "pending"
},
{
"is_primary": true,
"queue_id": 23442,
"role_display_name": "Data Governance Engineer",
"role_slug": "data-governance-engineer",
"skill_name": "Data Engineering",
"status": "pending"
},
{
"is_primary": true,
"queue_id": 23443,
"role_display_name": "Data Governance Engineer",
"role_slug": "data-governance-engineer",
"skill_name": "Azure Purview",
"status": "pending"
},
{
"is_primary": true,
"queue_id": 23444,
"role_display_name": "Data Governance Engineer",
"role_slug": "data-governance-engineer",
"skill_name": "Azure Data Factory",
"status": "pending"
},
{
"is_primary": true,
"queue_id": 23446,
"role_display_name": "Data Governance Engineer",
"role_slug": "data-governance-engineer",
"skill_name": "Azure Databricks",
"status": "pending"
},
{
"is_primary": true,
"queue_id": 23448,
"role_display_name": "Data Governance Engineer",
"role_slug": "data-governance-engineer",
"skill_name": "Azure Logic Apps",
"status": "pending"
},
{
"is_primary": true,
"queue_id": 23450,
"role_display_name": "Data Governance Engineer",
"role_slug": "data-governance-engineer",
"skill_name": "Azure Log Analytics",
"status": "pending"
},
{
"is_primary": true,
"queue_id": 23452,
"role_display_name": "Data Governance Engineer",
"role_slug": "data-governance-engineer",
"skill_name": "Azure Data Lake Storage Gen2",
"status": "pending"
},
{
"is_primary": true,
"queue_id": 23454,
"role_display_name": "Data Governance Engineer",
"role_slug": "data-governance-engineer",
"skill_name": "PySpark",
"status": "pending"
},
{
"is_primary": true,
"queue_id": 23456,
"role_display_name": "Data Governance Engineer",
"role_slug": "data-governance-engineer",
"skill_name": "Spark SQL",
"status": "pending"
}
],
"queue_entry_id": null,
"v3_pipeline_triggered": false,
"v3_role_slug": null,
"v3_run_id": null
}
}
API 2 — extract-details
{
"alias_matches": [
{
"alias_persist_skipped_reason": "TODO: REMOVE AFTER TESTING \u2014 alias DB write disabled",
"alias_persisted": false,
"existing_alias_id": 2004,
"existing_alias_text": "Apache Spark",
"input_term": "PySpark",
"matched_canonical": {
"category_id": 5,
"display_name": "Apache Spark",
"id": 1350,
"is_also_category": false,
"is_extractable": true,
"skill_nature": "FRAMEWORK",
"slug": "apache-spark",
"sub_category_id": 1021,
"typical_lifespan": "EVERGREEN",
"volatility": "STABLE"
},
"matched_via": "embedding_alias"
},
{
"alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
"alias_persisted": false,
"existing_alias_id": 583,
"existing_alias_text": "PowerShell",
"input_term": "PowerShell",
"matched_canonical": {
"category_id": 6,
"display_name": "PowerShell",
"id": 297,
"is_also_category": false,
"is_extractable": true,
"skill_nature": "LANGUAGE",
"slug": "powershell",
"sub_category_id": 38,
"typical_lifespan": "EVERGREEN",
"volatility": "STABLE"
},
"matched_via": "alias"
},
{
"alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
"alias_persisted": false,
"existing_alias_id": 258,
"existing_alias_text": "Microsoft Azure",
"input_term": "Microsoft Azure",
"matched_canonical": {
"category_id": 9,
"display_name": "Microsoft Azure",
"id": 97,
"is_also_category": false,
"is_extractable": true,
"skill_nature": "PLATFORM",
"slug": "microsoft-azure",
"sub_category_id": 46,
"typical_lifespan": "EVERGREEN",
"volatility": "STABLE"
},
"matched_via": "alias"
}
],
"candidate_roles": [
{
"display_name": "Data Engineer",
"id": 2,
"rationale": null,
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
},
{
"display_name": "Cyber Security Engineer",
"id": 5,
"rationale": null,
"role_archetype": null,
"slug": "cybersecurity-engineer",
"source": "db"
},
{
"display_name": "ML Engineer",
"id": 3,
"rationale": null,
"role_archetype": null,
"slug": "ml-engineer",
"source": "db"
},
{
"display_name": "MLOps Engineer",
"id": 16,
"rationale": null,
"role_archetype": null,
"slug": "ml-ops-engineer",
"source": "db"
},
{
"display_name": "PHP Backend Developer",
"id": 86,
"rationale": null,
"role_archetype": "Engineering",
"slug": "php-backend-developer",
"source": "db"
},
{
"display_name": ".NET Backend Developer",
"id": 83,
"rationale": null,
"role_archetype": "Engineering",
"slug": "dotnet-backend-developer",
"source": "db"
},
{
"display_name": "Backend Developer",
"id": 1,
"rationale": null,
"role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
"slug": "backend-engineer",
"source": "db"
},
{
"display_name": "DevOps Engineer",
"id": 10,
"rationale": null,
"role_archetype": null,
"slug": "devops-engineer",
"source": "db"
},
{
"display_name": "Fullstack Developer",
"id": 15,
"rationale": null,
"role_archetype": null,
"slug": "full-stack-engineer",
"source": "db"
},
{
"display_name": "Go Backend Developer",
"id": 81,
"rationale": null,
"role_archetype": "Engineering",
"slug": "go-backend-developer",
"source": "db"
},
{
"display_name": "Java Backend Developer",
"id": 79,
"rationale": null,
"role_archetype": "Engineering",
"slug": "java-backend-developer",
"source": "db"
},
{
"display_name": "Kotlin Backend Developer",
"id": 84,
"rationale": null,
"role_archetype": "Engineering",
"slug": "kotlin-server-backend-developer",
"source": "db"
},
{
"display_name": "Node.js Backend Developer",
"id": 82,
"rationale": null,
"role_archetype": "Engineering",
"slug": "node-backend-developer",
"source": "db"
},
{
"display_name": "Python Backend Developer",
"id": 80,
"rationale": null,
"role_archetype": "Engineering",
"slug": "python-backend-developer",
"source": "db"
},
{
"display_name": "Scala Backend Developer",
"id": 87,
"rationale": null,
"role_archetype": "Engineering",
"slug": "scala-backend-developer",
"source": "db"
}
],
"chosen_role": {
"display_name": "Data Governance Engineer",
"id": 146,
"rationale": "Domain=Data Engineering \u0026 Analytics; The JD centers on Azure Purview-based data governance alongside data extraction, ETL orchestration, and Azure data engineering tasks, which best matches Data Governance Engineer.",
"role_archetype": null,
"slug": "data-governance-engineer",
"source": "db"
},
"dimensions": [
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "ETL and ELT Tooling",
"id": 24,
"rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
"slug": "etl-and-elt-tooling",
"source": "db"
},
"input_skill": "PySpark",
"llm_role": null,
"roles_from_db": [
{
"display_name": "Data Engineer",
"id": 2,
"rationale": null,
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
}
]
},
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Programming Languages and Scripting",
"id": 59,
"rationale": "Languages used to write security automation, analysis scripts, detection logic, and remediation helpers. This is the primary implementation surface for a cybersecurity engineer across tooling and response workflows.",
"slug": "programming-languages-and-scripting",
"source": "db"
},
"input_skill": "PowerShell",
"llm_role": null,
"roles_from_db": [
{
"display_name": "Cyber Security Engineer",
"id": 5,
"rationale": null,
"role_archetype": null,
"slug": "cybersecurity-engineer",
"source": "db"
}
]
},
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Programming Languages for ML Systems",
"id": 39,
"rationale": "Languages used to build training code, inference services, evaluation jobs, and ML glue code. This is the primary implementation surface for ML engineers across experimentation and productionization.",
"slug": "programming-languages-for-ml-systems",
"source": "db"
},
"input_skill": "PowerShell",
"llm_role": null,
"roles_from_db": [
{
"display_name": "ML Engineer",
"id": 3,
"rationale": null,
"role_archetype": null,
"slug": "ml-engineer",
"source": "db"
},
{
"display_name": "MLOps Engineer",
"id": 16,
"rationale": null,
"role_archetype": null,
"slug": "ml-ops-engineer",
"source": "db"
}
]
},
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Cloud \u0026 Hosting Providers",
"id": 414,
"rationale": "Knowledge of major cloud and hosting vendor platforms for deploying and managing PHP applications.",
"slug": "cloud-hosting-providers",
"source": "db"
},
"input_skill": "Microsoft Azure",
"llm_role": null,
"roles_from_db": [
{
"display_name": "PHP Backend Developer",
"id": 86,
"rationale": null,
"role_archetype": "Engineering",
"slug": "php-backend-developer",
"source": "db"
}
]
},
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Cloud Platforms",
"id": 20,
"rationale": "Underlying cloud providers that host the managed services or infrastructure used by the role, such as AWS, Azure, and GCP.",
"slug": "cloud-platforms",
"source": "db"
},
"input_skill": "Microsoft Azure",
"llm_role": null,
"roles_from_db": [
{
"display_name": ".NET Backend Developer",
"id": 83,
"rationale": null,
"role_archetype": "Engineering",
"slug": "dotnet-backend-developer",
"source": "db"
},
{
"display_name": "Backend Developer",
"id": 1,
"rationale": null,
"role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
"slug": "backend-engineer",
"source": "db"
},
{
"display_name": "Cyber Security Engineer",
"id": 5,
"rationale": null,
"role_archetype": null,
"slug": "cybersecurity-engineer",
"source": "db"
},
{
"display_name": "Data Engineer",
"id": 2,
"rationale": null,
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
},
{
"display_name": "DevOps Engineer",
"id": 10,
"rationale": null,
"role_archetype": null,
"slug": "devops-engineer",
"source": "db"
},
{
"display_name": "Fullstack Developer",
"id": 15,
"rationale": null,
"role_archetype": null,
"slug": "full-stack-engineer",
"source": "db"
},
{
"display_name": "Go Backend Developer",
"id": 81,
"rationale": null,
"role_archetype": "Engineering",
"slug": "go-backend-developer",
"source": "db"
},
{
"display_name": "Java Backend Developer",
"id": 79,
"rationale": null,
"role_archetype": "Engineering",
"slug": "java-backend-developer",
"source": "db"
},
{
"display_name": "Kotlin Backend Developer",
"id": 84,
"rationale": null,
"role_archetype": "Engineering",
"slug": "kotlin-server-backend-developer",
"source": "db"
},
{
"display_name": "ML Engineer",
"id": 3,
"rationale": null,
"role_archetype": null,
"slug": "ml-engineer",
"source": "db"
},
{
"display_name": "MLOps Engineer",
"id": 16,
"rationale": null,
"role_archetype": null,
"slug": "ml-ops-engineer",
"source": "db"
},
{
"display_name": "Node.js Backend Developer",
"id": 82,
"rationale": null,
"role_archetype": "Engineering",
"slug": "node-backend-developer",
"source": "db"
},
{
"display_name": "Python Backend Developer",
"id": 80,
"rationale": null,
"role_archetype": "Engineering",
"slug": "python-backend-developer",
"source": "db"
},
{
"display_name": "Scala Backend Developer",
"id": 87,
"rationale": null,
"role_archetype": "Engineering",
"slug": "scala-backend-developer",
"source": "db"
}
]
}
],
"input_final_skills": [
"Data Governance",
"Data Engineering",
"Azure Purview",
"Azure Data Factory",
"Azure Databricks",
"Azure Logic Apps",
"Azure Log Analytics",
"Azure Data Lake Storage Gen2",
"PySpark",
"Spark SQL",
"PowerShell",
"Microsoft Azure"
],
"input_llm_skills": [
"Data Governance",
"Data Engineering",
"Azure Purview",
"Azure Data Factory",
"Azure Databricks",
"Azure Logic Apps",
"Azure Log Analytics",
"Azure Data Lake Storage Gen2",
"PySpark",
"Spark SQL",
"PowerShell",
"Microsoft Azure"
],
"new_aliases_persisted": 0,
"run_id": "aa174cd1-62e6-4b1e-8972-4d763e3333fc",
"skills_detail": [
{
"aliases_in_db": [],
"canonical": null,
"dimensions": [],
"input_skill": "Data Governance",
"matched_via": null,
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": {
"derived": {
"category": "Data Engineering Tools",
"skill_nature": "CONCEPT",
"sub_category": "general",
"typical_lifespan": "MULTI_YEAR",
"version_strategy": "UNVERSIONED",
"volatility": "MEDIUM"
},
"enrichment": null,
"keep_log": [],
"locked_dimensions": [],
"merge_log": [],
"placed": null,
"relationships": null,
"skill_id": "data-governance",
"split_log": [],
"typed": null,
"warnings": []
},
"source_tag": "llm",
"was_in_llm_skills": true
},
{
"aliases_in_db": [],
"canonical": null,
"dimensions": [],
"input_skill": "Data Engineering",
"matched_via": null,
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": {
"derived": {
"category": "Data Engineering Tools",
"skill_nature": "CONCEPT",
"sub_category": "general",
"typical_lifespan": "MULTI_YEAR",
"version_strategy": "UNVERSIONED",
"volatility": "MEDIUM"
},
"enrichment": null,
"keep_log": [],
"locked_dimensions": [],
"merge_log": [],
"placed": null,
"relationships": null,
"skill_id": "data-engineering",
"split_log": [],
"typed": null,
"warnings": []
},
"source_tag": "llm",
"was_in_llm_skills": true
},
{
"aliases_in_db": [],
"canonical": null,
"dimensions": [],
"input_skill": "Azure Purview",
"matched_via": null,
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": {
"derived": {
"category": "Cloud Platforms",
"skill_nature": "PLATFORM",
"sub_category": "Data Governance",
"typical_lifespan": "MULTI_YEAR",
"version_strategy": "UNVERSIONED",
"volatility": "MEDIUM"
},
"enrichment": null,
"keep_log": [],
"locked_dimensions": [],
"merge_log": [],
"placed": null,
"relationships": null,
"skill_id": "azure-purview",
"split_log": [],
"typed": null,
"warnings": []
},
"source_tag": "llm",
"was_in_llm_skills": true
},
{
"aliases_in_db": [],
"canonical": null,
"dimensions": [],
"input_skill": "Azure Data Factory",
"matched_via": null,
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": {
"derived": {
"category": "Cloud Platforms",
"skill_nature": "PLATFORM",
"sub_category": "Data Integration",
"typical_lifespan": "MULTI_YEAR",
"version_strategy": "UNVERSIONED",
"volatility": "MEDIUM"
},
"enrichment": null,
"keep_log": [],
"locked_dimensions": [],
"merge_log": [],
"placed": null,
"relationships": null,
"skill_id": "azure-data-factory",
"split_log": [],
"typed": null,
"warnings": []
},
"source_tag": "llm",
"was_in_llm_skills": true
},
{
"aliases_in_db": [],
"canonical": null,
"dimensions": [],
"input_skill": "Azure Databricks",
"matched_via": null,
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": {
"derived": {
"category": "Cloud Platforms",
"skill_nature": "PLATFORM",
"sub_category": "Data Science",
"typical_lifespan": "MULTI_YEAR",
"version_strategy": "UNVERSIONED",
"volatility": "MEDIUM"
},
"enrichment": null,
"keep_log": [],
"locked_dimensions": [],
"merge_log": [],
"placed": null,
"relationships": null,
"skill_id": "azure-databricks",
"split_log": [],
"typed": null,
"warnings": []
},
"source_tag": "llm",
"was_in_llm_skills": true
},
{
"aliases_in_db": [],
"canonical": null,
"dimensions": [],
"input_skill": "Azure Logic Apps",
"matched_via": null,
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": {
"derived": {
"category": "Cloud Platforms",
"skill_nature": "PLATFORM",
"sub_category": "Integration Services",
"typical_lifespan": "MULTI_YEAR",
"version_strategy": "UNVERSIONED",
"volatility": "MEDIUM"
},
"enrichment": null,
"keep_log": [],
"locked_dimensions": [],
"merge_log": [],
"placed": null,
"relationships": null,
"skill_id": "azure-logic-apps",
"split_log": [],
"typed": null,
"warnings": []
},
"source_tag": "llm",
"was_in_llm_skills": true
},
{
"aliases_in_db": [],
"canonical": null,
"dimensions": [],
"input_skill": "Azure Log Analytics",
"matched_via": null,
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": {
"derived": {
"category": "Cloud Platforms",
"skill_nature": "PLATFORM",
"sub_category": "Monitoring Tools",
"typical_lifespan": "MULTI_YEAR",
"version_strategy": "UNVERSIONED",
"volatility": "MEDIUM"
},
"enrichment": null,
"keep_log": [],
"locked_dimensions": [],
"merge_log": [],
"placed": null,
"relationships": null,
"skill_id": "azure-log-analytics",
"split_log": [],
"typed": null,
"warnings": []
},
"source_tag": "llm",
"was_in_llm_skills": true
},
{
"aliases_in_db": [],
"canonical": null,
"dimensions": [],
"input_skill": "Azure Data Lake Storage Gen2",
"matched_via": null,
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": {
"derived": {
"category": "Cloud Platforms",
"skill_nature": "PLATFORM",
"sub_category": "Data Storage",
"typical_lifespan": "MULTI_YEAR",
"version_strategy": "UNVERSIONED",
"volatility": "MEDIUM"
},
"enrichment": null,
"keep_log": [],
"locked_dimensions": [],
"merge_log": [],
"placed": null,
"relationships": null,
"skill_id": "azure-data-lake-storage-gen2",
"split_log": [],
"typed": null,
"warnings": []
},
"source_tag": "llm",
"was_in_llm_skills": true
},
{
"aliases_in_db": [
{
"alias_text": "Apache Spark",
"alias_type": "CANONICAL",
"id": 2004,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "apache spark 3",
"alias_type": "VERSION",
"id": 2006,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "spark",
"alias_type": "VERSION",
"id": 2510,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "spark 3",
"alias_type": "VERSION",
"id": 2007,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "spark 3.x",
"alias_type": "VERSION",
"id": 2009,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "spark3",
"alias_type": "VERSION",
"id": 2008,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
}
],
"canonical": {
"category_id": 5,
"display_name": "Apache Spark",
"id": 1350,
"is_also_category": false,
"is_extractable": true,
"skill_nature": "FRAMEWORK",
"slug": "apache-spark",
"sub_category_id": 1021,
"typical_lifespan": "EVERGREEN",
"volatility": "STABLE"
},
"dimensions": [
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "ETL and ELT Tooling",
"id": 24,
"rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
"slug": "etl-and-elt-tooling",
"source": "db"
},
"input_skill": "PySpark",
"llm_role": null,
"roles_from_db": [
{
"display_name": "Data Engineer",
"id": 2,
"rationale": null,
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
}
]
}
],
"input_skill": "PySpark",
"matched_via": "embedding_alias",
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": null,
"source_tag": "db",
"was_in_llm_skills": true
},
{
"aliases_in_db": [],
"canonical": null,
"dimensions": [],
"input_skill": "Spark SQL",
"matched_via": null,
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": {
"derived": {
"category": "Programming Languages",
"skill_nature": "LANGUAGE",
"sub_category": "general",
"typical_lifespan": "MULTI_YEAR",
"version_strategy": "UNVERSIONED",
"volatility": "MEDIUM"
},
"enrichment": null,
"keep_log": [],
"locked_dimensions": [],
"merge_log": [],
"placed": null,
"relationships": null,
"skill_id": "spark-sql",
"split_log": [],
"typed": null,
"warnings": []
},
"source_tag": "llm",
"was_in_llm_skills": true
},
{
"aliases_in_db": [
{
"alias_text": "PowerShell",
"alias_type": "CANONICAL",
"id": 583,
"is_primary": true,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "PowerShell 5",
"alias_type": "VERSION",
"id": 585,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "PowerShell 5.1",
"alias_type": "VERSION",
"id": 588,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "PowerShell 6",
"alias_type": "VERSION",
"id": 586,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "PowerShell 7",
"alias_type": "VERSION",
"id": 587,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "PowerShell 7.x",
"alias_type": "VERSION",
"id": 589,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "PowerShell Core",
"alias_type": "VERSION",
"id": 590,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "Windows PowerShell",
"alias_type": "VERSION",
"id": 591,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "powershell 7",
"alias_type": "VERSION",
"id": 2400,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "powershell 7.x",
"alias_type": "VERSION",
"id": 2401,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "powershell core",
"alias_type": "VERSION",
"id": 2402,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "ps 7",
"alias_type": "VERSION",
"id": 2398,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "pwsh",
"alias_type": "VERSION",
"id": 584,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
}
],
"canonical": {
"category_id": 6,
"display_name": "PowerShell",
"id": 297,
"is_also_category": false,
"is_extractable": true,
"skill_nature": "LANGUAGE",
"slug": "powershell",
"sub_category_id": 38,
"typical_lifespan": "EVERGREEN",
"volatility": "STABLE"
},
"dimensions": [
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Programming Languages and Scripting",
"id": 59,
"rationale": "Languages used to write security automation, analysis scripts, detection logic, and remediation helpers. This is the primary implementation surface for a cybersecurity engineer across tooling and response workflows.",
"slug": "programming-languages-and-scripting",
"source": "db"
},
"input_skill": "PowerShell",
"llm_role": null,
"roles_from_db": [
{
"display_name": "Cyber Security Engineer",
"id": 5,
"rationale": null,
"role_archetype": null,
"slug": "cybersecurity-engineer",
"source": "db"
}
]
},
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Programming Languages for ML Systems",
"id": 39,
"rationale": "Languages used to build training code, inference services, evaluation jobs, and ML glue code. This is the primary implementation surface for ML engineers across experimentation and productionization.",
"slug": "programming-languages-for-ml-systems",
"source": "db"
},
"input_skill": "PowerShell",
"llm_role": null,
"roles_from_db": [
{
"display_name": "ML Engineer",
"id": 3,
"rationale": null,
"role_archetype": null,
"slug": "ml-engineer",
"source": "db"
},
{
"display_name": "MLOps Engineer",
"id": 16,
"rationale": null,
"role_archetype": null,
"slug": "ml-ops-engineer",
"source": "db"
}
]
}
],
"input_skill": "PowerShell",
"matched_via": "alias",
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": null,
"source_tag": "db",
"was_in_llm_skills": true
},
{
"aliases_in_db": [
{
"alias_text": "Microsoft Azure",
"alias_type": "CANONICAL",
"id": 258,
"is_primary": true,
"match_strategy": "CASE_INSENSITIVE"
}
],
"canonical": {
"category_id": 9,
"display_name": "Microsoft Azure",
"id": 97,
"is_also_category": false,
"is_extractable": true,
"skill_nature": "PLATFORM",
"slug": "microsoft-azure",
"sub_category_id": 46,
"typical_lifespan": "EVERGREEN",
"volatility": "STABLE"
},
"dimensions": [
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Cloud \u0026 Hosting Providers",
"id": 414,
"rationale": "Knowledge of major cloud and hosting vendor platforms for deploying and managing PHP applications.",
"slug": "cloud-hosting-providers",
"source": "db"
},
"input_skill": "Microsoft Azure",
"llm_role": null,
"roles_from_db": [
{
"display_name": "PHP Backend Developer",
"id": 86,
"rationale": null,
"role_archetype": "Engineering",
"slug": "php-backend-developer",
"source": "db"
}
]
},
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Cloud Platforms",
"id": 20,
"rationale": "Underlying cloud providers that host the managed services or infrastructure used by the role, such as AWS, Azure, and GCP.",
"slug": "cloud-platforms",
"source": "db"
},
"input_skill": "Microsoft Azure",
"llm_role": null,
"roles_from_db": [
{
"display_name": ".NET Backend Developer",
"id": 83,
"rationale": null,
"role_archetype": "Engineering",
"slug": "dotnet-backend-developer",
"source": "db"
},
{
"display_name": "Backend Developer",
"id": 1,
"rationale": null,
"role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
"slug": "backend-engineer",
"source": "db"
},
{
"display_name": "Cyber Security Engineer",
"id": 5,
"rationale": null,
"role_archetype": null,
"slug": "cybersecurity-engineer",
"source": "db"
},
{
"display_name": "Data Engineer",
"id": 2,
"rationale": null,
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
},
{
"display_name": "DevOps Engineer",
"id": 10,
"rationale": null,
"role_archetype": null,
"slug": "devops-engineer",
"source": "db"
},
{
"display_name": "Fullstack Developer",
"id": 15,
"rationale": null,
"role_archetype": null,
"slug": "full-stack-engineer",
"source": "db"
},
{
"display_name": "Go Backend Developer",
"id": 81,
"rationale": null,
"role_archetype": "Engineering",
"slug": "go-backend-developer",
"source": "db"
},
{
"display_name": "Java Backend Developer",
"id": 79,
"rationale": null,
"role_archetype": "Engineering",
"slug": "java-backend-developer",
"source": "db"
},
{
"display_name": "Kotlin Backend Developer",
"id": 84,
"rationale": null,
"role_archetype": "Engineering",
"slug": "kotlin-server-backend-developer",
"source": "db"
},
{
"display_name": "ML Engineer",
"id": 3,
"rationale": null,
"role_archetype": null,
"slug": "ml-engineer",
"source": "db"
},
{
"display_name": "MLOps Engineer",
"id": 16,
"rationale": null,
"role_archetype": null,
"slug": "ml-ops-engineer",
"source": "db"
},
{
"display_name": "Node.js Backend Developer",
"id": 82,
"rationale": null,
"role_archetype": "Engineering",
"slug": "node-backend-developer",
"source": "db"
},
{
"display_name": "Python Backend Developer",
"id": 80,
"rationale": null,
"role_archetype": "Engineering",
"slug": "python-backend-developer",
"source": "db"
},
{
"display_name": "Scala Backend Developer",
"id": 87,
"rationale": null,
"role_archetype": "Engineering",
"slug": "scala-backend-developer",
"source": "db"
}
]
}
],
"input_skill": "Microsoft Azure",
"matched_via": "alias",
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": null,
"source_tag": "db",
"was_in_llm_skills": true
}
],
"unmatched_skills": [
"Data Governance",
"Data Engineering",
"Azure Purview",
"Azure Data Factory",
"Azure Databricks",
"Azure Logic Apps",
"Azure Log Analytics",
"Azure Data Lake Storage Gen2",
"Spark SQL"
]
}
API 3 — final-role-output
{
"chosen_role": {
"display_name": "Data Governance Engineer",
"id": 146,
"rationale": "Domain=Data Engineering \u0026 Analytics; The JD centers on Azure Purview-based data governance alongside data extraction, ETL orchestration, and Azure data engineering tasks, which best matches Data Governance Engineer.",
"role_archetype": null,
"slug": "data-governance-engineer",
"source": "db"
},
"chosen_role_resolution": "in_db",
"final_input_skills": [
{
"skill": "Data Governance",
"tag": "new"
},
{
"skill": "Data Engineering",
"tag": "new"
},
{
"skill": "Azure Purview",
"tag": "new"
},
{
"skill": "Azure Data Factory",
"tag": "new"
},
{
"skill": "Azure Databricks",
"tag": "new"
},
{
"skill": "Azure Logic Apps",
"tag": "new"
},
{
"skill": "Azure Log Analytics",
"tag": "new"
},
{
"skill": "Azure Data Lake Storage Gen2",
"tag": "new"
},
{
"skill": "PySpark",
"tag": "in_db"
},
{
"skill": "Spark SQL",
"tag": "new"
},
{
"skill": "PowerShell",
"tag": "in_db"
},
{
"skill": "Microsoft Azure",
"tag": "in_db"
}
],
"llm_cost_api1_usd": null,
"llm_cost_api2_usd": null,
"llm_cost_api3_usd": null,
"llm_cost_total_usd": null,
"persistence": {
"items": [
{
"chosen_role_id": 146,
"dimension": {
"difficulty_hint": "well_known",
"display_name": "ETL and ELT Tooling",
"id": 24,
"rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
"slug": "etl-and-elt-tooling",
"source": "db"
},
"dimension_id": 24,
"input_skill": "PySpark",
"llm_role": null,
"matched_chosen_role": false,
"outcome_line": "Skipped \u2014 no persistable v3 meta for new skill",
"role_dimension_saved": false,
"roles_from_db": [
{
"display_name": "Data Engineer",
"id": 2,
"rationale": null,
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
}
],
"skill_dimension_saved": false,
"skill_id": null,
"skill_tag": "new",
"skipped_reason": "skill_not_in_db_v3_proposed"
},
{
"chosen_role_id": 146,
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Programming Languages and Scripting",
"id": 59,
"rationale": "Languages used to write security automation, analysis scripts, detection logic, and remediation helpers. This is the primary implementation surface for a cybersecurity engineer across tooling and response workflows.",
"slug": "programming-languages-and-scripting",
"source": "db"
},
"dimension_id": 59,
"input_skill": "PowerShell",
"llm_role": null,
"matched_chosen_role": false,
"outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
"role_dimension_saved": false,
"roles_from_db": [
{
"display_name": "Cyber Security Engineer",
"id": 5,
"rationale": null,
"role_archetype": null,
"slug": "cybersecurity-engineer",
"source": "db"
}
],
"skill_dimension_saved": true,
"skill_id": 297,
"skill_tag": "in_db",
"skipped_reason": null
},
{
"chosen_role_id": 146,
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Programming Languages for ML Systems",
"id": 39,
"rationale": "Languages used to build training code, inference services, evaluation jobs, and ML glue code. This is the primary implementation surface for ML engineers across experimentation and productionization.",
"slug": "programming-languages-for-ml-systems",
"source": "db"
},
"dimension_id": 39,
"input_skill": "PowerShell",
"llm_role": null,
"matched_chosen_role": false,
"outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
"role_dimension_saved": false,
"roles_from_db": [
{
"display_name": "ML Engineer",
"id": 3,
"rationale": null,
"role_archetype": null,
"slug": "ml-engineer",
"source": "db"
},
{
"display_name": "MLOps Engineer",
"id": 16,
"rationale": null,
"role_archetype": null,
"slug": "ml-ops-engineer",
"source": "db"
}
],
"skill_dimension_saved": true,
"skill_id": 297,
"skill_tag": "in_db",
"skipped_reason": null
},
{
"chosen_role_id": 146,
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Cloud \u0026 Hosting Providers",
"id": 414,
"rationale": "Knowledge of major cloud and hosting vendor platforms for deploying and managing PHP applications.",
"slug": "cloud-hosting-providers",
"source": "db"
},
"dimension_id": 414,
"input_skill": "Microsoft Azure",
"llm_role": null,
"matched_chosen_role": false,
"outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
"role_dimension_saved": false,
"roles_from_db": [
{
"display_name": "PHP Backend Developer",
"id": 86,
"rationale": null,
"role_archetype": "Engineering",
"slug": "php-backend-developer",
"source": "db"
}
],
"skill_dimension_saved": true,
"skill_id": 97,
"skill_tag": "in_db",
"skipped_reason": null
},
{
"chosen_role_id": 146,
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Cloud Platforms",
"id": 20,
"rationale": "Underlying cloud providers that host the managed services or infrastructure used by the role, such as AWS, Azure, and GCP.",
"slug": "cloud-platforms",
"source": "db"
},
"dimension_id": 20,
"input_skill": "Microsoft Azure",
"llm_role": null,
"matched_chosen_role": false,
"outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
"role_dimension_saved": false,
"roles_from_db": [
{
"display_name": ".NET Backend Developer",
"id": 83,
"rationale": null,
"role_archetype": "Engineering",
"slug": "dotnet-backend-developer",
"source": "db"
},
{
"display_name": "Backend Developer",
"id": 1,
"rationale": null,
"role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
"slug": "backend-engineer",
"source": "db"
},
{
"display_name": "Cyber Security Engineer",
"id": 5,
"rationale": null,
"role_archetype": null,
"slug": "cybersecurity-engineer",
"source": "db"
},
{
"display_name": "Data Engineer",
"id": 2,
"rationale": null,
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
},
{
"display_name": "DevOps Engineer",
"id": 10,
"rationale": null,
"role_archetype": null,
"slug": "devops-engineer",
"source": "db"
},
{
"display_name": "Fullstack Developer",
"id": 15,
"rationale": null,
"role_archetype": null,
"slug": "full-stack-engineer",
"source": "db"
},
{
"display_name": "Go Backend Developer",
"id": 81,
"rationale": null,
"role_archetype": "Engineering",
"slug": "go-backend-developer",
"source": "db"
},
{
"display_name": "Java Backend Developer",
"id": 79,
"rationale": null,
"role_archetype": "Engineering",
"slug": "java-backend-developer",
"source": "db"
},
{
"display_name": "Kotlin Backend Developer",
"id": 84,
"rationale": null,
"role_archetype": "Engineering",
"slug": "kotlin-server-backend-developer",
"source": "db"
},
{
"display_name": "ML Engineer",
"id": 3,
"rationale": null,
"role_archetype": null,
"slug": "ml-engineer",
"source": "db"
},
{
"display_name": "MLOps Engineer",
"id": 16,
"rationale": null,
"role_archetype": null,
"slug": "ml-ops-engineer",
"source": "db"
},
{
"display_name": "Node.js Backend Developer",
"id": 82,
"rationale": null,
"role_archetype": "Engineering",
"slug": "node-backend-developer",
"source": "db"
},
{
"display_name": "Python Backend Developer",
"id": 80,
"rationale": null,
"role_archetype": "Engineering",
"slug": "python-backend-developer",
"source": "db"
},
{
"display_name": "Scala Backend Developer",
"id": 87,
"rationale": null,
"role_archetype": "Engineering",
"slug": "scala-backend-developer",
"source": "db"
}
],
"skill_dimension_saved": true,
"skill_id": 97,
"skill_tag": "in_db",
"skipped_reason": null
}
],
"new_skills_created": 0,
"role_dimension_saved": 0,
"skill_dimension_saved": 0,
"skipped": 1
},
"planner_output": null,
"run_id": "aa174cd1-62e6-4b1e-8972-4d763e3333fc"
}