Pipeline run
4f3db78c-1948-496b-ad86-aaf3919e8fa9
Client output enrichment
v2 Skill cluster · Nature of work · AI index · Tech stack maturity · Evidence · KRA descriptionvocab breakdown (legacy)
Signals
1 POST /skills/extract-from-jd
2 POST /skills/extract-details
3 POST /skills/final-role-output
Data Engineer
CASE Eslug: data-engineer · id: 2 · source: db
The primary skills strongly align with the responsibilities of a Data Engineer.
Resolution:
in_db
— role exists in library; skill↔dim and role↔dim links saved when applicable.
Job description
Responsibilities: Designed and developed Snowflake Cloud Data Warehouse solutions. Built ETL pipelines using Airflow to ingest JSON and variant data. Created database objects including tables, views, and streams in Snowflake. Developed scripts to transform and load data from staging to DWH layers Processed semi-structured (JSON/Variant) data into structured DWH models. Designed and implemented DAG workflows for reporting pipelines. Performed data validation, unit testing, and QA code deployment. Optimized queries and improved reporting performance. Supported reporting and analytics teams with accurate data delivery. Resolved production issues and improved system reliability.
Skills from this JD
Each row merges API 1 extraction, API 2 library match / v3 orchestration (dimensions + locked dims), and API 3 persistence tags.
Aliases — catalog
- Snowflake (CANONICAL) primary
Context tags (catalog)
Stored enrichment (catalog DB)
- Category
- Platform
- Sub-category
- Data Cloud Platform
- Vendor
- Snowflake Inc.
- License
- proprietary
- Year introduced
- 2012
- Confidence
- 0.98
- Version strategy
- NOT_APPLICABLE
Maturity reasoning: Snowflake appears frequently in data/analytics job postings and is a standard cloud data warehouse platform alongside BigQuery and Redshift.
Skill profile (library / DB)
- Skill nature
- PLATFORM
- Volatility
- STABLE
- Typical lifespan
- EVERGREEN
- Category id
- 9
- Sub-category id
- 113
- Extractable
- True
- Also category
- False
Dimensions (API 2 worklist)
-
Cloud Data Warehouses Catalog dimension db id 22
Library dimension (catalog)
Roles linked in library: Data Engineer
API 3 link attempts (this skill)
| Dimension | Skill↔dim | Role↔dim | Outcome |
|---|---|---|---|
|
Cloud Data Warehouses
cloud-data-warehouses
|
✓ | ✓ | Existing dimension (library) · Role↔dimension saved |
Aliases — catalog
- Airflow (CANONICAL) primary
- airflow 2 (VERSION)
- airflow-2 (VERSION)
- airflow2 (VERSION)
- airflow2.x (VERSION)
- apache airflow 2 (VERSION)
Context tags (catalog)
Stored enrichment (catalog DB)
- Category
- Tool
- Sub-category
- Workflow Orchestration Tool
- Vendor
- Apache Software Foundation
- License
- apache_2
- Year introduced
- 2014
- Confidence
- 0.95
- Version strategy
- SEPARATE_ENTITY
- Version tag
- 2.x
Maturity reasoning: Apache Airflow appears in many data engineering job postings and is a common orchestration choice in production stacks; its GitHub activity and ecosystem remain strong, with no vendor sunset or clear replacement dominating JDs.
Skill profile (library / DB)
- Skill nature
- TOOL
- Volatility
- STABLE
- Typical lifespan
- EVERGREEN
- Category id
- 13
- Sub-category id
- 130
- Extractable
- True
- Also category
- False
Dimensions (API 2 worklist)
-
Workflow Orchestration for ML Pipelines Catalog dimension db id 54
Library dimension (catalog)
Roles linked in library: ML Engineer, MLOps Engineer
API 3 link attempts (this skill)
| Dimension | Skill↔dim | Role↔dim | Outcome |
|---|---|---|---|
|
Workflow Orchestration for ML Pipelines
workflow-orchestration-for-ml-pipelines
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) |
Skill enrichment (orchestrator / LLM)
No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).
- Category
- Data Engineering Tools
- Sub-category
- general
- Skill nature
- PRACTICE
- Volatility
- MEDIUM
- Typical lifespan
- MULTI_YEAR
- Version strategy
- UNVERSIONED
Aliases — catalog
- JSON (CANONICAL) primary
Context tags (catalog)
Stored enrichment (catalog DB)
- Category
- Format
- Sub-category
- Data Interchange Format
- Confidence
- 0.99
- Version strategy
- NOT_APPLICABLE
Maturity reasoning: JSON is a default data interchange format in APIs and web stacks; it appears in a very high volume of job descriptions and is supported by every major language/runtime.
Skill profile (library / DB)
- Skill nature
- STANDARD
- Volatility
- STABLE
- Typical lifespan
- EVERGREEN
- Category id
- 4
- Sub-category id
- 1457
- Extractable
- True
- Also category
- False
Dimensions (API 2 worklist)
-
API Integration and Data Fetching Catalog dimension db id 127
Library dimension (catalog)
Roles linked in library: Angular Frontend Developer, Frontend Developer, Fullstack Developer, Fullstack Developer, React Frontend Developer, Svelte Frontend Developer, Vue Frontend Developer, Web Developer
-
API Interface and Contract Design Catalog dimension db id 289
Library dimension (catalog)
Roles linked in library: .NET Backend Developer, Go Backend Developer, Kotlin Backend Developer, Node.js Backend Developer, PHP Backend Developer, Python Backend Developer, Ruby Backend Developer, Scala Backend Developer
-
Integration Protocols & Standards Catalog dimension db id 271
Library dimension (catalog)
Roles linked in library: Pega Developer
API 3 link attempts (this skill)
| Dimension | Skill↔dim | Role↔dim | Outcome |
|---|---|---|---|
|
API Integration and Data Fetching
api-integration-and-data-fetching
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) |
|
API Interface and Contract Design
api-interface-and-contract-design
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) |
|
Integration Protocols & Standards
integration-protocols-standards
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) |
Aliases — catalog
- DAG design (CANONICAL) primary
Context tags (catalog)
Stored enrichment (catalog DB)
- Category
- Architecture
- Sub-category
- Dag Based Architecture
- Confidence
- 0.86
- Version strategy
- NOT_APPLICABLE
Maturity reasoning: Common in JDs for Airflow, dbt, and data engineering roles; DAG-based orchestration is a standard pattern for pipelines and workflow scheduling.
Skill profile (library / DB)
- Skill nature
- PATTERN
- Volatility
- STABLE
- Typical lifespan
- EVERGREEN
- Category id
- 1
- Sub-category id
- 1113
- Extractable
- True
- Also category
- False
Dimensions (API 2 worklist)
-
Workflow Orchestration for ML Pipelines Catalog dimension db id 54
Library dimension (catalog)
Roles linked in library: ML Engineer, MLOps Engineer
API 3 link attempts (this skill)
| Dimension | Skill↔dim | Role↔dim | Outcome |
|---|---|---|---|
|
Workflow Orchestration for ML Pipelines
workflow-orchestration-for-ml-pipelines
|
— | — |
Skipped — no persistable v3 meta for new skill
skill_not_in_db_v3_proposed
|
Skill enrichment (orchestrator / LLM)
No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).
- Category
- Data Engineering Tools
- Sub-category
- general
- Skill nature
- PRACTICE
- Volatility
- MEDIUM
- Typical lifespan
- MULTI_YEAR
- Version strategy
- UNVERSIONED
Aliases — catalog
- Unit Testing (CANONICAL)
Context tags (catalog)
Stored enrichment (catalog DB)
- Category
- Methodology
- Sub-category
- Testing Methodology
- Confidence
- 0.98
- Version strategy
- NOT_APPLICABLE
Maturity reasoning: Unit testing is a standard hiring requirement across software JDs and appears in mainstream frameworks/docs; GitHub and Stack Overflow usage remain consistently high, with no successor replacing it.
Skill profile (library / DB)
- Skill nature
- METHODOLOGY
- Volatility
- STABLE
- Typical lifespan
- EVERGREEN
- Category id
- 8
- Sub-category id
- 44
- Extractable
- True
- Also category
- False
Dimensions (API 2 worklist)
-
React Frontend Development Catalog dimension db id 96
Library dimension (catalog)
API 3 link attempts (this skill)
| Dimension | Skill↔dim | Role↔dim | Outcome |
|---|---|---|---|
|
React Frontend Development
d_init_01
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) |
Skill enrichment (orchestrator / LLM)
No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).
- Category
- Testing Tools
- Sub-category
- general
- Skill nature
- PRACTICE
- Volatility
- MEDIUM
- Typical lifespan
- MULTI_YEAR
- Version strategy
- UNVERSIONED
Aliases — catalog
- SQL (CANONICAL) primary
Context tags (catalog)
Stored enrichment (catalog DB)
- Category
- Language
- Sub-category
- Query Language
- Vendor
- ANSI
- License
- unknown
- Year introduced
- 1974
- Confidence
- 0.99
- Version strategy
- NOT_APPLICABLE
Maturity reasoning: SQL appears in a large share of data, backend, and analytics job descriptions and remains the default query language for PostgreSQL, MySQL, and cloud warehouses like Snowflake/BigQuery.
Skill profile (library / DB)
- Skill nature
- LANGUAGE
- Volatility
- STABLE
- Typical lifespan
- EVERGREEN
- Category id
- 6
- Sub-category id
- 97
- Extractable
- True
- Also category
- False
Dimensions (API 2 worklist)
-
Pega Programming Languages & DSLs Catalog dimension db id 267
Library dimension (catalog)
Roles linked in library: Pega Developer
-
Programming Languages & DSLs Catalog dimension db id 475
Library dimension (catalog)
Roles linked in library: Engineering Manager
-
Programming Languages for Data Work Catalog dimension db id 21
Library dimension (catalog)
Roles linked in library: Data Engineer
API 3 link attempts (this skill)
| Dimension | Skill↔dim | Role↔dim | Outcome |
|---|---|---|---|
|
Pega Programming Languages & DSLs
pega-programming-languages-dsls
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) |
|
Programming Languages & DSLs
programming-languages-dsls
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) |
|
Programming Languages for Data Work
programming-languages-for-data-work
|
✓ | ✓ | Existing dimension (library) · Role↔dimension saved |
Skill enrichment (orchestrator / LLM)
No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).
- Category
- Data Engineering Tools
- Sub-category
- general
- Skill nature
- CONCEPT
- Volatility
- MEDIUM
- Typical lifespan
- MULTI_YEAR
- Version strategy
- UNVERSIONED
All API 3 persistence rows
Same grid as the skill-extractor “Persistence items” table: one row per (skill × dimension) work item.
| Skill | Tag | Dimension | Skill↔dim | Role↔dim | Outcome | Notes |
|---|---|---|---|---|---|---|
| Snowflake | in_db |
Cloud Data Warehouses
cloud-data-warehouses
|
✓ | ✓ | Existing dimension (library) · Role↔dimension saved | |
| Airflow | in_db |
Workflow Orchestration for ML Pipelines
workflow-orchestration-for-ml-pipelines
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) | |
| JSON | in_db |
API Integration and Data Fetching
api-integration-and-data-fetching
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) | |
| JSON | in_db |
API Interface and Contract Design
api-interface-and-contract-design
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) | |
| JSON | in_db |
Integration Protocols & Standards
integration-protocols-standards
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) | |
| DAG | new |
Workflow Orchestration for ML Pipelines
workflow-orchestration-for-ml-pipelines
|
— | — | Skipped — no persistable v3 meta for new skill | skill_not_in_db_v3_proposed |
| Unit Testing | in_db |
React Frontend Development
d_init_01
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) | |
| SQL | in_db |
Pega Programming Languages & DSLs
pega-programming-languages-dsls
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) | |
| SQL | in_db |
Programming Languages & DSLs
programming-languages-dsls
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) | |
| SQL | in_db |
Programming Languages for Data Work
programming-languages-for-data-work
|
✓ | ✓ | Existing dimension (library) · Role↔dimension saved |
Library artifacts (this run)
| Kind | Detail | DB id |
|---|---|---|
| canonical_skill_proposed | ETL | type=Data Engineering Tools subtype=general nature=PRACTICE lifespan=MULTI_YEAR | |
| canonical_skill_proposed | Data Validation | type=Data Engineering Tools subtype=general nature=PRACTICE lifespan=MULTI_YEAR | |
| canonical_skill_proposed | QA | type=Testing Tools subtype=general nature=PRACTICE lifespan=MULTI_YEAR | |
| canonical_skill_proposed | Data Warehousing | type=Data Engineering Tools subtype=general nature=CONCEPT lifespan=MULTI_YEAR | |
| dimension_skill_link_proposed | DAG ↔ Workflow Orchestration for ML Pipelines |
nano JD Parser — gpt-4.1-nano click to toggle
Show raw JSON
{
"JD_type": "fail",
"archetype_override_applied": true,
"archetype_override_matched_skills": [
"Snowflake",
"Unit Testing",
"staging",
"production",
"Analytics",
"Airflow",
"Cloud",
"Models",
"Streams",
"Views",
"JSON"
],
"role_archetype": "Engineering"
}
API 1 — extract-from-jd click to toggle
{
"final_skills": [
{
"is_primary": true,
"skill_name": "Snowflake"
},
{
"is_primary": true,
"skill_name": "Airflow"
},
{
"is_primary": true,
"skill_name": "ETL"
},
{
"is_primary": true,
"skill_name": "JSON"
},
{
"is_primary": true,
"skill_name": "DAG"
},
{
"is_primary": true,
"skill_name": "Data Validation"
},
{
"is_primary": true,
"skill_name": "Unit Testing"
},
{
"is_primary": true,
"skill_name": "QA"
},
{
"is_primary": true,
"skill_name": "SQL"
},
{
"is_primary": true,
"skill_name": "Data Warehousing"
}
],
"jd_role": null,
"nano_parsed": {
"JD_type": "fail",
"archetype_override_applied": true,
"archetype_override_matched_skills": [
"Snowflake",
"Unit Testing",
"staging",
"production",
"Analytics",
"Airflow",
"Cloud",
"Models",
"Streams",
"Views",
"JSON"
],
"role_archetype": "Engineering"
},
"rejected": false,
"rejection_reason": null,
"run_id": "4f3db78c-1948-496b-ad86-aaf3919e8fa9",
"stage3_signals": {
"alias_found": false,
"alias_match_roles": [],
"kra_match_roles": [
{
"display_name": "Data Engineer",
"kra_matches": [
{
"kra_text": "Designs dimensional models, star schemas, data vault structures, and curated data mart tables to support BI tools and self-service analytics consumption.",
"sentence": "Developed scripts to transform and load data from staging to DWH layers Processed semi-structured (JSON/Variant) data into structured DWH models.",
"similarity": 0.5941
},
{
"kra_text": "Optimizes pipeline throughput, partitioning strategies, and query performance across cloud data warehouses like Snowflake, BigQuery, or Redshift.",
"sentence": "Designed and developed Snowflake Cloud Data Warehouse solutions.",
"similarity": 0.5882
},
{
"kra_text": "Works with data analysts, data scientists, and business stakeholders to define data models, ingestion schedules, and data delivery requirements.",
"sentence": "Supported reporting and analytics teams with accurate data delivery.",
"similarity": 0.5801
}
],
"matched_count": null,
"matched_skills": null,
"role_id": 2,
"score": 0.5875,
"slug": "data-engineer",
"total_count": null
},
{
"display_name": "MLOps Engineer",
"kra_matches": [
{
"kra_text": "Validates model performance benchmarks, data schema contracts, and system integration health before signing off on production release readiness.",
"sentence": "Performed data validation, unit testing, and QA code deployment.",
"similarity": 0.6061
},
{
"kra_text": "Coordinates model promotion workflows across development, staging, and production environments including integration testing and data contract validation.",
"sentence": "Developed scripts to transform and load data from staging to DWH layers Processed semi-structured (JSON/Variant) data into structured DWH models.",
"similarity": 0.5058
},
{
"kra_text": "Maintains model versioning, experiment lineage, and artifact tracking using MLflow, DVC, or Weights \u0026 Biases for reproducibility and auditability.",
"sentence": "Built ETL pipelines using Airflow to ingest JSON and variant data.",
"similarity": 0.4699
}
],
"matched_count": null,
"matched_skills": null,
"role_id": 16,
"score": 0.5273,
"slug": "ml-ops-engineer",
"total_count": null
},
{
"display_name": "Ruby Backend Developer",
"kra_matches": [
{
"kra_text": "performance and reliability improvements",
"sentence": "Resolved production issues and improved system reliability.",
"similarity": 0.6138
},
{
"kra_text": "performance and reliability improvements",
"sentence": "Optimized queries and improved reporting performance.",
"similarity": 0.5174
},
{
"kra_text": "automated backend checks",
"sentence": "Performed data validation, unit testing, and QA code deployment.",
"similarity": 0.4308
}
],
"matched_count": null,
"matched_skills": null,
"role_id": 85,
"score": 0.5207,
"slug": "ruby-backend-developer",
"total_count": null
},
{
"display_name": "AI Engineer",
"kra_matches": [
{
"kra_text": "Designs and implements prompt engineering workflows, few-shot examples, chain-of-thought patterns, and structured output parsing for AI feature pipelines.",
"sentence": "Designed and implemented DAG workflows for reporting pipelines.",
"similarity": 0.5648
},
{
"kra_text": "Defines evaluation frameworks, automated test suites, and human feedback loops to measure AI feature quality, accuracy, and consistency.",
"sentence": "Performed data validation, unit testing, and QA code deployment.",
"similarity": 0.4643
},
{
"kra_text": "Optimizes AI pipeline efficiency by tuning model selection, context window usage, prompt caching, and batching strategies to reduce cost and latency.",
"sentence": "Optimized queries and improved reporting performance.",
"similarity": 0.4492
}
],
"matched_count": null,
"matched_skills": null,
"role_id": 13,
"score": 0.4928,
"slug": "ai-engineer",
"total_count": null
},
{
"display_name": "Java Backend Developer",
"kra_matches": [
{
"kra_text": "backend performance tuning",
"sentence": "Optimized queries and improved reporting performance.",
"similarity": 0.5301
},
{
"kra_text": "code refactoring and defect fixes",
"sentence": "Performed data validation, unit testing, and QA code deployment.",
"similarity": 0.4948
},
{
"kra_text": "persistence and data modeling",
"sentence": "Developed scripts to transform and load data from staging to DWH layers Processed semi-structured (JSON/Variant) data into structured DWH models.",
"similarity": 0.4498
}
],
"matched_count": null,
"matched_skills": null,
"role_id": 79,
"score": 0.4916,
"slug": "java-backend-developer",
"total_count": null
}
],
"skill_match_roles": [
{
"display_name": "Pega Developer",
"kra_matches": null,
"matched_count": 2,
"matched_skills": [
"JSON",
"SQL"
],
"role_id": 24,
"score": 0.2,
"slug": "pega-developer",
"total_count": 10
},
{
"display_name": "Data Engineer",
"kra_matches": null,
"matched_count": 2,
"matched_skills": [
"SQL",
"Snowflake"
],
"role_id": 2,
"score": 0.2,
"slug": "data-engineer",
"total_count": 10
},
{
"display_name": "Fullstack Developer",
"kra_matches": null,
"matched_count": 1,
"matched_skills": [
"JSON"
],
"role_id": 15,
"score": 0.1,
"slug": "full-stack-engineer",
"total_count": 10
},
{
"display_name": "Frontend Developer",
"kra_matches": null,
"matched_count": 1,
"matched_skills": [
"JSON"
],
"role_id": 7,
"score": 0.1,
"slug": "frontend-engineer",
"total_count": 10
},
{
"display_name": "MLOps Engineer",
"kra_matches": null,
"matched_count": 1,
"matched_skills": [
"Airflow"
],
"role_id": 16,
"score": 0.1,
"slug": "ml-ops-engineer",
"total_count": 10
}
]
},
"stage4_decision": {
"alias_collision_detected": false,
"case": "E",
"chosen_role": null,
"confidence": 0.0,
"is_new_role": false,
"llm2_fired": false,
"llm2_reasoning": null,
"matched_dimensions": [],
"matched_kras": [],
"matched_skills": [],
"new_role_display_name": null,
"new_role_slug": null,
"queued": true,
"reasoning": "Stage 4 crashed: FileNotFoundError: [Errno 2] No such file or directory: \u0027/app/domain_role_classifier_prompt.md\u0027",
"sub_role": null
},
"stage5_updates": null
}
API 2 — extract-details
{
"alias_matches": [
{
"alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
"alias_persisted": false,
"existing_alias_id": 299,
"existing_alias_text": "Snowflake",
"input_term": "Snowflake",
"matched_canonical": {
"category_id": 9,
"display_name": "Snowflake",
"id": 105,
"is_also_category": false,
"is_extractable": true,
"skill_nature": "PLATFORM",
"slug": "snowflake",
"sub_category_id": 113,
"typical_lifespan": "EVERGREEN",
"volatility": "STABLE"
},
"matched_via": "alias"
},
{
"alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
"alias_persisted": false,
"existing_alias_id": 526,
"existing_alias_text": "Airflow",
"input_term": "Airflow",
"matched_canonical": {
"category_id": 13,
"display_name": "Airflow",
"id": 265,
"is_also_category": false,
"is_extractable": true,
"skill_nature": "TOOL",
"slug": "airflow",
"sub_category_id": 130,
"typical_lifespan": "EVERGREEN",
"volatility": "STABLE"
},
"matched_via": "alias"
},
{
"alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
"alias_persisted": false,
"existing_alias_id": 3018,
"existing_alias_text": "JSON",
"input_term": "JSON",
"matched_canonical": {
"category_id": 4,
"display_name": "JSON",
"id": 1984,
"is_also_category": false,
"is_extractable": true,
"skill_nature": "STANDARD",
"slug": "json",
"sub_category_id": 1457,
"typical_lifespan": "EVERGREEN",
"volatility": "STABLE"
},
"matched_via": "alias"
},
{
"alias_persist_skipped_reason": "TODO: REMOVE AFTER TESTING \u2014 alias DB write disabled",
"alias_persisted": false,
"existing_alias_id": 2484,
"existing_alias_text": "DAG design",
"input_term": "DAG",
"matched_canonical": {
"category_id": 1,
"display_name": "DAG design",
"id": 1557,
"is_also_category": false,
"is_extractable": true,
"skill_nature": "PATTERN",
"slug": "dag-design",
"sub_category_id": 1113,
"typical_lifespan": "EVERGREEN",
"volatility": "STABLE"
},
"matched_via": "embedding_alias"
},
{
"alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
"alias_persisted": false,
"existing_alias_id": 865,
"existing_alias_text": "Unit Testing",
"input_term": "Unit Testing",
"matched_canonical": {
"category_id": 8,
"display_name": "Unit Testing",
"id": 517,
"is_also_category": false,
"is_extractable": true,
"skill_nature": "METHODOLOGY",
"slug": "unit-testing",
"sub_category_id": 44,
"typical_lifespan": "EVERGREEN",
"volatility": "STABLE"
},
"matched_via": "alias"
},
{
"alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
"alias_persisted": false,
"existing_alias_id": 271,
"existing_alias_text": "SQL",
"input_term": "SQL",
"matched_canonical": {
"category_id": 6,
"display_name": "SQL",
"id": 101,
"is_also_category": false,
"is_extractable": true,
"skill_nature": "LANGUAGE",
"slug": "sql",
"sub_category_id": 97,
"typical_lifespan": "EVERGREEN",
"volatility": "STABLE"
},
"matched_via": "alias"
}
],
"candidate_roles": [
{
"display_name": "Data Engineer",
"id": 2,
"rationale": null,
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
},
{
"display_name": "ML Engineer",
"id": 3,
"rationale": null,
"role_archetype": null,
"slug": "ml-engineer",
"source": "db"
},
{
"display_name": "MLOps Engineer",
"id": 16,
"rationale": null,
"role_archetype": null,
"slug": "ml-ops-engineer",
"source": "db"
},
{
"display_name": "Angular Frontend Developer",
"id": 90,
"rationale": null,
"role_archetype": "Engineering",
"slug": "angular-frontend-developer",
"source": "db"
},
{
"display_name": "Frontend Developer",
"id": 7,
"rationale": null,
"role_archetype": null,
"slug": "frontend-engineer",
"source": "db"
},
{
"display_name": "Fullstack Developer",
"id": 435,
"rationale": null,
"role_archetype": "Engineering",
"slug": "fullstack-developer",
"source": "db"
},
{
"display_name": "Fullstack Developer",
"id": 15,
"rationale": null,
"role_archetype": null,
"slug": "full-stack-engineer",
"source": "db"
},
{
"display_name": "React Frontend Developer",
"id": 89,
"rationale": null,
"role_archetype": "Engineering",
"slug": "react-frontend-developer",
"source": "db"
},
{
"display_name": "Svelte Frontend Developer",
"id": 92,
"rationale": null,
"role_archetype": "Engineering",
"slug": "svelte-frontend-developer",
"source": "db"
},
{
"display_name": "Vue Frontend Developer",
"id": 91,
"rationale": null,
"role_archetype": "Engineering",
"slug": "vue-frontend-developer",
"source": "db"
},
{
"display_name": "Web Developer",
"id": 25,
"rationale": null,
"role_archetype": null,
"slug": "web-developer",
"source": "db"
},
{
"display_name": ".NET Backend Developer",
"id": 83,
"rationale": null,
"role_archetype": "Engineering",
"slug": "dotnet-backend-developer",
"source": "db"
},
{
"display_name": "Go Backend Developer",
"id": 81,
"rationale": null,
"role_archetype": "Engineering",
"slug": "go-backend-developer",
"source": "db"
},
{
"display_name": "Kotlin Backend Developer",
"id": 84,
"rationale": null,
"role_archetype": "Engineering",
"slug": "kotlin-server-backend-developer",
"source": "db"
},
{
"display_name": "Node.js Backend Developer",
"id": 82,
"rationale": null,
"role_archetype": "Engineering",
"slug": "node-backend-developer",
"source": "db"
},
{
"display_name": "PHP Backend Developer",
"id": 86,
"rationale": null,
"role_archetype": "Engineering",
"slug": "php-backend-developer",
"source": "db"
},
{
"display_name": "Python Backend Developer",
"id": 80,
"rationale": null,
"role_archetype": "Engineering",
"slug": "python-backend-developer",
"source": "db"
},
{
"display_name": "Ruby Backend Developer",
"id": 85,
"rationale": null,
"role_archetype": "Engineering",
"slug": "ruby-backend-developer",
"source": "db"
},
{
"display_name": "Scala Backend Developer",
"id": 87,
"rationale": null,
"role_archetype": "Engineering",
"slug": "scala-backend-developer",
"source": "db"
},
{
"display_name": "Pega Developer",
"id": 24,
"rationale": null,
"role_archetype": null,
"slug": "pega-developer",
"source": "db"
},
{
"display_name": "Engineering Manager",
"id": 121,
"rationale": null,
"role_archetype": null,
"slug": "engineering-manager",
"source": "db"
}
],
"chosen_role": {
"display_name": "Data Engineer",
"id": 2,
"rationale": "The primary skills strongly align with the responsibilities of a Data Engineer.",
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
},
"dimensions": [
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Cloud Data Warehouses",
"id": 22,
"rationale": "Managed analytical storage and compute platforms used for curated datasets, reporting, and downstream analytics. These systems are central to data modeling, performance tuning, and cost-aware query design.",
"slug": "cloud-data-warehouses",
"source": "db"
},
"input_skill": "Snowflake",
"llm_role": null,
"roles_from_db": [
{
"display_name": "Data Engineer",
"id": 2,
"rationale": null,
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
}
]
},
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Workflow Orchestration for ML Pipelines",
"id": 54,
"rationale": "Workflow engines used to coordinate training, evaluation, deployment, and retraining jobs. This cluster covers dependencies, retries, scheduling, and pipeline composition for ML lifecycle automation.",
"slug": "workflow-orchestration-for-ml-pipelines",
"source": "db"
},
"input_skill": "Airflow",
"llm_role": null,
"roles_from_db": [
{
"display_name": "ML Engineer",
"id": 3,
"rationale": null,
"role_archetype": null,
"slug": "ml-engineer",
"source": "db"
},
{
"display_name": "MLOps Engineer",
"id": 16,
"rationale": null,
"role_archetype": null,
"slug": "ml-ops-engineer",
"source": "db"
}
]
},
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "API Integration and Data Fetching",
"id": 127,
"rationale": "Client-side integration with backend endpoints and third-party services, including request shaping, response handling, and synchronization with UI state. This is central to frontend work because most screens depend on remote data.",
"slug": "api-integration-and-data-fetching",
"source": "db"
},
"input_skill": "JSON",
"llm_role": null,
"roles_from_db": [
{
"display_name": "Angular Frontend Developer",
"id": 90,
"rationale": null,
"role_archetype": "Engineering",
"slug": "angular-frontend-developer",
"source": "db"
},
{
"display_name": "Frontend Developer",
"id": 7,
"rationale": null,
"role_archetype": null,
"slug": "frontend-engineer",
"source": "db"
},
{
"display_name": "Fullstack Developer",
"id": 435,
"rationale": null,
"role_archetype": "Engineering",
"slug": "fullstack-developer",
"source": "db"
},
{
"display_name": "Fullstack Developer",
"id": 15,
"rationale": null,
"role_archetype": null,
"slug": "full-stack-engineer",
"source": "db"
},
{
"display_name": "React Frontend Developer",
"id": 89,
"rationale": null,
"role_archetype": "Engineering",
"slug": "react-frontend-developer",
"source": "db"
},
{
"display_name": "Svelte Frontend Developer",
"id": 92,
"rationale": null,
"role_archetype": "Engineering",
"slug": "svelte-frontend-developer",
"source": "db"
},
{
"display_name": "Vue Frontend Developer",
"id": 91,
"rationale": null,
"role_archetype": "Engineering",
"slug": "vue-frontend-developer",
"source": "db"
},
{
"display_name": "Web Developer",
"id": 25,
"rationale": null,
"role_archetype": null,
"slug": "web-developer",
"source": "db"
}
]
},
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "API Interface and Contract Design",
"id": 289,
"rationale": "Designing backend service interfaces and contracts that other systems consume, including endpoint and operation shape, request/response payloads, schema and validation, pagination, filtering, idempotency, versioning, status codes, and backward compatibility across REST, GraphQL, gRPC, and OpenAPI-based APIs.",
"slug": "api-interface-and-contract-design",
"source": "db"
},
"input_skill": "JSON",
"llm_role": null,
"roles_from_db": [
{
"display_name": ".NET Backend Developer",
"id": 83,
"rationale": null,
"role_archetype": "Engineering",
"slug": "dotnet-backend-developer",
"source": "db"
},
{
"display_name": "Go Backend Developer",
"id": 81,
"rationale": null,
"role_archetype": "Engineering",
"slug": "go-backend-developer",
"source": "db"
},
{
"display_name": "Kotlin Backend Developer",
"id": 84,
"rationale": null,
"role_archetype": "Engineering",
"slug": "kotlin-server-backend-developer",
"source": "db"
},
{
"display_name": "Node.js Backend Developer",
"id": 82,
"rationale": null,
"role_archetype": "Engineering",
"slug": "node-backend-developer",
"source": "db"
},
{
"display_name": "PHP Backend Developer",
"id": 86,
"rationale": null,
"role_archetype": "Engineering",
"slug": "php-backend-developer",
"source": "db"
},
{
"display_name": "Python Backend Developer",
"id": 80,
"rationale": null,
"role_archetype": "Engineering",
"slug": "python-backend-developer",
"source": "db"
},
{
"display_name": "Ruby Backend Developer",
"id": 85,
"rationale": null,
"role_archetype": "Engineering",
"slug": "ruby-backend-developer",
"source": "db"
},
{
"display_name": "Scala Backend Developer",
"id": 87,
"rationale": null,
"role_archetype": "Engineering",
"slug": "scala-backend-developer",
"source": "db"
}
]
},
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Integration Protocols \u0026 Standards",
"id": 271,
"rationale": "Standards and protocols for integrating Pega applications.",
"slug": "integration-protocols-standards",
"source": "db"
},
"input_skill": "JSON",
"llm_role": null,
"roles_from_db": [
{
"display_name": "Pega Developer",
"id": 24,
"rationale": null,
"role_archetype": null,
"slug": "pega-developer",
"source": "db"
}
]
},
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Workflow Orchestration for ML Pipelines",
"id": 54,
"rationale": "Workflow engines used to coordinate training, evaluation, deployment, and retraining jobs. This cluster covers dependencies, retries, scheduling, and pipeline composition for ML lifecycle automation.",
"slug": "workflow-orchestration-for-ml-pipelines",
"source": "db"
},
"input_skill": "DAG",
"llm_role": null,
"roles_from_db": [
{
"display_name": "ML Engineer",
"id": 3,
"rationale": null,
"role_archetype": null,
"slug": "ml-engineer",
"source": "db"
},
{
"display_name": "MLOps Engineer",
"id": 16,
"rationale": null,
"role_archetype": null,
"slug": "ml-ops-engineer",
"source": "db"
}
]
},
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "React Frontend Development",
"id": 96,
"rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
"slug": "d_init_01",
"source": "db"
},
"input_skill": "Unit Testing",
"llm_role": null,
"roles_from_db": []
},
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Pega Programming Languages \u0026 DSLs",
"id": 267,
"rationale": "Programming languages and domain-specific languages used in Pega development.",
"slug": "pega-programming-languages-dsls",
"source": "db"
},
"input_skill": "SQL",
"llm_role": null,
"roles_from_db": [
{
"display_name": "Pega Developer",
"id": 24,
"rationale": null,
"role_archetype": null,
"slug": "pega-developer",
"source": "db"
}
]
},
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Programming Languages \u0026 DSLs",
"id": 475,
"rationale": "Oversee and guide the selection and effective use of programming and domain\u2010specific languages in software projects.",
"slug": "programming-languages-dsls",
"source": "db"
},
"input_skill": "SQL",
"llm_role": null,
"roles_from_db": [
{
"display_name": "Engineering Manager",
"id": 121,
"rationale": null,
"role_archetype": null,
"slug": "engineering-manager",
"source": "db"
}
]
},
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Programming Languages for Data Work",
"id": 21,
"rationale": "Languages used to implement data pipelines, transformations, and operational glue. This is the primary coding surface for building ingestion, enrichment, and automation logic in data engineering.",
"slug": "programming-languages-for-data-work",
"source": "db"
},
"input_skill": "SQL",
"llm_role": null,
"roles_from_db": [
{
"display_name": "Data Engineer",
"id": 2,
"rationale": null,
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
}
]
}
],
"input_final_skills": [
"Snowflake",
"Airflow",
"ETL",
"JSON",
"DAG",
"Data Validation",
"Unit Testing",
"QA",
"SQL",
"Data Warehousing"
],
"input_llm_skills": [
"Snowflake",
"Airflow",
"ETL",
"JSON",
"DAG",
"Data Validation",
"Unit Testing",
"QA",
"SQL",
"Data Warehousing"
],
"new_aliases_persisted": 0,
"run_id": "4f3db78c-1948-496b-ad86-aaf3919e8fa9",
"skills_detail": [
{
"aliases_in_db": [
{
"alias_text": "Snowflake",
"alias_type": "CANONICAL",
"id": 299,
"is_primary": true,
"match_strategy": "CASE_INSENSITIVE"
}
],
"canonical": {
"category_id": 9,
"display_name": "Snowflake",
"id": 105,
"is_also_category": false,
"is_extractable": true,
"skill_nature": "PLATFORM",
"slug": "snowflake",
"sub_category_id": 113,
"typical_lifespan": "EVERGREEN",
"volatility": "STABLE"
},
"dimensions": [
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Cloud Data Warehouses",
"id": 22,
"rationale": "Managed analytical storage and compute platforms used for curated datasets, reporting, and downstream analytics. These systems are central to data modeling, performance tuning, and cost-aware query design.",
"slug": "cloud-data-warehouses",
"source": "db"
},
"input_skill": "Snowflake",
"llm_role": null,
"roles_from_db": [
{
"display_name": "Data Engineer",
"id": 2,
"rationale": null,
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
}
]
}
],
"input_skill": "Snowflake",
"matched_via": "alias",
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": null,
"source_tag": "db",
"was_in_llm_skills": true
},
{
"aliases_in_db": [
{
"alias_text": "Airflow",
"alias_type": "CANONICAL",
"id": 526,
"is_primary": true,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "airflow 2",
"alias_type": "VERSION",
"id": 2477,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "airflow-2",
"alias_type": "VERSION",
"id": 2478,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "airflow2",
"alias_type": "VERSION",
"id": 2476,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "airflow2.x",
"alias_type": "VERSION",
"id": 2479,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "apache airflow 2",
"alias_type": "VERSION",
"id": 2480,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
}
],
"canonical": {
"category_id": 13,
"display_name": "Airflow",
"id": 265,
"is_also_category": false,
"is_extractable": true,
"skill_nature": "TOOL",
"slug": "airflow",
"sub_category_id": 130,
"typical_lifespan": "EVERGREEN",
"volatility": "STABLE"
},
"dimensions": [
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Workflow Orchestration for ML Pipelines",
"id": 54,
"rationale": "Workflow engines used to coordinate training, evaluation, deployment, and retraining jobs. This cluster covers dependencies, retries, scheduling, and pipeline composition for ML lifecycle automation.",
"slug": "workflow-orchestration-for-ml-pipelines",
"source": "db"
},
"input_skill": "Airflow",
"llm_role": null,
"roles_from_db": [
{
"display_name": "ML Engineer",
"id": 3,
"rationale": null,
"role_archetype": null,
"slug": "ml-engineer",
"source": "db"
},
{
"display_name": "MLOps Engineer",
"id": 16,
"rationale": null,
"role_archetype": null,
"slug": "ml-ops-engineer",
"source": "db"
}
]
}
],
"input_skill": "Airflow",
"matched_via": "alias",
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": null,
"source_tag": "db",
"was_in_llm_skills": true
},
{
"aliases_in_db": [],
"canonical": null,
"dimensions": [],
"input_skill": "ETL",
"matched_via": null,
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": {
"derived": {
"category": "Data Engineering Tools",
"skill_nature": "PRACTICE",
"sub_category": "general",
"typical_lifespan": "MULTI_YEAR",
"version_strategy": "UNVERSIONED",
"volatility": "MEDIUM"
},
"enrichment": null,
"keep_log": [],
"locked_dimensions": [],
"merge_log": [],
"placed": null,
"relationships": null,
"skill_id": "etl",
"split_log": [],
"typed": null,
"warnings": []
},
"source_tag": "llm",
"was_in_llm_skills": true
},
{
"aliases_in_db": [
{
"alias_text": "JSON",
"alias_type": "CANONICAL",
"id": 3018,
"is_primary": true,
"match_strategy": "CASE_INSENSITIVE"
}
],
"canonical": {
"category_id": 4,
"display_name": "JSON",
"id": 1984,
"is_also_category": false,
"is_extractable": true,
"skill_nature": "STANDARD",
"slug": "json",
"sub_category_id": 1457,
"typical_lifespan": "EVERGREEN",
"volatility": "STABLE"
},
"dimensions": [
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "API Integration and Data Fetching",
"id": 127,
"rationale": "Client-side integration with backend endpoints and third-party services, including request shaping, response handling, and synchronization with UI state. This is central to frontend work because most screens depend on remote data.",
"slug": "api-integration-and-data-fetching",
"source": "db"
},
"input_skill": "JSON",
"llm_role": null,
"roles_from_db": [
{
"display_name": "Angular Frontend Developer",
"id": 90,
"rationale": null,
"role_archetype": "Engineering",
"slug": "angular-frontend-developer",
"source": "db"
},
{
"display_name": "Frontend Developer",
"id": 7,
"rationale": null,
"role_archetype": null,
"slug": "frontend-engineer",
"source": "db"
},
{
"display_name": "Fullstack Developer",
"id": 435,
"rationale": null,
"role_archetype": "Engineering",
"slug": "fullstack-developer",
"source": "db"
},
{
"display_name": "Fullstack Developer",
"id": 15,
"rationale": null,
"role_archetype": null,
"slug": "full-stack-engineer",
"source": "db"
},
{
"display_name": "React Frontend Developer",
"id": 89,
"rationale": null,
"role_archetype": "Engineering",
"slug": "react-frontend-developer",
"source": "db"
},
{
"display_name": "Svelte Frontend Developer",
"id": 92,
"rationale": null,
"role_archetype": "Engineering",
"slug": "svelte-frontend-developer",
"source": "db"
},
{
"display_name": "Vue Frontend Developer",
"id": 91,
"rationale": null,
"role_archetype": "Engineering",
"slug": "vue-frontend-developer",
"source": "db"
},
{
"display_name": "Web Developer",
"id": 25,
"rationale": null,
"role_archetype": null,
"slug": "web-developer",
"source": "db"
}
]
},
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "API Interface and Contract Design",
"id": 289,
"rationale": "Designing backend service interfaces and contracts that other systems consume, including endpoint and operation shape, request/response payloads, schema and validation, pagination, filtering, idempotency, versioning, status codes, and backward compatibility across REST, GraphQL, gRPC, and OpenAPI-based APIs.",
"slug": "api-interface-and-contract-design",
"source": "db"
},
"input_skill": "JSON",
"llm_role": null,
"roles_from_db": [
{
"display_name": ".NET Backend Developer",
"id": 83,
"rationale": null,
"role_archetype": "Engineering",
"slug": "dotnet-backend-developer",
"source": "db"
},
{
"display_name": "Go Backend Developer",
"id": 81,
"rationale": null,
"role_archetype": "Engineering",
"slug": "go-backend-developer",
"source": "db"
},
{
"display_name": "Kotlin Backend Developer",
"id": 84,
"rationale": null,
"role_archetype": "Engineering",
"slug": "kotlin-server-backend-developer",
"source": "db"
},
{
"display_name": "Node.js Backend Developer",
"id": 82,
"rationale": null,
"role_archetype": "Engineering",
"slug": "node-backend-developer",
"source": "db"
},
{
"display_name": "PHP Backend Developer",
"id": 86,
"rationale": null,
"role_archetype": "Engineering",
"slug": "php-backend-developer",
"source": "db"
},
{
"display_name": "Python Backend Developer",
"id": 80,
"rationale": null,
"role_archetype": "Engineering",
"slug": "python-backend-developer",
"source": "db"
},
{
"display_name": "Ruby Backend Developer",
"id": 85,
"rationale": null,
"role_archetype": "Engineering",
"slug": "ruby-backend-developer",
"source": "db"
},
{
"display_name": "Scala Backend Developer",
"id": 87,
"rationale": null,
"role_archetype": "Engineering",
"slug": "scala-backend-developer",
"source": "db"
}
]
},
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Integration Protocols \u0026 Standards",
"id": 271,
"rationale": "Standards and protocols for integrating Pega applications.",
"slug": "integration-protocols-standards",
"source": "db"
},
"input_skill": "JSON",
"llm_role": null,
"roles_from_db": [
{
"display_name": "Pega Developer",
"id": 24,
"rationale": null,
"role_archetype": null,
"slug": "pega-developer",
"source": "db"
}
]
}
],
"input_skill": "JSON",
"matched_via": "alias",
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": null,
"source_tag": "db",
"was_in_llm_skills": true
},
{
"aliases_in_db": [
{
"alias_text": "DAG design",
"alias_type": "CANONICAL",
"id": 2484,
"is_primary": true,
"match_strategy": "CASE_INSENSITIVE"
}
],
"canonical": {
"category_id": 1,
"display_name": "DAG design",
"id": 1557,
"is_also_category": false,
"is_extractable": true,
"skill_nature": "PATTERN",
"slug": "dag-design",
"sub_category_id": 1113,
"typical_lifespan": "EVERGREEN",
"volatility": "STABLE"
},
"dimensions": [
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Workflow Orchestration for ML Pipelines",
"id": 54,
"rationale": "Workflow engines used to coordinate training, evaluation, deployment, and retraining jobs. This cluster covers dependencies, retries, scheduling, and pipeline composition for ML lifecycle automation.",
"slug": "workflow-orchestration-for-ml-pipelines",
"source": "db"
},
"input_skill": "DAG",
"llm_role": null,
"roles_from_db": [
{
"display_name": "ML Engineer",
"id": 3,
"rationale": null,
"role_archetype": null,
"slug": "ml-engineer",
"source": "db"
},
{
"display_name": "MLOps Engineer",
"id": 16,
"rationale": null,
"role_archetype": null,
"slug": "ml-ops-engineer",
"source": "db"
}
]
}
],
"input_skill": "DAG",
"matched_via": "embedding_alias",
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": null,
"source_tag": "db",
"was_in_llm_skills": true
},
{
"aliases_in_db": [],
"canonical": null,
"dimensions": [],
"input_skill": "Data Validation",
"matched_via": null,
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": {
"derived": {
"category": "Data Engineering Tools",
"skill_nature": "PRACTICE",
"sub_category": "general",
"typical_lifespan": "MULTI_YEAR",
"version_strategy": "UNVERSIONED",
"volatility": "MEDIUM"
},
"enrichment": null,
"keep_log": [],
"locked_dimensions": [],
"merge_log": [],
"placed": null,
"relationships": null,
"skill_id": "data-validation",
"split_log": [],
"typed": null,
"warnings": []
},
"source_tag": "llm",
"was_in_llm_skills": true
},
{
"aliases_in_db": [
{
"alias_text": "Unit Testing",
"alias_type": "CANONICAL",
"id": 865,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
}
],
"canonical": {
"category_id": 8,
"display_name": "Unit Testing",
"id": 517,
"is_also_category": false,
"is_extractable": true,
"skill_nature": "METHODOLOGY",
"slug": "unit-testing",
"sub_category_id": 44,
"typical_lifespan": "EVERGREEN",
"volatility": "STABLE"
},
"dimensions": [
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "React Frontend Development",
"id": 96,
"rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
"slug": "d_init_01",
"source": "db"
},
"input_skill": "Unit Testing",
"llm_role": null,
"roles_from_db": []
}
],
"input_skill": "Unit Testing",
"matched_via": "alias",
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": null,
"source_tag": "db",
"was_in_llm_skills": true
},
{
"aliases_in_db": [],
"canonical": null,
"dimensions": [],
"input_skill": "QA",
"matched_via": null,
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": {
"derived": {
"category": "Testing Tools",
"skill_nature": "PRACTICE",
"sub_category": "general",
"typical_lifespan": "MULTI_YEAR",
"version_strategy": "UNVERSIONED",
"volatility": "MEDIUM"
},
"enrichment": null,
"keep_log": [],
"locked_dimensions": [],
"merge_log": [],
"placed": null,
"relationships": null,
"skill_id": "qa",
"split_log": [],
"typed": null,
"warnings": []
},
"source_tag": "llm",
"was_in_llm_skills": true
},
{
"aliases_in_db": [
{
"alias_text": "SQL",
"alias_type": "CANONICAL",
"id": 271,
"is_primary": true,
"match_strategy": "CASE_INSENSITIVE"
}
],
"canonical": {
"category_id": 6,
"display_name": "SQL",
"id": 101,
"is_also_category": false,
"is_extractable": true,
"skill_nature": "LANGUAGE",
"slug": "sql",
"sub_category_id": 97,
"typical_lifespan": "EVERGREEN",
"volatility": "STABLE"
},
"dimensions": [
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Pega Programming Languages \u0026 DSLs",
"id": 267,
"rationale": "Programming languages and domain-specific languages used in Pega development.",
"slug": "pega-programming-languages-dsls",
"source": "db"
},
"input_skill": "SQL",
"llm_role": null,
"roles_from_db": [
{
"display_name": "Pega Developer",
"id": 24,
"rationale": null,
"role_archetype": null,
"slug": "pega-developer",
"source": "db"
}
]
},
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Programming Languages \u0026 DSLs",
"id": 475,
"rationale": "Oversee and guide the selection and effective use of programming and domain\u2010specific languages in software projects.",
"slug": "programming-languages-dsls",
"source": "db"
},
"input_skill": "SQL",
"llm_role": null,
"roles_from_db": [
{
"display_name": "Engineering Manager",
"id": 121,
"rationale": null,
"role_archetype": null,
"slug": "engineering-manager",
"source": "db"
}
]
},
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Programming Languages for Data Work",
"id": 21,
"rationale": "Languages used to implement data pipelines, transformations, and operational glue. This is the primary coding surface for building ingestion, enrichment, and automation logic in data engineering.",
"slug": "programming-languages-for-data-work",
"source": "db"
},
"input_skill": "SQL",
"llm_role": null,
"roles_from_db": [
{
"display_name": "Data Engineer",
"id": 2,
"rationale": null,
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
}
]
}
],
"input_skill": "SQL",
"matched_via": "alias",
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": null,
"source_tag": "db",
"was_in_llm_skills": true
},
{
"aliases_in_db": [],
"canonical": null,
"dimensions": [],
"input_skill": "Data Warehousing",
"matched_via": null,
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": {
"derived": {
"category": "Data Engineering Tools",
"skill_nature": "CONCEPT",
"sub_category": "general",
"typical_lifespan": "MULTI_YEAR",
"version_strategy": "UNVERSIONED",
"volatility": "MEDIUM"
},
"enrichment": null,
"keep_log": [],
"locked_dimensions": [],
"merge_log": [],
"placed": null,
"relationships": null,
"skill_id": "data-warehousing",
"split_log": [],
"typed": null,
"warnings": []
},
"source_tag": "llm",
"was_in_llm_skills": true
}
],
"unmatched_skills": [
"ETL",
"Data Validation",
"QA",
"Data Warehousing"
]
}
API 3 — final-role-output
{
"chosen_role": {
"display_name": "Data Engineer",
"id": 2,
"rationale": "The primary skills strongly align with the responsibilities of a Data Engineer.",
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
},
"chosen_role_resolution": "in_db",
"final_input_skills": [
{
"skill": "Snowflake",
"tag": "in_db"
},
{
"skill": "Airflow",
"tag": "in_db"
},
{
"skill": "ETL",
"tag": "new"
},
{
"skill": "JSON",
"tag": "in_db"
},
{
"skill": "DAG",
"tag": "in_db"
},
{
"skill": "Data Validation",
"tag": "new"
},
{
"skill": "Unit Testing",
"tag": "in_db"
},
{
"skill": "QA",
"tag": "new"
},
{
"skill": "SQL",
"tag": "in_db"
},
{
"skill": "Data Warehousing",
"tag": "new"
}
],
"llm_cost_api1_usd": null,
"llm_cost_api2_usd": null,
"llm_cost_api3_usd": null,
"llm_cost_total_usd": null,
"persistence": {
"items": [
{
"chosen_role_id": 2,
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Cloud Data Warehouses",
"id": 22,
"rationale": "Managed analytical storage and compute platforms used for curated datasets, reporting, and downstream analytics. These systems are central to data modeling, performance tuning, and cost-aware query design.",
"slug": "cloud-data-warehouses",
"source": "db"
},
"dimension_id": 22,
"input_skill": "Snowflake",
"llm_role": null,
"matched_chosen_role": true,
"outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension saved",
"role_dimension_saved": true,
"roles_from_db": [
{
"display_name": "Data Engineer",
"id": 2,
"rationale": null,
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
}
],
"skill_dimension_saved": true,
"skill_id": 105,
"skill_tag": "in_db",
"skipped_reason": null
},
{
"chosen_role_id": 2,
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Workflow Orchestration for ML Pipelines",
"id": 54,
"rationale": "Workflow engines used to coordinate training, evaluation, deployment, and retraining jobs. This cluster covers dependencies, retries, scheduling, and pipeline composition for ML lifecycle automation.",
"slug": "workflow-orchestration-for-ml-pipelines",
"source": "db"
},
"dimension_id": 54,
"input_skill": "Airflow",
"llm_role": null,
"matched_chosen_role": false,
"outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
"role_dimension_saved": false,
"roles_from_db": [
{
"display_name": "ML Engineer",
"id": 3,
"rationale": null,
"role_archetype": null,
"slug": "ml-engineer",
"source": "db"
},
{
"display_name": "MLOps Engineer",
"id": 16,
"rationale": null,
"role_archetype": null,
"slug": "ml-ops-engineer",
"source": "db"
}
],
"skill_dimension_saved": true,
"skill_id": 265,
"skill_tag": "in_db",
"skipped_reason": null
},
{
"chosen_role_id": 2,
"dimension": {
"difficulty_hint": "well_known",
"display_name": "API Integration and Data Fetching",
"id": 127,
"rationale": "Client-side integration with backend endpoints and third-party services, including request shaping, response handling, and synchronization with UI state. This is central to frontend work because most screens depend on remote data.",
"slug": "api-integration-and-data-fetching",
"source": "db"
},
"dimension_id": 127,
"input_skill": "JSON",
"llm_role": null,
"matched_chosen_role": false,
"outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
"role_dimension_saved": false,
"roles_from_db": [
{
"display_name": "Angular Frontend Developer",
"id": 90,
"rationale": null,
"role_archetype": "Engineering",
"slug": "angular-frontend-developer",
"source": "db"
},
{
"display_name": "Frontend Developer",
"id": 7,
"rationale": null,
"role_archetype": null,
"slug": "frontend-engineer",
"source": "db"
},
{
"display_name": "Fullstack Developer",
"id": 435,
"rationale": null,
"role_archetype": "Engineering",
"slug": "fullstack-developer",
"source": "db"
},
{
"display_name": "Fullstack Developer",
"id": 15,
"rationale": null,
"role_archetype": null,
"slug": "full-stack-engineer",
"source": "db"
},
{
"display_name": "React Frontend Developer",
"id": 89,
"rationale": null,
"role_archetype": "Engineering",
"slug": "react-frontend-developer",
"source": "db"
},
{
"display_name": "Svelte Frontend Developer",
"id": 92,
"rationale": null,
"role_archetype": "Engineering",
"slug": "svelte-frontend-developer",
"source": "db"
},
{
"display_name": "Vue Frontend Developer",
"id": 91,
"rationale": null,
"role_archetype": "Engineering",
"slug": "vue-frontend-developer",
"source": "db"
},
{
"display_name": "Web Developer",
"id": 25,
"rationale": null,
"role_archetype": null,
"slug": "web-developer",
"source": "db"
}
],
"skill_dimension_saved": true,
"skill_id": 1984,
"skill_tag": "in_db",
"skipped_reason": null
},
{
"chosen_role_id": 2,
"dimension": {
"difficulty_hint": "well_known",
"display_name": "API Interface and Contract Design",
"id": 289,
"rationale": "Designing backend service interfaces and contracts that other systems consume, including endpoint and operation shape, request/response payloads, schema and validation, pagination, filtering, idempotency, versioning, status codes, and backward compatibility across REST, GraphQL, gRPC, and OpenAPI-based APIs.",
"slug": "api-interface-and-contract-design",
"source": "db"
},
"dimension_id": 289,
"input_skill": "JSON",
"llm_role": null,
"matched_chosen_role": false,
"outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
"role_dimension_saved": false,
"roles_from_db": [
{
"display_name": ".NET Backend Developer",
"id": 83,
"rationale": null,
"role_archetype": "Engineering",
"slug": "dotnet-backend-developer",
"source": "db"
},
{
"display_name": "Go Backend Developer",
"id": 81,
"rationale": null,
"role_archetype": "Engineering",
"slug": "go-backend-developer",
"source": "db"
},
{
"display_name": "Kotlin Backend Developer",
"id": 84,
"rationale": null,
"role_archetype": "Engineering",
"slug": "kotlin-server-backend-developer",
"source": "db"
},
{
"display_name": "Node.js Backend Developer",
"id": 82,
"rationale": null,
"role_archetype": "Engineering",
"slug": "node-backend-developer",
"source": "db"
},
{
"display_name": "PHP Backend Developer",
"id": 86,
"rationale": null,
"role_archetype": "Engineering",
"slug": "php-backend-developer",
"source": "db"
},
{
"display_name": "Python Backend Developer",
"id": 80,
"rationale": null,
"role_archetype": "Engineering",
"slug": "python-backend-developer",
"source": "db"
},
{
"display_name": "Ruby Backend Developer",
"id": 85,
"rationale": null,
"role_archetype": "Engineering",
"slug": "ruby-backend-developer",
"source": "db"
},
{
"display_name": "Scala Backend Developer",
"id": 87,
"rationale": null,
"role_archetype": "Engineering",
"slug": "scala-backend-developer",
"source": "db"
}
],
"skill_dimension_saved": true,
"skill_id": 1984,
"skill_tag": "in_db",
"skipped_reason": null
},
{
"chosen_role_id": 2,
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Integration Protocols \u0026 Standards",
"id": 271,
"rationale": "Standards and protocols for integrating Pega applications.",
"slug": "integration-protocols-standards",
"source": "db"
},
"dimension_id": 271,
"input_skill": "JSON",
"llm_role": null,
"matched_chosen_role": false,
"outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
"role_dimension_saved": false,
"roles_from_db": [
{
"display_name": "Pega Developer",
"id": 24,
"rationale": null,
"role_archetype": null,
"slug": "pega-developer",
"source": "db"
}
],
"skill_dimension_saved": true,
"skill_id": 1984,
"skill_tag": "in_db",
"skipped_reason": null
},
{
"chosen_role_id": 2,
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Workflow Orchestration for ML Pipelines",
"id": 54,
"rationale": "Workflow engines used to coordinate training, evaluation, deployment, and retraining jobs. This cluster covers dependencies, retries, scheduling, and pipeline composition for ML lifecycle automation.",
"slug": "workflow-orchestration-for-ml-pipelines",
"source": "db"
},
"dimension_id": 54,
"input_skill": "DAG",
"llm_role": null,
"matched_chosen_role": false,
"outcome_line": "Skipped \u2014 no persistable v3 meta for new skill",
"role_dimension_saved": false,
"roles_from_db": [
{
"display_name": "ML Engineer",
"id": 3,
"rationale": null,
"role_archetype": null,
"slug": "ml-engineer",
"source": "db"
},
{
"display_name": "MLOps Engineer",
"id": 16,
"rationale": null,
"role_archetype": null,
"slug": "ml-ops-engineer",
"source": "db"
}
],
"skill_dimension_saved": false,
"skill_id": null,
"skill_tag": "new",
"skipped_reason": "skill_not_in_db_v3_proposed"
},
{
"chosen_role_id": 2,
"dimension": {
"difficulty_hint": "well_known",
"display_name": "React Frontend Development",
"id": 96,
"rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
"slug": "d_init_01",
"source": "db"
},
"dimension_id": 96,
"input_skill": "Unit Testing",
"llm_role": null,
"matched_chosen_role": false,
"outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
"role_dimension_saved": false,
"roles_from_db": [],
"skill_dimension_saved": true,
"skill_id": 517,
"skill_tag": "in_db",
"skipped_reason": null
},
{
"chosen_role_id": 2,
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Pega Programming Languages \u0026 DSLs",
"id": 267,
"rationale": "Programming languages and domain-specific languages used in Pega development.",
"slug": "pega-programming-languages-dsls",
"source": "db"
},
"dimension_id": 267,
"input_skill": "SQL",
"llm_role": null,
"matched_chosen_role": false,
"outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
"role_dimension_saved": false,
"roles_from_db": [
{
"display_name": "Pega Developer",
"id": 24,
"rationale": null,
"role_archetype": null,
"slug": "pega-developer",
"source": "db"
}
],
"skill_dimension_saved": true,
"skill_id": 101,
"skill_tag": "in_db",
"skipped_reason": null
},
{
"chosen_role_id": 2,
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Programming Languages \u0026 DSLs",
"id": 475,
"rationale": "Oversee and guide the selection and effective use of programming and domain\u2010specific languages in software projects.",
"slug": "programming-languages-dsls",
"source": "db"
},
"dimension_id": 475,
"input_skill": "SQL",
"llm_role": null,
"matched_chosen_role": false,
"outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
"role_dimension_saved": false,
"roles_from_db": [
{
"display_name": "Engineering Manager",
"id": 121,
"rationale": null,
"role_archetype": null,
"slug": "engineering-manager",
"source": "db"
}
],
"skill_dimension_saved": true,
"skill_id": 101,
"skill_tag": "in_db",
"skipped_reason": null
},
{
"chosen_role_id": 2,
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Programming Languages for Data Work",
"id": 21,
"rationale": "Languages used to implement data pipelines, transformations, and operational glue. This is the primary coding surface for building ingestion, enrichment, and automation logic in data engineering.",
"slug": "programming-languages-for-data-work",
"source": "db"
},
"dimension_id": 21,
"input_skill": "SQL",
"llm_role": null,
"matched_chosen_role": true,
"outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension saved",
"role_dimension_saved": true,
"roles_from_db": [
{
"display_name": "Data Engineer",
"id": 2,
"rationale": null,
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
}
],
"skill_dimension_saved": true,
"skill_id": 101,
"skill_tag": "in_db",
"skipped_reason": null
}
],
"new_skills_created": 0,
"role_dimension_saved": 0,
"skill_dimension_saved": 0,
"skipped": 1
},
"planner_output": null,
"run_id": "4f3db78c-1948-496b-ad86-aaf3919e8fa9"
}