Pipeline run
69a5b208-d3d6-459c-8c8a-42ea615ee412
Client output enrichment
v2 Skill cluster · Nature of work · AI index · Tech stack maturity · Evidence · KRA descriptionvocab breakdown (legacy)
Signals
Post-classification
Captured for admin review
1 POST /skills/extract-from-jd
2 POST /skills/extract-details
3 POST /skills/final-role-output
Data Engineer
CASE Aslug: data-engineer · id: 2 · source: db
Exact alias hit on data-engineer (1.0) — no other alias at this confidence; skill_top data-engineer 0.20 does not contradict
Resolution:
in_db
— role exists in library; skill↔dim and role↔dim links saved when applicable.
Job description
Job Description: Profile: Senior Data Engineer / Data Lead Location: Work From Home Experience: 3-10 Years WUElev8 is organising a 12 hour Online-interactive hiring hackathon called Next Pathway Hack Backpackers being presented by Next Pathway Inc. on 6th August 2022 that aims to solve a data problem and hire passionate and dedicated data enthusiasts and experienced professionals to join team as Senior Data Engineer / Data Lead / Data Consultant / Snowflake Consultant. We have already provided hiring opportunities to many talented professionals in our 15+ hackathons. Our work has also been recognized by Hon'ble PM of India, various government and non-government organisations. Mode: Online Interactive Guidelines: You can participate individuallyEvery participant needs to register on the WUElev8 platform and apply for participation in the hackathonThe mode of the hackathon is online interactive.You will work on the problem statement during the hackathon time onlyBased on your participation and solution, you will be screened further for the interview round.Offer letter can be released on the same day or next day of the interviewDo not copy or do plagiarism for the solution. If found, you will be disqualified. I am sure you might have got super excited by now! Then what are you waiting for? Hurry up & Register before it gets closed! How to Participate: Signup/Login on the platformRegister for the eventA welcome email will be sent to you for the event with the further details Sign up & Register Now Link: https://wuelev8.tech/drills/next-pathway-hack-backpackers For more hiring & innovation hackathons stay connected with us: LinkedIn Page: https://www.linkedin.com/company/wuelev8/ Website: https://wuelev8.tech/drill Skills Required: Must have PySpark experienceHands-on knowledge on IICS toolHands-on Knowledge on Snowflakehands-on knowledge on SynapseHands-on knowledge on ADFHands-on experience on any of the above tool/technology is expected with PySpark being the must to have skill About Us: WUElev8(Where you Elevate!) is a platform which empowers engineers to engage themselves in ongoing innovation journeys and thereby allowing them to elevate their careers to new heights. The engineering undergrads and working professionals get the ‘value of their contributions’ on WUElev8 and which they can use in getting the best recommendations of best jobs, hiring challenges, hackathons which itself help them in elevating their career. The platform best serves the organizations and startups which value the ongoing innovation and technology to scale their business and serve their customers by providing them the best talent enabling their innovation journeys and thereby elevating their businesses!
Skills from this JD
Each row merges API 1 extraction, API 2 library match / v3 orchestration (dimensions + locked dims), and API 3 persistence tags.
Aliases — catalog
- Apache Spark (CANONICAL)
- apache spark 3 (VERSION)
- spark (VERSION)
- spark 3 (VERSION)
- spark 3.x (VERSION)
- spark3 (VERSION)
Context tags (catalog)
Stored enrichment (catalog DB)
- Category
- Framework
- Sub-category
- Distributed Data Processing Framework
- Vendor
- Apache Software Foundation
- License
- apache_2
- Year introduced
- 2010
- Confidence
- 0.94
- Version strategy
- SEPARATE_ENTITY
- Version tag
- 3.x
Maturity reasoning: Apache Spark appears in many data engineering JDs and remains a standard for distributed ETL/ELT; its GitHub and vendor ecosystem activity stay strong, with Databricks and cloud platforms still promoting it.
Skill profile (library / DB)
- Skill nature
- FRAMEWORK
- Volatility
- STABLE
- Typical lifespan
- EVERGREEN
- Category id
- 5
- Sub-category id
- 1021
- Extractable
- True
- Also category
- False
Dimensions (API 2 worklist)
-
ETL and ELT Tooling Catalog dimension db id 24
Library dimension (catalog)
Roles linked in library: Data Engineer
API 3 link attempts (this skill)
| Dimension | Skill↔dim | Role↔dim | Outcome |
|---|---|---|---|
|
ETL and ELT Tooling
etl-and-elt-tooling
|
— | — |
Skipped — no persistable v3 meta for new skill
skill_not_in_db_v3_proposed
|
Skill enrichment (orchestrator / LLM)
No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).
- Category
- Cloud Platforms
- Sub-category
- general
- Skill nature
- PLATFORM
- Volatility
- MEDIUM
- Typical lifespan
- MULTI_YEAR
- Version strategy
- UNVERSIONED
Aliases — catalog
- Snowflake (CANONICAL) primary
Context tags (catalog)
Stored enrichment (catalog DB)
- Category
- Platform
- Sub-category
- Data Cloud Platform
- Vendor
- Snowflake Inc.
- License
- proprietary
- Year introduced
- 2012
- Confidence
- 0.98
- Version strategy
- NOT_APPLICABLE
Maturity reasoning: Snowflake appears frequently in data/analytics job postings and is a standard cloud data warehouse platform alongside BigQuery and Redshift.
Skill profile (library / DB)
- Skill nature
- PLATFORM
- Volatility
- STABLE
- Typical lifespan
- EVERGREEN
- Category id
- 9
- Sub-category id
- 113
- Extractable
- True
- Also category
- False
Dimensions (API 2 worklist)
-
Cloud Data Warehouses Catalog dimension db id 22
Library dimension (catalog)
Roles linked in library: Data Engineer
API 3 link attempts (this skill)
| Dimension | Skill↔dim | Role↔dim | Outcome |
|---|---|---|---|
|
Cloud Data Warehouses
cloud-data-warehouses
|
✓ | ✓ | Existing dimension (library) · Role↔dimension saved |
Skill enrichment (orchestrator / LLM)
No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).
- Category
- Cloud Platforms
- Sub-category
- general
- Skill nature
- PLATFORM
- Volatility
- MEDIUM
- Typical lifespan
- MULTI_YEAR
- Version strategy
- UNVERSIONED
Skill enrichment (orchestrator / LLM)
No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).
- Category
- Cloud Platforms
- Sub-category
- general
- Skill nature
- PLATFORM
- Volatility
- MEDIUM
- Typical lifespan
- MULTI_YEAR
- Version strategy
- UNVERSIONED
All API 3 persistence rows
Same grid as the skill-extractor “Persistence items” table: one row per (skill × dimension) work item.
| Skill | Tag | Dimension | Skill↔dim | Role↔dim | Outcome | Notes |
|---|---|---|---|---|---|---|
| PySpark | new |
ETL and ELT Tooling
etl-and-elt-tooling
|
— | — | Skipped — no persistable v3 meta for new skill | skill_not_in_db_v3_proposed |
| Snowflake | in_db |
Cloud Data Warehouses
cloud-data-warehouses
|
✓ | ✓ | Existing dimension (library) · Role↔dimension saved |
Library artifacts (this run)
| Kind | Detail | DB id |
|---|---|---|
| canonical_skill_proposed | IICS | type=Cloud Platforms subtype=general nature=PLATFORM lifespan=MULTI_YEAR | |
| canonical_skill_proposed | Synapse | type=Cloud Platforms subtype=general nature=PLATFORM lifespan=MULTI_YEAR | |
| canonical_skill_proposed | ADF | type=Cloud Platforms subtype=general nature=PLATFORM lifespan=MULTI_YEAR | |
| dimension_skill_link_proposed | PySpark ↔ ETL and ELT Tooling | |
| role_dimension_link_proposed | Data Engineer ↔ ETL and ELT Tooling |
nano JD Parser — gpt-4.1-nano click to toggle
Show raw JSON
{
"JD_type": "pass",
"about_company": {
"source_marker": {
"first_5_words": "WUElev8(Where you Elevate!) is a",
"last_5_words": "elevating their businesses!"
},
"text": "WUElev8(Where you Elevate!) is a platform which empowers engineers to engage themselves in ongoing innovation journeys and thereby allowing them to elevate their careers to new heights.\n\nThe engineering undergrads and working professionals get the \u2018value of their contributions\u2019 on WUElev8 and which they can use in getting the best recommendations of best jobs, hiring challenges, hackathons which itself help them in elevating their career.\n\nThe platform best serves the organizations and startups which value the ongoing innovation and technology to scale their business and serve their customers by providing them the best talent enabling their innovation journeys and thereby elevating their businesses!",
"word_count": 84
},
"certifications": [],
"company_name": "WUElev8",
"ctc": null,
"domain": {
"primary": {
"aliases": [],
"domain": "Other"
},
"secondary": null
},
"education": [],
"experience": {
"max": 10,
"min": 3,
"raw": "3-10 Years"
},
"job_locations": [
{
"aliases": [],
"city": null,
"country": null,
"state": null,
"work_mode": "remote"
}
],
"role": "Senior Data Engineer / Data Lead",
"role_aliases": [
"Data Engineer",
"Data Lead",
"Data Consultant",
"Snowflake Consultant"
],
"role_archetype": "Data",
"roles_and_responsibilities": [
{
"bullet_count": 6,
"heading": "Skills Required",
"heading_was_present": true,
"source_marker": {
"first_5_words": "Must have PySpark experience Hands-on",
"last_5_words": "being the must to have skill"
},
"text": "Must have PySpark experience\nHands-on knowledge on IICS tool\nHands-on Knowledge on Snowflake\nhands-on knowledge on Synapse\nHands-on knowledge on ADF\nHands-on experience on any of the above tool/technology is expected with PySpark being the must to have skill",
"word_count": 43
}
],
"urls": [
{
"type": "other",
"url": "https://wuelev8.tech/drills/next-pathway-hack-backpackers"
},
{
"type": "linkedin",
"url": "https://www.linkedin.com/company/wuelev8/"
},
{
"type": "website",
"url": "https://wuelev8.tech/drill"
}
]
}
API 1 — extract-from-jd click to toggle
{
"final_skills": [
{
"is_primary": true,
"skill_name": "PySpark"
},
{
"is_primary": true,
"skill_name": "IICS"
},
{
"is_primary": true,
"skill_name": "Snowflake"
},
{
"is_primary": true,
"skill_name": "Synapse"
},
{
"is_primary": true,
"skill_name": "ADF"
}
],
"jd_role": {
"display_name": "Senior Data Engineer / Data Lead",
"rationale": null,
"role_aliases": [
"Data Engineer",
"Data Lead",
"Data Consultant",
"Snowflake Consultant"
],
"role_archetype": "Data",
"slug": ""
},
"nano_parsed": {
"JD_type": "pass",
"about_company": {
"source_marker": {
"first_5_words": "WUElev8(Where you Elevate!) is a",
"last_5_words": "elevating their businesses!"
},
"text": "WUElev8(Where you Elevate!) is a platform which empowers engineers to engage themselves in ongoing innovation journeys and thereby allowing them to elevate their careers to new heights.\n\nThe engineering undergrads and working professionals get the \u2018value of their contributions\u2019 on WUElev8 and which they can use in getting the best recommendations of best jobs, hiring challenges, hackathons which itself help them in elevating their career.\n\nThe platform best serves the organizations and startups which value the ongoing innovation and technology to scale their business and serve their customers by providing them the best talent enabling their innovation journeys and thereby elevating their businesses!",
"word_count": 84
},
"certifications": [],
"company_name": "WUElev8",
"ctc": null,
"domain": {
"primary": {
"aliases": [],
"domain": "Other"
},
"secondary": null
},
"education": [],
"experience": {
"max": 10,
"min": 3,
"raw": "3-10 Years"
},
"job_locations": [
{
"aliases": [],
"city": null,
"country": null,
"state": null,
"work_mode": "remote"
}
],
"role": "Senior Data Engineer / Data Lead",
"role_aliases": [
"Data Engineer",
"Data Lead",
"Data Consultant",
"Snowflake Consultant"
],
"role_archetype": "Data",
"roles_and_responsibilities": [
{
"bullet_count": 6,
"heading": "Skills Required",
"heading_was_present": true,
"source_marker": {
"first_5_words": "Must have PySpark experience Hands-on",
"last_5_words": "being the must to have skill"
},
"text": "Must have PySpark experience\nHands-on knowledge on IICS tool\nHands-on Knowledge on Snowflake\nhands-on knowledge on Synapse\nHands-on knowledge on ADF\nHands-on experience on any of the above tool/technology is expected with PySpark being the must to have skill",
"word_count": 43
}
],
"urls": [
{
"type": "other",
"url": "https://wuelev8.tech/drills/next-pathway-hack-backpackers"
},
{
"type": "linkedin",
"url": "https://www.linkedin.com/company/wuelev8/"
},
{
"type": "website",
"url": "https://wuelev8.tech/drill"
}
]
},
"rejected": false,
"rejection_reason": null,
"run_id": "69a5b208-d3d6-459c-8c8a-42ea615ee412",
"stage3_signals": {
"alias_found": true,
"alias_match_roles": [
{
"display_name": "Data Engineer",
"kra_matches": null,
"matched_count": null,
"matched_skills": null,
"role_id": 2,
"score": 1.0,
"slug": "data-engineer",
"total_count": null
}
],
"kra_match_roles": [
{
"display_name": "Data Engineer",
"kra_matches": [
{
"kra_text": "Develops batch and real-time streaming data pipelines using Apache Spark, Apache Kafka, Apache Flink, or Airflow for data movement and processing at scale.",
"sentence": "Hands-on experience on any of the above tool/technology is expected with PySpark being the must to have skill",
"similarity": 0.4584
}
],
"matched_count": null,
"matched_skills": null,
"role_id": 2,
"score": 0.4584,
"slug": "data-engineer",
"total_count": null
},
{
"display_name": "ML Engineer",
"kra_matches": [
{
"kra_text": "Prepares, cleans, and transforms training datasets, manages feature stores, and builds feature engineering pipelines for model training.",
"sentence": "Hands-on experience on any of the above tool/technology is expected with PySpark being the must to have skill",
"similarity": 0.3678
}
],
"matched_count": null,
"matched_skills": null,
"role_id": 3,
"score": 0.3678,
"slug": "ml-engineer",
"total_count": null
},
{
"display_name": "AI Engineer",
"kra_matches": [
{
"kra_text": "Designs and implements prompt engineering workflows, few-shot examples, chain-of-thought patterns, and structured output parsing for AI feature pipelines.",
"sentence": "Hands-on experience on any of the above tool/technology is expected with PySpark being the must to have skill",
"similarity": 0.348
}
],
"matched_count": null,
"matched_skills": null,
"role_id": 13,
"score": 0.348,
"slug": "ai-engineer",
"total_count": null
},
{
"display_name": "Fullstack Developer",
"kra_matches": [
{
"kra_text": "Implements complete product features end-to-end from database schema design through backend API to frontend UI using JavaScript, TypeScript, Python, or Ruby on Rails.",
"sentence": "Hands-on experience on any of the above tool/technology is expected with PySpark being the must to have skill",
"similarity": 0.3267
}
],
"matched_count": null,
"matched_skills": null,
"role_id": 15,
"score": 0.3267,
"slug": "full-stack-engineer",
"total_count": null
},
{
"display_name": "MLOps Engineer",
"kra_matches": [
{
"kra_text": "Supports ML platform incidents by diagnosing model serving failures, feature store pipeline breaks, and training environment configuration issues.",
"sentence": "Hands-on experience on any of the above tool/technology is expected with PySpark being the must to have skill",
"similarity": 0.299
}
],
"matched_count": null,
"matched_skills": null,
"role_id": 16,
"score": 0.299,
"slug": "ml-ops-engineer",
"total_count": null
}
],
"skill_match_roles": [
{
"display_name": "Data Engineer",
"kra_matches": null,
"matched_count": 1,
"matched_skills": [
"Snowflake"
],
"role_id": 2,
"score": 0.2,
"slug": "data-engineer",
"total_count": 5
}
]
},
"stage4_decision": {
"alias_collision_detected": false,
"case": "A",
"chosen_role": {
"display_name": "Data Engineer",
"kra_matches": null,
"matched_count": null,
"matched_skills": null,
"role_id": 2,
"score": 1.0,
"slug": "data-engineer",
"total_count": null
},
"confidence": 1.0,
"is_new_role": false,
"llm2_fired": false,
"llm2_reasoning": null,
"matched_dimensions": [],
"matched_kras": [],
"matched_skills": [],
"new_role_display_name": null,
"new_role_slug": null,
"queued": false,
"reasoning": "Exact alias hit on data-engineer (1.0) \u2014 no other alias at this confidence; skill_top data-engineer 0.20 does not contradict",
"sub_role": null
},
"stage5_updates": {
"centroid_n_after": 121,
"centroid_updated": true,
"collision_log_id": null,
"new_kra_attached": null,
"new_skills_attached": [
{
"is_primary": true,
"queue_id": 6768,
"role_display_name": "Data Engineer",
"role_slug": "data-engineer",
"skill_name": "PySpark",
"status": "pending"
},
{
"is_primary": true,
"queue_id": 6769,
"role_display_name": "Data Engineer",
"role_slug": "data-engineer",
"skill_name": "IICS",
"status": "pending"
},
{
"is_primary": true,
"queue_id": 6770,
"role_display_name": "Data Engineer",
"role_slug": "data-engineer",
"skill_name": "Synapse",
"status": "pending"
},
{
"is_primary": true,
"queue_id": 6771,
"role_display_name": "Data Engineer",
"role_slug": "data-engineer",
"skill_name": "ADF",
"status": "pending"
}
],
"queue_entry_id": null,
"v3_pipeline_triggered": false,
"v3_role_slug": null,
"v3_run_id": null
}
}
API 2 — extract-details
{
"alias_matches": [
{
"alias_persist_skipped_reason": "TODO: REMOVE AFTER TESTING \u2014 alias DB write disabled",
"alias_persisted": false,
"existing_alias_id": 2004,
"existing_alias_text": "Apache Spark",
"input_term": "PySpark",
"matched_canonical": {
"category_id": 5,
"display_name": "Apache Spark",
"id": 1350,
"is_also_category": false,
"is_extractable": true,
"skill_nature": "FRAMEWORK",
"slug": "apache-spark",
"sub_category_id": 1021,
"typical_lifespan": "EVERGREEN",
"volatility": "STABLE"
},
"matched_via": "embedding_alias"
},
{
"alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
"alias_persisted": false,
"existing_alias_id": 299,
"existing_alias_text": "Snowflake",
"input_term": "Snowflake",
"matched_canonical": {
"category_id": 9,
"display_name": "Snowflake",
"id": 105,
"is_also_category": false,
"is_extractable": true,
"skill_nature": "PLATFORM",
"slug": "snowflake",
"sub_category_id": 113,
"typical_lifespan": "EVERGREEN",
"volatility": "STABLE"
},
"matched_via": "alias"
}
],
"candidate_roles": [
{
"display_name": "Data Engineer",
"id": 2,
"rationale": null,
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
}
],
"chosen_role": {
"display_name": "Data Engineer",
"id": 2,
"rationale": "Exact alias hit on data-engineer (1.0) \u2014 no other alias at this confidence; skill_top data-engineer 0.20 does not contradict",
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
},
"dimensions": [
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "ETL and ELT Tooling",
"id": 24,
"rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
"slug": "etl-and-elt-tooling",
"source": "db"
},
"input_skill": "PySpark",
"llm_role": null,
"roles_from_db": [
{
"display_name": "Data Engineer",
"id": 2,
"rationale": null,
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
}
]
},
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Cloud Data Warehouses",
"id": 22,
"rationale": "Managed analytical storage and compute platforms used for curated datasets, reporting, and downstream analytics. These systems are central to data modeling, performance tuning, and cost-aware query design.",
"slug": "cloud-data-warehouses",
"source": "db"
},
"input_skill": "Snowflake",
"llm_role": null,
"roles_from_db": [
{
"display_name": "Data Engineer",
"id": 2,
"rationale": null,
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
}
]
}
],
"input_final_skills": [
"PySpark",
"IICS",
"Snowflake",
"Synapse",
"ADF"
],
"input_llm_skills": [
"PySpark",
"IICS",
"Snowflake",
"Synapse",
"ADF"
],
"new_aliases_persisted": 0,
"run_id": "69a5b208-d3d6-459c-8c8a-42ea615ee412",
"skills_detail": [
{
"aliases_in_db": [
{
"alias_text": "Apache Spark",
"alias_type": "CANONICAL",
"id": 2004,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "apache spark 3",
"alias_type": "VERSION",
"id": 2006,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "spark",
"alias_type": "VERSION",
"id": 2510,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "spark 3",
"alias_type": "VERSION",
"id": 2007,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "spark 3.x",
"alias_type": "VERSION",
"id": 2009,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "spark3",
"alias_type": "VERSION",
"id": 2008,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
}
],
"canonical": {
"category_id": 5,
"display_name": "Apache Spark",
"id": 1350,
"is_also_category": false,
"is_extractable": true,
"skill_nature": "FRAMEWORK",
"slug": "apache-spark",
"sub_category_id": 1021,
"typical_lifespan": "EVERGREEN",
"volatility": "STABLE"
},
"dimensions": [
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "ETL and ELT Tooling",
"id": 24,
"rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
"slug": "etl-and-elt-tooling",
"source": "db"
},
"input_skill": "PySpark",
"llm_role": null,
"roles_from_db": [
{
"display_name": "Data Engineer",
"id": 2,
"rationale": null,
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
}
]
}
],
"input_skill": "PySpark",
"matched_via": "embedding_alias",
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": null,
"source_tag": "db",
"was_in_llm_skills": true
},
{
"aliases_in_db": [],
"canonical": null,
"dimensions": [],
"input_skill": "IICS",
"matched_via": null,
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": {
"derived": {
"category": "Cloud Platforms",
"skill_nature": "PLATFORM",
"sub_category": "general",
"typical_lifespan": "MULTI_YEAR",
"version_strategy": "UNVERSIONED",
"volatility": "MEDIUM"
},
"enrichment": null,
"keep_log": [],
"locked_dimensions": [],
"merge_log": [],
"placed": null,
"relationships": null,
"skill_id": "iics",
"split_log": [],
"typed": null,
"warnings": []
},
"source_tag": "llm",
"was_in_llm_skills": true
},
{
"aliases_in_db": [
{
"alias_text": "Snowflake",
"alias_type": "CANONICAL",
"id": 299,
"is_primary": true,
"match_strategy": "CASE_INSENSITIVE"
}
],
"canonical": {
"category_id": 9,
"display_name": "Snowflake",
"id": 105,
"is_also_category": false,
"is_extractable": true,
"skill_nature": "PLATFORM",
"slug": "snowflake",
"sub_category_id": 113,
"typical_lifespan": "EVERGREEN",
"volatility": "STABLE"
},
"dimensions": [
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Cloud Data Warehouses",
"id": 22,
"rationale": "Managed analytical storage and compute platforms used for curated datasets, reporting, and downstream analytics. These systems are central to data modeling, performance tuning, and cost-aware query design.",
"slug": "cloud-data-warehouses",
"source": "db"
},
"input_skill": "Snowflake",
"llm_role": null,
"roles_from_db": [
{
"display_name": "Data Engineer",
"id": 2,
"rationale": null,
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
}
]
}
],
"input_skill": "Snowflake",
"matched_via": "alias",
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": null,
"source_tag": "db",
"was_in_llm_skills": true
},
{
"aliases_in_db": [],
"canonical": null,
"dimensions": [],
"input_skill": "Synapse",
"matched_via": null,
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": {
"derived": {
"category": "Cloud Platforms",
"skill_nature": "PLATFORM",
"sub_category": "general",
"typical_lifespan": "MULTI_YEAR",
"version_strategy": "UNVERSIONED",
"volatility": "MEDIUM"
},
"enrichment": null,
"keep_log": [],
"locked_dimensions": [],
"merge_log": [],
"placed": null,
"relationships": null,
"skill_id": "synapse",
"split_log": [],
"typed": null,
"warnings": []
},
"source_tag": "llm",
"was_in_llm_skills": true
},
{
"aliases_in_db": [],
"canonical": null,
"dimensions": [],
"input_skill": "ADF",
"matched_via": null,
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": {
"derived": {
"category": "Cloud Platforms",
"skill_nature": "PLATFORM",
"sub_category": "general",
"typical_lifespan": "MULTI_YEAR",
"version_strategy": "UNVERSIONED",
"volatility": "MEDIUM"
},
"enrichment": null,
"keep_log": [],
"locked_dimensions": [],
"merge_log": [],
"placed": null,
"relationships": null,
"skill_id": "adf",
"split_log": [],
"typed": null,
"warnings": []
},
"source_tag": "llm",
"was_in_llm_skills": true
}
],
"unmatched_skills": [
"IICS",
"Synapse",
"ADF"
]
}
API 3 — final-role-output
{
"chosen_role": {
"display_name": "Data Engineer",
"id": 2,
"rationale": "Exact alias hit on data-engineer (1.0) \u2014 no other alias at this confidence; skill_top data-engineer 0.20 does not contradict",
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
},
"chosen_role_resolution": "in_db",
"final_input_skills": [
{
"skill": "PySpark",
"tag": "in_db"
},
{
"skill": "IICS",
"tag": "new"
},
{
"skill": "Snowflake",
"tag": "in_db"
},
{
"skill": "Synapse",
"tag": "new"
},
{
"skill": "ADF",
"tag": "new"
}
],
"llm_cost_api1_usd": null,
"llm_cost_api2_usd": null,
"llm_cost_api3_usd": null,
"llm_cost_total_usd": null,
"persistence": {
"items": [
{
"chosen_role_id": 2,
"dimension": {
"difficulty_hint": "well_known",
"display_name": "ETL and ELT Tooling",
"id": 24,
"rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
"slug": "etl-and-elt-tooling",
"source": "db"
},
"dimension_id": 24,
"input_skill": "PySpark",
"llm_role": null,
"matched_chosen_role": true,
"outcome_line": "Skipped \u2014 no persistable v3 meta for new skill",
"role_dimension_saved": false,
"roles_from_db": [
{
"display_name": "Data Engineer",
"id": 2,
"rationale": null,
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
}
],
"skill_dimension_saved": false,
"skill_id": null,
"skill_tag": "new",
"skipped_reason": "skill_not_in_db_v3_proposed"
},
{
"chosen_role_id": 2,
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Cloud Data Warehouses",
"id": 22,
"rationale": "Managed analytical storage and compute platforms used for curated datasets, reporting, and downstream analytics. These systems are central to data modeling, performance tuning, and cost-aware query design.",
"slug": "cloud-data-warehouses",
"source": "db"
},
"dimension_id": 22,
"input_skill": "Snowflake",
"llm_role": null,
"matched_chosen_role": true,
"outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension saved",
"role_dimension_saved": true,
"roles_from_db": [
{
"display_name": "Data Engineer",
"id": 2,
"rationale": null,
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
}
],
"skill_dimension_saved": true,
"skill_id": 105,
"skill_tag": "in_db",
"skipped_reason": null
}
],
"new_skills_created": 0,
"role_dimension_saved": 0,
"skill_dimension_saved": 0,
"skipped": 1
},
"planner_output": null,
"run_id": "69a5b208-d3d6-459c-8c8a-42ea615ee412"
}
LLM Calls
Every model call made for this run, in pipeline order. Click a card to see the model's response.