Pipeline run
e7cc83d4-091c-4d28-b237-a18cae3b08f1
Client output enrichment
v2 Skill cluster · Nature of work · AI index · Tech stack maturity · Evidence · KRA descriptionvocab breakdown (legacy)
Signals
Post-classification
Captured for admin review
1 POST /skills/extract-from-jd
2 POST /skills/extract-details
3 POST /skills/final-role-output
Data Engineer
domain · Data Engineering & Analytics CASE DOMAINslug: data-engineer · id: 2 · source: db
Domain=Data Engineering & Analytics; The JD centers on Spark/PySpark, structured data, and cloud exposure, which aligns best with data engineering responsibilities.
Matched skills
Matched dimensions
Matched KRAs
Resolution:
in_db
— role exists in library; skill↔dim and role↔dim links saved when applicable.
Job description
Experience: 4+ yrs Job Location: Bangalore Notice Period: Immediate to 20 days Joining Mandatory skills: Excellent in Spark/Pyspark Experience working with structured data Exposure to cloud Bid data will be a big plus About US: Thank you for expressing interest in a career with TEKsystems Global Services (TGS). At TEKsystems Global Services, we believe in lasting careers with room to run. Giving our people limitless opportunity to make an impact and enable the world’s largest companies to transform how they do business. We’re the professional services division of TEKsystems, accounting for over $1 Billion in revenue. We’re one of India’s fastest growing full-stack services companies, with about 5500 full time employees across the globe of which 2000 are in India (Bangalore and Hyderabad). TGS operates through multiple solution centers across North America, EMEA and APAC, including locations like Dallas, Redmond, Bloomington, Baltimore, Maryland, Europe (Amsterdam, London), Canada (Montreal), and Philippines (Manila). To support key areas of our growth, we acquired two companies to join our family. One North, a full-service digital agency and 1Strategy a premiere AWS service provider. Through One North we’re able to help our customers elevate their customer, brand and UI/UX experiences. And, through 1Strategy, we’re able to ensure our customers can take full advantage of the complete AWS solutions portfolio. Certifications Joining the elite team at TEKsystems Global Services, gives you runway to grow with us. Upskill faster. Surpass your peers. Even earning certifications: • ISO 27001 • HITRUST Certification • PCI DSS Certification • HIPPA Compliance • PMP certified Project Managers Partnership The world’s leading technology and software providers partner with us because of our scale, full-stack capabilities and speed—giving you the room to specialize and sharpen your skills on the most innovative platforms and game-changing technology. • Amazon Webservices Advanced Consulting Partner • Microsoft Gold Partner • Google Cloud Premier Partner • Other top partnerships – Snowflake, RedHat, Oracle platinum partner, Salesforce Gold partner, ServiceNow Managed Service Provider and reseller, MuleSoft, Tableau system Integrator, SailPoint Systems, Cloudera Specialized, Hortornworks Community Partner TGS offers a wide range of IT services including but not limited to delivering high end business consulting services and building applications. This is done through multiple centers of excellence including: • Data Analytics • Data Insights • Enterprise Integration • Enterprise cloud application (Salesforce and Oracle) • Transformation Operation Management • Continuous Development • Continuous Testing • Transformation Devops Cloud For more details, please visit us on https://www.teksystems.com/ https://www.teksystems.com/en-in/services https://www.teksystems.com/careers-in-india https://www.linkedin.com/company/teksystems-global-services-india/mycompany/
Skills from this JD
Each row merges API 1 extraction, API 2 library match / v3 orchestration (dimensions + locked dims), and API 3 persistence tags.
Aliases — catalog
- Apache Spark (CANONICAL)
- apache spark 3 (VERSION)
- spark (VERSION)
- spark 3 (VERSION)
- spark 3.x (VERSION)
- spark3 (VERSION)
Context tags (catalog)
Stored enrichment (catalog DB)
- Category
- Framework
- Sub-category
- Distributed Data Processing Framework
- Vendor
- Apache Software Foundation
- License
- apache_2
- Year introduced
- 2010
- Confidence
- 0.94
- Version strategy
- SEPARATE_ENTITY
- Version tag
- 3.x
Maturity reasoning: Apache Spark appears in many data engineering JDs and remains a standard for distributed ETL/ELT; its GitHub and vendor ecosystem activity stay strong, with Databricks and cloud platforms still promoting it.
Skill profile (library / DB)
- Skill nature
- FRAMEWORK
- Volatility
- STABLE
- Typical lifespan
- EVERGREEN
- Category id
- 5
- Sub-category id
- 1021
- Extractable
- True
- Also category
- False
Dimensions (API 2 worklist)
-
ETL and ELT Tooling Catalog dimension db id 24
Library dimension (catalog)
Roles linked in library: Data Engineer
API 3 link attempts (this skill)
| Dimension | Skill↔dim | Role↔dim | Outcome |
|---|---|---|---|
|
ETL and ELT Tooling
etl-and-elt-tooling
|
✓ | ✓ | Existing dimension (library) · Role↔dimension saved |
Aliases — catalog
- Apache Spark (CANONICAL)
- apache spark 3 (VERSION)
- spark (VERSION)
- spark 3 (VERSION)
- spark 3.x (VERSION)
- spark3 (VERSION)
Context tags (catalog)
Stored enrichment (catalog DB)
- Category
- Framework
- Sub-category
- Distributed Data Processing Framework
- Vendor
- Apache Software Foundation
- License
- apache_2
- Year introduced
- 2010
- Confidence
- 0.94
- Version strategy
- SEPARATE_ENTITY
- Version tag
- 3.x
Maturity reasoning: Apache Spark appears in many data engineering JDs and remains a standard for distributed ETL/ELT; its GitHub and vendor ecosystem activity stay strong, with Databricks and cloud platforms still promoting it.
Skill profile (library / DB)
- Skill nature
- FRAMEWORK
- Volatility
- STABLE
- Typical lifespan
- EVERGREEN
- Category id
- 5
- Sub-category id
- 1021
- Extractable
- True
- Also category
- False
Dimensions (API 2 worklist)
-
ETL and ELT Tooling Catalog dimension db id 24
Library dimension (catalog)
Roles linked in library: Data Engineer
API 3 link attempts (this skill)
| Dimension | Skill↔dim | Role↔dim | Outcome |
|---|---|---|---|
|
ETL and ELT Tooling
etl-and-elt-tooling
|
— | — |
Skipped — no persistable v3 meta for new skill
skill_not_in_db_v3_proposed
|
All API 3 persistence rows
Same grid as the skill-extractor “Persistence items” table: one row per (skill × dimension) work item.
| Skill | Tag | Dimension | Skill↔dim | Role↔dim | Outcome | Notes |
|---|---|---|---|---|---|---|
| Spark | in_db |
ETL and ELT Tooling
etl-and-elt-tooling
|
✓ | ✓ | Existing dimension (library) · Role↔dimension saved | |
| PySpark | new |
ETL and ELT Tooling
etl-and-elt-tooling
|
— | — | Skipped — no persistable v3 meta for new skill | skill_not_in_db_v3_proposed |
Library artifacts (this run)
| Kind | Detail | DB id |
|---|---|---|
| dimension_skill_link_proposed | PySpark ↔ ETL and ELT Tooling | |
| role_dimension_link_proposed | Data Engineer ↔ ETL and ELT Tooling |
nano JD Parser — gpt-4.1-nano click to toggle
Certifications
Show raw JSON
{
"JD_type": "pass",
"about_company": {
"source_marker": {
"first_5_words": "Thank you for expressing interest",
"last_5_words": "full advantage of the complete AWS solutions portfolio."
},
"text": "Thank you for expressing interest in a career with TEKsystems Global Services (TGS). At TEKsystems Global Services, we believe in lasting careers with room to run. Giving our people limitless opportunity to make an impact and enable the world\u2019s largest companies to transform how they do business. We\u2019re the professional services division of TEKsystems, accounting for over $1 Billion in revenue. We\u2019re one of India\u2019s fastest growing full-stack services companies, with about 5500 full time employees across the globe of which 2000 are in India (Bangalore and Hyderabad). TGS operates through multiple solution centers across North America, EMEA and APAC, including locations like Dallas, Redmond, Bloomington, Baltimore, Maryland, Europe (Amsterdam, London), Canada (Montreal), and Philippines (Manila). To support key areas of our growth, we acquired two companies to join our family. One North, a full-service digital agency and 1Strategy a premiere AWS service provider. Through One North we\u2019re able to help our customers elevate their customer, brand and UI/UX experiences. And, through 1Strategy, we\u2019re able to ensure our customers can take full advantage of the complete AWS solutions portfolio.",
"word_count": 164
},
"archetype_override_applied": true,
"archetype_override_matched_skills": [
"Snowflake",
"Tableau",
"AWS",
"Make",
"DevOps",
"UI/UX",
"Analytics",
"Cloud",
"ISO 27001",
"Room",
"Provider",
"Location",
"PCI DSS"
],
"certifications": [
"ISO 27001",
"HITRUST Certification",
"PCI DSS Certification",
"HIPPA Compliance",
"PMP certified Project Managers"
],
"company_name": "TEKsystems Global Services",
"ctc": null,
"domain": {
"primary": {
"aliases": [
"ITES",
"BPO"
],
"domain": "IT Services \u0026 Consulting"
},
"secondary": null
},
"education": [],
"experience": {
"max": null,
"min": 4,
"raw": "4+ yrs"
},
"job_locations": [
{
"aliases": [
"Bengaluru"
],
"city": "Bangalore",
"country": "India",
"state": null,
"work_mode": null
}
],
"role": null,
"role_aliases": [],
"role_archetype": "Engineering",
"roles_and_responsibilities": [
{
"bullet_count": 4,
"heading": "Mandatory skills",
"heading_was_present": true,
"source_marker": {
"first_5_words": "Excellent in Spark/Pyspark Experience",
"last_5_words": "data will be a big plus"
},
"text": "Excellent in Spark/Pyspark\nExperience working with structured data\nExposure to cloud\nBid data will be a big plus",
"word_count": 24
}
],
"urls": [
{
"type": "website",
"url": "https://www.teksystems.com/"
},
{
"type": "other",
"url": "https://www.teksystems.com/en-in/services"
},
{
"type": "careers",
"url": "https://www.teksystems.com/careers-in-india"
},
{
"type": "linkedin",
"url": "https://www.linkedin.com/company/teksystems-global-services-india/mycompany/"
}
]
}
API 1 — extract-from-jd click to toggle
{
"final_skills": [
{
"is_primary": true,
"skill_name": "Spark"
},
{
"is_primary": true,
"skill_name": "PySpark"
}
],
"jd_role": null,
"nano_parsed": {
"JD_type": "pass",
"about_company": {
"source_marker": {
"first_5_words": "Thank you for expressing interest",
"last_5_words": "full advantage of the complete AWS solutions portfolio."
},
"text": "Thank you for expressing interest in a career with TEKsystems Global Services (TGS). At TEKsystems Global Services, we believe in lasting careers with room to run. Giving our people limitless opportunity to make an impact and enable the world\u2019s largest companies to transform how they do business. We\u2019re the professional services division of TEKsystems, accounting for over $1 Billion in revenue. We\u2019re one of India\u2019s fastest growing full-stack services companies, with about 5500 full time employees across the globe of which 2000 are in India (Bangalore and Hyderabad). TGS operates through multiple solution centers across North America, EMEA and APAC, including locations like Dallas, Redmond, Bloomington, Baltimore, Maryland, Europe (Amsterdam, London), Canada (Montreal), and Philippines (Manila). To support key areas of our growth, we acquired two companies to join our family. One North, a full-service digital agency and 1Strategy a premiere AWS service provider. Through One North we\u2019re able to help our customers elevate their customer, brand and UI/UX experiences. And, through 1Strategy, we\u2019re able to ensure our customers can take full advantage of the complete AWS solutions portfolio.",
"word_count": 164
},
"archetype_override_applied": true,
"archetype_override_matched_skills": [
"Snowflake",
"Tableau",
"AWS",
"Make",
"DevOps",
"UI/UX",
"Analytics",
"Cloud",
"ISO 27001",
"Room",
"Provider",
"Location",
"PCI DSS"
],
"certifications": [
"ISO 27001",
"HITRUST Certification",
"PCI DSS Certification",
"HIPPA Compliance",
"PMP certified Project Managers"
],
"company_name": "TEKsystems Global Services",
"ctc": null,
"domain": {
"primary": {
"aliases": [
"ITES",
"BPO"
],
"domain": "IT Services \u0026 Consulting"
},
"secondary": null
},
"education": [],
"experience": {
"max": null,
"min": 4,
"raw": "4+ yrs"
},
"job_locations": [
{
"aliases": [
"Bengaluru"
],
"city": "Bangalore",
"country": "India",
"state": null,
"work_mode": null
}
],
"role": null,
"role_aliases": [],
"role_archetype": "Engineering",
"roles_and_responsibilities": [
{
"bullet_count": 4,
"heading": "Mandatory skills",
"heading_was_present": true,
"source_marker": {
"first_5_words": "Excellent in Spark/Pyspark Experience",
"last_5_words": "data will be a big plus"
},
"text": "Excellent in Spark/Pyspark\nExperience working with structured data\nExposure to cloud\nBid data will be a big plus",
"word_count": 24
}
],
"urls": [
{
"type": "website",
"url": "https://www.teksystems.com/"
},
{
"type": "other",
"url": "https://www.teksystems.com/en-in/services"
},
{
"type": "careers",
"url": "https://www.teksystems.com/careers-in-india"
},
{
"type": "linkedin",
"url": "https://www.linkedin.com/company/teksystems-global-services-india/mycompany/"
}
]
},
"rejected": false,
"rejection_reason": null,
"run_id": "e7cc83d4-091c-4d28-b237-a18cae3b08f1",
"stage3_signals": {
"alias_found": false,
"alias_match_roles": [],
"kra_match_roles": [
{
"display_name": "Data Engineer",
"kra_matches": [
{
"kra_text": "Develops batch and real-time streaming data pipelines using Apache Spark, Apache Kafka, Apache Flink, or Airflow for data movement and processing at scale.",
"sentence": "Excellent in Spark/Pyspark\nExperience working with structured data\nExposure to cloud\nBid data will be a big plus",
"similarity": 0.5312
}
],
"matched_count": null,
"matched_skills": null,
"role_id": 2,
"score": 0.5312,
"slug": "data-engineer",
"total_count": null
},
{
"display_name": "ML Engineer",
"kra_matches": [
{
"kra_text": "Prepares, cleans, and transforms training datasets, manages feature stores, and builds feature engineering pipelines for model training.",
"sentence": "Excellent in Spark/Pyspark\nExperience working with structured data\nExposure to cloud\nBid data will be a big plus",
"similarity": 0.4005
}
],
"matched_count": null,
"matched_skills": null,
"role_id": 3,
"score": 0.4005,
"slug": "ml-engineer",
"total_count": null
},
{
"display_name": "Fullstack Developer",
"kra_matches": [
{
"kra_text": "Designs and queries relational databases like PostgreSQL and document stores like MongoDB, writing migrations, indexes, and optimized queries.",
"sentence": "Excellent in Spark/Pyspark\nExperience working with structured data\nExposure to cloud\nBid data will be a big plus",
"similarity": 0.3796
}
],
"matched_count": null,
"matched_skills": null,
"role_id": 15,
"score": 0.3796,
"slug": "full-stack-engineer",
"total_count": null
},
{
"display_name": "Svelte Frontend Developer",
"kra_matches": [
{
"kra_text": "backend data integration",
"sentence": "Excellent in Spark/Pyspark\nExperience working with structured data\nExposure to cloud\nBid data will be a big plus",
"similarity": 0.3763
}
],
"matched_count": null,
"matched_skills": null,
"role_id": 92,
"score": 0.3763,
"slug": "svelte-frontend-developer",
"total_count": null
},
{
"display_name": "AI Engineer",
"kra_matches": [
{
"kra_text": "Designs and implements prompt engineering workflows, few-shot examples, chain-of-thought patterns, and structured output parsing for AI feature pipelines.",
"sentence": "Excellent in Spark/Pyspark\nExperience working with structured data\nExposure to cloud\nBid data will be a big plus",
"similarity": 0.3653
}
],
"matched_count": null,
"matched_skills": null,
"role_id": 13,
"score": 0.3653,
"slug": "ai-engineer",
"total_count": null
}
],
"skill_match_roles": [
{
"display_name": "Data Engineer",
"kra_matches": null,
"matched_count": 1,
"matched_skills": [
"Apache Spark"
],
"role_id": 2,
"score": 0.5,
"slug": "data-engineer",
"total_count": 2
}
]
},
"stage4_decision": {
"alias_collision_detected": false,
"case": "DOMAIN",
"chosen_role": {
"display_name": "Data Engineer",
"kra_matches": null,
"matched_count": null,
"matched_skills": null,
"role_id": 2,
"score": 0.97,
"slug": "data-engineer",
"total_count": null
},
"confidence": 0.97,
"is_new_role": false,
"llm2_fired": false,
"llm2_reasoning": null,
"matched_dimensions": [
"Big Data Processing",
"Cloud Data Engineering",
"Structured Data Handling"
],
"matched_kras": [
"Excellent in Spark/Pyspark",
"Experience working with structured data",
"Exposure to cloud"
],
"matched_skills": [
"Spark",
"Pyspark",
"structured data",
"cloud",
"big data"
],
"new_role_display_name": null,
"new_role_slug": null,
"queued": false,
"reasoning": "Domain=Data Engineering \u0026 Analytics; The JD centers on Spark/PySpark, structured data, and cloud exposure, which aligns best with data engineering responsibilities.",
"sub_role": null
},
"stage5_updates": {
"centroid_n_after": 375,
"centroid_updated": true,
"collision_log_id": null,
"new_kra_attached": null,
"new_skills_attached": [
{
"is_primary": true,
"queue_id": 17758,
"role_display_name": "Data Engineer",
"role_slug": "data-engineer",
"skill_name": "PySpark",
"status": "pending"
}
],
"queue_entry_id": null,
"v3_pipeline_triggered": false,
"v3_role_slug": null,
"v3_run_id": null
}
}
API 2 — extract-details
{
"alias_matches": [
{
"alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
"alias_persisted": false,
"existing_alias_id": 2510,
"existing_alias_text": "spark",
"input_term": "Spark",
"matched_canonical": {
"category_id": 5,
"display_name": "Apache Spark",
"id": 1350,
"is_also_category": false,
"is_extractable": true,
"skill_nature": "FRAMEWORK",
"slug": "apache-spark",
"sub_category_id": 1021,
"typical_lifespan": "EVERGREEN",
"volatility": "STABLE"
},
"matched_via": "alias"
},
{
"alias_persist_skipped_reason": "TODO: REMOVE AFTER TESTING \u2014 alias DB write disabled",
"alias_persisted": false,
"existing_alias_id": 2004,
"existing_alias_text": "Apache Spark",
"input_term": "PySpark",
"matched_canonical": {
"category_id": 5,
"display_name": "Apache Spark",
"id": 1350,
"is_also_category": false,
"is_extractable": true,
"skill_nature": "FRAMEWORK",
"slug": "apache-spark",
"sub_category_id": 1021,
"typical_lifespan": "EVERGREEN",
"volatility": "STABLE"
},
"matched_via": "embedding_alias"
}
],
"candidate_roles": [
{
"display_name": "Data Engineer",
"id": 2,
"rationale": null,
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
}
],
"chosen_role": {
"display_name": "Data Engineer",
"id": 2,
"rationale": "Domain=Data Engineering \u0026 Analytics; The JD centers on Spark/PySpark, structured data, and cloud exposure, which aligns best with data engineering responsibilities.",
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
},
"dimensions": [
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "ETL and ELT Tooling",
"id": 24,
"rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
"slug": "etl-and-elt-tooling",
"source": "db"
},
"input_skill": "Spark",
"llm_role": null,
"roles_from_db": [
{
"display_name": "Data Engineer",
"id": 2,
"rationale": null,
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
}
]
},
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "ETL and ELT Tooling",
"id": 24,
"rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
"slug": "etl-and-elt-tooling",
"source": "db"
},
"input_skill": "PySpark",
"llm_role": null,
"roles_from_db": [
{
"display_name": "Data Engineer",
"id": 2,
"rationale": null,
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
}
]
}
],
"input_final_skills": [
"Spark",
"PySpark"
],
"input_llm_skills": [
"Spark",
"PySpark"
],
"new_aliases_persisted": 0,
"run_id": "e7cc83d4-091c-4d28-b237-a18cae3b08f1",
"skills_detail": [
{
"aliases_in_db": [
{
"alias_text": "Apache Spark",
"alias_type": "CANONICAL",
"id": 2004,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "apache spark 3",
"alias_type": "VERSION",
"id": 2006,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "spark",
"alias_type": "VERSION",
"id": 2510,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "spark 3",
"alias_type": "VERSION",
"id": 2007,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "spark 3.x",
"alias_type": "VERSION",
"id": 2009,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "spark3",
"alias_type": "VERSION",
"id": 2008,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
}
],
"canonical": {
"category_id": 5,
"display_name": "Apache Spark",
"id": 1350,
"is_also_category": false,
"is_extractable": true,
"skill_nature": "FRAMEWORK",
"slug": "apache-spark",
"sub_category_id": 1021,
"typical_lifespan": "EVERGREEN",
"volatility": "STABLE"
},
"dimensions": [
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "ETL and ELT Tooling",
"id": 24,
"rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
"slug": "etl-and-elt-tooling",
"source": "db"
},
"input_skill": "Spark",
"llm_role": null,
"roles_from_db": [
{
"display_name": "Data Engineer",
"id": 2,
"rationale": null,
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
}
]
}
],
"input_skill": "Spark",
"matched_via": "alias",
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": null,
"source_tag": "db",
"was_in_llm_skills": true
},
{
"aliases_in_db": [
{
"alias_text": "Apache Spark",
"alias_type": "CANONICAL",
"id": 2004,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "apache spark 3",
"alias_type": "VERSION",
"id": 2006,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "spark",
"alias_type": "VERSION",
"id": 2510,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "spark 3",
"alias_type": "VERSION",
"id": 2007,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "spark 3.x",
"alias_type": "VERSION",
"id": 2009,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "spark3",
"alias_type": "VERSION",
"id": 2008,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
}
],
"canonical": {
"category_id": 5,
"display_name": "Apache Spark",
"id": 1350,
"is_also_category": false,
"is_extractable": true,
"skill_nature": "FRAMEWORK",
"slug": "apache-spark",
"sub_category_id": 1021,
"typical_lifespan": "EVERGREEN",
"volatility": "STABLE"
},
"dimensions": [
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "ETL and ELT Tooling",
"id": 24,
"rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
"slug": "etl-and-elt-tooling",
"source": "db"
},
"input_skill": "PySpark",
"llm_role": null,
"roles_from_db": [
{
"display_name": "Data Engineer",
"id": 2,
"rationale": null,
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
}
]
}
],
"input_skill": "PySpark",
"matched_via": "embedding_alias",
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": null,
"source_tag": "db",
"was_in_llm_skills": true
}
],
"unmatched_skills": []
}
API 3 — final-role-output
{
"chosen_role": {
"display_name": "Data Engineer",
"id": 2,
"rationale": "Domain=Data Engineering \u0026 Analytics; The JD centers on Spark/PySpark, structured data, and cloud exposure, which aligns best with data engineering responsibilities.",
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
},
"chosen_role_resolution": "in_db",
"final_input_skills": [
{
"skill": "Spark",
"tag": "in_db"
},
{
"skill": "PySpark",
"tag": "in_db"
}
],
"llm_cost_api1_usd": null,
"llm_cost_api2_usd": null,
"llm_cost_api3_usd": null,
"llm_cost_total_usd": null,
"persistence": {
"items": [
{
"chosen_role_id": 2,
"dimension": {
"difficulty_hint": "well_known",
"display_name": "ETL and ELT Tooling",
"id": 24,
"rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
"slug": "etl-and-elt-tooling",
"source": "db"
},
"dimension_id": 24,
"input_skill": "Spark",
"llm_role": null,
"matched_chosen_role": true,
"outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension saved",
"role_dimension_saved": true,
"roles_from_db": [
{
"display_name": "Data Engineer",
"id": 2,
"rationale": null,
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
}
],
"skill_dimension_saved": true,
"skill_id": 1350,
"skill_tag": "in_db",
"skipped_reason": null
},
{
"chosen_role_id": 2,
"dimension": {
"difficulty_hint": "well_known",
"display_name": "ETL and ELT Tooling",
"id": 24,
"rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
"slug": "etl-and-elt-tooling",
"source": "db"
},
"dimension_id": 24,
"input_skill": "PySpark",
"llm_role": null,
"matched_chosen_role": true,
"outcome_line": "Skipped \u2014 no persistable v3 meta for new skill",
"role_dimension_saved": false,
"roles_from_db": [
{
"display_name": "Data Engineer",
"id": 2,
"rationale": null,
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
}
],
"skill_dimension_saved": false,
"skill_id": null,
"skill_tag": "new",
"skipped_reason": "skill_not_in_db_v3_proposed"
}
],
"new_skills_created": 0,
"role_dimension_saved": 0,
"skill_dimension_saved": 0,
"skipped": 1
},
"planner_output": null,
"run_id": "e7cc83d4-091c-4d28-b237-a18cae3b08f1"
}