Pipeline run
c90f7764-3162-4ae8-9e8a-6579f118a3a8
Client output enrichment
v2 Skill cluster · Nature of work · AI index · Tech stack maturity · Evidence · KRA descriptionvocab breakdown (legacy)
Signals
Post-classification
Captured for admin review
1 POST /skills/extract-from-jd
2 POST /skills/extract-details
3 POST /skills/final-role-output
Data Engineer
domain · Data Engineering & Analytics CASE DOMAINslug: data-engineer · id: 2 · source: db
Domain=Data Engineering & Analytics; The JD centers on leading data strategy and building data pipelines, cloud-native data platforms, analytics, and integration patterns, which aligns best with Data Engineer at an engineering leadership level.
Matched skills
Matched dimensions
Matched KRAs
Resolution:
in_db
— role exists in library; skill↔dim and role↔dim links saved when applicable.
Job description
You Lead the Way. We’ve Got Your Back. At American Express, we know that with the right backing, people and businesses have the power to progress in incredible ways. Whether we’re supporting our customers’ financial confidence to move ahead, taking commerce to new heights, or encouraging people to explore the world, our colleagues are constantly redefining what’s possible — and we’re proud to back each other every step of the way. When you join #TeamAmex, you become part of a diverse community of over 60,000 colleagues, all with a common goal to deliver an exceptional customer experience every day. We back our colleagues with the support they need to thrive, professionally and personally. That’s why we have Amex Flex, our enterprise working model that provides greater flexibility to colleagues while ensuring we preserve the important aspects of our unique in-person culture. Depending on role and business needs, colleagues will either work onsite, in a hybrid model (combination of in-office and virtual days) or fully virtually. Overview of The Department: This exciting Engineering Director position will be part of the transformation journey supporting GCS (Global Commercial Services) Data Engineering organization, within American Express Technologies (AET). Our Engineering Directors have both deep technical and strategic knowledge that drive superior solutions. If data engineering, leading a team of engineers, working in a data driven organization, cross functional technology environment, with a focus on delivering a Superior Customer experience is your passion, there’s no better place to build a career. This Global Commercial Data Engineering team is fully focused on unleashing the power of commercial data to drive B2B spend, and non-card spend, while maximizing the usage of T&E spend with our commercial clients. This role will be based in Bangalore India, reporting to the VP of Global Commercial Engineering in support of Global Commercial Services and will have responsibility for delivering various data platform, API development, and delivering internal and external insights enabled via advanced analytics and Machine Learning capabilities to power ALL parts of Global Commercial Services Business Unit within American Express. What you get to do every day: Contribute to the overall Data Strategy for GCS with a focus on building Data driven products.Set goals, performance review and technical leadership to foster a team environment, and provide mentorship and feedback to our awesome engineering colleagues.Develop a strong talent pipeline and provide opportunities and challenges to technical team members to learn and grow.Apply current industry trends to deliver best in class data driven products to drive growth.Collaborate with the Data Product Owners, Enterprise Data Governance and Management stakeholders, data/staff architects, and business users to design the data models and data pipelines for high-performance analytics while ensuring data accuracy and completeness.Provide technical and thought leadership on building Cloud native applications in a hybrid cloud environment.Work with on prem and public cloud-based data platforms, data lakes, data warehouses, various analytical tools like Spark, Python, Tableau and machine learning capabilities.Build and optimize data pipelines using one or more integration patterns and solutions (e.g., ETL, Message driven, Event driven, Streaming API design and more)10+ years of Data engineering, Business Intelligence and Data Warehousing experience.10+ years of Experience in SQL or similar languages and development experience in at least one scripting language (Java, Python etc.)Experience with public cloud-based data platforms and managed services, Data Lakes and Analytical tools (Apache Spark, Pyton, Hive, Tableau).Experience with Data governance practices such as metadata management, data lineage.Experience with Big Data technologies and architecture with prior working experience with Analytics and Data science teams.Experience with implementing Machine learning modelsExperience with ML based Data Standardization, Matching and Enrichment.Experience as a hands-on architect with the ability to design and architect large scale web and analytical applications.Experience with building CI/CD pipelines and tools like Jenkins.Proven experience hiring and mentoring high-caliber, data-focused engineers with diverse technical strengths and backgrounds.Strong written and verbal communication skills. American Express is an equal opportunity employer and makes employment decisions without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, veteran status, disability status, age, or any other status protected by law. Offer of employment with American Express is conditioned upon the successful completion of a background verification check, subject to applicable laws and regulations.
Skills from this JD
Each row merges API 1 extraction, API 2 library match / v3 orchestration (dimensions + locked dims), and API 3 persistence tags.
Aliases — catalog
- Apache Spark (CANONICAL)
- apache spark 3 (VERSION)
- spark (VERSION)
- spark 3 (VERSION)
- spark 3.x (VERSION)
- spark3 (VERSION)
Context tags (catalog)
Stored enrichment (catalog DB)
- Category
- Framework
- Sub-category
- Distributed Data Processing Framework
- Vendor
- Apache Software Foundation
- License
- apache_2
- Year introduced
- 2010
- Confidence
- 0.94
- Version strategy
- SEPARATE_ENTITY
- Version tag
- 3.x
Maturity reasoning: Apache Spark appears in many data engineering JDs and remains a standard for distributed ETL/ELT; its GitHub and vendor ecosystem activity stay strong, with Databricks and cloud platforms still promoting it.
Skill profile (library / DB)
- Skill nature
- FRAMEWORK
- Volatility
- STABLE
- Typical lifespan
- EVERGREEN
- Category id
- 5
- Sub-category id
- 1021
- Extractable
- True
- Also category
- False
Dimensions (API 2 worklist)
-
ETL and ELT Tooling Catalog dimension db id 24
Library dimension (catalog)
Roles linked in library: Data Engineer
API 3 link attempts (this skill)
| Dimension | Skill↔dim | Role↔dim | Outcome |
|---|---|---|---|
|
ETL and ELT Tooling
etl-and-elt-tooling
|
✓ | ✓ | Existing dimension (library) · Role↔dimension saved |
Aliases — catalog
- Python (CANONICAL) primary
- Python 2 (VERSION)
- Python 2.x (VERSION)
- Python 3 (VERSION)
- Python 3.10 (VERSION)
- Python 3.11 (VERSION)
- Python 3.12 (VERSION)
- Python 3.x (VERSION)
- py (VERSION)
- py2 (VERSION)
- py3 (VERSION)
- python 3 (VERSION)
- python 3.x (VERSION)
- python2 (VERSION)
- python3 (VERSION)
- python3.x (VERSION)
Context tags (catalog)
Stored enrichment (catalog DB)
- Category
- Language
- Sub-category
- Programming Language
- Vendor
- PSF
- License
- mit
- Year introduced
- 1991
- Confidence
- 0.99
- Version strategy
- SEPARATE_ENTITY
- Version tag
- 3
Maturity reasoning: Python appears in a very high volume of job descriptions across data, backend, automation, and ML roles, and remains a default hiring-pipeline language on major job boards and tech stacks.
Skill profile (library / DB)
- Skill nature
- LANGUAGE
- Volatility
- STABLE
- Typical lifespan
- EVERGREEN
- Category id
- 6
- Sub-category id
- 96
- Extractable
- True
- Also category
- False
Dimensions (API 2 worklist)
-
Cloud Security Scripting & DSL Languages Catalog dimension db id 248
Library dimension (catalog)
Roles linked in library: Cloud Security Engineer
-
Programming Languages Catalog dimension db id 1
Library dimension (catalog)
Roles linked in library: Backend Developer, Fullstack Developer, Fullstack Developer
-
Programming Languages and Scripting Catalog dimension db id 59
Library dimension (catalog)
Roles linked in library: Cyber Security Engineer
-
Programming Languages for Data Work Catalog dimension db id 21
Library dimension (catalog)
Roles linked in library: Data Engineer
-
Programming Languages for ML Systems Catalog dimension db id 39
Library dimension (catalog)
Roles linked in library: ML Engineer, MLOps Engineer
-
Programming Languages for XR Catalog dimension db id 97
Library dimension (catalog)
Roles linked in library: AR/VR Engineer
-
Python Programming Catalog dimension db id 290
Library dimension (catalog)
Roles linked in library: Python Backend Developer
API 3 link attempts (this skill)
| Dimension | Skill↔dim | Role↔dim | Outcome |
|---|---|---|---|
|
Cloud Security Scripting & DSL Languages
cloud-security-scripting-dsl-languages
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) |
|
Programming Languages
programming-languages
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) |
|
Programming Languages and Scripting
programming-languages-and-scripting
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) |
|
Programming Languages for Data Work
programming-languages-for-data-work
|
✓ | ✓ | Existing dimension (library) · Role↔dimension saved |
|
Programming Languages for ML Systems
programming-languages-for-ml-systems
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) |
|
Programming Languages for XR
programming-languages-for-xr
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) |
|
Python Programming
python-programming
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) |
Aliases — catalog
- Tableau (CANONICAL) primary
Context tags (catalog)
Stored enrichment (catalog DB)
- Category
- Platform
- Sub-category
- Bi Analytics Platform
- Vendor
- Tableau Software
- License
- proprietary
- Year introduced
- 2003
- Confidence
- 0.96
- Version strategy
- NOT_APPLICABLE
Maturity reasoning: Tableau appears frequently in BI/data analyst job descriptions and remains a standard enterprise analytics platform with strong vendor support and broad adoption.
Skill profile (library / DB)
- Skill nature
- PLATFORM
- Volatility
- STABLE
- Typical lifespan
- EVERGREEN
- Category id
- 9
- Sub-category id
- 111
- Extractable
- True
- Also category
- False
Dimensions (API 2 worklist)
-
BI and Visualization Tools Catalog dimension db id 31
Library dimension (catalog)
Roles linked in library: Data Engineer
API 3 link attempts (this skill)
| Dimension | Skill↔dim | Role↔dim | Outcome |
|---|---|---|---|
|
BI and Visualization Tools
bi-and-visualization-tools
|
✓ | ✓ | Existing dimension (library) · Role↔dimension saved |
Aliases — catalog
- Machine Learning (CANONICAL)
Context tags (catalog)
Stored enrichment (catalog DB)
- Category
- Concept
- Sub-category
- Machine Learning
- Confidence
- 0.98
- Version strategy
- NOT_APPLICABLE
Maturity reasoning: Machine Learning appears in large volumes of job descriptions across data, product, and platform roles, and major cloud vendors (AWS, Google Cloud, Azure) offer dedicated ML services and certifications, indicating broad adoption.
Skill profile (library / DB)
- Skill nature
- CONCEPT
- Volatility
- STABLE
- Typical lifespan
- EVERGREEN
- Category id
- 2
- Sub-category id
- 1024
- Extractable
- True
- Also category
- False
Dimensions (API 2 worklist)
-
AI Governance and Model Security Catalog dimension db id 50
Library dimension (catalog)
Roles linked in library: AI Engineer, ML Engineer, MLOps Engineer
-
React Frontend Development Catalog dimension db id 96
Library dimension (catalog)
API 3 link attempts (this skill)
| Dimension | Skill↔dim | Role↔dim | Outcome |
|---|---|---|---|
|
AI Governance and Model Security
ai-governance-and-model-security
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) |
|
React Frontend Development
d_init_01
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) |
Skill enrichment (orchestrator / LLM)
No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).
- Category
- Data Engineering Tools
- Sub-category
- general
- Skill nature
- PRACTICE
- Volatility
- MEDIUM
- Typical lifespan
- MULTI_YEAR
- Version strategy
- UNVERSIONED
Skill enrichment (orchestrator / LLM)
No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).
- Category
- Architectural Concepts
- Sub-category
- general
- Skill nature
- CONCEPT
- Volatility
- STABLE
- Typical lifespan
- EVERGREEN
- Version strategy
- UNVERSIONED
Aliases — catalog
- Event-Driven Architecture (CANONICAL)
Context tags (catalog)
Stored enrichment (catalog DB)
- Category
- Architecture
- Sub-category
- Event Driven Architecture
- Confidence
- 0.99
- Version strategy
- NOT_APPLICABLE
Maturity reasoning: Common in cloud-native JDs and vendor docs; AWS, Azure, and Confluent all market event-driven patterns with Kafka/PubSub, showing broad hiring demand.
Skill profile (library / DB)
- Skill nature
- PATTERN
- Volatility
- STABLE
- Typical lifespan
- EVERGREEN
- Category id
- 1
- Sub-category id
- 1027
- Extractable
- True
- Also category
- False
Dimensions (API 2 worklist)
-
React Frontend Development Catalog dimension db id 96
Library dimension (catalog)
API 3 link attempts (this skill)
| Dimension | Skill↔dim | Role↔dim | Outcome |
|---|---|---|---|
|
React Frontend Development
d_init_01
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) |
Skill enrichment (orchestrator / LLM)
No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).
- Category
- Data Engineering Tools
- Sub-category
- general
- Skill nature
- CONCEPT
- Volatility
- MEDIUM
- Typical lifespan
- MULTI_YEAR
- Version strategy
- UNVERSIONED
Skill enrichment (orchestrator / LLM)
No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).
- Category
- Concepts
- Sub-category
- general
- Skill nature
- CONCEPT
- Volatility
- MEDIUM
- Typical lifespan
- MULTI_YEAR
- Version strategy
- UNVERSIONED
Skill enrichment (orchestrator / LLM)
No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).
- Category
- Concepts
- Sub-category
- general
- Skill nature
- CONCEPT
- Volatility
- MEDIUM
- Typical lifespan
- MULTI_YEAR
- Version strategy
- UNVERSIONED
Skill enrichment (orchestrator / LLM)
No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).
- Category
- Cloud Platforms
- Sub-category
- general
- Skill nature
- CONCEPT
- Volatility
- MEDIUM
- Typical lifespan
- MULTI_YEAR
- Version strategy
- UNVERSIONED
Aliases — catalog
- Data Lakes (CANONICAL)
Context tags (catalog)
Stored enrichment (catalog DB)
- Category
- Architecture
- Sub-category
- Data Lake Architecture
- Confidence
- 0.90
- Version strategy
- NOT_APPLICABLE
Maturity reasoning: Data lakes are widely listed in cloud/data platform job descriptions and are a standard architecture in AWS, Azure, and GCP ecosystems; they’re a common hiring-pipeline staple rather than a niche pattern.
Skill profile (library / DB)
- Skill nature
- PATTERN
- Volatility
- STABLE
- Typical lifespan
- EVERGREEN
- Category id
- 1
- Sub-category id
- 1025
- Extractable
- True
- Also category
- False
Dimensions (API 2 worklist)
-
Cloud Storage and Data Services Catalog dimension db id 144
Library dimension (catalog)
Roles linked in library: Cloud Architect
-
React Frontend Development Catalog dimension db id 96
Library dimension (catalog)
API 3 link attempts (this skill)
| Dimension | Skill↔dim | Role↔dim | Outcome |
|---|---|---|---|
|
Cloud Storage and Data Services
cloud-storage-and-data-services
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) |
|
React Frontend Development
d_init_01
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) |
Skill enrichment (orchestrator / LLM)
No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).
- Category
- Databases
- Sub-category
- general
- Skill nature
- TOOL
- Volatility
- MEDIUM
- Typical lifespan
- MULTI_YEAR
- Version strategy
- UNVERSIONED
All API 3 persistence rows
Same grid as the skill-extractor “Persistence items” table: one row per (skill × dimension) work item.
| Skill | Tag | Dimension | Skill↔dim | Role↔dim | Outcome | Notes |
|---|---|---|---|---|---|---|
| Spark | in_db |
ETL and ELT Tooling
etl-and-elt-tooling
|
✓ | ✓ | Existing dimension (library) · Role↔dimension saved | |
| Python | in_db |
Cloud Security Scripting & DSL Languages
cloud-security-scripting-dsl-languages
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) | |
| Python | in_db |
Programming Languages
programming-languages
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) | |
| Python | in_db |
Programming Languages and Scripting
programming-languages-and-scripting
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) | |
| Python | in_db |
Programming Languages for Data Work
programming-languages-for-data-work
|
✓ | ✓ | Existing dimension (library) · Role↔dimension saved | |
| Python | in_db |
Programming Languages for ML Systems
programming-languages-for-ml-systems
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) | |
| Python | in_db |
Programming Languages for XR
programming-languages-for-xr
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) | |
| Python | in_db |
Python Programming
python-programming
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) | |
| Tableau | in_db |
BI and Visualization Tools
bi-and-visualization-tools
|
✓ | ✓ | Existing dimension (library) · Role↔dimension saved | |
| Machine Learning | in_db |
AI Governance and Model Security
ai-governance-and-model-security
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) | |
| Machine Learning | in_db |
React Frontend Development
d_init_01
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) | |
| Event-Driven Architecture | in_db |
React Frontend Development
d_init_01
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) | |
| Data Lakes | in_db |
Cloud Storage and Data Services
cloud-storage-and-data-services
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) | |
| Data Lakes | in_db |
React Frontend Development
d_init_01
|
✓ | — | Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role) |
Library artifacts (this run)
| Kind | Detail | DB id |
|---|---|---|
| canonical_skill_proposed | ETL | type=Data Engineering Tools subtype=general nature=PRACTICE lifespan=MULTI_YEAR | |
| canonical_skill_proposed | Message-Driven Architecture | type=Architectural Concepts subtype=general nature=CONCEPT lifespan=EVERGREEN | |
| canonical_skill_proposed | Streaming | type=Data Engineering Tools subtype=general nature=CONCEPT lifespan=MULTI_YEAR | |
| canonical_skill_proposed | API Design | type=Concepts subtype=general nature=CONCEPT lifespan=MULTI_YEAR | |
| canonical_skill_proposed | Cloud Native | type=Concepts subtype=general nature=CONCEPT lifespan=MULTI_YEAR | |
| canonical_skill_proposed | Hybrid Cloud | type=Cloud Platforms subtype=general nature=CONCEPT lifespan=MULTI_YEAR | |
| canonical_skill_proposed | Data Warehouses | type=Databases subtype=general nature=TOOL lifespan=MULTI_YEAR |
nano JD Parser — gpt-4.1-nano click to toggle
Show raw JSON
{
"JD_type": "pass",
"about_company": {
"source_marker": {
"first_5_words": "At American Express, we know",
"last_5_words": "aspects of our unique in-person culture."
},
"text": "At American Express, we know that with the right backing, people and businesses have the power to progress in incredible ways. Whether we\u2019re supporting our customers\u2019 financial confidence to move ahead, taking commerce to new heights, or encouraging people to explore the world, our colleagues are constantly redefining what\u2019s possible \u2014 and we\u2019re proud to back each other every step of the way. When you join #TeamAmex, you become part of a diverse community of over 60,000 colleagues, all with a common goal to deliver an exceptional customer experience every day. We back our colleagues with the support they need to thrive, professionally and personally. That\u2019s why we have Amex Flex, our enterprise working model that provides greater flexibility to colleagues while ensuring we preserve the important aspects of our unique in-person culture.",
"word_count": 100
},
"certifications": [],
"company_name": "American Express",
"ctc": null,
"domain": {
"primary": {
"aliases": [
"FinTech",
"BFSI"
],
"domain": "Financial Services"
},
"secondary": null
},
"education": [],
"experience": {
"max": null,
"min": 10,
"raw": "10+ years of Data engineering, Business Intelligence and Data Warehousing experience."
},
"job_locations": [
{
"aliases": [
"Bengaluru"
],
"city": "Bangalore",
"country": "India",
"state": null,
"work_mode": "hybrid"
}
],
"role": "Engineering Director",
"role_aliases": [
"Director of Engineering",
"Data Engineering Director",
"Engineering Lead"
],
"role_archetype": "Engineering",
"roles_and_responsibilities": [
{
"bullet_count": 0,
"heading": "What you get to do every day",
"heading_was_present": true,
"source_marker": {
"first_5_words": "Contribute to the overall Data",
"last_5_words": "integration patterns and solutions (e.g.,"
},
"text": "Contribute to the overall Data Strategy for GCS with a focus on building Data driven products.\nSet goals, performance review and technical leadership to foster a team environment, and provide mentorship and feedback to our awesome engineering colleagues.\nDevelop a strong talent pipeline and provide opportunities and challenges to technical team members to learn and grow.\nApply current industry trends to deliver best in class data driven products to drive growth.\nCollaborate with the Data Product Owners, Enterprise Data Governance and Management stakeholders, data/staff architects, and business users to design the data models and data pipelines for high-performance analytics while ensuring data accuracy and completeness.\nProvide technical and thought leadership on building Cloud native applications in a hybrid cloud environment.\nWork with on prem and public cloud-based data platforms, data lakes, data warehouses, various analytical tools like Spark, Python, Tableau and machine learning capabilities.\nBuild and optimize data pipelines using one or more integration patterns and solutions (e.g., ETL, Message driven, Event driven, Streaming API design and more).",
"word_count": 157
}
],
"urls": []
}
API 1 — extract-from-jd click to toggle
{
"final_skills": [
{
"is_primary": true,
"skill_name": "Spark"
},
{
"is_primary": true,
"skill_name": "Python"
},
{
"is_primary": true,
"skill_name": "Tableau"
},
{
"is_primary": true,
"skill_name": "Machine Learning"
},
{
"is_primary": true,
"skill_name": "ETL"
},
{
"is_primary": true,
"skill_name": "Message-Driven Architecture"
},
{
"is_primary": true,
"skill_name": "Event-Driven Architecture"
},
{
"is_primary": true,
"skill_name": "Streaming"
},
{
"is_primary": true,
"skill_name": "API Design"
},
{
"is_primary": true,
"skill_name": "Cloud Native"
},
{
"is_primary": true,
"skill_name": "Hybrid Cloud"
},
{
"is_primary": true,
"skill_name": "Data Lakes"
},
{
"is_primary": true,
"skill_name": "Data Warehouses"
}
],
"jd_role": {
"display_name": "Engineering Director",
"rationale": null,
"role_aliases": [
"Director of Engineering",
"Data Engineering Director",
"Engineering Lead"
],
"role_archetype": "Engineering",
"slug": ""
},
"nano_parsed": {
"JD_type": "pass",
"about_company": {
"source_marker": {
"first_5_words": "At American Express, we know",
"last_5_words": "aspects of our unique in-person culture."
},
"text": "At American Express, we know that with the right backing, people and businesses have the power to progress in incredible ways. Whether we\u2019re supporting our customers\u2019 financial confidence to move ahead, taking commerce to new heights, or encouraging people to explore the world, our colleagues are constantly redefining what\u2019s possible \u2014 and we\u2019re proud to back each other every step of the way. When you join #TeamAmex, you become part of a diverse community of over 60,000 colleagues, all with a common goal to deliver an exceptional customer experience every day. We back our colleagues with the support they need to thrive, professionally and personally. That\u2019s why we have Amex Flex, our enterprise working model that provides greater flexibility to colleagues while ensuring we preserve the important aspects of our unique in-person culture.",
"word_count": 100
},
"certifications": [],
"company_name": "American Express",
"ctc": null,
"domain": {
"primary": {
"aliases": [
"FinTech",
"BFSI"
],
"domain": "Financial Services"
},
"secondary": null
},
"education": [],
"experience": {
"max": null,
"min": 10,
"raw": "10+ years of Data engineering, Business Intelligence and Data Warehousing experience."
},
"job_locations": [
{
"aliases": [
"Bengaluru"
],
"city": "Bangalore",
"country": "India",
"state": null,
"work_mode": "hybrid"
}
],
"role": "Engineering Director",
"role_aliases": [
"Director of Engineering",
"Data Engineering Director",
"Engineering Lead"
],
"role_archetype": "Engineering",
"roles_and_responsibilities": [
{
"bullet_count": 0,
"heading": "What you get to do every day",
"heading_was_present": true,
"source_marker": {
"first_5_words": "Contribute to the overall Data",
"last_5_words": "integration patterns and solutions (e.g.,"
},
"text": "Contribute to the overall Data Strategy for GCS with a focus on building Data driven products.\nSet goals, performance review and technical leadership to foster a team environment, and provide mentorship and feedback to our awesome engineering colleagues.\nDevelop a strong talent pipeline and provide opportunities and challenges to technical team members to learn and grow.\nApply current industry trends to deliver best in class data driven products to drive growth.\nCollaborate with the Data Product Owners, Enterprise Data Governance and Management stakeholders, data/staff architects, and business users to design the data models and data pipelines for high-performance analytics while ensuring data accuracy and completeness.\nProvide technical and thought leadership on building Cloud native applications in a hybrid cloud environment.\nWork with on prem and public cloud-based data platforms, data lakes, data warehouses, various analytical tools like Spark, Python, Tableau and machine learning capabilities.\nBuild and optimize data pipelines using one or more integration patterns and solutions (e.g., ETL, Message driven, Event driven, Streaming API design and more).",
"word_count": 157
}
],
"urls": []
},
"rejected": false,
"rejection_reason": null,
"run_id": "c90f7764-3162-4ae8-9e8a-6579f118a3a8",
"stage3_signals": {
"alias_found": true,
"alias_match_roles": [
{
"display_name": "Engineering Manager",
"kra_matches": null,
"matched_count": null,
"matched_skills": null,
"role_id": 121,
"score": 1.0,
"slug": "engineering-manager",
"total_count": null
}
],
"kra_match_roles": [
{
"display_name": "Data Engineer",
"kra_matches": [
{
"kra_text": "Works with data analysts, data scientists, and business stakeholders to define data models, ingestion schedules, and data delivery requirements.",
"sentence": "Collaborate with the Data Product Owners, Enterprise Data Governance and Management stakeholders, data/staff architects, and business users to design the data models and data pipelines for high-performance analytics while ensuring data accuracy and completeness.",
"similarity": 0.7149
},
{
"kra_text": "Builds data ingestion pipelines to collect data from transactional databases, third-party APIs, event streams, and file sources into centralized data platforms.",
"sentence": "Build and optimize data pipelines using one or more integration patterns and solutions (e.g. , ETL, Message driven, Event driven, Streaming API design and more).",
"similarity": 0.6375
},
{
"kra_text": "Develops batch and real-time streaming data pipelines using Apache Spark, Apache Kafka, Apache Flink, or Airflow for data movement and processing at scale.",
"sentence": "Work with on prem and public cloud-based data platforms, data lakes, data warehouses, various analytical tools like Spark, Python, Tableau and machine learning capabilities.",
"similarity": 0.6029
}
],
"matched_count": null,
"matched_skills": null,
"role_id": 2,
"score": 0.6518,
"slug": "data-engineer",
"total_count": null
},
{
"display_name": "Cloud Architect",
"kra_matches": [
{
"kra_text": "Conducts architecture reviews, approves technical design documents, and guides engineering teams through cloud migration and modernization projects.",
"sentence": "Provide technical and thought leadership on building Cloud native applications in a hybrid cloud environment.",
"similarity": 0.566
},
{
"kra_text": "Conducts architecture reviews, approves technical design documents, and guides engineering teams through cloud migration and modernization projects.",
"sentence": "Set goals, performance review and technical leadership to foster a team environment, and provide mentorship and feedback to our awesome engineering colleagues.",
"similarity": 0.4846
},
{
"kra_text": "Defines cloud adoption roadmaps, lift-and-shift vs. refactor migration strategies, and landing zone architectures for workloads moving to AWS, Azure, or GCP.",
"sentence": "Work with on prem and public cloud-based data platforms, data lakes, data warehouses, various analytical tools like Spark, Python, Tableau and machine learning capabilities.",
"similarity": 0.4418
}
],
"matched_count": null,
"matched_skills": null,
"role_id": 9,
"score": 0.4975,
"slug": "cloud-architect",
"total_count": null
},
{
"display_name": "DevOps Engineer",
"kra_matches": [
{
"kra_text": "Collaborates with development teams to improve build processes, reduce deployment friction, containerize applications, and adopt DevOps best practices.",
"sentence": "Provide technical and thought leadership on building Cloud native applications in a hybrid cloud environment.",
"similarity": 0.5519
},
{
"kra_text": "Collaborates with development teams to improve build processes, reduce deployment friction, containerize applications, and adopt DevOps best practices.",
"sentence": "Build and optimize data pipelines using one or more integration patterns and solutions (e.g. , ETL, Message driven, Event driven, Streaming API design and more).",
"similarity": 0.4782
},
{
"kra_text": "Collaborates with development teams to improve build processes, reduce deployment friction, containerize applications, and adopt DevOps best practices.",
"sentence": "Develop a strong talent pipeline and provide opportunities and challenges to technical team members to learn and grow.",
"similarity": 0.4449
}
],
"matched_count": null,
"matched_skills": null,
"role_id": 10,
"score": 0.4917,
"slug": "devops-engineer",
"total_count": null
},
{
"display_name": "ML Engineer",
"kra_matches": [
{
"kra_text": "Prepares, cleans, and transforms training datasets, manages feature stores, and builds feature engineering pipelines for model training.",
"sentence": "Work with on prem and public cloud-based data platforms, data lakes, data warehouses, various analytical tools like Spark, Python, Tableau and machine learning capabilities.",
"similarity": 0.5178
},
{
"kra_text": "Prepares, cleans, and transforms training datasets, manages feature stores, and builds feature engineering pipelines for model training.",
"sentence": "Collaborate with the Data Product Owners, Enterprise Data Governance and Management stakeholders, data/staff architects, and business users to design the data models and data pipelines for high-performance analytics while ensuring data accuracy and completeness.",
"similarity": 0.4572
},
{
"kra_text": "Prepares, cleans, and transforms training datasets, manages feature stores, and builds feature engineering pipelines for model training.",
"sentence": "Build and optimize data pipelines using one or more integration patterns and solutions (e.g. , ETL, Message driven, Event driven, Streaming API design and more).",
"similarity": 0.4519
}
],
"matched_count": null,
"matched_skills": null,
"role_id": 3,
"score": 0.4756,
"slug": "ml-engineer",
"total_count": null
},
{
"display_name": "Flutter Developer",
"kra_matches": [
{
"kra_text": "integrate external APIs and data sources",
"sentence": "Build and optimize data pipelines using one or more integration patterns and solutions (e.g. , ETL, Message driven, Event driven, Streaming API design and more).",
"similarity": 0.4962
},
{
"kra_text": "collaborate with design, product, and backend teams",
"sentence": "Collaborate with the Data Product Owners, Enterprise Data Governance and Management stakeholders, data/staff architects, and business users to design the data models and data pipelines for high-performance analytics while ensuring data accuracy and completeness.",
"similarity": 0.4941
},
{
"kra_text": "collaborate with design, product, and backend teams",
"sentence": "Set goals, performance review and technical leadership to foster a team environment, and provide mentorship and feedback to our awesome engineering colleagues.",
"similarity": 0.4364
}
],
"matched_count": null,
"matched_skills": null,
"role_id": 74,
"score": 0.4756,
"slug": "flutter-developer",
"total_count": null
}
],
"skill_match_roles": [
{
"display_name": "Data Engineer",
"kra_matches": null,
"matched_count": 3,
"matched_skills": [
"Apache Spark",
"Python",
"Tableau"
],
"role_id": 2,
"score": 0.2308,
"slug": "data-engineer",
"total_count": 13
},
{
"display_name": "ML Engineer",
"kra_matches": null,
"matched_count": 2,
"matched_skills": [
"Machine Learning",
"Python"
],
"role_id": 3,
"score": 0.1538,
"slug": "ml-engineer",
"total_count": 13
},
{
"display_name": "MLOps Engineer",
"kra_matches": null,
"matched_count": 2,
"matched_skills": [
"Machine Learning",
"Python"
],
"role_id": 16,
"score": 0.1538,
"slug": "ml-ops-engineer",
"total_count": 13
},
{
"display_name": "AR/VR Engineer",
"kra_matches": null,
"matched_count": 1,
"matched_skills": [
"Python"
],
"role_id": 8,
"score": 0.0769,
"slug": "ar-vr-engineer",
"total_count": 13
},
{
"display_name": "Cyber Security Engineer",
"kra_matches": null,
"matched_count": 1,
"matched_skills": [
"Python"
],
"role_id": 5,
"score": 0.0769,
"slug": "cybersecurity-engineer",
"total_count": 13
}
]
},
"stage4_decision": {
"alias_collision_detected": false,
"case": "DOMAIN",
"chosen_role": {
"display_name": "Data Engineer",
"kra_matches": null,
"matched_count": null,
"matched_skills": null,
"role_id": 2,
"score": 0.96,
"slug": "data-engineer",
"total_count": null
},
"confidence": 0.96,
"is_new_role": false,
"llm2_fired": false,
"llm2_reasoning": null,
"matched_dimensions": [
"Data Strategy and Product Engineering",
"Technical Leadership and Mentorship",
"Talent Development",
"Cloud-Native Data Platform Engineering",
"Analytics Data Modeling",
"Data Pipeline Architecture",
"Hybrid Cloud Data Solutions"
],
"matched_kras": [
"Contribute to the overall Data Strategy",
"Set goals, performance review and technical leadership",
"Develop a strong talent pipeline",
"design the data models and data pipelines",
"ensure data accuracy and completeness",
"Provide technical and thought leadership",
"Work with on prem and public cloud-based data platforms",
"Build and optimize data pipelines"
],
"matched_skills": [
"Data Strategy",
"data driven products",
"data models",
"data pipelines",
"Cloud native applications",
"hybrid cloud environment",
"data lakes",
"data warehouses",
"Spark",
"Python",
"Tableau",
"machine learning",
"ETL",
"Message driven",
"Event driven",
"Streaming API design"
],
"new_role_display_name": null,
"new_role_slug": null,
"queued": false,
"reasoning": "Domain=Data Engineering \u0026 Analytics; The JD centers on leading data strategy and building data pipelines, cloud-native data platforms, analytics, and integration patterns, which aligns best with Data Engineer at an engineering leadership level.",
"sub_role": null
},
"stage5_updates": {
"centroid_n_after": 237,
"centroid_updated": true,
"collision_log_id": null,
"new_kra_attached": null,
"new_skills_attached": [
{
"is_primary": true,
"queue_id": 11945,
"role_display_name": "Data Engineer",
"role_slug": "data-engineer",
"skill_name": "ETL",
"status": "pending"
},
{
"is_primary": true,
"queue_id": 11946,
"role_display_name": "Data Engineer",
"role_slug": "data-engineer",
"skill_name": "Message-Driven Architecture",
"status": "pending"
},
{
"is_primary": true,
"queue_id": 11947,
"role_display_name": "Data Engineer",
"role_slug": "data-engineer",
"skill_name": "Streaming",
"status": "pending"
},
{
"is_primary": true,
"queue_id": 11948,
"role_display_name": "Data Engineer",
"role_slug": "data-engineer",
"skill_name": "API Design",
"status": "pending"
},
{
"is_primary": true,
"queue_id": 11949,
"role_display_name": "Data Engineer",
"role_slug": "data-engineer",
"skill_name": "Cloud Native",
"status": "pending"
},
{
"is_primary": true,
"queue_id": 11950,
"role_display_name": "Data Engineer",
"role_slug": "data-engineer",
"skill_name": "Hybrid Cloud",
"status": "pending"
},
{
"is_primary": true,
"queue_id": 11951,
"role_display_name": "Data Engineer",
"role_slug": "data-engineer",
"skill_name": "Data Warehouses",
"status": "pending"
}
],
"queue_entry_id": null,
"v3_pipeline_triggered": false,
"v3_role_slug": null,
"v3_run_id": null
}
}
API 2 — extract-details
{
"alias_matches": [
{
"alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
"alias_persisted": false,
"existing_alias_id": 2510,
"existing_alias_text": "spark",
"input_term": "Spark",
"matched_canonical": {
"category_id": 5,
"display_name": "Apache Spark",
"id": 1350,
"is_also_category": false,
"is_extractable": true,
"skill_nature": "FRAMEWORK",
"slug": "apache-spark",
"sub_category_id": 1021,
"typical_lifespan": "EVERGREEN",
"volatility": "STABLE"
},
"matched_via": "alias"
},
{
"alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
"alias_persisted": false,
"existing_alias_id": 67,
"existing_alias_text": "Python",
"input_term": "Python",
"matched_canonical": {
"category_id": 6,
"display_name": "Python",
"id": 5,
"is_also_category": false,
"is_extractable": true,
"skill_nature": "LANGUAGE",
"slug": "python",
"sub_category_id": 96,
"typical_lifespan": "EVERGREEN",
"volatility": "STABLE"
},
"matched_via": "alias"
},
{
"alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
"alias_persisted": false,
"existing_alias_id": 359,
"existing_alias_text": "Tableau",
"input_term": "Tableau",
"matched_canonical": {
"category_id": 9,
"display_name": "Tableau",
"id": 150,
"is_also_category": false,
"is_extractable": true,
"skill_nature": "PLATFORM",
"slug": "tableau",
"sub_category_id": 111,
"typical_lifespan": "EVERGREEN",
"volatility": "STABLE"
},
"matched_via": "alias"
},
{
"alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
"alias_persisted": false,
"existing_alias_id": 2015,
"existing_alias_text": "Machine Learning",
"input_term": "Machine Learning",
"matched_canonical": {
"category_id": 2,
"display_name": "Machine Learning",
"id": 1356,
"is_also_category": false,
"is_extractable": true,
"skill_nature": "CONCEPT",
"slug": "machine-learning",
"sub_category_id": 1024,
"typical_lifespan": "EVERGREEN",
"volatility": "STABLE"
},
"matched_via": "alias"
},
{
"alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
"alias_persisted": false,
"existing_alias_id": 2019,
"existing_alias_text": "Event-Driven Architecture",
"input_term": "Event-Driven Architecture",
"matched_canonical": {
"category_id": 1,
"display_name": "Event-Driven Architecture",
"id": 1360,
"is_also_category": false,
"is_extractable": true,
"skill_nature": "PATTERN",
"slug": "event-driven-architecture",
"sub_category_id": 1027,
"typical_lifespan": "EVERGREEN",
"volatility": "STABLE"
},
"matched_via": "alias"
},
{
"alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
"alias_persisted": false,
"existing_alias_id": 2017,
"existing_alias_text": "Data Lakes",
"input_term": "Data Lakes",
"matched_canonical": {
"category_id": 1,
"display_name": "Data Lakes",
"id": 1358,
"is_also_category": false,
"is_extractable": true,
"skill_nature": "PATTERN",
"slug": "data-lakes",
"sub_category_id": 1025,
"typical_lifespan": "EVERGREEN",
"volatility": "STABLE"
},
"matched_via": "alias"
}
],
"candidate_roles": [
{
"display_name": "Data Engineer",
"id": 2,
"rationale": null,
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
},
{
"display_name": "Cloud Security Engineer",
"id": 23,
"rationale": null,
"role_archetype": null,
"slug": "cloud-security-engineer",
"source": "db"
},
{
"display_name": "Backend Developer",
"id": 1,
"rationale": null,
"role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
"slug": "backend-engineer",
"source": "db"
},
{
"display_name": "Fullstack Developer",
"id": 435,
"rationale": null,
"role_archetype": "Engineering",
"slug": "fullstack-developer",
"source": "db"
},
{
"display_name": "Fullstack Developer",
"id": 15,
"rationale": null,
"role_archetype": null,
"slug": "full-stack-engineer",
"source": "db"
},
{
"display_name": "Cyber Security Engineer",
"id": 5,
"rationale": null,
"role_archetype": null,
"slug": "cybersecurity-engineer",
"source": "db"
},
{
"display_name": "ML Engineer",
"id": 3,
"rationale": null,
"role_archetype": null,
"slug": "ml-engineer",
"source": "db"
},
{
"display_name": "MLOps Engineer",
"id": 16,
"rationale": null,
"role_archetype": null,
"slug": "ml-ops-engineer",
"source": "db"
},
{
"display_name": "AR/VR Engineer",
"id": 8,
"rationale": null,
"role_archetype": null,
"slug": "ar-vr-engineer",
"source": "db"
},
{
"display_name": "Python Backend Developer",
"id": 80,
"rationale": null,
"role_archetype": "Engineering",
"slug": "python-backend-developer",
"source": "db"
},
{
"display_name": "AI Engineer",
"id": 13,
"rationale": null,
"role_archetype": null,
"slug": "ai-engineer",
"source": "db"
},
{
"display_name": "Cloud Architect",
"id": 9,
"rationale": null,
"role_archetype": null,
"slug": "cloud-architect",
"source": "db"
}
],
"chosen_role": {
"display_name": "Data Engineer",
"id": 2,
"rationale": "Domain=Data Engineering \u0026 Analytics; The JD centers on leading data strategy and building data pipelines, cloud-native data platforms, analytics, and integration patterns, which aligns best with Data Engineer at an engineering leadership level.",
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
},
"dimensions": [
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "ETL and ELT Tooling",
"id": 24,
"rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
"slug": "etl-and-elt-tooling",
"source": "db"
},
"input_skill": "Spark",
"llm_role": null,
"roles_from_db": [
{
"display_name": "Data Engineer",
"id": 2,
"rationale": null,
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
}
]
},
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Cloud Security Scripting \u0026 DSL Languages",
"id": 248,
"rationale": "Proficiency in programming and domain-specific languages used to automate and script cloud security controls.",
"slug": "cloud-security-scripting-dsl-languages",
"source": "db"
},
"input_skill": "Python",
"llm_role": null,
"roles_from_db": [
{
"display_name": "Cloud Security Engineer",
"id": 23,
"rationale": null,
"role_archetype": null,
"slug": "cloud-security-engineer",
"source": "db"
}
]
},
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Programming Languages",
"id": 1,
"rationale": "Primary implementation languages used to build client and server feature code. Full stack engineers need enough fluency to move across layers and implement product behavior end to end.",
"slug": "programming-languages",
"source": "db"
},
"input_skill": "Python",
"llm_role": null,
"roles_from_db": [
{
"display_name": "Backend Developer",
"id": 1,
"rationale": null,
"role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
"slug": "backend-engineer",
"source": "db"
},
{
"display_name": "Fullstack Developer",
"id": 435,
"rationale": null,
"role_archetype": "Engineering",
"slug": "fullstack-developer",
"source": "db"
},
{
"display_name": "Fullstack Developer",
"id": 15,
"rationale": null,
"role_archetype": null,
"slug": "full-stack-engineer",
"source": "db"
}
]
},
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Programming Languages and Scripting",
"id": 59,
"rationale": "Languages used to write security automation, analysis scripts, detection logic, and remediation helpers. This is the primary implementation surface for a cybersecurity engineer across tooling and response workflows.",
"slug": "programming-languages-and-scripting",
"source": "db"
},
"input_skill": "Python",
"llm_role": null,
"roles_from_db": [
{
"display_name": "Cyber Security Engineer",
"id": 5,
"rationale": null,
"role_archetype": null,
"slug": "cybersecurity-engineer",
"source": "db"
}
]
},
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Programming Languages for Data Work",
"id": 21,
"rationale": "Languages used to implement data pipelines, transformations, and operational glue. This is the primary coding surface for building ingestion, enrichment, and automation logic in data engineering.",
"slug": "programming-languages-for-data-work",
"source": "db"
},
"input_skill": "Python",
"llm_role": null,
"roles_from_db": [
{
"display_name": "Data Engineer",
"id": 2,
"rationale": null,
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
}
]
},
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Programming Languages for ML Systems",
"id": 39,
"rationale": "Languages used to build training code, inference services, evaluation jobs, and ML glue code. This is the primary implementation surface for ML engineers across experimentation and productionization.",
"slug": "programming-languages-for-ml-systems",
"source": "db"
},
"input_skill": "Python",
"llm_role": null,
"roles_from_db": [
{
"display_name": "ML Engineer",
"id": 3,
"rationale": null,
"role_archetype": null,
"slug": "ml-engineer",
"source": "db"
},
{
"display_name": "MLOps Engineer",
"id": 16,
"rationale": null,
"role_archetype": null,
"slug": "ml-ops-engineer",
"source": "db"
}
]
},
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Programming Languages for XR",
"id": 97,
"rationale": "Primary implementation languages used to build immersive client features, interaction logic, and device-specific runtime behavior. This is the core coding surface for AR/VR experiences.",
"slug": "programming-languages-for-xr",
"source": "db"
},
"input_skill": "Python",
"llm_role": null,
"roles_from_db": [
{
"display_name": "AR/VR Engineer",
"id": 8,
"rationale": null,
"role_archetype": null,
"slug": "ar-vr-engineer",
"source": "db"
}
]
},
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Python Programming",
"id": 290,
"rationale": "Core Python language skills used to implement backend business logic, request handlers, integrations, and service internals. This is the primary coding surface for the role.",
"slug": "python-programming",
"source": "db"
},
"input_skill": "Python",
"llm_role": null,
"roles_from_db": [
{
"display_name": "Python Backend Developer",
"id": 80,
"rationale": null,
"role_archetype": "Engineering",
"slug": "python-backend-developer",
"source": "db"
}
]
},
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "BI and Visualization Tools",
"id": 31,
"rationale": "Tools used to expose curated data to analysts and business users through dashboards, reports, and semantic exploration. Data engineers support these tools by shaping reliable datasets and performant models.",
"slug": "bi-and-visualization-tools",
"source": "db"
},
"input_skill": "Tableau",
"llm_role": null,
"roles_from_db": [
{
"display_name": "Data Engineer",
"id": 2,
"rationale": null,
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
}
]
},
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "AI Governance and Model Security",
"id": 50,
"rationale": "Controls and documentation used to make models safer, auditable, and compliant. ML engineers use this to manage model risk, supply chain integrity, and governance requirements.",
"slug": "ai-governance-and-model-security",
"source": "db"
},
"input_skill": "Machine Learning",
"llm_role": null,
"roles_from_db": [
{
"display_name": "AI Engineer",
"id": 13,
"rationale": null,
"role_archetype": null,
"slug": "ai-engineer",
"source": "db"
},
{
"display_name": "ML Engineer",
"id": 3,
"rationale": null,
"role_archetype": null,
"slug": "ml-engineer",
"source": "db"
},
{
"display_name": "MLOps Engineer",
"id": 16,
"rationale": null,
"role_archetype": null,
"slug": "ml-ops-engineer",
"source": "db"
}
]
},
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "React Frontend Development",
"id": 96,
"rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
"slug": "d_init_01",
"source": "db"
},
"input_skill": "Machine Learning",
"llm_role": null,
"roles_from_db": []
},
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "React Frontend Development",
"id": 96,
"rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
"slug": "d_init_01",
"source": "db"
},
"input_skill": "Event-Driven Architecture",
"llm_role": null,
"roles_from_db": []
},
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Cloud Storage and Data Services",
"id": 144,
"rationale": "Cloud-native storage and managed data services used to place workloads, choose durability tiers, and define platform boundaries. This is a coherent cluster because architects evaluate storage fit, access patterns, and managed service tradeoffs.",
"slug": "cloud-storage-and-data-services",
"source": "db"
},
"input_skill": "Data Lakes",
"llm_role": null,
"roles_from_db": [
{
"display_name": "Cloud Architect",
"id": 9,
"rationale": null,
"role_archetype": null,
"slug": "cloud-architect",
"source": "db"
}
]
},
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "React Frontend Development",
"id": 96,
"rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
"slug": "d_init_01",
"source": "db"
},
"input_skill": "Data Lakes",
"llm_role": null,
"roles_from_db": []
}
],
"input_final_skills": [
"Spark",
"Python",
"Tableau",
"Machine Learning",
"ETL",
"Message-Driven Architecture",
"Event-Driven Architecture",
"Streaming",
"API Design",
"Cloud Native",
"Hybrid Cloud",
"Data Lakes",
"Data Warehouses"
],
"input_llm_skills": [
"Spark",
"Python",
"Tableau",
"Machine Learning",
"ETL",
"Message-Driven Architecture",
"Event-Driven Architecture",
"Streaming",
"API Design",
"Cloud Native",
"Hybrid Cloud",
"Data Lakes",
"Data Warehouses"
],
"new_aliases_persisted": 0,
"run_id": "c90f7764-3162-4ae8-9e8a-6579f118a3a8",
"skills_detail": [
{
"aliases_in_db": [
{
"alias_text": "Apache Spark",
"alias_type": "CANONICAL",
"id": 2004,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "apache spark 3",
"alias_type": "VERSION",
"id": 2006,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "spark",
"alias_type": "VERSION",
"id": 2510,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "spark 3",
"alias_type": "VERSION",
"id": 2007,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "spark 3.x",
"alias_type": "VERSION",
"id": 2009,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "spark3",
"alias_type": "VERSION",
"id": 2008,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
}
],
"canonical": {
"category_id": 5,
"display_name": "Apache Spark",
"id": 1350,
"is_also_category": false,
"is_extractable": true,
"skill_nature": "FRAMEWORK",
"slug": "apache-spark",
"sub_category_id": 1021,
"typical_lifespan": "EVERGREEN",
"volatility": "STABLE"
},
"dimensions": [
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "ETL and ELT Tooling",
"id": 24,
"rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
"slug": "etl-and-elt-tooling",
"source": "db"
},
"input_skill": "Spark",
"llm_role": null,
"roles_from_db": [
{
"display_name": "Data Engineer",
"id": 2,
"rationale": null,
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
}
]
}
],
"input_skill": "Spark",
"matched_via": "alias",
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": null,
"source_tag": "db",
"was_in_llm_skills": true
},
{
"aliases_in_db": [
{
"alias_text": "Python",
"alias_type": "CANONICAL",
"id": 67,
"is_primary": true,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "Python 2",
"alias_type": "VERSION",
"id": 72,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "Python 2.x",
"alias_type": "VERSION",
"id": 74,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "Python 3",
"alias_type": "VERSION",
"id": 73,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "Python 3.10",
"alias_type": "VERSION",
"id": 76,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "Python 3.11",
"alias_type": "VERSION",
"id": 77,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "Python 3.12",
"alias_type": "VERSION",
"id": 78,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "Python 3.x",
"alias_type": "VERSION",
"id": 75,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "py",
"alias_type": "VERSION",
"id": 2183,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "py2",
"alias_type": "VERSION",
"id": 68,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "py3",
"alias_type": "VERSION",
"id": 69,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "python 3",
"alias_type": "VERSION",
"id": 2186,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "python 3.x",
"alias_type": "VERSION",
"id": 2849,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "python2",
"alias_type": "VERSION",
"id": 70,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "python3",
"alias_type": "VERSION",
"id": 71,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
},
{
"alias_text": "python3.x",
"alias_type": "VERSION",
"id": 2848,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
}
],
"canonical": {
"category_id": 6,
"display_name": "Python",
"id": 5,
"is_also_category": false,
"is_extractable": true,
"skill_nature": "LANGUAGE",
"slug": "python",
"sub_category_id": 96,
"typical_lifespan": "EVERGREEN",
"volatility": "STABLE"
},
"dimensions": [
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Cloud Security Scripting \u0026 DSL Languages",
"id": 248,
"rationale": "Proficiency in programming and domain-specific languages used to automate and script cloud security controls.",
"slug": "cloud-security-scripting-dsl-languages",
"source": "db"
},
"input_skill": "Python",
"llm_role": null,
"roles_from_db": [
{
"display_name": "Cloud Security Engineer",
"id": 23,
"rationale": null,
"role_archetype": null,
"slug": "cloud-security-engineer",
"source": "db"
}
]
},
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Programming Languages",
"id": 1,
"rationale": "Primary implementation languages used to build client and server feature code. Full stack engineers need enough fluency to move across layers and implement product behavior end to end.",
"slug": "programming-languages",
"source": "db"
},
"input_skill": "Python",
"llm_role": null,
"roles_from_db": [
{
"display_name": "Backend Developer",
"id": 1,
"rationale": null,
"role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
"slug": "backend-engineer",
"source": "db"
},
{
"display_name": "Fullstack Developer",
"id": 435,
"rationale": null,
"role_archetype": "Engineering",
"slug": "fullstack-developer",
"source": "db"
},
{
"display_name": "Fullstack Developer",
"id": 15,
"rationale": null,
"role_archetype": null,
"slug": "full-stack-engineer",
"source": "db"
}
]
},
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Programming Languages and Scripting",
"id": 59,
"rationale": "Languages used to write security automation, analysis scripts, detection logic, and remediation helpers. This is the primary implementation surface for a cybersecurity engineer across tooling and response workflows.",
"slug": "programming-languages-and-scripting",
"source": "db"
},
"input_skill": "Python",
"llm_role": null,
"roles_from_db": [
{
"display_name": "Cyber Security Engineer",
"id": 5,
"rationale": null,
"role_archetype": null,
"slug": "cybersecurity-engineer",
"source": "db"
}
]
},
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Programming Languages for Data Work",
"id": 21,
"rationale": "Languages used to implement data pipelines, transformations, and operational glue. This is the primary coding surface for building ingestion, enrichment, and automation logic in data engineering.",
"slug": "programming-languages-for-data-work",
"source": "db"
},
"input_skill": "Python",
"llm_role": null,
"roles_from_db": [
{
"display_name": "Data Engineer",
"id": 2,
"rationale": null,
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
}
]
},
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Programming Languages for ML Systems",
"id": 39,
"rationale": "Languages used to build training code, inference services, evaluation jobs, and ML glue code. This is the primary implementation surface for ML engineers across experimentation and productionization.",
"slug": "programming-languages-for-ml-systems",
"source": "db"
},
"input_skill": "Python",
"llm_role": null,
"roles_from_db": [
{
"display_name": "ML Engineer",
"id": 3,
"rationale": null,
"role_archetype": null,
"slug": "ml-engineer",
"source": "db"
},
{
"display_name": "MLOps Engineer",
"id": 16,
"rationale": null,
"role_archetype": null,
"slug": "ml-ops-engineer",
"source": "db"
}
]
},
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Programming Languages for XR",
"id": 97,
"rationale": "Primary implementation languages used to build immersive client features, interaction logic, and device-specific runtime behavior. This is the core coding surface for AR/VR experiences.",
"slug": "programming-languages-for-xr",
"source": "db"
},
"input_skill": "Python",
"llm_role": null,
"roles_from_db": [
{
"display_name": "AR/VR Engineer",
"id": 8,
"rationale": null,
"role_archetype": null,
"slug": "ar-vr-engineer",
"source": "db"
}
]
},
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Python Programming",
"id": 290,
"rationale": "Core Python language skills used to implement backend business logic, request handlers, integrations, and service internals. This is the primary coding surface for the role.",
"slug": "python-programming",
"source": "db"
},
"input_skill": "Python",
"llm_role": null,
"roles_from_db": [
{
"display_name": "Python Backend Developer",
"id": 80,
"rationale": null,
"role_archetype": "Engineering",
"slug": "python-backend-developer",
"source": "db"
}
]
}
],
"input_skill": "Python",
"matched_via": "alias",
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": null,
"source_tag": "db",
"was_in_llm_skills": true
},
{
"aliases_in_db": [
{
"alias_text": "Tableau",
"alias_type": "CANONICAL",
"id": 359,
"is_primary": true,
"match_strategy": "CASE_INSENSITIVE"
}
],
"canonical": {
"category_id": 9,
"display_name": "Tableau",
"id": 150,
"is_also_category": false,
"is_extractable": true,
"skill_nature": "PLATFORM",
"slug": "tableau",
"sub_category_id": 111,
"typical_lifespan": "EVERGREEN",
"volatility": "STABLE"
},
"dimensions": [
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "BI and Visualization Tools",
"id": 31,
"rationale": "Tools used to expose curated data to analysts and business users through dashboards, reports, and semantic exploration. Data engineers support these tools by shaping reliable datasets and performant models.",
"slug": "bi-and-visualization-tools",
"source": "db"
},
"input_skill": "Tableau",
"llm_role": null,
"roles_from_db": [
{
"display_name": "Data Engineer",
"id": 2,
"rationale": null,
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
}
]
}
],
"input_skill": "Tableau",
"matched_via": "alias",
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": null,
"source_tag": "db",
"was_in_llm_skills": true
},
{
"aliases_in_db": [
{
"alias_text": "Machine Learning",
"alias_type": "CANONICAL",
"id": 2015,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
}
],
"canonical": {
"category_id": 2,
"display_name": "Machine Learning",
"id": 1356,
"is_also_category": false,
"is_extractable": true,
"skill_nature": "CONCEPT",
"slug": "machine-learning",
"sub_category_id": 1024,
"typical_lifespan": "EVERGREEN",
"volatility": "STABLE"
},
"dimensions": [
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "AI Governance and Model Security",
"id": 50,
"rationale": "Controls and documentation used to make models safer, auditable, and compliant. ML engineers use this to manage model risk, supply chain integrity, and governance requirements.",
"slug": "ai-governance-and-model-security",
"source": "db"
},
"input_skill": "Machine Learning",
"llm_role": null,
"roles_from_db": [
{
"display_name": "AI Engineer",
"id": 13,
"rationale": null,
"role_archetype": null,
"slug": "ai-engineer",
"source": "db"
},
{
"display_name": "ML Engineer",
"id": 3,
"rationale": null,
"role_archetype": null,
"slug": "ml-engineer",
"source": "db"
},
{
"display_name": "MLOps Engineer",
"id": 16,
"rationale": null,
"role_archetype": null,
"slug": "ml-ops-engineer",
"source": "db"
}
]
},
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "React Frontend Development",
"id": 96,
"rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
"slug": "d_init_01",
"source": "db"
},
"input_skill": "Machine Learning",
"llm_role": null,
"roles_from_db": []
}
],
"input_skill": "Machine Learning",
"matched_via": "alias",
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": null,
"source_tag": "db",
"was_in_llm_skills": true
},
{
"aliases_in_db": [],
"canonical": null,
"dimensions": [],
"input_skill": "ETL",
"matched_via": null,
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": {
"derived": {
"category": "Data Engineering Tools",
"skill_nature": "PRACTICE",
"sub_category": "general",
"typical_lifespan": "MULTI_YEAR",
"version_strategy": "UNVERSIONED",
"volatility": "MEDIUM"
},
"enrichment": null,
"keep_log": [],
"locked_dimensions": [],
"merge_log": [],
"placed": null,
"relationships": null,
"skill_id": "etl",
"split_log": [],
"typed": null,
"warnings": []
},
"source_tag": "llm",
"was_in_llm_skills": true
},
{
"aliases_in_db": [],
"canonical": null,
"dimensions": [],
"input_skill": "Message-Driven Architecture",
"matched_via": null,
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": {
"derived": {
"category": "Architectural Concepts",
"skill_nature": "CONCEPT",
"sub_category": "general",
"typical_lifespan": "EVERGREEN",
"version_strategy": "UNVERSIONED",
"volatility": "STABLE"
},
"enrichment": null,
"keep_log": [],
"locked_dimensions": [],
"merge_log": [],
"placed": null,
"relationships": null,
"skill_id": "message-driven-architecture",
"split_log": [],
"typed": null,
"warnings": []
},
"source_tag": "llm",
"was_in_llm_skills": true
},
{
"aliases_in_db": [
{
"alias_text": "Event-Driven Architecture",
"alias_type": "CANONICAL",
"id": 2019,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
}
],
"canonical": {
"category_id": 1,
"display_name": "Event-Driven Architecture",
"id": 1360,
"is_also_category": false,
"is_extractable": true,
"skill_nature": "PATTERN",
"slug": "event-driven-architecture",
"sub_category_id": 1027,
"typical_lifespan": "EVERGREEN",
"volatility": "STABLE"
},
"dimensions": [
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "React Frontend Development",
"id": 96,
"rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
"slug": "d_init_01",
"source": "db"
},
"input_skill": "Event-Driven Architecture",
"llm_role": null,
"roles_from_db": []
}
],
"input_skill": "Event-Driven Architecture",
"matched_via": "alias",
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": null,
"source_tag": "db",
"was_in_llm_skills": true
},
{
"aliases_in_db": [],
"canonical": null,
"dimensions": [],
"input_skill": "Streaming",
"matched_via": null,
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": {
"derived": {
"category": "Data Engineering Tools",
"skill_nature": "CONCEPT",
"sub_category": "general",
"typical_lifespan": "MULTI_YEAR",
"version_strategy": "UNVERSIONED",
"volatility": "MEDIUM"
},
"enrichment": null,
"keep_log": [],
"locked_dimensions": [],
"merge_log": [],
"placed": null,
"relationships": null,
"skill_id": "streaming",
"split_log": [],
"typed": null,
"warnings": []
},
"source_tag": "llm",
"was_in_llm_skills": true
},
{
"aliases_in_db": [],
"canonical": null,
"dimensions": [],
"input_skill": "API Design",
"matched_via": null,
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": {
"derived": {
"category": "Concepts",
"skill_nature": "CONCEPT",
"sub_category": "general",
"typical_lifespan": "MULTI_YEAR",
"version_strategy": "UNVERSIONED",
"volatility": "MEDIUM"
},
"enrichment": null,
"keep_log": [],
"locked_dimensions": [],
"merge_log": [],
"placed": null,
"relationships": null,
"skill_id": "api-design",
"split_log": [],
"typed": null,
"warnings": []
},
"source_tag": "llm",
"was_in_llm_skills": true
},
{
"aliases_in_db": [],
"canonical": null,
"dimensions": [],
"input_skill": "Cloud Native",
"matched_via": null,
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": {
"derived": {
"category": "Concepts",
"skill_nature": "CONCEPT",
"sub_category": "general",
"typical_lifespan": "MULTI_YEAR",
"version_strategy": "UNVERSIONED",
"volatility": "MEDIUM"
},
"enrichment": null,
"keep_log": [],
"locked_dimensions": [],
"merge_log": [],
"placed": null,
"relationships": null,
"skill_id": "cloud-native",
"split_log": [],
"typed": null,
"warnings": []
},
"source_tag": "llm",
"was_in_llm_skills": true
},
{
"aliases_in_db": [],
"canonical": null,
"dimensions": [],
"input_skill": "Hybrid Cloud",
"matched_via": null,
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": {
"derived": {
"category": "Cloud Platforms",
"skill_nature": "CONCEPT",
"sub_category": "general",
"typical_lifespan": "MULTI_YEAR",
"version_strategy": "UNVERSIONED",
"volatility": "MEDIUM"
},
"enrichment": null,
"keep_log": [],
"locked_dimensions": [],
"merge_log": [],
"placed": null,
"relationships": null,
"skill_id": "hybrid-cloud",
"split_log": [],
"typed": null,
"warnings": []
},
"source_tag": "llm",
"was_in_llm_skills": true
},
{
"aliases_in_db": [
{
"alias_text": "Data Lakes",
"alias_type": "CANONICAL",
"id": 2017,
"is_primary": false,
"match_strategy": "CASE_INSENSITIVE"
}
],
"canonical": {
"category_id": 1,
"display_name": "Data Lakes",
"id": 1358,
"is_also_category": false,
"is_extractable": true,
"skill_nature": "PATTERN",
"slug": "data-lakes",
"sub_category_id": 1025,
"typical_lifespan": "EVERGREEN",
"volatility": "STABLE"
},
"dimensions": [
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Cloud Storage and Data Services",
"id": 144,
"rationale": "Cloud-native storage and managed data services used to place workloads, choose durability tiers, and define platform boundaries. This is a coherent cluster because architects evaluate storage fit, access patterns, and managed service tradeoffs.",
"slug": "cloud-storage-and-data-services",
"source": "db"
},
"input_skill": "Data Lakes",
"llm_role": null,
"roles_from_db": [
{
"display_name": "Cloud Architect",
"id": 9,
"rationale": null,
"role_archetype": null,
"slug": "cloud-architect",
"source": "db"
}
]
},
{
"dimension": {
"difficulty_hint": "well_known",
"display_name": "React Frontend Development",
"id": 96,
"rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
"slug": "d_init_01",
"source": "db"
},
"input_skill": "Data Lakes",
"llm_role": null,
"roles_from_db": []
}
],
"input_skill": "Data Lakes",
"matched_via": "alias",
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": null,
"source_tag": "db",
"was_in_llm_skills": true
},
{
"aliases_in_db": [],
"canonical": null,
"dimensions": [],
"input_skill": "Data Warehouses",
"matched_via": null,
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": {
"derived": {
"category": "Databases",
"skill_nature": "TOOL",
"sub_category": "general",
"typical_lifespan": "MULTI_YEAR",
"version_strategy": "UNVERSIONED",
"volatility": "MEDIUM"
},
"enrichment": null,
"keep_log": [],
"locked_dimensions": [],
"merge_log": [],
"placed": null,
"relationships": null,
"skill_id": "data-warehouses",
"split_log": [],
"typed": null,
"warnings": []
},
"source_tag": "llm",
"was_in_llm_skills": true
}
],
"unmatched_skills": [
"ETL",
"Message-Driven Architecture",
"Streaming",
"API Design",
"Cloud Native",
"Hybrid Cloud",
"Data Warehouses"
]
}
API 3 — final-role-output
{
"chosen_role": {
"display_name": "Data Engineer",
"id": 2,
"rationale": "Domain=Data Engineering \u0026 Analytics; The JD centers on leading data strategy and building data pipelines, cloud-native data platforms, analytics, and integration patterns, which aligns best with Data Engineer at an engineering leadership level.",
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
},
"chosen_role_resolution": "in_db",
"final_input_skills": [
{
"skill": "Spark",
"tag": "in_db"
},
{
"skill": "Python",
"tag": "in_db"
},
{
"skill": "Tableau",
"tag": "in_db"
},
{
"skill": "Machine Learning",
"tag": "in_db"
},
{
"skill": "ETL",
"tag": "new"
},
{
"skill": "Message-Driven Architecture",
"tag": "new"
},
{
"skill": "Event-Driven Architecture",
"tag": "in_db"
},
{
"skill": "Streaming",
"tag": "new"
},
{
"skill": "API Design",
"tag": "new"
},
{
"skill": "Cloud Native",
"tag": "new"
},
{
"skill": "Hybrid Cloud",
"tag": "new"
},
{
"skill": "Data Lakes",
"tag": "in_db"
},
{
"skill": "Data Warehouses",
"tag": "new"
}
],
"llm_cost_api1_usd": null,
"llm_cost_api2_usd": null,
"llm_cost_api3_usd": null,
"llm_cost_total_usd": null,
"persistence": {
"items": [
{
"chosen_role_id": 2,
"dimension": {
"difficulty_hint": "well_known",
"display_name": "ETL and ELT Tooling",
"id": 24,
"rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
"slug": "etl-and-elt-tooling",
"source": "db"
},
"dimension_id": 24,
"input_skill": "Spark",
"llm_role": null,
"matched_chosen_role": true,
"outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension saved",
"role_dimension_saved": true,
"roles_from_db": [
{
"display_name": "Data Engineer",
"id": 2,
"rationale": null,
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
}
],
"skill_dimension_saved": true,
"skill_id": 1350,
"skill_tag": "in_db",
"skipped_reason": null
},
{
"chosen_role_id": 2,
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Cloud Security Scripting \u0026 DSL Languages",
"id": 248,
"rationale": "Proficiency in programming and domain-specific languages used to automate and script cloud security controls.",
"slug": "cloud-security-scripting-dsl-languages",
"source": "db"
},
"dimension_id": 248,
"input_skill": "Python",
"llm_role": null,
"matched_chosen_role": false,
"outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
"role_dimension_saved": false,
"roles_from_db": [
{
"display_name": "Cloud Security Engineer",
"id": 23,
"rationale": null,
"role_archetype": null,
"slug": "cloud-security-engineer",
"source": "db"
}
],
"skill_dimension_saved": true,
"skill_id": 5,
"skill_tag": "in_db",
"skipped_reason": null
},
{
"chosen_role_id": 2,
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Programming Languages",
"id": 1,
"rationale": "Primary implementation languages used to build client and server feature code. Full stack engineers need enough fluency to move across layers and implement product behavior end to end.",
"slug": "programming-languages",
"source": "db"
},
"dimension_id": 1,
"input_skill": "Python",
"llm_role": null,
"matched_chosen_role": false,
"outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
"role_dimension_saved": false,
"roles_from_db": [
{
"display_name": "Backend Developer",
"id": 1,
"rationale": null,
"role_archetype": "A Backend Engineer designs, builds, and maintains the server-side logic and data handling that power applications and services. They focus on implementing reliable business functionality, integrating with other systems, and ensuring the backend is scalable, maintainable, and observable.",
"slug": "backend-engineer",
"source": "db"
},
{
"display_name": "Fullstack Developer",
"id": 435,
"rationale": null,
"role_archetype": "Engineering",
"slug": "fullstack-developer",
"source": "db"
},
{
"display_name": "Fullstack Developer",
"id": 15,
"rationale": null,
"role_archetype": null,
"slug": "full-stack-engineer",
"source": "db"
}
],
"skill_dimension_saved": true,
"skill_id": 5,
"skill_tag": "in_db",
"skipped_reason": null
},
{
"chosen_role_id": 2,
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Programming Languages and Scripting",
"id": 59,
"rationale": "Languages used to write security automation, analysis scripts, detection logic, and remediation helpers. This is the primary implementation surface for a cybersecurity engineer across tooling and response workflows.",
"slug": "programming-languages-and-scripting",
"source": "db"
},
"dimension_id": 59,
"input_skill": "Python",
"llm_role": null,
"matched_chosen_role": false,
"outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
"role_dimension_saved": false,
"roles_from_db": [
{
"display_name": "Cyber Security Engineer",
"id": 5,
"rationale": null,
"role_archetype": null,
"slug": "cybersecurity-engineer",
"source": "db"
}
],
"skill_dimension_saved": true,
"skill_id": 5,
"skill_tag": "in_db",
"skipped_reason": null
},
{
"chosen_role_id": 2,
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Programming Languages for Data Work",
"id": 21,
"rationale": "Languages used to implement data pipelines, transformations, and operational glue. This is the primary coding surface for building ingestion, enrichment, and automation logic in data engineering.",
"slug": "programming-languages-for-data-work",
"source": "db"
},
"dimension_id": 21,
"input_skill": "Python",
"llm_role": null,
"matched_chosen_role": true,
"outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension saved",
"role_dimension_saved": true,
"roles_from_db": [
{
"display_name": "Data Engineer",
"id": 2,
"rationale": null,
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
}
],
"skill_dimension_saved": true,
"skill_id": 5,
"skill_tag": "in_db",
"skipped_reason": null
},
{
"chosen_role_id": 2,
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Programming Languages for ML Systems",
"id": 39,
"rationale": "Languages used to build training code, inference services, evaluation jobs, and ML glue code. This is the primary implementation surface for ML engineers across experimentation and productionization.",
"slug": "programming-languages-for-ml-systems",
"source": "db"
},
"dimension_id": 39,
"input_skill": "Python",
"llm_role": null,
"matched_chosen_role": false,
"outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
"role_dimension_saved": false,
"roles_from_db": [
{
"display_name": "ML Engineer",
"id": 3,
"rationale": null,
"role_archetype": null,
"slug": "ml-engineer",
"source": "db"
},
{
"display_name": "MLOps Engineer",
"id": 16,
"rationale": null,
"role_archetype": null,
"slug": "ml-ops-engineer",
"source": "db"
}
],
"skill_dimension_saved": true,
"skill_id": 5,
"skill_tag": "in_db",
"skipped_reason": null
},
{
"chosen_role_id": 2,
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Programming Languages for XR",
"id": 97,
"rationale": "Primary implementation languages used to build immersive client features, interaction logic, and device-specific runtime behavior. This is the core coding surface for AR/VR experiences.",
"slug": "programming-languages-for-xr",
"source": "db"
},
"dimension_id": 97,
"input_skill": "Python",
"llm_role": null,
"matched_chosen_role": false,
"outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
"role_dimension_saved": false,
"roles_from_db": [
{
"display_name": "AR/VR Engineer",
"id": 8,
"rationale": null,
"role_archetype": null,
"slug": "ar-vr-engineer",
"source": "db"
}
],
"skill_dimension_saved": true,
"skill_id": 5,
"skill_tag": "in_db",
"skipped_reason": null
},
{
"chosen_role_id": 2,
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Python Programming",
"id": 290,
"rationale": "Core Python language skills used to implement backend business logic, request handlers, integrations, and service internals. This is the primary coding surface for the role.",
"slug": "python-programming",
"source": "db"
},
"dimension_id": 290,
"input_skill": "Python",
"llm_role": null,
"matched_chosen_role": false,
"outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
"role_dimension_saved": false,
"roles_from_db": [
{
"display_name": "Python Backend Developer",
"id": 80,
"rationale": null,
"role_archetype": "Engineering",
"slug": "python-backend-developer",
"source": "db"
}
],
"skill_dimension_saved": true,
"skill_id": 5,
"skill_tag": "in_db",
"skipped_reason": null
},
{
"chosen_role_id": 2,
"dimension": {
"difficulty_hint": "well_known",
"display_name": "BI and Visualization Tools",
"id": 31,
"rationale": "Tools used to expose curated data to analysts and business users through dashboards, reports, and semantic exploration. Data engineers support these tools by shaping reliable datasets and performant models.",
"slug": "bi-and-visualization-tools",
"source": "db"
},
"dimension_id": 31,
"input_skill": "Tableau",
"llm_role": null,
"matched_chosen_role": true,
"outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension saved",
"role_dimension_saved": true,
"roles_from_db": [
{
"display_name": "Data Engineer",
"id": 2,
"rationale": null,
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
}
],
"skill_dimension_saved": true,
"skill_id": 150,
"skill_tag": "in_db",
"skipped_reason": null
},
{
"chosen_role_id": 2,
"dimension": {
"difficulty_hint": "well_known",
"display_name": "AI Governance and Model Security",
"id": 50,
"rationale": "Controls and documentation used to make models safer, auditable, and compliant. ML engineers use this to manage model risk, supply chain integrity, and governance requirements.",
"slug": "ai-governance-and-model-security",
"source": "db"
},
"dimension_id": 50,
"input_skill": "Machine Learning",
"llm_role": null,
"matched_chosen_role": false,
"outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
"role_dimension_saved": false,
"roles_from_db": [
{
"display_name": "AI Engineer",
"id": 13,
"rationale": null,
"role_archetype": null,
"slug": "ai-engineer",
"source": "db"
},
{
"display_name": "ML Engineer",
"id": 3,
"rationale": null,
"role_archetype": null,
"slug": "ml-engineer",
"source": "db"
},
{
"display_name": "MLOps Engineer",
"id": 16,
"rationale": null,
"role_archetype": null,
"slug": "ml-ops-engineer",
"source": "db"
}
],
"skill_dimension_saved": true,
"skill_id": 1356,
"skill_tag": "in_db",
"skipped_reason": null
},
{
"chosen_role_id": 2,
"dimension": {
"difficulty_hint": "well_known",
"display_name": "React Frontend Development",
"id": 96,
"rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
"slug": "d_init_01",
"source": "db"
},
"dimension_id": 96,
"input_skill": "Machine Learning",
"llm_role": null,
"matched_chosen_role": false,
"outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
"role_dimension_saved": false,
"roles_from_db": [],
"skill_dimension_saved": true,
"skill_id": 1356,
"skill_tag": "in_db",
"skipped_reason": null
},
{
"chosen_role_id": 2,
"dimension": {
"difficulty_hint": "well_known",
"display_name": "React Frontend Development",
"id": 96,
"rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
"slug": "d_init_01",
"source": "db"
},
"dimension_id": 96,
"input_skill": "Event-Driven Architecture",
"llm_role": null,
"matched_chosen_role": false,
"outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
"role_dimension_saved": false,
"roles_from_db": [],
"skill_dimension_saved": true,
"skill_id": 1360,
"skill_tag": "in_db",
"skipped_reason": null
},
{
"chosen_role_id": 2,
"dimension": {
"difficulty_hint": "well_known",
"display_name": "Cloud Storage and Data Services",
"id": 144,
"rationale": "Cloud-native storage and managed data services used to place workloads, choose durability tiers, and define platform boundaries. This is a coherent cluster because architects evaluate storage fit, access patterns, and managed service tradeoffs.",
"slug": "cloud-storage-and-data-services",
"source": "db"
},
"dimension_id": 144,
"input_skill": "Data Lakes",
"llm_role": null,
"matched_chosen_role": false,
"outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
"role_dimension_saved": false,
"roles_from_db": [
{
"display_name": "Cloud Architect",
"id": 9,
"rationale": null,
"role_archetype": null,
"slug": "cloud-architect",
"source": "db"
}
],
"skill_dimension_saved": true,
"skill_id": 1358,
"skill_tag": "in_db",
"skipped_reason": null
},
{
"chosen_role_id": 2,
"dimension": {
"difficulty_hint": "well_known",
"display_name": "React Frontend Development",
"id": 96,
"rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
"slug": "d_init_01",
"source": "db"
},
"dimension_id": 96,
"input_skill": "Data Lakes",
"llm_role": null,
"matched_chosen_role": false,
"outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
"role_dimension_saved": false,
"roles_from_db": [],
"skill_dimension_saved": true,
"skill_id": 1358,
"skill_tag": "in_db",
"skipped_reason": null
}
],
"new_skills_created": 0,
"role_dimension_saved": 0,
"skill_dimension_saved": 0,
"skipped": 0
},
"planner_output": null,
"run_id": "c90f7764-3162-4ae8-9e8a-6579f118a3a8"
}
LLM Calls
Every model call made for this run, in pipeline order. Click a card to see the model's response.