Pipeline run
ac2115f3-bb2c-4372-9c29-23edaa5ba3fc
Client output enrichment
v2 Skill cluster · Nature of work · AI index · Tech stack maturity · Evidence · KRA descriptionvocab breakdown (legacy)
Signals
Post-classification
Captured for admin review
1 POST /skills/extract-from-jd
2 POST /skills/extract-details
3 POST /skills/final-role-output
Data Engineer
CASE Aslug: data-engineer · id: 2 · source: db
Exact alias hit on data-engineer (1.0) — no other alias at this confidence; skill_top absent does not contradict
Resolution:
in_db
— role exists in library; skill↔dim and role↔dim links saved when applicable.
Job description
About Uber Uber is a technology company that is changing the way the world thinks about transportation. We are building technology people use every day. Whether it's heading home from work, getting a meal delivered from a favorite restaurant, or a way to earn extra income, Uber is becoming part of the fabric of daily life. We're making cities safer, smarter, and more connected. And we're doing it at a global scale-energizing the local economy and bringing opportunity to millions of people around the world. We ignite opportunity by setting the world in motion! Uber's positive impact is tangible in the communities we operate in, and that drives us to keep moving forward. About The Team The Data Engineering team at Hyderabad collaborates with Engineering, Product, and Analytics teams across tech sites to collectively accomplish OKRs to take Uber forward. We constantly enrich our data layer to optimally deal with the next generation of products which are a result of our big bold bets. We design and build data pipelines to schedule & orchestrate a variety of tasks such as extract, cleanse, transform, enrich, and load data as per the business needs. We serve data insights at depth to various teams at Uber-like business analytics, data analytics, data science, and other business partners to make strategic decisions, train DS models, perform health checks, etc. What We're Looking For We are seeking a strong and passionate data engineer with experience in large-scale system implementation, with a focus on complex data pipelines. The candidate should be able to design and drive large projects from inception to production. The right person will work with cross-functional businesses', and technology partners to gather requirements and translate them into a data engineering roadmap. Must be a great communicator, standout teammate, and a technology powerhouse. What You'll Do Collaborate with engineering/product/analyst teams across tech sites to collectively accomplish OKRs to take Uber forwardEnrich data layers to effectively deal with the next generation of products which are a result of Uber's big bold betsDesign and build data pipelines to schedule & orchestrate a variety of tasks such as extract, cleanse, transform, enrich & load data as per the business needs What You'll Need Strong SQL skillsStrong in Data Warehousing and Data Modelling conceptsHands-on experience in Hadoop tech stack:HDFS, Hive, Oozie, Airflow, MapReduce, Spark.Programming languages - Python, Java, Scala, etc.Experience in building ETL Data PipelinesPerformance Troubleshooting and Tuning
Skills from this JD
Each row merges API 1 extraction, API 2 library match / v3 orchestration (dimensions + locked dims), and API 3 persistence tags.
Skill enrichment (orchestrator / LLM)
No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).
- Category
- Data Engineering Tools
- Sub-category
- general
- Skill nature
- CONCEPT
- Volatility
- MEDIUM
- Typical lifespan
- MULTI_YEAR
- Version strategy
- UNVERSIONED
Skill enrichment (orchestrator / LLM)
No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).
- Category
- Data Engineering Tools
- Sub-category
- general
- Skill nature
- CONCEPT
- Volatility
- MEDIUM
- Typical lifespan
- MULTI_YEAR
- Version strategy
- UNVERSIONED
Skill enrichment (orchestrator / LLM)
No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).
- Category
- Data Engineering Tools
- Sub-category
- general
- Skill nature
- CONCEPT
- Volatility
- STABLE
- Typical lifespan
- EVERGREEN
- Version strategy
- UNVERSIONED
Skill enrichment (orchestrator / LLM)
No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).
- Category
- Concepts
- Sub-category
- general
- Skill nature
- CONCEPT
- Volatility
- MEDIUM
- Typical lifespan
- MULTI_YEAR
- Version strategy
- UNVERSIONED
Library artifacts (this run)
| Kind | Detail | DB id |
|---|---|---|
| canonical_skill_proposed | Data Pipelines | type=Data Engineering Tools subtype=general nature=CONCEPT lifespan=MULTI_YEAR | |
| canonical_skill_proposed | ETL | type=Data Engineering Tools subtype=general nature=CONCEPT lifespan=MULTI_YEAR | |
| canonical_skill_proposed | Data Engineering | type=Data Engineering Tools subtype=general nature=CONCEPT lifespan=EVERGREEN | |
| canonical_skill_proposed | Large-scale Systems | type=Concepts subtype=general nature=CONCEPT lifespan=MULTI_YEAR |
nano JD Parser — gpt-4.1-nano click to toggle
Show raw JSON
{
"JD_type": "pass",
"about_company": {
"source_marker": {
"first_5_words": "Uber is a technology company",
"last_5_words": "drives us to keep moving forward."
},
"text": "Uber is a technology company that is changing the way the world thinks about transportation. We are building technology people use every day. Whether it\u0027s heading home from work, getting a meal delivered from a favorite restaurant, or a way to earn extra income, Uber is becoming part of the fabric of daily life.\n\nWe\u0027re making cities safer, smarter, and more connected. And we\u0027re doing it at a global scale-energizing the local economy and bringing opportunity to millions of people around the world. We ignite opportunity by setting the world in motion!\n\nUber\u0027s positive impact is tangible in the communities we operate in, and that drives us to keep moving forward.",
"word_count": 104
},
"certifications": [],
"company_name": "Uber",
"ctc": null,
"domain": {
"primary": {
"aliases": [
"Tech Services",
"Technology"
],
"domain": "IT Services \u0026 Consulting"
},
"secondary": null
},
"education": [],
"experience": {
"max": null,
"min": null,
"raw": null
},
"job_locations": [
{
"aliases": [
"Hyderabad, TG"
],
"city": "Hyderabad",
"country": "India",
"state": "Telangana",
"work_mode": null
}
],
"role": "Data Engineer",
"role_aliases": [
"Data Engineer",
"Data Pipeline Engineer",
"ETL Engineer"
],
"role_archetype": "Data",
"roles_and_responsibilities": [
{
"bullet_count": 0,
"heading": "What You\u0027ll Do",
"heading_was_present": true,
"source_marker": {
"first_5_words": "Collaborate with engineering/product/analyst teams",
"last_5_words": "as per the business needs"
},
"text": "Collaborate with engineering/product/analyst teams across tech sites to collectively accomplish OKRs to take Uber forward\nEnrich data layers to effectively deal with the next generation of products which are a result of Uber\u0027s big bold bets\nDesign and build data pipelines to schedule \u0026 orchestrate a variety of tasks such as extract, cleanse, transform, enrich \u0026 load data as per the business needs",
"word_count": 50
},
{
"bullet_count": 0,
"heading": "What We\u0027re Looking For",
"heading_was_present": true,
"source_marker": {
"first_5_words": "We are seeking a strong",
"last_5_words": "and a technology powerhouse."
},
"text": "We are seeking a strong and passionate data engineer with experience in large-scale system implementation, with a focus on complex data pipelines. The candidate should be able to design and drive large projects from inception to production. The right person will work with cross-functional businesses\u0027, and technology partners to gather requirements and translate them into a data engineering roadmap. Must be a great communicator, standout teammate, and a technology powerhouse.",
"word_count": 63
}
],
"urls": []
}
API 1 — extract-from-jd click to toggle
{
"final_skills": [
{
"is_primary": true,
"skill_name": "Data Pipelines"
},
{
"is_primary": true,
"skill_name": "ETL"
},
{
"is_primary": true,
"skill_name": "Data Engineering"
},
{
"is_primary": false,
"skill_name": "Large-scale Systems"
}
],
"jd_role": {
"display_name": "Data Engineer",
"rationale": null,
"role_aliases": [
"Data Engineer",
"Data Pipeline Engineer",
"ETL Engineer"
],
"role_archetype": "Data",
"slug": ""
},
"nano_parsed": {
"JD_type": "pass",
"about_company": {
"source_marker": {
"first_5_words": "Uber is a technology company",
"last_5_words": "drives us to keep moving forward."
},
"text": "Uber is a technology company that is changing the way the world thinks about transportation. We are building technology people use every day. Whether it\u0027s heading home from work, getting a meal delivered from a favorite restaurant, or a way to earn extra income, Uber is becoming part of the fabric of daily life.\n\nWe\u0027re making cities safer, smarter, and more connected. And we\u0027re doing it at a global scale-energizing the local economy and bringing opportunity to millions of people around the world. We ignite opportunity by setting the world in motion!\n\nUber\u0027s positive impact is tangible in the communities we operate in, and that drives us to keep moving forward.",
"word_count": 104
},
"certifications": [],
"company_name": "Uber",
"ctc": null,
"domain": {
"primary": {
"aliases": [
"Tech Services",
"Technology"
],
"domain": "IT Services \u0026 Consulting"
},
"secondary": null
},
"education": [],
"experience": {
"max": null,
"min": null,
"raw": null
},
"job_locations": [
{
"aliases": [
"Hyderabad, TG"
],
"city": "Hyderabad",
"country": "India",
"state": "Telangana",
"work_mode": null
}
],
"role": "Data Engineer",
"role_aliases": [
"Data Engineer",
"Data Pipeline Engineer",
"ETL Engineer"
],
"role_archetype": "Data",
"roles_and_responsibilities": [
{
"bullet_count": 0,
"heading": "What You\u0027ll Do",
"heading_was_present": true,
"source_marker": {
"first_5_words": "Collaborate with engineering/product/analyst teams",
"last_5_words": "as per the business needs"
},
"text": "Collaborate with engineering/product/analyst teams across tech sites to collectively accomplish OKRs to take Uber forward\nEnrich data layers to effectively deal with the next generation of products which are a result of Uber\u0027s big bold bets\nDesign and build data pipelines to schedule \u0026 orchestrate a variety of tasks such as extract, cleanse, transform, enrich \u0026 load data as per the business needs",
"word_count": 50
},
{
"bullet_count": 0,
"heading": "What We\u0027re Looking For",
"heading_was_present": true,
"source_marker": {
"first_5_words": "We are seeking a strong",
"last_5_words": "and a technology powerhouse."
},
"text": "We are seeking a strong and passionate data engineer with experience in large-scale system implementation, with a focus on complex data pipelines. The candidate should be able to design and drive large projects from inception to production. The right person will work with cross-functional businesses\u0027, and technology partners to gather requirements and translate them into a data engineering roadmap. Must be a great communicator, standout teammate, and a technology powerhouse.",
"word_count": 63
}
],
"urls": []
},
"rejected": false,
"rejection_reason": null,
"run_id": "ac2115f3-bb2c-4372-9c29-23edaa5ba3fc",
"stage3_signals": {
"alias_found": true,
"alias_match_roles": [
{
"display_name": "Data Engineer",
"kra_matches": null,
"matched_count": null,
"matched_skills": null,
"role_id": 2,
"score": 1.0,
"slug": "data-engineer",
"total_count": null
}
],
"kra_match_roles": [
{
"display_name": "Data Engineer",
"kra_matches": [
{
"kra_text": "Works with data analysts, data scientists, and business stakeholders to define data models, ingestion schedules, and data delivery requirements.",
"sentence": "Design and build data pipelines to schedule \u0026 orchestrate a variety of tasks such as extract, cleanse, transform, enrich \u0026 load data as per the business needs",
"similarity": 0.6498
},
{
"kra_text": "Works with data analysts, data scientists, and business stakeholders to define data models, ingestion schedules, and data delivery requirements.",
"sentence": "The right person will work with cross-functional businesses\u0027, and technology partners to gather requirements and translate them into a data engineering roadmap.",
"similarity": 0.5977
},
{
"kra_text": "Works with data analysts, data scientists, and business stakeholders to define data models, ingestion schedules, and data delivery requirements.",
"sentence": "We are seeking a strong and passionate data engineer with experience in large-scale system implementation, with a focus on complex data pipelines.",
"similarity": 0.5069
}
],
"matched_count": null,
"matched_skills": null,
"role_id": 2,
"score": 0.5848,
"slug": "data-engineer",
"total_count": null
},
{
"display_name": "Flutter Developer",
"kra_matches": [
{
"kra_text": "collaborate with design, product, and backend teams",
"sentence": "Collaborate with engineering/product/analyst teams across tech sites to collectively accomplish OKRs to take Uber forward",
"similarity": 0.5705
},
{
"kra_text": "collaborate with design, product, and backend teams",
"sentence": "The right person will work with cross-functional businesses\u0027, and technology partners to gather requirements and translate them into a data engineering roadmap.",
"similarity": 0.4651
},
{
"kra_text": "collaborate with design, product, and backend teams",
"sentence": "The candidate should be able to design and drive large projects from inception to production.",
"similarity": 0.4152
}
],
"matched_count": null,
"matched_skills": null,
"role_id": 74,
"score": 0.4836,
"slug": "flutter-developer",
"total_count": null
},
{
"display_name": "ML Engineer",
"kra_matches": [
{
"kra_text": "Prepares, cleans, and transforms training datasets, manages feature stores, and builds feature engineering pipelines for model training.",
"sentence": "Design and build data pipelines to schedule \u0026 orchestrate a variety of tasks such as extract, cleanse, transform, enrich \u0026 load data as per the business needs",
"similarity": 0.5329
},
{
"kra_text": "Translates product requirements into machine learning system specifications including feature definitions, model architecture choices, and success metric definitions.",
"sentence": "The right person will work with cross-functional businesses\u0027, and technology partners to gather requirements and translate them into a data engineering roadmap.",
"similarity": 0.4814
},
{
"kra_text": "Prepares, cleans, and transforms training datasets, manages feature stores, and builds feature engineering pipelines for model training.",
"sentence": "We are seeking a strong and passionate data engineer with experience in large-scale system implementation, with a focus on complex data pipelines.",
"similarity": 0.3974
}
],
"matched_count": null,
"matched_skills": null,
"role_id": 3,
"score": 0.4705,
"slug": "ml-engineer",
"total_count": null
},
{
"display_name": "Fullstack Developer",
"kra_matches": [
{
"kra_text": "Works closely with product managers and UX designers to translate requirements and wireframes into working software features through iterative development.",
"sentence": "The right person will work with cross-functional businesses\u0027, and technology partners to gather requirements and translate them into a data engineering roadmap.",
"similarity": 0.4867
},
{
"kra_text": "Designs and queries relational databases like PostgreSQL and document stores like MongoDB, writing migrations, indexes, and optimized queries.",
"sentence": "Design and build data pipelines to schedule \u0026 orchestrate a variety of tasks such as extract, cleanse, transform, enrich \u0026 load data as per the business needs",
"similarity": 0.4456
},
{
"kra_text": "Works closely with product managers and UX designers to translate requirements and wireframes into working software features through iterative development.",
"sentence": "Collaborate with engineering/product/analyst teams across tech sites to collectively accomplish OKRs to take Uber forward",
"similarity": 0.4367
}
],
"matched_count": null,
"matched_skills": null,
"role_id": 15,
"score": 0.4563,
"slug": "full-stack-engineer",
"total_count": null
},
{
"display_name": "Cloud Architect",
"kra_matches": [
{
"kra_text": "Conducts architecture reviews, approves technical design documents, and guides engineering teams through cloud migration and modernization projects.",
"sentence": "The right person will work with cross-functional businesses\u0027, and technology partners to gather requirements and translate them into a data engineering roadmap.",
"similarity": 0.4566
},
{
"kra_text": "Designs backup policies, cross-region replication, and disaster recovery runbooks to meet defined RTO and RPO targets for critical workloads.",
"sentence": "Design and build data pipelines to schedule \u0026 orchestrate a variety of tasks such as extract, cleanse, transform, enrich \u0026 load data as per the business needs",
"similarity": 0.4359
},
{
"kra_text": "Conducts architecture reviews, approves technical design documents, and guides engineering teams through cloud migration and modernization projects.",
"sentence": "The candidate should be able to design and drive large projects from inception to production.",
"similarity": 0.4354
}
],
"matched_count": null,
"matched_skills": null,
"role_id": 9,
"score": 0.4426,
"slug": "cloud-architect",
"total_count": null
}
],
"skill_match_roles": []
},
"stage4_decision": {
"alias_collision_detected": false,
"case": "A",
"chosen_role": {
"display_name": "Data Engineer",
"kra_matches": null,
"matched_count": null,
"matched_skills": null,
"role_id": 2,
"score": 1.0,
"slug": "data-engineer",
"total_count": null
},
"confidence": 1.0,
"is_new_role": false,
"llm2_fired": false,
"llm2_reasoning": null,
"matched_dimensions": [],
"matched_kras": [],
"matched_skills": [],
"new_role_display_name": null,
"new_role_slug": null,
"queued": false,
"reasoning": "Exact alias hit on data-engineer (1.0) \u2014 no other alias at this confidence; skill_top absent does not contradict",
"sub_role": null
},
"stage5_updates": {
"centroid_n_after": 422,
"centroid_updated": true,
"collision_log_id": null,
"new_kra_attached": null,
"new_skills_attached": [
{
"is_primary": true,
"queue_id": 19564,
"role_display_name": "Data Engineer",
"role_slug": "data-engineer",
"skill_name": "Data Pipelines",
"status": "pending"
},
{
"is_primary": true,
"queue_id": 19565,
"role_display_name": "Data Engineer",
"role_slug": "data-engineer",
"skill_name": "ETL",
"status": "pending"
},
{
"is_primary": true,
"queue_id": 19566,
"role_display_name": "Data Engineer",
"role_slug": "data-engineer",
"skill_name": "Data Engineering",
"status": "pending"
},
{
"is_primary": false,
"queue_id": 19567,
"role_display_name": "Data Engineer",
"role_slug": "data-engineer",
"skill_name": "Large-scale Systems",
"status": "pending"
}
],
"queue_entry_id": null,
"v3_pipeline_triggered": false,
"v3_role_slug": null,
"v3_run_id": null
}
}
API 2 — extract-details
{
"alias_matches": [],
"candidate_roles": [],
"chosen_role": {
"display_name": "Data Engineer",
"id": 2,
"rationale": "Exact alias hit on data-engineer (1.0) \u2014 no other alias at this confidence; skill_top absent does not contradict",
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
},
"dimensions": [],
"input_final_skills": [
"Data Pipelines",
"ETL",
"Data Engineering",
"Large-scale Systems"
],
"input_llm_skills": [
"Data Pipelines",
"ETL",
"Data Engineering",
"Large-scale Systems"
],
"new_aliases_persisted": 0,
"run_id": "ac2115f3-bb2c-4372-9c29-23edaa5ba3fc",
"skills_detail": [
{
"aliases_in_db": [],
"canonical": null,
"dimensions": [],
"input_skill": "Data Pipelines",
"matched_via": null,
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": {
"derived": {
"category": "Data Engineering Tools",
"skill_nature": "CONCEPT",
"sub_category": "general",
"typical_lifespan": "MULTI_YEAR",
"version_strategy": "UNVERSIONED",
"volatility": "MEDIUM"
},
"enrichment": null,
"keep_log": [],
"locked_dimensions": [],
"merge_log": [],
"placed": null,
"relationships": null,
"skill_id": "data-pipelines",
"split_log": [],
"typed": null,
"warnings": []
},
"source_tag": "llm",
"was_in_llm_skills": true
},
{
"aliases_in_db": [],
"canonical": null,
"dimensions": [],
"input_skill": "ETL",
"matched_via": null,
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": {
"derived": {
"category": "Data Engineering Tools",
"skill_nature": "CONCEPT",
"sub_category": "general",
"typical_lifespan": "MULTI_YEAR",
"version_strategy": "UNVERSIONED",
"volatility": "MEDIUM"
},
"enrichment": null,
"keep_log": [],
"locked_dimensions": [],
"merge_log": [],
"placed": null,
"relationships": null,
"skill_id": "etl",
"split_log": [],
"typed": null,
"warnings": []
},
"source_tag": "llm",
"was_in_llm_skills": true
},
{
"aliases_in_db": [],
"canonical": null,
"dimensions": [],
"input_skill": "Data Engineering",
"matched_via": null,
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": {
"derived": {
"category": "Data Engineering Tools",
"skill_nature": "CONCEPT",
"sub_category": "general",
"typical_lifespan": "EVERGREEN",
"version_strategy": "UNVERSIONED",
"volatility": "STABLE"
},
"enrichment": null,
"keep_log": [],
"locked_dimensions": [],
"merge_log": [],
"placed": null,
"relationships": null,
"skill_id": "data-engineering",
"split_log": [],
"typed": null,
"warnings": []
},
"source_tag": "llm",
"was_in_llm_skills": true
},
{
"aliases_in_db": [],
"canonical": null,
"dimensions": [],
"input_skill": "Large-scale Systems",
"matched_via": null,
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": {
"derived": {
"category": "Concepts",
"skill_nature": "CONCEPT",
"sub_category": "general",
"typical_lifespan": "MULTI_YEAR",
"version_strategy": "UNVERSIONED",
"volatility": "MEDIUM"
},
"enrichment": null,
"keep_log": [],
"locked_dimensions": [],
"merge_log": [],
"placed": null,
"relationships": null,
"skill_id": "large-scale-systems",
"split_log": [],
"typed": null,
"warnings": []
},
"source_tag": "llm",
"was_in_llm_skills": true
}
],
"unmatched_skills": [
"Data Pipelines",
"ETL",
"Data Engineering",
"Large-scale Systems"
]
}
API 3 — final-role-output
{
"chosen_role": {
"display_name": "Data Engineer",
"id": 2,
"rationale": "Exact alias hit on data-engineer (1.0) \u2014 no other alias at this confidence; skill_top absent does not contradict",
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
},
"chosen_role_resolution": "in_db",
"final_input_skills": [
{
"skill": "Data Pipelines",
"tag": "new"
},
{
"skill": "ETL",
"tag": "new"
},
{
"skill": "Data Engineering",
"tag": "new"
},
{
"skill": "Large-scale Systems",
"tag": "new"
}
],
"llm_cost_api1_usd": null,
"llm_cost_api2_usd": null,
"llm_cost_api3_usd": null,
"llm_cost_total_usd": null,
"persistence": {
"items": [],
"new_skills_created": 0,
"role_dimension_saved": 0,
"skill_dimension_saved": 0,
"skipped": 0
},
"planner_output": null,
"run_id": "ac2115f3-bb2c-4372-9c29-23edaa5ba3fc"
}