Pipeline run
8ca0f5da-0de7-4fc5-85e1-fba71e558159
Client output enrichment
v2 Skill cluster · Nature of work · AI index · Tech stack maturity · Evidence · KRA descriptionvocab breakdown (legacy)
Signals
Post-classification
Captured for admin review
1 POST /skills/extract-from-jd
2 POST /skills/extract-details
3 POST /skills/final-role-output
Data Engineer
CASE Aslug: data-engineer · id: 2 · source: db
Exact alias hit on data-engineer (1.0) — no other alias at this confidence; skill_top absent does not contradict
Resolution:
in_db
— role exists in library; skill↔dim and role↔dim links saved when applicable.
Job description
Description To transform data into a format that can be easily analyzed by developing, maintaining, and testing infrastructures for data generation. You will be working closely with data scientists and are largely in charge of architecting solutions for data scientists that enable them to do their jobs. Grade: T5 Please note that the Job will close at 12am on Posting Close date, so please submit your application prior to the Close Date. What Your Main Responsibilities Are • Data Pipeline: Develop and maintain scalable data pipelines and builds out new API integrations to support continuing increases in data volume and complexity. • Data Integration: Connect offline and online data to continuously improve overall understanding of customer behavior and journeys for personalization. Data pre-processing including collecting, parsing, managing, analyzing and visualizing large sets of data. • Data Quality Management: Cleanse the data and improve data quality and readiness for analysis. Drive standards, define and implement/improve data governance strategies and enforce best practices to scale data analysis across platforms. • Data Transformation: Processes data by cleansing data and transforming them to proper storage structure for the purpose of querying and analysis using ETL and ELT process. • Data Enablement: Ensure data is accessible and usable to wider enterprise to enable a deeper and more timely understanding of operation. What We Are Looking For • Masters/Bachelors degree in Engineering/Computer Science/Math/Statistics or equivalent. • Strong programming skills in Python/Pyspark/SAS. • Proven experience with large data sets and related technologies Hadoop, Hive, Distributed computing systems, Spark optimization. • Experience on cloud platforms (preferably Azure) and its services Azure Data Factory (ADF), ADLS Storage, Azure DevOps. Hands-on experience on Databricks, Delta Lake, Workflows. • Should have knowledge of DevOps process and tools like Docker, CI/CD, Kubernetes, Terraform, Octopus. • Hands-on experience with SQL and data modeling to support the organization's data storage and analysis needs. • Experience on any BI tool like Power BI (Good to have). • Cloud migration experience (Good to have). • Cloud and Data Engineering certification (Good to have). • Working in an Agile environment. Our Company FedEx was built on a philosophy that puts people first, one we take seriously. We are an equal opportunity/affirmative action employer and we are committed to a diverse, equitable, and inclusive workforce in which we enforce fair treatment, and provide growth opportunities for everyone. All qualified applicants will receive consideration for employment regardless of age, race, color, national origin, genetics, religion, gender, marital status, pregnancy (including childbirth or a related medical condition), physical or mental disability, or any other characteristic protected by applicable laws, regulations, and ordinances. FedEx is one of the world's largest express transportation companies and has consistently been selected as one of the top 10 Worlds Most Admired Companies by "Fortune" magazine. Every day FedEx delivers for its customers with transportation and business solutions, serving more than 220 countries and territories around the globe. We can serve this global network due to our outstanding team of FedEx team members, who are tasked with making every FedEx experience outstanding. Our Philosophy The People-Service-Profit philosophy (P-S-P) describes the principles that govern every FedEx decision, policy, or activity. FedEx takes care of our people; they, in turn, deliver the impeccable service demanded by our customers, who reward us with the profitability necessary to secure our future. The essential element in making the People-Service-Profit philosophy such a positive force for the company is where we close the circle, and return these profits back into the business, and invest back in our people. Our success in the industry is attributed to our people. Through our P-S-P philosophy, we have a work environment that encourages team members to be innovative in delivering the highest possible quality of service to our customers. We care for their well-being, and value their contributions to the company. Our Culture Our culture is important for many reasons, and we intentionally bring it to life through our behaviors, actions, and activities in every part of the world. The FedEx culture and values have been a cornerstone of our success and growth since we began in the early 1970s. While other companies can copy our systems, infrastructure, and processes, our culture makes us unique and is often a differentiating factor as we compete and
Skills from this JD
Each row merges API 1 extraction, API 2 library match / v3 orchestration (dimensions + locked dims), and API 3 persistence tags.
Skill enrichment (orchestrator / LLM)
No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).
- Category
- Data Engineering Tools
- Sub-category
- general
- Skill nature
- PRACTICE
- Volatility
- MEDIUM
- Typical lifespan
- MULTI_YEAR
- Version strategy
- UNVERSIONED
Skill enrichment (orchestrator / LLM)
No Stage 7 enrichment blob on this skill (orchestrator skipped enrichment).
- Category
- Data Engineering Tools
- Sub-category
- general
- Skill nature
- PRACTICE
- Volatility
- MEDIUM
- Typical lifespan
- MULTI_YEAR
- Version strategy
- UNVERSIONED
Library artifacts (this run)
| Kind | Detail | DB id |
|---|---|---|
| canonical_skill_proposed | ETL | type=Data Engineering Tools subtype=general nature=PRACTICE lifespan=MULTI_YEAR | |
| canonical_skill_proposed | ELT | type=Data Engineering Tools subtype=general nature=PRACTICE lifespan=MULTI_YEAR |
nano JD Parser — gpt-4.1-nano click to toggle
Certifications
Show raw JSON
{
"JD_type": "pass",
"about_company": {
"source_marker": {
"first_5_words": "FedEx was built on a",
"last_5_words": "laws, regulations, and ordinances."
},
"text": "FedEx was built on a philosophy that puts people first, one we take seriously. We are an equal opportunity/affirmative action employer and we are committed to a diverse, equitable, and inclusive workforce in which we enforce fair treatment, and provide growth opportunities for everyone. All qualified applicants will receive consideration for employment regardless of age, race, color, national origin, genetics, religion, gender, marital status, pregnancy (including childbirth or a related medical condition), physical or mental disability, or any other characteristic protected by applicable laws, regulations, and ordinances.",
"word_count": 84
},
"certifications": [
"Cloud and Data Engineering certification"
],
"company_name": "FedEx",
"ctc": null,
"domain": {
"primary": {
"aliases": [
"Transportation",
"Shipping"
],
"domain": "Logistics \u0026 Supply Chain"
},
"secondary": null
},
"education": [
{
"level": "Bachelor\u0027s",
"qualification": "BTECH/BE/MSc - Engineering/Computer Science/Math/Statistics (or equivalent)",
"raw": "Masters/Bachelors degree in Engineering/Computer Science/Math/Statistics or equivalent.",
"requirement": "required"
}
],
"experience": {
"max": null,
"min": null,
"raw": null
},
"job_locations": [],
"role": "Data Engineer",
"role_aliases": [
"Data Developer",
"Data Pipeline Engineer",
"Data Integration Engineer"
],
"role_archetype": "Data",
"roles_and_responsibilities": [
{
"bullet_count": 5,
"heading": "What Your Main Responsibilities Are",
"heading_was_present": true,
"source_marker": {
"first_5_words": "\u2022 Data Pipeline: Develop and",
"last_5_words": "understanding of operation."
},
"text": "\u2022 Data Pipeline: Develop and maintain scalable data pipelines and builds out new API integrations to support continuing increases in data volume and complexity. \n\u2022 Data Integration: Connect offline and online data to continuously improve overall understanding of customer behavior and journeys for personalization. Data pre-processing including collecting, parsing, managing, analyzing and visualizing large sets of data. \n\u2022 Data Quality Management: Cleanse the data and improve data quality and readiness for analysis. Drive standards, define and implement/improve data governance strategies and enforce best practices to scale data analysis across platforms. \n\u2022 Data Transformation: Processes data by cleansing data and transforming them to proper storage structure for the purpose of querying and analysis using ETL and ELT process. \n\u2022 Data Enablement: Ensure data is accessible and usable to wider enterprise to enable a deeper and more timely understanding of operation.",
"word_count": 134
}
],
"urls": []
}
API 1 — extract-from-jd click to toggle
{
"final_skills": [
{
"is_primary": true,
"skill_name": "ETL"
},
{
"is_primary": true,
"skill_name": "ELT"
}
],
"jd_role": {
"display_name": "Data Engineer",
"rationale": null,
"role_aliases": [
"Data Developer",
"Data Pipeline Engineer",
"Data Integration Engineer"
],
"role_archetype": "Data",
"slug": ""
},
"nano_parsed": {
"JD_type": "pass",
"about_company": {
"source_marker": {
"first_5_words": "FedEx was built on a",
"last_5_words": "laws, regulations, and ordinances."
},
"text": "FedEx was built on a philosophy that puts people first, one we take seriously. We are an equal opportunity/affirmative action employer and we are committed to a diverse, equitable, and inclusive workforce in which we enforce fair treatment, and provide growth opportunities for everyone. All qualified applicants will receive consideration for employment regardless of age, race, color, national origin, genetics, religion, gender, marital status, pregnancy (including childbirth or a related medical condition), physical or mental disability, or any other characteristic protected by applicable laws, regulations, and ordinances.",
"word_count": 84
},
"certifications": [
"Cloud and Data Engineering certification"
],
"company_name": "FedEx",
"ctc": null,
"domain": {
"primary": {
"aliases": [
"Transportation",
"Shipping"
],
"domain": "Logistics \u0026 Supply Chain"
},
"secondary": null
},
"education": [
{
"level": "Bachelor\u0027s",
"qualification": "BTECH/BE/MSc - Engineering/Computer Science/Math/Statistics (or equivalent)",
"raw": "Masters/Bachelors degree in Engineering/Computer Science/Math/Statistics or equivalent.",
"requirement": "required"
}
],
"experience": {
"max": null,
"min": null,
"raw": null
},
"job_locations": [],
"role": "Data Engineer",
"role_aliases": [
"Data Developer",
"Data Pipeline Engineer",
"Data Integration Engineer"
],
"role_archetype": "Data",
"roles_and_responsibilities": [
{
"bullet_count": 5,
"heading": "What Your Main Responsibilities Are",
"heading_was_present": true,
"source_marker": {
"first_5_words": "\u2022 Data Pipeline: Develop and",
"last_5_words": "understanding of operation."
},
"text": "\u2022 Data Pipeline: Develop and maintain scalable data pipelines and builds out new API integrations to support continuing increases in data volume and complexity. \n\u2022 Data Integration: Connect offline and online data to continuously improve overall understanding of customer behavior and journeys for personalization. Data pre-processing including collecting, parsing, managing, analyzing and visualizing large sets of data. \n\u2022 Data Quality Management: Cleanse the data and improve data quality and readiness for analysis. Drive standards, define and implement/improve data governance strategies and enforce best practices to scale data analysis across platforms. \n\u2022 Data Transformation: Processes data by cleansing data and transforming them to proper storage structure for the purpose of querying and analysis using ETL and ELT process. \n\u2022 Data Enablement: Ensure data is accessible and usable to wider enterprise to enable a deeper and more timely understanding of operation.",
"word_count": 134
}
],
"urls": []
},
"rejected": false,
"rejection_reason": null,
"run_id": "8ca0f5da-0de7-4fc5-85e1-fba71e558159",
"stage3_signals": {
"alias_found": true,
"alias_match_roles": [
{
"display_name": "Data Engineer",
"kra_matches": null,
"matched_count": null,
"matched_skills": null,
"role_id": 2,
"score": 1.0,
"slug": "data-engineer",
"total_count": null
}
],
"kra_match_roles": [
{
"display_name": "Data Engineer",
"kra_matches": [
{
"kra_text": "Implements data transformation, cleansing, deduplication, and enrichment logic to convert raw source data into analytics-ready curated datasets.",
"sentence": "Data Transformation: Processes data by cleansing data and transforming them to proper storage structure for the purpose of querying and analysis using ETL and ELT process.",
"similarity": 0.6929
},
{
"kra_text": "Builds data ingestion pipelines to collect data from transactional databases, third-party APIs, event streams, and file sources into centralized data platforms.",
"sentence": "Data Pipeline: Develop and maintain scalable data pipelines and builds out new API integrations to support continuing increases in data volume and complexity.",
"similarity": 0.6592
},
{
"kra_text": "Implements data transformation, cleansing, deduplication, and enrichment logic to convert raw source data into analytics-ready curated datasets.",
"sentence": "Data Quality Management: Cleanse the data and improve data quality and readiness for analysis.",
"similarity": 0.632
}
],
"matched_count": null,
"matched_skills": null,
"role_id": 2,
"score": 0.6614,
"slug": "data-engineer",
"total_count": null
},
{
"display_name": "ML Engineer",
"kra_matches": [
{
"kra_text": "Prepares, cleans, and transforms training datasets, manages feature stores, and builds feature engineering pipelines for model training.",
"sentence": "Data Pipeline: Develop and maintain scalable data pipelines and builds out new API integrations to support continuing increases in data volume and complexity.",
"similarity": 0.5143
},
{
"kra_text": "Prepares, cleans, and transforms training datasets, manages feature stores, and builds feature engineering pipelines for model training.",
"sentence": "Data pre-processing including collecting, parsing, managing, analyzing and visualizing large sets of data.",
"similarity": 0.5143
},
{
"kra_text": "Prepares, cleans, and transforms training datasets, manages feature stores, and builds feature engineering pipelines for model training.",
"sentence": "Data Transformation: Processes data by cleansing data and transforming them to proper storage structure for the purpose of querying and analysis using ETL and ELT process.",
"similarity": 0.5019
}
],
"matched_count": null,
"matched_skills": null,
"role_id": 3,
"score": 0.5102,
"slug": "ml-engineer",
"total_count": null
},
{
"display_name": "Svelte Frontend Developer",
"kra_matches": [
{
"kra_text": "backend data integration",
"sentence": "Data Integration: Connect offline and online data to continuously improve overall understanding of customer behavior and journeys for personalization.",
"similarity": 0.5183
},
{
"kra_text": "backend data integration",
"sentence": "Data Pipeline: Develop and maintain scalable data pipelines and builds out new API integrations to support continuing increases in data volume and complexity.",
"similarity": 0.4727
},
{
"kra_text": "backend data integration",
"sentence": "Drive standards, define and implement/improve data governance strategies and enforce best practices to scale data analysis across platforms.",
"similarity": 0.4631
}
],
"matched_count": null,
"matched_skills": null,
"role_id": 92,
"score": 0.4847,
"slug": "svelte-frontend-developer",
"total_count": null
},
{
"display_name": "MLOps Engineer",
"kra_matches": [
{
"kra_text": "Validates model performance benchmarks, data schema contracts, and system integration health before signing off on production release readiness.",
"sentence": "Data Quality Management: Cleanse the data and improve data quality and readiness for analysis.",
"similarity": 0.4838
},
{
"kra_text": "Automates ML platform operations including scheduled retraining triggers, pipeline orchestration, evaluation workflows, and alerting configuration.",
"sentence": "Data Pipeline: Develop and maintain scalable data pipelines and builds out new API integrations to support continuing increases in data volume and complexity.",
"similarity": 0.4684
},
{
"kra_text": "Validates model performance benchmarks, data schema contracts, and system integration health before signing off on production release readiness.",
"sentence": "Drive standards, define and implement/improve data governance strategies and enforce best practices to scale data analysis across platforms.",
"similarity": 0.4511
}
],
"matched_count": null,
"matched_skills": null,
"role_id": 16,
"score": 0.4678,
"slug": "ml-ops-engineer",
"total_count": null
},
{
"display_name": "AI Engineer",
"kra_matches": [
{
"kra_text": "Integrates AI model API responses with application business logic, database writes, event publishing, and downstream service orchestration.",
"sentence": "Data Pipeline: Develop and maintain scalable data pipelines and builds out new API integrations to support continuing increases in data volume and complexity.",
"similarity": 0.5021
},
{
"kra_text": "Documents AI feature capabilities, known limitations, failure modes, prompt versioning, and operational runbooks for engineering and product teams.",
"sentence": "Data Enablement: Ensure data is accessible and usable to wider enterprise to enable a deeper and more timely understanding of operation.",
"similarity": 0.4296
},
{
"kra_text": "Integrates AI model API responses with application business logic, database writes, event publishing, and downstream service orchestration.",
"sentence": "Data Integration: Connect offline and online data to continuously improve overall understanding of customer behavior and journeys for personalization.",
"similarity": 0.4268
}
],
"matched_count": null,
"matched_skills": null,
"role_id": 13,
"score": 0.4528,
"slug": "ai-engineer",
"total_count": null
}
],
"skill_match_roles": []
},
"stage4_decision": {
"alias_collision_detected": false,
"case": "A",
"chosen_role": {
"display_name": "Data Engineer",
"kra_matches": null,
"matched_count": null,
"matched_skills": null,
"role_id": 2,
"score": 1.0,
"slug": "data-engineer",
"total_count": null
},
"confidence": 1.0,
"is_new_role": false,
"llm2_fired": false,
"llm2_reasoning": null,
"matched_dimensions": [],
"matched_kras": [],
"matched_skills": [],
"new_role_display_name": null,
"new_role_slug": null,
"queued": false,
"reasoning": "Exact alias hit on data-engineer (1.0) \u2014 no other alias at this confidence; skill_top absent does not contradict",
"sub_role": null
},
"stage5_updates": {
"centroid_n_after": 116,
"centroid_updated": true,
"collision_log_id": null,
"new_kra_attached": null,
"new_skills_attached": [
{
"is_primary": true,
"queue_id": 6618,
"role_display_name": "Data Engineer",
"role_slug": "data-engineer",
"skill_name": "ETL",
"status": "pending"
},
{
"is_primary": true,
"queue_id": 6619,
"role_display_name": "Data Engineer",
"role_slug": "data-engineer",
"skill_name": "ELT",
"status": "pending"
}
],
"queue_entry_id": null,
"v3_pipeline_triggered": false,
"v3_role_slug": null,
"v3_run_id": null
}
}
API 2 — extract-details
{
"alias_matches": [],
"candidate_roles": [],
"chosen_role": {
"display_name": "Data Engineer",
"id": 2,
"rationale": "Exact alias hit on data-engineer (1.0) \u2014 no other alias at this confidence; skill_top absent does not contradict",
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
},
"dimensions": [],
"input_final_skills": [
"ETL",
"ELT"
],
"input_llm_skills": [
"ETL",
"ELT"
],
"new_aliases_persisted": 0,
"run_id": "8ca0f5da-0de7-4fc5-85e1-fba71e558159",
"skills_detail": [
{
"aliases_in_db": [],
"canonical": null,
"dimensions": [],
"input_skill": "ETL",
"matched_via": null,
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": {
"derived": {
"category": "Data Engineering Tools",
"skill_nature": "PRACTICE",
"sub_category": "general",
"typical_lifespan": "MULTI_YEAR",
"version_strategy": "UNVERSIONED",
"volatility": "MEDIUM"
},
"enrichment": null,
"keep_log": [],
"locked_dimensions": [],
"merge_log": [],
"placed": null,
"relationships": null,
"skill_id": "etl",
"split_log": [],
"typed": null,
"warnings": []
},
"source_tag": "llm",
"was_in_llm_skills": true
},
{
"aliases_in_db": [],
"canonical": null,
"dimensions": [],
"input_skill": "ELT",
"matched_via": null,
"new_alias_persisted": false,
"new_alias_text": null,
"new_skill_meta": {
"derived": {
"category": "Data Engineering Tools",
"skill_nature": "PRACTICE",
"sub_category": "general",
"typical_lifespan": "MULTI_YEAR",
"version_strategy": "UNVERSIONED",
"volatility": "MEDIUM"
},
"enrichment": null,
"keep_log": [],
"locked_dimensions": [],
"merge_log": [],
"placed": null,
"relationships": null,
"skill_id": "elt",
"split_log": [],
"typed": null,
"warnings": []
},
"source_tag": "llm",
"was_in_llm_skills": true
}
],
"unmatched_skills": [
"ETL",
"ELT"
]
}
API 3 — final-role-output
{
"chosen_role": {
"display_name": "Data Engineer",
"id": 2,
"rationale": "Exact alias hit on data-engineer (1.0) \u2014 no other alias at this confidence; skill_top absent does not contradict",
"role_archetype": null,
"slug": "data-engineer",
"source": "db"
},
"chosen_role_resolution": "in_db",
"final_input_skills": [
{
"skill": "ETL",
"tag": "new"
},
{
"skill": "ELT",
"tag": "new"
}
],
"llm_cost_api1_usd": null,
"llm_cost_api2_usd": null,
"llm_cost_api3_usd": null,
"llm_cost_total_usd": null,
"persistence": {
"items": [],
"new_skills_created": 0,
"role_dimension_saved": 0,
"skill_dimension_saved": 0,
"skipped": 0
},
"planner_output": null,
"run_id": "8ca0f5da-0de7-4fc5-85e1-fba71e558159"
}
LLM Calls
Every model call made for this run, in pipeline order. Click a card to see the model's response.