← Back to history

Pipeline run

2facfcf1-fe54-4063-ab89-636e61909f69

Client output enrichment

v2 Skill cluster · Nature of work · AI index · Tech stack maturity · Evidence · KRA description

Nature of work

—

no_db_connection

Tech stack maturity

Mainstream Modern

AI index (0 = no AI use, 5 = totally AI-dependent · v2.1)

0.50 / 5

· Title match

✓ Has AI skill

· AI skill (primary)

✓ AI skill (secondary)

· On AI team

· Builds AI products

vocab breakdown (legacy)

Assistants (×1): —

Frameworks (×2): —

Models / concepts (×3): RAG, embeddings, LLM, agentic AI, agentic, AI

Evidence — skills matched in JD (14)

Data Engineering Python SQL Java Spark Databricks Flink AWS GCP Azure Microservices Embeddings Data Structures Algorithms

Skill cluster (0 dimension groups, role-scoped)

No dimension groups computed for this JD.

Status: extract_from_jd_done Created: 2026-05-10T18:47:37.024585Z Updated: 2026-05-10T18:47:37.024585Z

Flow Current 3-step pipeline

1 POST /skills/extract-from-jd

2 POST /skills/extract-details

3 POST /skills/final-role-output

Role Chosen role & resolution

No chosen role stored for this run.

Job description

As a Data Engineer III with Expedia Engineering teams you will have the opportunity to leverage your technical expertise to design solutions that enrich the Data and Intelligent service Metric Enablement platform with new features and functionality to run the business. This role goes beyond traditional data engineering and you will be designing, building, deploying, and operating data pipelines, embeddings workflows to power Agentic AI applications in production and will be expected to own architecture decisions, drive AI platform evolution, and ensure enterprise-grade reliability, governance, and scalability.

You will also have the opportunity to work alongside junior developers as a coach/mentor to them & work with Sr. Devs on various tech teams to come up with solutions.

In this role, you will:
• Design and develop, scalable cloud-native solutions that are scalable, responsive & resilient.
• Build scalable ingestion pipelines for structured and unstructured data (documents, logs, knowledge bases, transactional data)
• Design semantic layers and context-building strategies for LLM consumption
• Architect and build production-ready RAG systems (retrieval pipelines, embeddings, vector indexing, ranking strategies) and work with vector databases and retrieval systems
• Develop embedding pipelines and manage vector databases at scale
• Develop, test, own and deliver Sprint tasks and help drive the team forward
• Collaborate with teams and individuals to complete your team assignment on time, with quality
• Be a coach/mentor to junior developers on the team
• Work across multiple layers of the stack as the problem demands.
• Have a strong sense of ownership of all technical issues
• Identify risks, and issues & drive them to mitigation/resolution as required in the scope of your work
• Prototype ideas, execute and learn from them and enrich the overall team experience

Experience and Qualifications
• 6+ years of development experience in an enterprise-level engineering environment increasing levels of technical expertise.
• 4+ years of hands-on backend Data Engineering application development experience with an excellent understanding of products with microservice architecture.
• Proven hands-on experience designing, building, and operating data pipelines that enable LLM-based agentic AI systems, including support for embeddings, retrieval layers, and orchestration workflows.
• Expert-level SQL and strong Python proficiency (Java is a plus)
• Experience with distributed processing frameworks (Spark, Databricks, Flink, etc.)
• Experience building data pipelines in cloud-native environments (AWS/GCP/Azure)
• Experience building scalable, fault-tolerant, observable systems
• Good knowledge of Data Structures and Algorithm.
• Strong understanding of data modeling and semantic layer design
• Understanding of embeddings, chunking strategies, retrieval optimization, and re-ranking

Skills from this JD

Each row merges API 1 extraction, API 2 library match / v3 orchestration (dimensions + locked dims), and API 3 persistence tags.

Data Engineering Primary No API 2 row (run stopped after API 1 or history missing)

Python Primary No API 2 row (run stopped after API 1 or history missing)

SQL Primary No API 2 row (run stopped after API 1 or history missing)

Java Secondary No API 2 row (run stopped after API 1 or history missing)

Spark Secondary No API 2 row (run stopped after API 1 or history missing)

Databricks Secondary No API 2 row (run stopped after API 1 or history missing)

Flink Secondary No API 2 row (run stopped after API 1 or history missing)

AWS Secondary No API 2 row (run stopped after API 1 or history missing)

GCP Secondary No API 2 row (run stopped after API 1 or history missing)

Azure Secondary No API 2 row (run stopped after API 1 or history missing)

Microservices Secondary No API 2 row (run stopped after API 1 or history missing)

Embeddings Secondary No API 2 row (run stopped after API 1 or history missing)

Data Structures Secondary No API 2 row (run stopped after API 1 or history missing)

Algorithms Secondary No API 2 row (run stopped after API 1 or history missing)

Library artifacts (this run)

No artifact rows for this run.

API 1 — extract-from-jd click to toggle

{
  "final_skills": [
    {
      "is_primary": true,
      "skill_name": "Data Engineering"
    },
    {
      "is_primary": true,
      "skill_name": "Python"
    },
    {
      "is_primary": true,
      "skill_name": "SQL"
    },
    {
      "is_primary": false,
      "skill_name": "Java"
    },
    {
      "is_primary": false,
      "skill_name": "Spark"
    },
    {
      "is_primary": false,
      "skill_name": "Databricks"
    },
    {
      "is_primary": false,
      "skill_name": "Flink"
    },
    {
      "is_primary": false,
      "skill_name": "AWS"
    },
    {
      "is_primary": false,
      "skill_name": "GCP"
    },
    {
      "is_primary": false,
      "skill_name": "Azure"
    },
    {
      "is_primary": false,
      "skill_name": "Microservices"
    },
    {
      "is_primary": false,
      "skill_name": "Embeddings"
    },
    {
      "is_primary": false,
      "skill_name": "Data Structures"
    },
    {
      "is_primary": false,
      "skill_name": "Algorithms"
    }
  ],
  "run_id": null
}

API 2 — extract-details

{}

API 3 — final-role-output

{}

LLM Calls

Every model call made for this run, in pipeline order. Click a card to see the model's response.

Loading…