← Back to history

Pipeline run

5f6e76e1-2900-47a8-94df-c46c76ce90f2

Pipeline LLM cost (USD)
API 1: $0.0121 API 2: $0.0000 API 3: $0.0000 Total: $0.0121

Client output enrichment

v2 Skill cluster · Nature of work · AI index · Tech stack maturity · Evidence · KRA description
role baseline loaded sources · ai_index: jd · nature_of_work: jd · tech_stack_maturity: jd
Nature of work · Data pipeline development
Build Azure data/GenAI solutions end to end: ingest and transform batch/streaming data in Databricks/Data Factory/Synapse, model and govern lakehouse data on ADLS/Delta, and expose it through APIs plus Azure OpenAI/RAG integrations.
"Design and implement scalable data ingestion and transformation pipelines using Azure Data Factory, Azure Synapse Pipelines, Databricks, and serverless compute"
Tech stack maturity
AI-Native & Bleeding-Edge
The stack combines cloud-native Azure data engineering with LLMOps, prompt engineering, LangChain/LlamaIndex, PEFT/LoRA, and Azure AI services, indicating an AI-first, cutting-edge implementation profile.
AI index (0 = no AI use, 5 = totally AI-dependent · v2.1)
1.70 / 5
· Title match
Has AI skill
AI skill (primary)
· AI skill (secondary)
· On AI team
· Builds AI products
vocab breakdown (legacy)
Assistants (×1): Copilot
Frameworks (×2): LangChain, LlamaIndex, Hugging Face, Bedrock, Vertex AI, Azure OpenAI
Models / concepts (×3): OpenAI, Transformers, RAG, fine-tuning, LoRA, LLMOps, MLOps, prompt engineering, AI, GenAI, Machine Learning, Artificial Intelligence
Evidence — skills matched in JD (53)
Azure Data Factory Azure Synapse Analytics Azure Databricks ADLS Gen2 Azure Storage PySpark Python SQL ETL/ELT Data Modeling Azure SQL SQL MI Azure OpenAI Service Azure Cognitive Services Azure Cognitive Search Azure AI Search LangChain LlamaIndex FastAPI Flask Azure Functions Azure App Service Azure AD Managed Identities RBAC +28
Skill cluster (17 dimension groups, role-scoped)
Web Application Frameworks
FastAPI Flask Django
Cloud Platforms
Azure App Service AWS Bedrock
Programming Languages for Data Work
Python SQL
Angular Component Model and Templates
Angular
Authentication and Authorization
RBAC
BI and Visualization Tools
Power BI
Cloud Data Warehouses
Azure Synapse Analytics
Cloud Platforms & Managed Services
Azure Functions
Containerization and Image Builds
Docker
Data Lineage and Metadata
MLOps
Identity and Access Management Products
Azure AD
Infrastructure as Code
Terraform
LLM Serving & Deployment
LLMOps
Messaging and Event Streaming
Kafka
React Component Architecture
React
Secrets and Identity Automation
Azure Key Vault
Cross-cutting / unaligned
Azure Data Factory Azure Databricks ADLS Gen2 Azure Storage PySpark ETL/ELT Data Modeling Azure SQL SQL MI Azure OpenAI Service Azure Cognitive Services Azure Cognitive Search Azure AI Search LangChain LlamaIndex Managed Identities Encryption Git CI/CD Azure Event Hubs Azure Stream Analytics Delta Lake Lakehouse Architecture Azure Purview Azure Machine Learning Bicep Hugging Face Transformers PEFT LoRA Prompt Engineering Azure DevOps GitHub Vertex AI
Show KRA description ↓
Design and implement scalable data ingestion and transformation pipelines using Azure Data Factory, Azure Synapse Pipelines, Databricks, and serverless compute; build and manage data lakes and lakehouse architectures using ADLS, Delta Lake, and enterprise Azure warehouse components; develop PySpark/Python data processing jobs for batch and streaming; implement real-time ingestion with Azure Event Hubs, Azure Stream Analytics, or Kafka; apply best practices for data modeling, partitioning, indexing, compression, cost optimization, and performance tuning across Azure platforms; ensure data quality, lineage, metadata management, and auditing across the lifecycle; implement security and governance with Azure AD, Managed Identities, Key Vault, network isolation, and fine-grained RBAC; design and develop GenAI applications using Azure OpenAI and Azure AI services; implement RAG architectures using Azure Cognitive Search / Azure AI Search and Azure data stores; integrate GenAI solutions with ADLS, Synapse, SQL, SharePoint, and Microsoft 365; build Python APIs, microservices, and backends with FastAPI/Flask hosted on Azure Functions, App Service, or AKS; implement prompt. Azure Data Factory, Azure Synapse Analytics, Azure Databricks, ADLS Gen2, Azure Storage, PySpark, Python, SQL, Data Warehousing, ETL/ELT, Data Modeling, Azure SQL / SQL MI, Azure OpenAI Service, Azure Cognitive Services, Azure Cognitive Search / Azure AI Search, LangChain, LlamaIndex, REST API Development (FastAPI/Flask/Django), Azure Functions, Azure App Service, Azure AD, Managed Identities, RBAC, Azure Key Vault, Encryption, Docker, Git, CI/CD (Azure DevOps/GitHub). Azure Event Hubs, Kafka, Azure Stream Analytics, Delta Lake / Lakehouse Architecture, Azure Purview, Power BI Integration, Azure Machine Learning, Azure OpenAI Integration, Terraform / Bicep, Cost Optimization on Azure Data Services, Real-time Analytics, Hugging Face Transformers, PEFT / LoRA Fine-tuning, Multi-cloud GenAI Platforms (Azure OpenAI / AWS Bedrock / Vertex AI), Prompt Engineering Techniques, Frontend Integration (React / Angular) for chatbots, LLMOps / MLOps, Monitoring & Logging for GenAI, Security & Compliance for AI workloads.

Signals

Skill ml-engineer
0.20
Alias backend-engineer
1.00
KRA data-engineer
0.57

Post-classification

Centroidupdated · n=252
Alias collision log
New-role queue
New skills captured20
New KRA captured

Captured for admin review

Azure Data Factory primary Data Engineer pending
Azure Databricks primary Data Engineer pending
ADLS Gen2 primary Data Engineer pending
Azure Storage primary Data Engineer pending
PySpark primary Data Engineer pending
ETL/ELT primary Data Engineer pending
Data Modeling primary Data Engineer pending
Azure SQL primary Data Engineer pending
SQL MI primary Data Engineer pending
Azure OpenAI Service primary Data Engineer pending
Azure Cognitive Services primary Data Engineer pending
Azure AI Search primary Data Engineer pending
Managed Identities primary Data Engineer pending
Encryption primary Data Engineer pending
Azure Event Hubs primary Data Engineer pending
Azure Stream Analytics primary Data Engineer pending
Lakehouse Architecture primary Data Engineer pending
Azure Purview primary Data Engineer pending
Azure Machine Learning primary Data Engineer pending
Hugging Face Transformers primary Data Engineer pending
Status: extract_from_jd_done Created: 2026-05-27T15:04:35.226840Z Updated: 2026-06-12T16:53:47.563582Z
Flow Current 3-step pipeline

1 POST /skills/extract-from-jd

2 POST /skills/extract-details

3 POST /skills/final-role-output

Role Chosen role & resolution

No chosen role stored for this run.

Job description

Line of Service
Advisory


Industry/Sector
Not Applicable


Specialism
Emerging Technologies


Management Level
Senior Associate


Job Description & Summary
At PwC, our people in software and product innovation focus on developing cutting-edge software solutions and driving product innovation to meet the evolving needs of clients. These individuals combine technical experience with creative thinking to deliver innovative software products and solutions.

In emerging technology at PwC, you will focus on exploring and implementing cutting-edge technologies to drive innovation and transformation for clients. You will work in areas such as artificial intelligence, blockchain, and the internet of things (IoT).

*Why PWCAt PwC, you will be part of a vibrant community of solvers that leads with trust and creates distinctive outcomes for our clients and communities. This purpose-led and values-driven work, powered by technology in an environment that drives innovation, will enable you to make a tangible impact in the real world. We reward your contributions, support your wellbeing, and offer inclusive benefits, flexibility programmes and mentorship that will help you thrive in work and life. Together, we grow, learn, care, collaborate, and create a future of infinite experiences for each other. Learn more about us.At PwC, we believe in providing equal employment opportunities, without any discrimination on the grounds of gender, ethnic background, age, disability, marital status, sexual orientation, pregnancy, gender identity or expression, religion or other beliefs, perceived differences and status protected by law. We strive to create an environment where each one of our people can bring their true selves and contribute to their personal growth and the firm’s growth. To enable this, we have zero tolerance for any discrimination and harassment based on the above considerations. "
Job Description & Summary: A career in our New Technologies practice, within Application and Emerging Technology services, offers the opportunity to design and build modern, cloud-native solutions on Microsoft Azure that power the next generation of digital businesses. You will help clients create scalable, secure, and high-performing data and AI platforms by leveraging Azure services such as Azure Synapse, Azure Databricks, Azure Data Factory, Azure OpenAI Service, and Azure Cognitive Services. Our team focuses on building end-to-end data pipelines, lakehouse architectures, and enterprise-grade GenAI solutions—including intelligent applications, copilots, and domain-specific assistants—that enable advanced analytics, automation, and innovation. We emphasize secure, compliant, and scalable architectures that help organizations unlock the value of their data and transform how they operate and compete.

Responsibilities:

Design and implement scalable data ingestion and transformation pipelines using Azure Data Factory, Azure Synapse Pipelines, Databricks, and serverless compute; build and manage data lakes and lakehouse architectures using ADLS, Delta Lake, and enterprise Azure warehouse components; develop PySpark/Python data processing jobs for batch and streaming; implement real-time ingestion with Azure Event Hubs, Azure Stream Analytics, or Kafka; apply best practices for data modeling, partitioning, indexing, compression, cost optimization, and performance tuning across Azure platforms; ensure data quality, lineage, metadata management, and auditing across the lifecycle; implement security and governance with Azure AD, Managed

Identities, Key Vault, network isolation, and fine-grained RBAC; design and develop GenAI applications using Azure OpenAI and Azure AI services; implement RAG architectures using Azure Cognitive Search / Azure AI Search and Azure data stores; integrate GenAI solutions with ADLS, Synapse, SQL, SharePoint, and Microsoft 365; build Python APIs, microservices, and backends with FastAPI/Flask hosted on Azure Functions, App Service, or AKS; implement prompt

Mandatory skill sets:

Azure Data Factory, Azure Synapse Analytics, Azure Databricks, ADLS Gen2, Azure Storage, PySpark, Python, SQL, Data Warehousing, ETL/ELT, Data Modeling, Azure SQL / SQL MI, Azure OpenAI Service, Azure Cognitive Services, Azure Cognitive Search / Azure AI Search, LangChain, LlamaIndex, REST API Development (FastAPI/Flask/Django), Azure Functions, Azure App Service, Azure AD, Managed Identities, RBAC, Azure Key Vault, Encryption, Docker, Git, CI/CD (Azure DevOps/GitHub)

Preferred skill sets:

Azure Event Hubs, Kafka, Azure Stream Analytics, Delta Lake / Lakehouse Architecture, Azure Purview, Power BI Integration, Azure Machine Learning, Azure OpenAI Integration, Terraform / Bicep, Cost Optimization on Azure Data Services, Real-time Analytics, Hugging Face Transformers, PEFT / LoRA Fine-tuning, Multi-cloud GenAI Platforms (Azure OpenAI / AWS Bedrock / Vertex AI), Prompt Engineering Techniques, Frontend Integration (React / Angular) for chatbots, LLMOps / MLOps,

Monitoring & Logging for GenAI, Security & Compliance for AI workloads

Years of experience required:

4-7

Education qualification-Full

Time:

B.E/B.Tech/M.Tech/MBA/MCA




Education (if blank, degree and/or field of study not specified)
Degrees/Field of Study required: Master of Business Administration, Bachelor of Technology

Degrees/Field of Study preferred:


Certifications (if blank, certifications not specified)


Required Skills
Microsoft Azure


Optional Skills
Accepting Feedback, Accepting Feedback, Active Listening, Analytical Thinking, Artificial Intelligence, Business Planning and Simulation (BW-BPS), Communication, Competitive Advantage, Conducting Research, Creativity, Digital Transformation, Embracing Change, Emotional Regulation, Empathy, Implementing Technology, Inclusion, Innovation Processes, Intellectual Curiosity, Internet of Things (IoT), Learning Agility, Optimism, Product Development, Product Testing, Prototyping, Quality Assurance Process Management {+ 10 more}


Desired Languages (If blank, desired languages not specified)


Travel Requirements


Available for Work Visa Sponsorship?


Government Clearance Required?


Job Posting End Date

Skills from this JD

Each row merges API 1 extraction, API 2 library match / v3 orchestration (dimensions + locked dims), and API 3 persistence tags.

Azure Data Factory Primary No API 2 row (run stopped after API 1 or history missing)
Azure Synapse Analytics Primary No API 2 row (run stopped after API 1 or history missing)
Azure Databricks Primary No API 2 row (run stopped after API 1 or history missing)
ADLS Gen2 Primary No API 2 row (run stopped after API 1 or history missing)
Azure Storage Primary No API 2 row (run stopped after API 1 or history missing)
PySpark Primary No API 2 row (run stopped after API 1 or history missing)
Python Primary No API 2 row (run stopped after API 1 or history missing)
SQL Primary No API 2 row (run stopped after API 1 or history missing)
ETL/ELT Primary No API 2 row (run stopped after API 1 or history missing)
Data Modeling Primary No API 2 row (run stopped after API 1 or history missing)
Azure SQL Primary No API 2 row (run stopped after API 1 or history missing)
SQL MI Primary No API 2 row (run stopped after API 1 or history missing)
Azure OpenAI Service Primary No API 2 row (run stopped after API 1 or history missing)
Azure Cognitive Services Primary No API 2 row (run stopped after API 1 or history missing)
Azure Cognitive Search Primary No API 2 row (run stopped after API 1 or history missing)
Azure AI Search Primary No API 2 row (run stopped after API 1 or history missing)
LangChain Primary No API 2 row (run stopped after API 1 or history missing)
LlamaIndex Primary No API 2 row (run stopped after API 1 or history missing)
FastAPI Primary No API 2 row (run stopped after API 1 or history missing)
Flask Primary No API 2 row (run stopped after API 1 or history missing)
Django Secondary No API 2 row (run stopped after API 1 or history missing)
Azure Functions Primary No API 2 row (run stopped after API 1 or history missing)
Azure App Service Primary No API 2 row (run stopped after API 1 or history missing)
Azure AD Primary No API 2 row (run stopped after API 1 or history missing)
Managed Identities Primary No API 2 row (run stopped after API 1 or history missing)
RBAC Primary No API 2 row (run stopped after API 1 or history missing)
Azure Key Vault Primary No API 2 row (run stopped after API 1 or history missing)
Encryption Primary No API 2 row (run stopped after API 1 or history missing)
Docker Primary No API 2 row (run stopped after API 1 or history missing)
Git Primary No API 2 row (run stopped after API 1 or history missing)
CI/CD Primary No API 2 row (run stopped after API 1 or history missing)
Azure DevOps Secondary No API 2 row (run stopped after API 1 or history missing)
GitHub Secondary No API 2 row (run stopped after API 1 or history missing)
Azure Event Hubs Primary No API 2 row (run stopped after API 1 or history missing)
Kafka Primary No API 2 row (run stopped after API 1 or history missing)
Azure Stream Analytics Primary No API 2 row (run stopped after API 1 or history missing)
Delta Lake Primary No API 2 row (run stopped after API 1 or history missing)
Lakehouse Architecture Primary No API 2 row (run stopped after API 1 or history missing)
Azure Purview Primary No API 2 row (run stopped after API 1 or history missing)
Power BI Secondary No API 2 row (run stopped after API 1 or history missing)
Azure Machine Learning Primary No API 2 row (run stopped after API 1 or history missing)
Terraform Primary No API 2 row (run stopped after API 1 or history missing)
Bicep Primary No API 2 row (run stopped after API 1 or history missing)
Hugging Face Transformers Primary No API 2 row (run stopped after API 1 or history missing)
PEFT Primary No API 2 row (run stopped after API 1 or history missing)
LoRA Primary No API 2 row (run stopped after API 1 or history missing)
AWS Bedrock Secondary No API 2 row (run stopped after API 1 or history missing)
Vertex AI Secondary No API 2 row (run stopped after API 1 or history missing)
Prompt Engineering Primary No API 2 row (run stopped after API 1 or history missing)
React Secondary No API 2 row (run stopped after API 1 or history missing)
Angular Secondary No API 2 row (run stopped after API 1 or history missing)
LLMOps Primary No API 2 row (run stopped after API 1 or history missing)
MLOps Primary No API 2 row (run stopped after API 1 or history missing)

Library artifacts (this run)

No artifact rows for this run.
nano JD Parser — gpt-4.1-nano click to toggle
RoleSenior Associate
CompanyPwC
Experience4-7
DomainIT Services & Consulting
JD type pass
Show raw JSON
{
  "JD_type": "pass",
  "about_company": {
    "source_marker": {
      "first_5_words": "At PwC, our people in",
      "last_5_words": "software products and solutions."
    },
    "text": "At PwC, our people in software and product innovation focus on developing cutting-edge software solutions and driving product innovation to meet the evolving needs of clients. These individuals combine technical experience with creative thinking to deliver innovative software products and solutions.",
    "word_count": 42
  },
  "certifications": [],
  "company_name": "PwC",
  "ctc": null,
  "domain": {
    "primary": {
      "aliases": [
        "ITES",
        "BPO",
        "Tech Consulting"
      ],
      "domain": "IT Services \u0026 Consulting"
    },
    "secondary": null
  },
  "education": [
    {
      "level": "Master\u0027s",
      "qualification": "MBA - Business Administration",
      "raw": "Master of Business Administration",
      "requirement": "required"
    },
    {
      "level": "Bachelor\u0027s",
      "qualification": "BTECH/BE - Technology",
      "raw": "Bachelor of Technology",
      "requirement": "required"
    }
  ],
  "experience": {
    "max": 7,
    "min": 4,
    "raw": "4-7"
  },
  "job_locations": [],
  "role": "Senior Associate",
  "role_aliases": [
    "Senior Software Engineer",
    "Senior Developer",
    "Senior Data Engineer"
  ],
  "role_archetype": "Engineering",
  "roles_and_responsibilities": [
    {
      "bullet_count": 0,
      "heading": "Responsibilities",
      "heading_was_present": true,
      "source_marker": {
        "first_5_words": "Design and implement scalable data",
        "last_5_words": "and develop GenAI applications using"
      },
      "text": "Design and implement scalable data ingestion and transformation pipelines using Azure Data Factory, Azure Synapse Pipelines, Databricks, and serverless compute; build and manage data lakes and lakehouse architectures using ADLS, Delta Lake, and enterprise Azure warehouse components; develop PySpark/Python data processing jobs for batch and streaming; implement real-time ingestion with Azure Event Hubs, Azure Stream Analytics, or Kafka; apply best practices for data modeling, partitioning, indexing, compression, cost optimization, and performance tuning across Azure platforms; ensure data quality, lineage, metadata management, and auditing across the lifecycle; implement security and governance with Azure AD, Managed Identities, Key Vault, network isolation, and fine-grained RBAC; design and develop GenAI applications using Azure OpenAI and Azure AI services; implement RAG architectures using Azure Cognitive Search / Azure AI Search and Azure data stores; integrate GenAI solutions with ADLS, Synapse, SQL, SharePoint, and Microsoft 365; build Python APIs, microservices, and backends with FastAPI/Flask hosted on Azure Functions, App Service, or AKS; implement prompt.",
      "word_count": 198
    },
    {
      "bullet_count": 0,
      "heading": "Mandatory skill sets",
      "heading_was_present": true,
      "source_marker": {
        "first_5_words": "Azure Data Factory, Azure Synapse",
        "last_5_words": "CI/CD (Azure DevOps/GitHub)."
      },
      "text": "Azure Data Factory, Azure Synapse Analytics, Azure Databricks, ADLS Gen2, Azure Storage, PySpark, Python, SQL, Data Warehousing, ETL/ELT, Data Modeling, Azure SQL / SQL MI, Azure OpenAI Service, Azure Cognitive Services, Azure Cognitive Search / Azure AI Search, LangChain, LlamaIndex, REST API Development (FastAPI/Flask/Django), Azure Functions, Azure App Service, Azure AD, Managed Identities, RBAC, Azure Key Vault, Encryption, Docker, Git, CI/CD (Azure DevOps/GitHub).",
      "word_count": 43
    },
    {
      "bullet_count": 0,
      "heading": "Preferred skill sets",
      "heading_was_present": true,
      "source_marker": {
        "first_5_words": "Azure Event Hubs, Kafka, Azure",
        "last_5_words": "for AI workloads."
      },
      "text": "Azure Event Hubs, Kafka, Azure Stream Analytics, Delta Lake / Lakehouse Architecture, Azure Purview, Power BI Integration, Azure Machine Learning, Azure OpenAI Integration, Terraform / Bicep, Cost Optimization on Azure Data Services, Real-time Analytics, Hugging Face Transformers, PEFT / LoRA Fine-tuning, Multi-cloud GenAI Platforms (Azure OpenAI / AWS Bedrock / Vertex AI), Prompt Engineering Techniques, Frontend Integration (React / Angular) for chatbots, LLMOps / MLOps, Monitoring \u0026 Logging for GenAI, Security \u0026 Compliance for AI workloads.",
      "word_count": 56
    }
  ],
  "urls": []
}
API 1 — extract-from-jd click to toggle
{
  "final_skills": [
    {
      "is_primary": true,
      "skill_name": "Azure Data Factory"
    },
    {
      "is_primary": true,
      "skill_name": "Azure Synapse Analytics"
    },
    {
      "is_primary": true,
      "skill_name": "Azure Databricks"
    },
    {
      "is_primary": true,
      "skill_name": "ADLS Gen2"
    },
    {
      "is_primary": true,
      "skill_name": "Azure Storage"
    },
    {
      "is_primary": true,
      "skill_name": "PySpark"
    },
    {
      "is_primary": true,
      "skill_name": "Python"
    },
    {
      "is_primary": true,
      "skill_name": "SQL"
    },
    {
      "is_primary": true,
      "skill_name": "ETL/ELT"
    },
    {
      "is_primary": true,
      "skill_name": "Data Modeling"
    },
    {
      "is_primary": true,
      "skill_name": "Azure SQL"
    },
    {
      "is_primary": true,
      "skill_name": "SQL MI"
    },
    {
      "is_primary": true,
      "skill_name": "Azure OpenAI Service"
    },
    {
      "is_primary": true,
      "skill_name": "Azure Cognitive Services"
    },
    {
      "is_primary": true,
      "skill_name": "Azure Cognitive Search"
    },
    {
      "is_primary": true,
      "skill_name": "Azure AI Search"
    },
    {
      "is_primary": true,
      "skill_name": "LangChain"
    },
    {
      "is_primary": true,
      "skill_name": "LlamaIndex"
    },
    {
      "is_primary": true,
      "skill_name": "FastAPI"
    },
    {
      "is_primary": true,
      "skill_name": "Flask"
    },
    {
      "is_primary": false,
      "skill_name": "Django"
    },
    {
      "is_primary": true,
      "skill_name": "Azure Functions"
    },
    {
      "is_primary": true,
      "skill_name": "Azure App Service"
    },
    {
      "is_primary": true,
      "skill_name": "Azure AD"
    },
    {
      "is_primary": true,
      "skill_name": "Managed Identities"
    },
    {
      "is_primary": true,
      "skill_name": "RBAC"
    },
    {
      "is_primary": true,
      "skill_name": "Azure Key Vault"
    },
    {
      "is_primary": true,
      "skill_name": "Encryption"
    },
    {
      "is_primary": true,
      "skill_name": "Docker"
    },
    {
      "is_primary": true,
      "skill_name": "Git"
    },
    {
      "is_primary": true,
      "skill_name": "CI/CD"
    },
    {
      "is_primary": false,
      "skill_name": "Azure DevOps"
    },
    {
      "is_primary": false,
      "skill_name": "GitHub"
    },
    {
      "is_primary": true,
      "skill_name": "Azure Event Hubs"
    },
    {
      "is_primary": true,
      "skill_name": "Kafka"
    },
    {
      "is_primary": true,
      "skill_name": "Azure Stream Analytics"
    },
    {
      "is_primary": true,
      "skill_name": "Delta Lake"
    },
    {
      "is_primary": true,
      "skill_name": "Lakehouse Architecture"
    },
    {
      "is_primary": true,
      "skill_name": "Azure Purview"
    },
    {
      "is_primary": false,
      "skill_name": "Power BI"
    },
    {
      "is_primary": true,
      "skill_name": "Azure Machine Learning"
    },
    {
      "is_primary": true,
      "skill_name": "Terraform"
    },
    {
      "is_primary": true,
      "skill_name": "Bicep"
    },
    {
      "is_primary": true,
      "skill_name": "Hugging Face Transformers"
    },
    {
      "is_primary": true,
      "skill_name": "PEFT"
    },
    {
      "is_primary": true,
      "skill_name": "LoRA"
    },
    {
      "is_primary": false,
      "skill_name": "AWS Bedrock"
    },
    {
      "is_primary": false,
      "skill_name": "Vertex AI"
    },
    {
      "is_primary": true,
      "skill_name": "Prompt Engineering"
    },
    {
      "is_primary": false,
      "skill_name": "React"
    },
    {
      "is_primary": false,
      "skill_name": "Angular"
    },
    {
      "is_primary": true,
      "skill_name": "LLMOps"
    },
    {
      "is_primary": true,
      "skill_name": "MLOps"
    }
  ],
  "jd_role": {
    "display_name": "Senior Associate",
    "rationale": null,
    "role_aliases": [
      "Senior Software Engineer",
      "Senior Developer",
      "Senior Data Engineer"
    ],
    "role_archetype": "Engineering",
    "slug": ""
  },
  "nano_parsed": {
    "JD_type": "pass",
    "about_company": {
      "source_marker": {
        "first_5_words": "At PwC, our people in",
        "last_5_words": "software products and solutions."
      },
      "text": "At PwC, our people in software and product innovation focus on developing cutting-edge software solutions and driving product innovation to meet the evolving needs of clients. These individuals combine technical experience with creative thinking to deliver innovative software products and solutions.",
      "word_count": 42
    },
    "certifications": [],
    "company_name": "PwC",
    "ctc": null,
    "domain": {
      "primary": {
        "aliases": [
          "ITES",
          "BPO",
          "Tech Consulting"
        ],
        "domain": "IT Services \u0026 Consulting"
      },
      "secondary": null
    },
    "education": [
      {
        "level": "Master\u0027s",
        "qualification": "MBA - Business Administration",
        "raw": "Master of Business Administration",
        "requirement": "required"
      },
      {
        "level": "Bachelor\u0027s",
        "qualification": "BTECH/BE - Technology",
        "raw": "Bachelor of Technology",
        "requirement": "required"
      }
    ],
    "experience": {
      "max": 7,
      "min": 4,
      "raw": "4-7"
    },
    "job_locations": [],
    "role": "Senior Associate",
    "role_aliases": [
      "Senior Software Engineer",
      "Senior Developer",
      "Senior Data Engineer"
    ],
    "role_archetype": "Engineering",
    "roles_and_responsibilities": [
      {
        "bullet_count": 0,
        "heading": "Responsibilities",
        "heading_was_present": true,
        "source_marker": {
          "first_5_words": "Design and implement scalable data",
          "last_5_words": "and develop GenAI applications using"
        },
        "text": "Design and implement scalable data ingestion and transformation pipelines using Azure Data Factory, Azure Synapse Pipelines, Databricks, and serverless compute; build and manage data lakes and lakehouse architectures using ADLS, Delta Lake, and enterprise Azure warehouse components; develop PySpark/Python data processing jobs for batch and streaming; implement real-time ingestion with Azure Event Hubs, Azure Stream Analytics, or Kafka; apply best practices for data modeling, partitioning, indexing, compression, cost optimization, and performance tuning across Azure platforms; ensure data quality, lineage, metadata management, and auditing across the lifecycle; implement security and governance with Azure AD, Managed Identities, Key Vault, network isolation, and fine-grained RBAC; design and develop GenAI applications using Azure OpenAI and Azure AI services; implement RAG architectures using Azure Cognitive Search / Azure AI Search and Azure data stores; integrate GenAI solutions with ADLS, Synapse, SQL, SharePoint, and Microsoft 365; build Python APIs, microservices, and backends with FastAPI/Flask hosted on Azure Functions, App Service, or AKS; implement prompt.",
        "word_count": 198
      },
      {
        "bullet_count": 0,
        "heading": "Mandatory skill sets",
        "heading_was_present": true,
        "source_marker": {
          "first_5_words": "Azure Data Factory, Azure Synapse",
          "last_5_words": "CI/CD (Azure DevOps/GitHub)."
        },
        "text": "Azure Data Factory, Azure Synapse Analytics, Azure Databricks, ADLS Gen2, Azure Storage, PySpark, Python, SQL, Data Warehousing, ETL/ELT, Data Modeling, Azure SQL / SQL MI, Azure OpenAI Service, Azure Cognitive Services, Azure Cognitive Search / Azure AI Search, LangChain, LlamaIndex, REST API Development (FastAPI/Flask/Django), Azure Functions, Azure App Service, Azure AD, Managed Identities, RBAC, Azure Key Vault, Encryption, Docker, Git, CI/CD (Azure DevOps/GitHub).",
        "word_count": 43
      },
      {
        "bullet_count": 0,
        "heading": "Preferred skill sets",
        "heading_was_present": true,
        "source_marker": {
          "first_5_words": "Azure Event Hubs, Kafka, Azure",
          "last_5_words": "for AI workloads."
        },
        "text": "Azure Event Hubs, Kafka, Azure Stream Analytics, Delta Lake / Lakehouse Architecture, Azure Purview, Power BI Integration, Azure Machine Learning, Azure OpenAI Integration, Terraform / Bicep, Cost Optimization on Azure Data Services, Real-time Analytics, Hugging Face Transformers, PEFT / LoRA Fine-tuning, Multi-cloud GenAI Platforms (Azure OpenAI / AWS Bedrock / Vertex AI), Prompt Engineering Techniques, Frontend Integration (React / Angular) for chatbots, LLMOps / MLOps, Monitoring \u0026 Logging for GenAI, Security \u0026 Compliance for AI workloads.",
        "word_count": 56
      }
    ],
    "urls": []
  },
  "rejected": false,
  "rejection_reason": null,
  "run_id": "5f6e76e1-2900-47a8-94df-c46c76ce90f2",
  "stage3_signals": {
    "alias_found": true,
    "alias_match_roles": [
      {
        "display_name": "Backend Developer",
        "kra_matches": null,
        "matched_count": null,
        "matched_skills": null,
        "role_id": 1,
        "score": 1.0,
        "slug": "backend-engineer",
        "total_count": null
      }
    ],
    "kra_match_roles": [
      {
        "display_name": "Data Engineer",
        "kra_matches": [
          {
            "kra_text": "Develops batch and real-time streaming data pipelines using Apache Spark, Apache Kafka, Apache Flink, or Airflow for data movement and processing at scale.",
            "sentence": "Design and implement scalable data ingestion and transformation pipelines using Azure Data Factory, Azure Synapse Pipelines, Databricks, and serverless compute; build and manage data lakes and lakehouse architectures using ADLS, Delta Lake, and enterprise Azure warehouse components; develop PySpark/Python data processing jobs for batch and streaming; implement real-time ingestion with Azure Event Hubs, Azure Stream Analytics, or Kafka; apply best practices for data modeling, partitioning, indexing, compression, cost optimization, and performance tuning across Azure platforms; ensure data quality, lineage, metadata management, and auditing across the lifecycle; implement security and governance with Azure AD, Managed Identities, Key Vault, network isolation, and fine-grained RBAC; design and develop GenAI applications using Azure OpenAI and Azure AI services; implement RAG architectures using Azure Cognitive Search / Azure AI Search and Azure data stores; integrate GenAI solutions with ADLS, Synapse, SQL, SharePoint, and Microsoft 365; build Python APIs, microservices, and backends with FastAPI/Flask hosted on Azure Functions, App Service, or AKS; implement prompt.",
            "similarity": 0.6154
          },
          {
            "kra_text": "Develops batch and real-time streaming data pipelines using Apache Spark, Apache Kafka, Apache Flink, or Airflow for data movement and processing at scale.",
            "sentence": "Azure Data Factory, Azure Synapse Analytics, Azure Databricks, ADLS Gen2, Azure Storage, PySpark, Python, SQL, Data Warehousing, ETL/ELT, Data Modeling, Azure SQL / SQL MI, Azure OpenAI Service, Azure Cognitive Services, Azure Cognitive Search / Azure AI Search, LangChain, LlamaIndex, REST API Development (FastAPI/Flask/Django), Azure Functions, Azure App Service, Azure AD, Managed Identities, RBAC, Azure Key Vault, Encryption, Docker, Git, CI/CD (Azure DevOps/GitHub).",
            "similarity": 0.5622
          },
          {
            "kra_text": "Develops batch and real-time streaming data pipelines using Apache Spark, Apache Kafka, Apache Flink, or Airflow for data movement and processing at scale.",
            "sentence": "Azure Event Hubs, Kafka, Azure Stream Analytics, Delta Lake / Lakehouse Architecture, Azure Purview, Power BI Integration, Azure Machine Learning, Azure OpenAI Integration, Terraform / Bicep, Cost Optimization on Azure Data Services, Real-time Analytics, Hugging Face Transformers, PEFT / LoRA Fine-tuning, Multi-cloud GenAI Platforms (Azure OpenAI / AWS Bedrock / Vertex AI), Prompt Engineering Techniques, Frontend Integration (React / Angular) for chatbots, LLMOps / MLOps, Monitoring \u0026 Logging for GenAI, Security \u0026 Compliance for AI workloads.",
            "similarity": 0.532
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 2,
        "score": 0.5699,
        "slug": "data-engineer",
        "total_count": null
      },
      {
        "display_name": "AI Engineer",
        "kra_matches": [
          {
            "kra_text": "Designs and implements prompt engineering workflows, few-shot examples, chain-of-thought patterns, and structured output parsing for AI feature pipelines.",
            "sentence": "Design and implement scalable data ingestion and transformation pipelines using Azure Data Factory, Azure Synapse Pipelines, Databricks, and serverless compute; build and manage data lakes and lakehouse architectures using ADLS, Delta Lake, and enterprise Azure warehouse components; develop PySpark/Python data processing jobs for batch and streaming; implement real-time ingestion with Azure Event Hubs, Azure Stream Analytics, or Kafka; apply best practices for data modeling, partitioning, indexing, compression, cost optimization, and performance tuning across Azure platforms; ensure data quality, lineage, metadata management, and auditing across the lifecycle; implement security and governance with Azure AD, Managed Identities, Key Vault, network isolation, and fine-grained RBAC; design and develop GenAI applications using Azure OpenAI and Azure AI services; implement RAG architectures using Azure Cognitive Search / Azure AI Search and Azure data stores; integrate GenAI solutions with ADLS, Synapse, SQL, SharePoint, and Microsoft 365; build Python APIs, microservices, and backends with FastAPI/Flask hosted on Azure Functions, App Service, or AKS; implement prompt.",
            "similarity": 0.5599
          },
          {
            "kra_text": "Optimizes AI pipeline efficiency by tuning model selection, context window usage, prompt caching, and batching strategies to reduce cost and latency.",
            "sentence": "Azure Event Hubs, Kafka, Azure Stream Analytics, Delta Lake / Lakehouse Architecture, Azure Purview, Power BI Integration, Azure Machine Learning, Azure OpenAI Integration, Terraform / Bicep, Cost Optimization on Azure Data Services, Real-time Analytics, Hugging Face Transformers, PEFT / LoRA Fine-tuning, Multi-cloud GenAI Platforms (Azure OpenAI / AWS Bedrock / Vertex AI), Prompt Engineering Techniques, Frontend Integration (React / Angular) for chatbots, LLMOps / MLOps, Monitoring \u0026 Logging for GenAI, Security \u0026 Compliance for AI workloads.",
            "similarity": 0.5371
          },
          {
            "kra_text": "Integrates AI model API responses with application business logic, database writes, event publishing, and downstream service orchestration.",
            "sentence": "Azure Data Factory, Azure Synapse Analytics, Azure Databricks, ADLS Gen2, Azure Storage, PySpark, Python, SQL, Data Warehousing, ETL/ELT, Data Modeling, Azure SQL / SQL MI, Azure OpenAI Service, Azure Cognitive Services, Azure Cognitive Search / Azure AI Search, LangChain, LlamaIndex, REST API Development (FastAPI/Flask/Django), Azure Functions, Azure App Service, Azure AD, Managed Identities, RBAC, Azure Key Vault, Encryption, Docker, Git, CI/CD (Azure DevOps/GitHub).",
            "similarity": 0.4854
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 13,
        "score": 0.5274,
        "slug": "ai-engineer",
        "total_count": null
      },
      {
        "display_name": "Cloud Architect",
        "kra_matches": [
          {
            "kra_text": "Defines cloud adoption roadmaps, lift-and-shift vs. refactor migration strategies, and landing zone architectures for workloads moving to AWS, Azure, or GCP.",
            "sentence": "Azure Event Hubs, Kafka, Azure Stream Analytics, Delta Lake / Lakehouse Architecture, Azure Purview, Power BI Integration, Azure Machine Learning, Azure OpenAI Integration, Terraform / Bicep, Cost Optimization on Azure Data Services, Real-time Analytics, Hugging Face Transformers, PEFT / LoRA Fine-tuning, Multi-cloud GenAI Platforms (Azure OpenAI / AWS Bedrock / Vertex AI), Prompt Engineering Techniques, Frontend Integration (React / Angular) for chatbots, LLMOps / MLOps, Monitoring \u0026 Logging for GenAI, Security \u0026 Compliance for AI workloads.",
            "similarity": 0.5197
          },
          {
            "kra_text": "Defines cloud adoption roadmaps, lift-and-shift vs. refactor migration strategies, and landing zone architectures for workloads moving to AWS, Azure, or GCP.",
            "sentence": "Design and implement scalable data ingestion and transformation pipelines using Azure Data Factory, Azure Synapse Pipelines, Databricks, and serverless compute; build and manage data lakes and lakehouse architectures using ADLS, Delta Lake, and enterprise Azure warehouse components; develop PySpark/Python data processing jobs for batch and streaming; implement real-time ingestion with Azure Event Hubs, Azure Stream Analytics, or Kafka; apply best practices for data modeling, partitioning, indexing, compression, cost optimization, and performance tuning across Azure platforms; ensure data quality, lineage, metadata management, and auditing across the lifecycle; implement security and governance with Azure AD, Managed Identities, Key Vault, network isolation, and fine-grained RBAC; design and develop GenAI applications using Azure OpenAI and Azure AI services; implement RAG architectures using Azure Cognitive Search / Azure AI Search and Azure data stores; integrate GenAI solutions with ADLS, Synapse, SQL, SharePoint, and Microsoft 365; build Python APIs, microservices, and backends with FastAPI/Flask hosted on Azure Functions, App Service, or AKS; implement prompt.",
            "similarity": 0.5036
          },
          {
            "kra_text": "Defines cloud adoption roadmaps, lift-and-shift vs. refactor migration strategies, and landing zone architectures for workloads moving to AWS, Azure, or GCP.",
            "sentence": "Azure Data Factory, Azure Synapse Analytics, Azure Databricks, ADLS Gen2, Azure Storage, PySpark, Python, SQL, Data Warehousing, ETL/ELT, Data Modeling, Azure SQL / SQL MI, Azure OpenAI Service, Azure Cognitive Services, Azure Cognitive Search / Azure AI Search, LangChain, LlamaIndex, REST API Development (FastAPI/Flask/Django), Azure Functions, Azure App Service, Azure AD, Managed Identities, RBAC, Azure Key Vault, Encryption, Docker, Git, CI/CD (Azure DevOps/GitHub).",
            "similarity": 0.4857
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 9,
        "score": 0.503,
        "slug": "cloud-architect",
        "total_count": null
      },
      {
        "display_name": "AI Compliance Officer",
        "kra_matches": [
          {
            "kra_text": "Maps AI system behaviors and data processing activities to regulatory requirements including EU AI Act, GDPR, CCPA, and sector-specific compliance frameworks.",
            "sentence": "Azure Event Hubs, Kafka, Azure Stream Analytics, Delta Lake / Lakehouse Architecture, Azure Purview, Power BI Integration, Azure Machine Learning, Azure OpenAI Integration, Terraform / Bicep, Cost Optimization on Azure Data Services, Real-time Analytics, Hugging Face Transformers, PEFT / LoRA Fine-tuning, Multi-cloud GenAI Platforms (Azure OpenAI / AWS Bedrock / Vertex AI), Prompt Engineering Techniques, Frontend Integration (React / Angular) for chatbots, LLMOps / MLOps, Monitoring \u0026 Logging for GenAI, Security \u0026 Compliance for AI workloads.",
            "similarity": 0.5393
          },
          {
            "kra_text": "Maps AI system behaviors and data processing activities to regulatory requirements including EU AI Act, GDPR, CCPA, and sector-specific compliance frameworks.",
            "sentence": "Design and implement scalable data ingestion and transformation pipelines using Azure Data Factory, Azure Synapse Pipelines, Databricks, and serverless compute; build and manage data lakes and lakehouse architectures using ADLS, Delta Lake, and enterprise Azure warehouse components; develop PySpark/Python data processing jobs for batch and streaming; implement real-time ingestion with Azure Event Hubs, Azure Stream Analytics, or Kafka; apply best practices for data modeling, partitioning, indexing, compression, cost optimization, and performance tuning across Azure platforms; ensure data quality, lineage, metadata management, and auditing across the lifecycle; implement security and governance with Azure AD, Managed Identities, Key Vault, network isolation, and fine-grained RBAC; design and develop GenAI applications using Azure OpenAI and Azure AI services; implement RAG architectures using Azure Cognitive Search / Azure AI Search and Azure data stores; integrate GenAI solutions with ADLS, Synapse, SQL, SharePoint, and Microsoft 365; build Python APIs, microservices, and backends with FastAPI/Flask hosted on Azure Functions, App Service, or AKS; implement prompt.",
            "similarity": 0.5101
          },
          {
            "kra_text": "Maps AI system behaviors and data processing activities to regulatory requirements including EU AI Act, GDPR, CCPA, and sector-specific compliance frameworks.",
            "sentence": "Azure Data Factory, Azure Synapse Analytics, Azure Databricks, ADLS Gen2, Azure Storage, PySpark, Python, SQL, Data Warehousing, ETL/ELT, Data Modeling, Azure SQL / SQL MI, Azure OpenAI Service, Azure Cognitive Services, Azure Cognitive Search / Azure AI Search, LangChain, LlamaIndex, REST API Development (FastAPI/Flask/Django), Azure Functions, Azure App Service, Azure AD, Managed Identities, RBAC, Azure Key Vault, Encryption, Docker, Git, CI/CD (Azure DevOps/GitHub).",
            "similarity": 0.4581
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 12,
        "score": 0.5025,
        "slug": "ai-compliance-officer",
        "total_count": null
      },
      {
        "display_name": "MLOps Engineer",
        "kra_matches": [
          {
            "kra_text": "Automates ML platform operations including scheduled retraining triggers, pipeline orchestration, evaluation workflows, and alerting configuration.",
            "sentence": "Azure Event Hubs, Kafka, Azure Stream Analytics, Delta Lake / Lakehouse Architecture, Azure Purview, Power BI Integration, Azure Machine Learning, Azure OpenAI Integration, Terraform / Bicep, Cost Optimization on Azure Data Services, Real-time Analytics, Hugging Face Transformers, PEFT / LoRA Fine-tuning, Multi-cloud GenAI Platforms (Azure OpenAI / AWS Bedrock / Vertex AI), Prompt Engineering Techniques, Frontend Integration (React / Angular) for chatbots, LLMOps / MLOps, Monitoring \u0026 Logging for GenAI, Security \u0026 Compliance for AI workloads.",
            "similarity": 0.5231
          },
          {
            "kra_text": "Automates ML platform operations including scheduled retraining triggers, pipeline orchestration, evaluation workflows, and alerting configuration.",
            "sentence": "Design and implement scalable data ingestion and transformation pipelines using Azure Data Factory, Azure Synapse Pipelines, Databricks, and serverless compute; build and manage data lakes and lakehouse architectures using ADLS, Delta Lake, and enterprise Azure warehouse components; develop PySpark/Python data processing jobs for batch and streaming; implement real-time ingestion with Azure Event Hubs, Azure Stream Analytics, or Kafka; apply best practices for data modeling, partitioning, indexing, compression, cost optimization, and performance tuning across Azure platforms; ensure data quality, lineage, metadata management, and auditing across the lifecycle; implement security and governance with Azure AD, Managed Identities, Key Vault, network isolation, and fine-grained RBAC; design and develop GenAI applications using Azure OpenAI and Azure AI services; implement RAG architectures using Azure Cognitive Search / Azure AI Search and Azure data stores; integrate GenAI solutions with ADLS, Synapse, SQL, SharePoint, and Microsoft 365; build Python APIs, microservices, and backends with FastAPI/Flask hosted on Azure Functions, App Service, or AKS; implement prompt.",
            "similarity": 0.4771
          },
          {
            "kra_text": "Automates ML platform operations including scheduled retraining triggers, pipeline orchestration, evaluation workflows, and alerting configuration.",
            "sentence": "Azure Data Factory, Azure Synapse Analytics, Azure Databricks, ADLS Gen2, Azure Storage, PySpark, Python, SQL, Data Warehousing, ETL/ELT, Data Modeling, Azure SQL / SQL MI, Azure OpenAI Service, Azure Cognitive Services, Azure Cognitive Search / Azure AI Search, LangChain, LlamaIndex, REST API Development (FastAPI/Flask/Django), Azure Functions, Azure App Service, Azure AD, Managed Identities, RBAC, Azure Key Vault, Encryption, Docker, Git, CI/CD (Azure DevOps/GitHub).",
            "similarity": 0.4648
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 16,
        "score": 0.4883,
        "slug": "ml-ops-engineer",
        "total_count": null
      }
    ],
    "skill_match_roles": [
      {
        "display_name": "ML Engineer",
        "kra_matches": null,
        "matched_count": 9,
        "matched_skills": [
          "Azure App Service",
          "CI/CD",
          "Delta Lake",
          "LLMOps",
          "LangChain",
          "LlamaIndex",
          "MLOps",
          "Python",
          "Terraform"
        ],
        "role_id": 3,
        "score": 0.2,
        "slug": "ml-engineer",
        "total_count": 45
      },
      {
        "display_name": "MLOps Engineer",
        "kra_matches": null,
        "matched_count": 7,
        "matched_skills": [
          "Azure App Service",
          "Delta Lake",
          "LLMOps",
          "LangChain",
          "LlamaIndex",
          "MLOps",
          "Python"
        ],
        "role_id": 16,
        "score": 0.1556,
        "slug": "ml-ops-engineer",
        "total_count": 45
      },
      {
        "display_name": "Backend Developer",
        "kra_matches": null,
        "matched_count": 7,
        "matched_skills": [
          "Azure App Service",
          "Docker",
          "FastAPI",
          "Flask",
          "Kafka",
          "Python",
          "RBAC"
        ],
        "role_id": 1,
        "score": 0.1556,
        "slug": "backend-engineer",
        "total_count": 45
      },
      {
        "display_name": "Data Engineer",
        "kra_matches": null,
        "matched_count": 6,
        "matched_skills": [
          "Azure App Service",
          "Azure Synapse Analytics",
          "Kafka",
          "MLOps",
          "Python",
          "SQL"
        ],
        "role_id": 2,
        "score": 0.1333,
        "slug": "data-engineer",
        "total_count": 45
      },
      {
        "display_name": "DevOps Engineer",
        "kra_matches": null,
        "matched_count": 6,
        "matched_skills": [
          "Azure App Service",
          "Azure Key Vault",
          "Bicep",
          "CI/CD",
          "Docker",
          "Terraform"
        ],
        "role_id": 10,
        "score": 0.1333,
        "slug": "devops-engineer",
        "total_count": 45
      }
    ]
  },
  "stage4_decision": {
    "alias_collision_detected": false,
    "case": "DOMAIN",
    "chosen_role": {
      "display_name": "Data Engineer",
      "kra_matches": null,
      "matched_count": null,
      "matched_skills": null,
      "role_id": 2,
      "score": 0.98,
      "slug": "data-engineer",
      "total_count": null
    },
    "confidence": 0.98,
    "is_new_role": false,
    "llm2_fired": false,
    "llm2_reasoning": null,
    "matched_dimensions": [
      "Data Pipeline Engineering",
      "Lakehouse and Data Warehouse Architecture",
      "Batch and Streaming Data Processing",
      "Data Modeling and Performance Optimization",
      "Data Quality and Metadata Management",
      "Security and Governance",
      "GenAI Application Engineering",
      "API and Microservice Development"
    ],
    "matched_kras": [
      "Design and implement scalable data ingestion and transformation pipelines",
      "Build and manage data lakes and lakehouse architectures",
      "Develop PySpark/Python data processing jobs for batch and streaming",
      "Implement real-time ingestion with Azure Event Hubs",
      "Apply best practices for data modeling and performance tuning",
      "Ensure data quality, lineage, metadata management, and auditing",
      "Implement security and governance with Azure AD and Key Vault",
      "Design and develop GenAI applications using Azure OpenAI",
      "Implement RAG architectures using Azure Cognitive Search",
      "Build Python APIs, microservices, and backends with FastAPI/Flask"
    ],
    "matched_skills": [
      "Azure Data Factory",
      "Azure Synapse Pipelines",
      "Databricks",
      "ADLS",
      "Delta Lake",
      "PySpark",
      "Python",
      "Azure Event Hubs",
      "Azure Stream Analytics",
      "Kafka",
      "Azure OpenAI",
      "Azure Cognitive Search / Azure AI Search",
      "FastAPI",
      "Flask",
      "Azure Functions"
    ],
    "new_role_display_name": null,
    "new_role_slug": null,
    "queued": false,
    "reasoning": "Domain=Data Engineering \u0026 Analytics; The JD centers on building Azure-based data pipelines, lakehouse/data warehouse solutions, streaming ingestion, and related engineering, which best matches Data Engineer.",
    "sub_role": null
  },
  "stage5_updates": {
    "centroid_n_after": 252,
    "centroid_updated": true,
    "collision_log_id": null,
    "new_kra_attached": null,
    "new_skills_attached": [
      {
        "is_primary": true,
        "queue_id": 12539,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Azure Data Factory",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 12540,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Azure Databricks",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 12541,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "ADLS Gen2",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 12542,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Azure Storage",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 12543,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "PySpark",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 12544,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "ETL/ELT",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 12545,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Data Modeling",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 12546,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Azure SQL",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 12547,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "SQL MI",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 12548,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Azure OpenAI Service",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 12549,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Azure Cognitive Services",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 12550,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Azure AI Search",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 12551,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Managed Identities",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 12552,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Encryption",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 12553,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Azure Event Hubs",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 12554,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Azure Stream Analytics",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 12555,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Lakehouse Architecture",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 12556,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Azure Purview",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 12557,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Azure Machine Learning",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 12558,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Hugging Face Transformers",
        "status": "pending"
      }
    ],
    "queue_entry_id": null,
    "v3_pipeline_triggered": false,
    "v3_role_slug": null,
    "v3_run_id": null
  }
}
API 2 — extract-details
{}
API 3 — final-role-output
{}

LLM Calls

Every model call made for this run, in pipeline order. Click a card to see the model's response.

Loading…