← Back to history

Pipeline run

212cb41a-60b7-467d-a650-070a5dcee4fa

Pipeline LLM cost (USD)
API 1: $0.0038 API 2: $0.0000 API 3: $0.0000 Total: $0.0038

Client output enrichment

v2 Skill cluster · Nature of work · AI index · Tech stack maturity · Evidence · KRA description
role baseline loaded sources · ai_index: jd · nature_of_work: jd · tech_stack_maturity: jd
Nature of work · Data pipeline development
Build and optimize Databricks PySpark ETL pipelines and workflows, moving data from source systems into Data Lake/Warehouse layers with cleansing, validation, and Medallion Architecture. Also tune SQL/Spark performance, add logging/monitoring, and document lineage/governance artifacts.
"designing, developing, and optimizing scalable ETL pipelines and data workflows using Databricks and Apache Spark"
Tech stack maturity
Modern Cloud Native
Apache Spark, Databricks, and SQL query optimization are strongly associated with cloud-native data platforms and modern distributed analytics stacks.
AI index (0 = no AI use, 5 = totally AI-dependent · v2.1)
0.00 / 5
· Title match
· Has AI skill
· AI skill (primary)
· AI skill (secondary)
· On AI team
· Builds AI products
vocab breakdown (legacy)
Assistants (×1):
Frameworks (×2):
Models / concepts (×3):
Evidence — skills matched in JD (16)
Databricks PySpark Apache Spark Spark SQL SQL ETL Data Lake Data Warehouse Medallion Architecture Databricks Workflows Data Pipelines Data Quality Query Optimization Performance Optimization Data Governance Compliance
Skill cluster (4 dimension groups, role-scoped)
ETL and ELT Tooling
Apache Spark
Performance and Cost Optimization
Query Optimization
Programming Languages for Data Work
SQL
Cross-cutting / unaligned
Databricks PySpark Spark SQL ETL Data Lake Data Warehouse Medallion Architecture Databricks Workflows Data Pipelines Data Quality Performance Optimization Data Governance Compliance
Show KRA description ↓
We are looking for a highly skilled Databricks PySpark Developer to join our data platform implementation team. In this role, you will be responsible for designing, developing, and optimizing scalable ETL pipelines and data workflows using Databricks and Apache Spark. You will work closely with data engineers, data scientists, and BI teams to support advanced analytics and reporting requirements. • ETL Development & Data Engineering Design, develop, and maintain scalable ETL processes using Databricks PySpark. Extract, transform, and load data from heterogeneous sources into Data Lake and Data Warehouse environments. Optimize ETL workflows for performance, scalability, and cost efficiency using Spark SQL and PySpark. Implement robust error handling, logging, and monitoring mechanisms for ETL jobs. Design and implement data solutions following Medallion Architecture (Bronze, Silver, Gold layers). Ensure data is cleansed, enriched, validated, and optimized at each layer for analytics consumption. • Data Pipeline Management Hands-on experience in building and managing advanced data pipelines using Databricks Workflows. Develop and maintain reliable, reusable, and scalable pipelines ensuring data quality and integrity. Collaborate with cross-functional teams to translate business and analytics requirements into efficient data pipelines. • Data Analysis & Query Optimization Write, review, and optimize complex SQL queries for data transformation, aggregation, and analysis. Perform query tuning and performance optimization on large-scale datasets within Databricks. • Project Coordination & Continuous Improvement Participate in project planning, estimation, and delivery activities. Stay updated with the latest features in Databricks, Spark, and cloud data platforms, and recommend best practices. Document ETL processes, data lineage, metadata, and workflows to support data governance and compliance. Mentor junior developers and contribute to team knowledge sharing where required.

Signals

Skill data-engineer
0.21
Alias data-engineer
1.00
KRA data-engineer
0.68

Post-classification

Centroidupdated · n=213
Alias collision log
New-role queue
New skills captured12
New KRA captured

Captured for admin review

PySpark primary Data Engineer pending
Spark SQL primary Data Engineer pending
ETL primary Data Engineer pending
Data Lake primary Data Engineer pending
Data Warehouse primary Data Engineer pending
Medallion Architecture primary Data Engineer pending
Databricks Workflows primary Data Engineer pending
Data Pipelines primary Data Engineer pending
Data Quality primary Data Engineer pending
Performance Optimization primary Data Engineer pending
Data Governance Data Engineer pending
Compliance Data Engineer pending
Status: extract_from_jd_done Created: 2026-05-27T14:47:32.210817Z Updated: 2026-06-12T17:18:05.066587Z
Flow Current 3-step pipeline

1 POST /skills/extract-from-jd

2 POST /skills/extract-details

3 POST /skills/final-role-output

Role Chosen role & resolution

No chosen role stored for this run.

Job description

Role: Databricks PySpark Developer

Experience: 5+ years

Location: Bangalore (onsite-5days) /no relocation candidates

Notice period-immediate joiners/serving notice period

Role Overview :

We are looking for a highly skilled Databricks PySpark Developer to join our data platform implementation team. In this role, you will be responsible for designing, developing, and optimizing scalable ETL pipelines and data workflows using Databricks and Apache Spark. You will work closely with data engineers, data scientists, and BI teams to support advanced analytics and reporting requirements.

Key Responsibilities :

• ETL Development & Data Engineering Design, develop, and maintain scalable ETL processes using Databricks PySpark. Extract, transform, and load data from heterogeneous sources into Data Lake and Data Warehouse environments. Optimize ETL workflows for performance, scalability, and cost efficiency using Spark SQL and PySpark. Implement robust error handling, logging, and monitoring mechanisms for ETL jobs. Design and implement data solutions following Medallion Architecture (Bronze, Silver, Gold layers). Ensure data is cleansed, enriched, validated, and optimized at each layer for analytics consumption.
• Data Pipeline Management Hands-on experience in building and managing advanced data pipelines using Databricks Workflows. Develop and maintain reliable, reusable, and scalable pipelines ensuring data quality and integrity. Collaborate with cross-functional teams to translate business and analytics requirements into efficient data pipelines.
• Data Analysis & Query Optimization Write, review, and optimize complex SQL queries for data transformation, aggregation, and analysis. Perform query tuning and performance optimization on large-scale datasets within Databricks.
• Project Coordination & Continuous Improvement Participate in project planning, estimation, and delivery activities. Stay updated with the latest features in Databricks, Spark, and cloud data platforms, and recommend best practices. Document ETL processes, data lineage, metadata, and workflows to support data governance and compliance. Mentor junior developers and contribute to team knowledge sharing where required.


Required Qualifications :

Bachelor’s degree in Computer Science, Engineering, or a related field.

5+ years of experience in ETL/Data Engineering roles with strong focus on Databricks PySpark.

Strong proficiency in Python, with hands-on experience in developing and debugging PySpark applications.

In-depth understanding of Apache Spark architecture, including RDDs, DataFrames, and Spark SQL.

Expertise in SQL development and optimization for large-scale data processing.

Proven experience working with data warehousing concepts and ETL frameworks.

Strong problem-solving and troubleshooting skills.

Excellent communication and collaboration skills.

Preferred Qualifications :

Experience working on cloud platforms, preferably AWS.

Hands-on experience with tools such as Databricks, Snowflake, Tableau, or similar data platforms.

Strong understanding of data governance, data quality, and best practices in data engineering.

Relevant certifications in Databricks, PySpark, Spark SQL, or cloud technologies.

Skills from this JD

Each row merges API 1 extraction, API 2 library match / v3 orchestration (dimensions + locked dims), and API 3 persistence tags.

Databricks Primary No API 2 row (run stopped after API 1 or history missing)
PySpark Primary No API 2 row (run stopped after API 1 or history missing)
Apache Spark Primary No API 2 row (run stopped after API 1 or history missing)
Spark SQL Primary No API 2 row (run stopped after API 1 or history missing)
SQL Primary No API 2 row (run stopped after API 1 or history missing)
ETL Primary No API 2 row (run stopped after API 1 or history missing)
Data Lake Primary No API 2 row (run stopped after API 1 or history missing)
Data Warehouse Primary No API 2 row (run stopped after API 1 or history missing)
Medallion Architecture Primary No API 2 row (run stopped after API 1 or history missing)
Databricks Workflows Primary No API 2 row (run stopped after API 1 or history missing)
Data Pipelines Primary No API 2 row (run stopped after API 1 or history missing)
Data Quality Primary No API 2 row (run stopped after API 1 or history missing)
Query Optimization Primary No API 2 row (run stopped after API 1 or history missing)
Performance Optimization Primary No API 2 row (run stopped after API 1 or history missing)
Data Governance Secondary No API 2 row (run stopped after API 1 or history missing)
Compliance Secondary No API 2 row (run stopped after API 1 or history missing)

Library artifacts (this run)

No artifact rows for this run.
nano JD Parser — gpt-4.1-nano click to toggle
RoleDatabricks PySpark Developer
Experience5+ years
DomainIT Services & Consulting
Location Bangalore, India (onsite)
JD type pass

Certifications

Databricks PySpark Spark SQL
Show raw JSON
{
  "JD_type": "pass",
  "about_company": null,
  "certifications": [
    "Databricks",
    "PySpark",
    "Spark SQL"
  ],
  "company_name": null,
  "ctc": null,
  "domain": {
    "primary": {
      "aliases": [],
      "domain": "IT Services \u0026 Consulting"
    },
    "secondary": null
  },
  "education": [
    {
      "level": "Bachelor\u0027s",
      "qualification": "BTECH/BE - Computer Science (or related)",
      "raw": "Bachelor\u2019s degree in Computer Science, Engineering, or a related field.",
      "requirement": "required"
    }
  ],
  "experience": {
    "max": null,
    "min": 5,
    "raw": "5+ years"
  },
  "job_locations": [
    {
      "aliases": [
        "Bengaluru"
      ],
      "city": "Bangalore",
      "country": "India",
      "state": null,
      "work_mode": "onsite"
    }
  ],
  "role": "Databricks PySpark Developer",
  "role_aliases": [
    "PySpark Developer",
    "ETL Developer",
    "Data Engineer"
  ],
  "role_archetype": "Data",
  "roles_and_responsibilities": [
    {
      "bullet_count": 0,
      "heading": "Role Overview",
      "heading_was_present": true,
      "source_marker": {
        "first_5_words": "We are looking for a",
        "last_5_words": "analytics and reporting requirements."
      },
      "text": "We are looking for a highly skilled Databricks PySpark Developer to join our data platform implementation team. In this role, you will be responsible for designing, developing, and optimizing scalable ETL pipelines and data workflows using Databricks and Apache Spark. You will work closely with data engineers, data scientists, and BI teams to support advanced analytics and reporting requirements.",
      "word_count": 54
    },
    {
      "bullet_count": 4,
      "heading": "Key Responsibilities",
      "heading_was_present": true,
      "source_marker": {
        "first_5_words": "\u2022 ETL Development \u0026 Data Engineering",
        "last_5_words": "knowledge sharing where required."
      },
      "text": "\u2022 ETL Development \u0026 Data Engineering Design, develop, and maintain scalable ETL processes using Databricks PySpark. Extract, transform, and load data from heterogeneous sources into Data Lake and Data Warehouse environments. Optimize ETL workflows for performance, scalability, and cost efficiency using Spark SQL and PySpark. Implement robust error handling, logging, and monitoring mechanisms for ETL jobs. Design and implement data solutions following Medallion Architecture (Bronze, Silver, Gold layers). Ensure data is cleansed, enriched, validated, and optimized at each layer for analytics consumption.\n\u2022 Data Pipeline Management Hands-on experience in building and managing advanced data pipelines using Databricks Workflows. Develop and maintain reliable, reusable, and scalable pipelines ensuring data quality and integrity. Collaborate with cross-functional teams to translate business and analytics requirements into efficient data pipelines.\n\u2022 Data Analysis \u0026 Query Optimization Write, review, and optimize complex SQL queries for data transformation, aggregation, and analysis. Perform query tuning and performance optimization on large-scale datasets within Databricks.\n\u2022 Project Coordination \u0026 Continuous Improvement Participate in project planning, estimation, and delivery activities. Stay updated with the latest features in Databricks, Spark, and cloud data platforms, and recommend best practices. Document ETL processes, data lineage, metadata, and workflows to support data governance and compliance. Mentor junior developers and contribute to team knowledge sharing where required.",
      "word_count": 309
    }
  ],
  "urls": []
}
API 1 — extract-from-jd click to toggle
{
  "final_skills": [
    {
      "is_primary": true,
      "skill_name": "Databricks"
    },
    {
      "is_primary": true,
      "skill_name": "PySpark"
    },
    {
      "is_primary": true,
      "skill_name": "Apache Spark"
    },
    {
      "is_primary": true,
      "skill_name": "Spark SQL"
    },
    {
      "is_primary": true,
      "skill_name": "SQL"
    },
    {
      "is_primary": true,
      "skill_name": "ETL"
    },
    {
      "is_primary": true,
      "skill_name": "Data Lake"
    },
    {
      "is_primary": true,
      "skill_name": "Data Warehouse"
    },
    {
      "is_primary": true,
      "skill_name": "Medallion Architecture"
    },
    {
      "is_primary": true,
      "skill_name": "Databricks Workflows"
    },
    {
      "is_primary": true,
      "skill_name": "Data Pipelines"
    },
    {
      "is_primary": true,
      "skill_name": "Data Quality"
    },
    {
      "is_primary": true,
      "skill_name": "Query Optimization"
    },
    {
      "is_primary": true,
      "skill_name": "Performance Optimization"
    },
    {
      "is_primary": false,
      "skill_name": "Data Governance"
    },
    {
      "is_primary": false,
      "skill_name": "Compliance"
    }
  ],
  "jd_role": {
    "display_name": "Databricks PySpark Developer",
    "rationale": null,
    "role_aliases": [
      "PySpark Developer",
      "ETL Developer",
      "Data Engineer"
    ],
    "role_archetype": "Data",
    "slug": ""
  },
  "nano_parsed": {
    "JD_type": "pass",
    "about_company": null,
    "certifications": [
      "Databricks",
      "PySpark",
      "Spark SQL"
    ],
    "company_name": null,
    "ctc": null,
    "domain": {
      "primary": {
        "aliases": [],
        "domain": "IT Services \u0026 Consulting"
      },
      "secondary": null
    },
    "education": [
      {
        "level": "Bachelor\u0027s",
        "qualification": "BTECH/BE - Computer Science (or related)",
        "raw": "Bachelor\u2019s degree in Computer Science, Engineering, or a related field.",
        "requirement": "required"
      }
    ],
    "experience": {
      "max": null,
      "min": 5,
      "raw": "5+ years"
    },
    "job_locations": [
      {
        "aliases": [
          "Bengaluru"
        ],
        "city": "Bangalore",
        "country": "India",
        "state": null,
        "work_mode": "onsite"
      }
    ],
    "role": "Databricks PySpark Developer",
    "role_aliases": [
      "PySpark Developer",
      "ETL Developer",
      "Data Engineer"
    ],
    "role_archetype": "Data",
    "roles_and_responsibilities": [
      {
        "bullet_count": 0,
        "heading": "Role Overview",
        "heading_was_present": true,
        "source_marker": {
          "first_5_words": "We are looking for a",
          "last_5_words": "analytics and reporting requirements."
        },
        "text": "We are looking for a highly skilled Databricks PySpark Developer to join our data platform implementation team. In this role, you will be responsible for designing, developing, and optimizing scalable ETL pipelines and data workflows using Databricks and Apache Spark. You will work closely with data engineers, data scientists, and BI teams to support advanced analytics and reporting requirements.",
        "word_count": 54
      },
      {
        "bullet_count": 4,
        "heading": "Key Responsibilities",
        "heading_was_present": true,
        "source_marker": {
          "first_5_words": "\u2022 ETL Development \u0026 Data Engineering",
          "last_5_words": "knowledge sharing where required."
        },
        "text": "\u2022 ETL Development \u0026 Data Engineering Design, develop, and maintain scalable ETL processes using Databricks PySpark. Extract, transform, and load data from heterogeneous sources into Data Lake and Data Warehouse environments. Optimize ETL workflows for performance, scalability, and cost efficiency using Spark SQL and PySpark. Implement robust error handling, logging, and monitoring mechanisms for ETL jobs. Design and implement data solutions following Medallion Architecture (Bronze, Silver, Gold layers). Ensure data is cleansed, enriched, validated, and optimized at each layer for analytics consumption.\n\u2022 Data Pipeline Management Hands-on experience in building and managing advanced data pipelines using Databricks Workflows. Develop and maintain reliable, reusable, and scalable pipelines ensuring data quality and integrity. Collaborate with cross-functional teams to translate business and analytics requirements into efficient data pipelines.\n\u2022 Data Analysis \u0026 Query Optimization Write, review, and optimize complex SQL queries for data transformation, aggregation, and analysis. Perform query tuning and performance optimization on large-scale datasets within Databricks.\n\u2022 Project Coordination \u0026 Continuous Improvement Participate in project planning, estimation, and delivery activities. Stay updated with the latest features in Databricks, Spark, and cloud data platforms, and recommend best practices. Document ETL processes, data lineage, metadata, and workflows to support data governance and compliance. Mentor junior developers and contribute to team knowledge sharing where required.",
        "word_count": 309
      }
    ],
    "urls": []
  },
  "rejected": false,
  "rejection_reason": null,
  "run_id": "212cb41a-60b7-467d-a650-070a5dcee4fa",
  "stage3_signals": {
    "alias_found": true,
    "alias_match_roles": [
      {
        "display_name": "Data Engineer",
        "kra_matches": null,
        "matched_count": null,
        "matched_skills": null,
        "role_id": 2,
        "score": 1.0,
        "slug": "data-engineer",
        "total_count": null
      }
    ],
    "kra_match_roles": [
      {
        "display_name": "Data Engineer",
        "kra_matches": [
          {
            "kra_text": "Implements data transformation, cleansing, deduplication, and enrichment logic to convert raw source data into analytics-ready curated datasets.",
            "sentence": "Ensure data is cleansed, enriched, validated, and optimized at each layer for analytics consumption.",
            "similarity": 0.6961
          },
          {
            "kra_text": "Works with data analysts, data scientists, and business stakeholders to define data models, ingestion schedules, and data delivery requirements.",
            "sentence": "Collaborate with cross-functional teams to translate business and analytics requirements into efficient data pipelines.",
            "similarity": 0.6691
          },
          {
            "kra_text": "Maintains data catalog entries, column-level data lineage, and technical documentation to support data discoverability and governance across the organization.",
            "sentence": "Document ETL processes, data lineage, metadata, and workflows to support data governance and compliance.",
            "similarity": 0.6656
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 2,
        "score": 0.6769,
        "slug": "data-engineer",
        "total_count": null
      },
      {
        "display_name": "Fullstack Developer",
        "kra_matches": [
          {
            "kra_text": "Designs and queries relational databases like PostgreSQL and document stores like MongoDB, writing migrations, indexes, and optimized queries.",
            "sentence": "Data Analysis \u0026 Query Optimization Write, review, and optimize complex SQL queries for data transformation, aggregation, and analysis.",
            "similarity": 0.5774
          },
          {
            "kra_text": "Delivers features through CI/CD pipelines using automated tests, staged rollouts, feature flags, and incremental deployments.",
            "sentence": "Develop and maintain reliable, reusable, and scalable pipelines ensuring data quality and integrity.",
            "similarity": 0.4976
          },
          {
            "kra_text": "Designs and queries relational databases like PostgreSQL and document stores like MongoDB, writing migrations, indexes, and optimized queries.",
            "sentence": "Perform query tuning and performance optimization on large-scale datasets within Databricks.",
            "similarity": 0.4848
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 15,
        "score": 0.5199,
        "slug": "full-stack-engineer",
        "total_count": null
      },
      {
        "display_name": "DevOps Engineer",
        "kra_matches": [
          {
            "kra_text": "Monitors CI/CD pipeline reliability, identifies bottlenecks in delivery workflows, and improves deployment frequency, lead time, and failure recovery rate.",
            "sentence": "Develop and maintain reliable, reusable, and scalable pipelines ensuring data quality and integrity.",
            "similarity": 0.5539
          },
          {
            "kra_text": "Collaborates with development teams to improve build processes, reduce deployment friction, containerize applications, and adopt DevOps best practices.",
            "sentence": "Collaborate with cross-functional teams to translate business and analytics requirements into efficient data pipelines.",
            "similarity": 0.5081
          },
          {
            "kra_text": "Collaborates with development teams to improve build processes, reduce deployment friction, containerize applications, and adopt DevOps best practices.",
            "sentence": "Mentor junior developers and contribute to team knowledge sharing where required.",
            "similarity": 0.4956
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 10,
        "score": 0.5192,
        "slug": "devops-engineer",
        "total_count": null
      },
      {
        "display_name": "ML Engineer",
        "kra_matches": [
          {
            "kra_text": "Prepares, cleans, and transforms training datasets, manages feature stores, and builds feature engineering pipelines for model training.",
            "sentence": "ETL Development \u0026 Data Engineering Design, develop, and maintain scalable ETL processes using Databricks PySpark.",
            "similarity": 0.5113
          },
          {
            "kra_text": "Prepares, cleans, and transforms training datasets, manages feature stores, and builds feature engineering pipelines for model training.",
            "sentence": "Data Pipeline Management Hands-on experience in building and managing advanced data pipelines using Databricks Workflows.",
            "similarity": 0.4995
          },
          {
            "kra_text": "Prepares, cleans, and transforms training datasets, manages feature stores, and builds feature engineering pipelines for model training.",
            "sentence": "Develop and maintain reliable, reusable, and scalable pipelines ensuring data quality and integrity.",
            "similarity": 0.4839
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 3,
        "score": 0.4982,
        "slug": "ml-engineer",
        "total_count": null
      },
      {
        "display_name": "Svelte Frontend Developer",
        "kra_matches": [
          {
            "kra_text": "performance tuning",
            "sentence": "Perform query tuning and performance optimization on large-scale datasets within Databricks.",
            "similarity": 0.5383
          },
          {
            "kra_text": "backend data integration",
            "sentence": "Collaborate with cross-functional teams to translate business and analytics requirements into efficient data pipelines.",
            "similarity": 0.4781
          },
          {
            "kra_text": "backend data integration",
            "sentence": "Ensure data is cleansed, enriched, validated, and optimized at each layer for analytics consumption.",
            "similarity": 0.4507
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 92,
        "score": 0.489,
        "slug": "svelte-frontend-developer",
        "total_count": null
      }
    ],
    "skill_match_roles": [
      {
        "display_name": "Data Engineer",
        "kra_matches": null,
        "matched_count": 3,
        "matched_skills": [
          "Apache Spark",
          "SQL",
          "query optimization"
        ],
        "role_id": 2,
        "score": 0.2143,
        "slug": "data-engineer",
        "total_count": 14
      },
      {
        "display_name": "Pega Developer",
        "kra_matches": null,
        "matched_count": 1,
        "matched_skills": [
          "SQL"
        ],
        "role_id": 24,
        "score": 0.0714,
        "slug": "pega-developer",
        "total_count": 14
      },
      {
        "display_name": "Backend Developer",
        "kra_matches": null,
        "matched_count": 1,
        "matched_skills": [
          "query optimization"
        ],
        "role_id": 1,
        "score": 0.0714,
        "slug": "backend-engineer",
        "total_count": 14
      },
      {
        "display_name": "Node.js Backend Developer",
        "kra_matches": null,
        "matched_count": 1,
        "matched_skills": [
          "query optimization"
        ],
        "role_id": 82,
        "score": 0.0714,
        "slug": "node-backend-developer",
        "total_count": 14
      },
      {
        "display_name": ".NET Backend Developer",
        "kra_matches": null,
        "matched_count": 1,
        "matched_skills": [
          "query optimization"
        ],
        "role_id": 83,
        "score": 0.0714,
        "slug": "dotnet-backend-developer",
        "total_count": 14
      }
    ]
  },
  "stage4_decision": {
    "alias_collision_detected": false,
    "case": "A",
    "chosen_role": {
      "display_name": "Data Engineer",
      "kra_matches": null,
      "matched_count": null,
      "matched_skills": null,
      "role_id": 2,
      "score": 1.0,
      "slug": "data-engineer",
      "total_count": null
    },
    "confidence": 1.0,
    "is_new_role": false,
    "llm2_fired": false,
    "llm2_reasoning": null,
    "matched_dimensions": [],
    "matched_kras": [],
    "matched_skills": [],
    "new_role_display_name": null,
    "new_role_slug": null,
    "queued": false,
    "reasoning": "Exact alias hit on data-engineer (1.0) \u2014 no other alias at this confidence; skill_top data-engineer 0.21 does not contradict",
    "sub_role": null
  },
  "stage5_updates": {
    "centroid_n_after": 213,
    "centroid_updated": true,
    "collision_log_id": null,
    "new_kra_attached": null,
    "new_skills_attached": [
      {
        "is_primary": true,
        "queue_id": 10989,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "PySpark",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 10990,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Spark SQL",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 10991,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "ETL",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 10992,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Data Lake",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 10993,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Data Warehouse",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 10994,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Medallion Architecture",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 10995,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Databricks Workflows",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 10996,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Data Pipelines",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 10997,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Data Quality",
        "status": "pending"
      },
      {
        "is_primary": true,
        "queue_id": 10998,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Performance Optimization",
        "status": "pending"
      },
      {
        "is_primary": false,
        "queue_id": 10999,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Data Governance",
        "status": "pending"
      },
      {
        "is_primary": false,
        "queue_id": 11000,
        "role_display_name": "Data Engineer",
        "role_slug": "data-engineer",
        "skill_name": "Compliance",
        "status": "pending"
      }
    ],
    "queue_entry_id": null,
    "v3_pipeline_triggered": false,
    "v3_role_slug": null,
    "v3_run_id": null
  }
}
API 2 — extract-details
{}
API 3 — final-role-output
{}

LLM Calls

Every model call made for this run, in pipeline order. Click a card to see the model's response.

Loading…