Pipeline run

c8124d96-207f-470a-8c8d-b437d47ba9fc

Pipeline LLM cost (USD)

API 1: $0.0097 API 2: $0.0000 API 3: $0.0000 Total: $0.0097

Client output enrichment

v2 Skill cluster · Nature of work · AI index · Tech stack maturity · Evidence · KRA description

Nature of work · Data Engineering / Data Platform

Build and ship a data governance platform: design scalable tooling for metadata, lineage, quality, and compliance, while setting standards, partnering with cross-functional teams, and mentoring engineers.

""Lead initiatives around metadata management, data lineage, and data cataloging.""

Tech stack maturity

Modern Cloud Native cache hit

The stack centers on cloud data platforms and distributed data tooling like Snowflake, Kafka, Airflow, Spark, and modern languages/services, which is characteristic of modern cloud-native engineering.

AI index (0 = no AI use, 5 = totally AI-dependent · v2.1)

0.00 / 5

· Title match

· Has AI skill

· AI skill (primary)

· AI skill (secondary)

· On AI team

· Builds AI products

vocab breakdown (legacy)

Assistants (×1): —

Frameworks (×2): —

Models / concepts (×3): —

Evidence — skills matched in JD (16)

Python Java Scala Go REST Kafka Spark Airflow Snowflake Apache Atlas Amundsen DataHub AWS Azure GCP HIPAA

Skill cluster (9 dimension groups, role-scoped)

Cloud Provider Platforms

AWS Azure GCP

Asynchronous Messaging and Event Streaming

Kafka

Compliance and Security Frameworks

HIPAA

Go Language and Toolchain

Integration Protocols & Standards

REST

Java Language and JVM

Java

Programming Languages for Data Work

Scala

Python Programming

Python

Cross-cutting / unaligned

Spark Airflow Snowflake Apache Atlas Amundsen DataHub

Show KRA description ↓

We are looking for a Staff Engineer - Data Governance to lead the design and development of a scalable, secure, and robust data governance platform. You will play a key role in building data platform capabilities for data quality, metadata management, lineage tracking, and compliance across all data layers. If you’re passionate about building foundational data infrastructure that accelerates innovation in healthcare, we’d love to talk. • Architect, design, and build scalable data governance tools and frameworks. • Collaborate with cross-functional teams to ensure data compliance, security, and usability. • Lead initiatives around metadata management, data lineage, and data cataloging. • Define and evangelize standards and best practices across data engineering teams. • Own the end-to-end lifecycle of governance tooling – from prototyping to production deployment. • Mentor and guide junior engineers and contribute to technical leadership across the organization. • Drive innovation in privacy-by-design, regulatory compliance (e.g., HIPAA), and data observability solutions. • 8+ years of experience in software engineering, with 3+ years focused on data governance or related tooling. • Strong experience building distributed systems for metadata management, data lineage, and quality tracking. • Proficient in backend development (Python, Java, or Scala or Go) and familiar with RESTful API design. • Expertise in modern data stacks: Kafka, Spark, Airflow, Snowflake etc. • Experience with open-source data governance frameworks like Apache Atlas, Amundsen, or DataHub is a big plus. • Familiarity with cloud platforms (AWS, Azure, GCP) and their native data governance offerings. • Prior experience in building metadata management frameworks for scale.

Signals

Skill data-engineer

0.67

Alias data-engineer

1.00

KRA data-engineer

0.63

Post-classification

Centroidupdated · n=7

Alias collision log—

New-role queue—

New skills captured3

New KRA capturedyes

Captured for admin review

Apache Atlas ↔ Data Governance Engineer pending

Amundsen ↔ Data Governance Engineer pending

DataHub ↔ Data Governance Engineer pending

R&R fragment (sim 0.00) ↔ Data Governance Engineer pending

Status: extract_from_jd_done Created: 2026-05-27T15:50:42.921362Z Updated: 2026-06-12T15:52:19.765761Z

Flow Current 3-step pipeline

1 POST /skills/extract-from-jd

2 POST /skills/extract-details

3 POST /skills/final-role-output

Role Chosen role & resolution

No chosen role stored for this run.

Job description

Hello! You've landed on this page, which means you're interested in working with us. Let's take a sneak peek at what it's like to work at Innovaccer.

Engineering at Innovaccer
With every line of code, we accelerate our customers' success, turning complex challenges into innovative solutions. Collaboratively, we transform each data point we gather into valuable insights for our customers. Join us and be part of a team that's turning dreams of better healthcare into reality, one line of code at a time. Together, we’re shaping the future and making a meaningful impact on the world.

About the Role
We are looking for a Staff Engineer - Data Governance to lead the design and development of a scalable, secure, and robust data governance platform. You will play a key role in building data platform capabilities for data quality, metadata management, lineage tracking, and compliance across all data layers. If you’re passionate about building foundational data infrastructure that accelerates innovation in healthcare, we’d love to talk.

A Day in the Life
• Architect, design, and build scalable data governance tools and frameworks.
• Collaborate with cross-functional teams to ensure data compliance, security, and usability.
• Lead initiatives around metadata management, data lineage, and data cataloging.
• Define and evangelize standards and best practices across data engineering teams.
• Own the end-to-end lifecycle of governance tooling – from prototyping to production deployment.
• Mentor and guide junior engineers and contribute to technical leadership across the organization.
• Drive innovation in privacy-by-design, regulatory compliance (e.g., HIPAA), and data observability solutions.

What You Need
• 8+ years of experience in software engineering, with 3+ years focused on data governance or related tooling.
• Strong experience building distributed systems for metadata management, data lineage, and quality tracking.
• Proficient in backend development (Python, Java, or Scala or Go) and familiar with RESTful API design.
• Expertise in modern data stacks: Kafka, Spark, Airflow, Snowflake etc.
• Experience with open-source data governance frameworks like Apache Atlas, Amundsen, or DataHub is a big plus.
• Familiarity with cloud platforms (AWS, Azure, GCP) and their native data governance offerings.
• Prior experience in building metadata management frameworks for scale.
• Bachelor's or Master’s degree in Computer Science, Engineering, or a related field.

Here’s What We Offer
• Generous Leave Benefits: Enjoy generous leave benefits of up to 40 days.
• Parental Leave: Experience one of the industry's best parental leave policies to spend time with your new addition.
• Sabbatical Leave Policy: Want to focus on skill development, pursue an academic career, or just take a break? We've got you covered.
• Health Insurance: We offer health benefits and insurance to you and your family for medically related expenses related to illness, disease, or injury.
• Pet-Friendly Office*: Spend more time with your treasured friends, even when you're away from home. Bring your furry friends with you to the office and let your colleagues become their friends, too. *Noida office only
• Creche Facility for children*: Say goodbye to worries and hello to a convenient and reliable creche facility that puts your child's well-being first. *India offices

Where and how we work
Our Noida office is situated in a posh techspace, equipped with various amenities to support our work environment. Here, we follow a five-day work schedule, allowing us to efficiently carry out our tasks and collaborate effectively within our team. Innovaccer is an equal-opportunity employer. We celebrate diversity, and we are committed to fostering an inclusive and diverse workplace where all employees, regardless of race, color, religion, gender, gender identity or expression, sexual orientation, national origin, genetics, disability, age, marital status, or veteran status, feel valued and empowered.

Disclaimer: Innovaccer does not charge fees or require payment from individuals or agencies for securing employment with us. We do not guarantee job spots or engage in any financial transactions related to employment. If you encounter any posts or requests asking for payment or personal information, we strongly advise you to report them immediately to our HR department at px@innovaccer.com. Additionally, please exercise caution and verify the authenticity of any requests before disclosing personal and confidential information, including bank account details.

About Innovaccer
Innovaccer Inc. is the data platform that accelerates innovation. The Innovaccer platform unifies patient data across systems and care settings, and empowers healthcare organizations with scalable, modern applications that improve clinical, financial, operational, and experiential outcomes. Innovaccer’s EHR-agnostic solutions have been deployed across more than 1,600 hospitals and clinics in the US, enabling care delivery transformation for more than 96,000 clinicians, and helping providers work collaboratively with payers and life sciences companies. Innovaccer has helped its customers unify health records for more than 54 million people and generate over $1.5 billion in cumulative cost savings. The Innovaccer platform is the #1 rated Best-in-KLAS data and analytics platform by KLAS, and the #1 rated population health technology platform by Black Book. For more information, please visit innovaccer.com.
Check us out on YouTube, Glassdoor, LinkedIn, and innovaccer.com.

Skills from this JD

Each row merges API 1 extraction, API 2 library match / v3 orchestration (dimensions + locked dims), and API 3 persistence tags.

Python Primary No API 2 row (run stopped after API 1 or history missing)

Java Primary No API 2 row (run stopped after API 1 or history missing)

Scala Primary No API 2 row (run stopped after API 1 or history missing)

Go Primary No API 2 row (run stopped after API 1 or history missing)

REST Primary No API 2 row (run stopped after API 1 or history missing)

Kafka Primary No API 2 row (run stopped after API 1 or history missing)

Spark Primary No API 2 row (run stopped after API 1 or history missing)

Airflow Primary No API 2 row (run stopped after API 1 or history missing)

Snowflake Primary No API 2 row (run stopped after API 1 or history missing)

Apache Atlas Secondary No API 2 row (run stopped after API 1 or history missing)

Amundsen Secondary No API 2 row (run stopped after API 1 or history missing)

DataHub Secondary No API 2 row (run stopped after API 1 or history missing)

AWS Secondary No API 2 row (run stopped after API 1 or history missing)

Azure Secondary No API 2 row (run stopped after API 1 or history missing)

GCP Secondary No API 2 row (run stopped after API 1 or history missing)

HIPAA Secondary No API 2 row (run stopped after API 1 or history missing)

Library artifacts (this run)

No artifact rows for this run.

nano JD Parser — gpt-4.1-nano click to toggle

RoleStaff Engineer - Data Governance

CompanyInnovaccer

Experience8+ years of experience in software engineering, with 3+ years focused on data governance or related tooling.

DomainHealthcare

Location Noida, India (onsite)

JD type pass

Show raw JSON

{
  "JD_type": "pass",
  "about_company": {
    "source_marker": {
      "first_5_words": "Innovaccer Inc. is the data",
      "last_5_words": "visit innovaccer.com."
    },
    "text": "Innovaccer Inc. is the data platform that accelerates innovation. The Innovaccer platform unifies patient data across systems and care settings, and empowers healthcare organizations with scalable, modern applications that improve clinical, financial, operational, and experiential outcomes. Innovaccer\u2019s EHR-agnostic solutions have been deployed across more than 1,600 hospitals and clinics in the US, enabling care delivery transformation for more than 96,000 clinicians, and helping providers work collaboratively with payers and life sciences companies. Innovaccer has helped its customers unify health records for more than 54 million people and generate over $1.5 billion in cumulative cost savings. The Innovaccer platform is the #1 rated Best-in-KLAS data and analytics platform by KLAS, and the #1 rated population health technology platform by Black Book. For more information, please visit innovaccer.com.",
    "word_count": 128
  },
  "certifications": [],
  "company_name": "Innovaccer",
  "ctc": null,
  "domain": {
    "primary": {
      "aliases": [
        "HealthTech",
        "MedTech"
      ],
      "domain": "Healthcare"
    },
    "secondary": null
  },
  "education": [
    {
      "level": "Bachelor\u0027s",
      "qualification": "BTECH/BE/BSC - Computer Science",
      "raw": "Bachelor\u0027s or Master\u2019s degree in Computer Science, Engineering, or a related field.",
      "requirement": "required"
    },
    {
      "level": "Master\u0027s",
      "qualification": "MTECH/ME/MSC - Computer Science",
      "raw": "Bachelor\u0027s or Master\u2019s degree in Computer Science, Engineering, or a related field.",
      "requirement": "preferred"
    }
  ],
  "experience": {
    "max": null,
    "min": 8,
    "raw": "8+ years of experience in software engineering, with 3+ years focused on data governance or related tooling."
  },
  "job_locations": [
    {
      "aliases": [
        "Noida, UP"
      ],
      "city": "Noida",
      "country": "India",
      "state": "Uttar Pradesh",
      "work_mode": "onsite"
    }
  ],
  "role": "Staff Engineer - Data Governance",
  "role_aliases": [
    "Data Governance Engineer",
    "Data Engineer",
    "Staff Data Engineer"
  ],
  "role_archetype": "Engineering",
  "roles_and_responsibilities": [
    {
      "bullet_count": 0,
      "heading": "About the Role",
      "heading_was_present": true,
      "source_marker": {
        "first_5_words": "We are looking for a",
        "last_5_words": "we\u2019d love to talk."
      },
      "text": "We are looking for a Staff Engineer - Data Governance to lead the design and development of a scalable, secure, and robust data governance platform. You will play a key role in building data platform capabilities for data quality, metadata management, lineage tracking, and compliance across all data layers. If you\u2019re passionate about building foundational data infrastructure that accelerates innovation in healthcare, we\u2019d love to talk.",
      "word_count": 56
    },
    {
      "bullet_count": 7,
      "heading": "A Day in the Life",
      "heading_was_present": true,
      "source_marker": {
        "first_5_words": "\u2022 Architect, design, and build",
        "last_5_words": "and data observability solutions."
      },
      "text": "\u2022 Architect, design, and build scalable data governance tools and frameworks.\n\u2022 Collaborate with cross-functional teams to ensure data compliance, security, and usability.\n\u2022 Lead initiatives around metadata management, data lineage, and data cataloging.\n\u2022 Define and evangelize standards and best practices across data engineering teams.\n\u2022 Own the end-to-end lifecycle of governance tooling \u2013 from prototyping to production deployment.\n\u2022 Mentor and guide junior engineers and contribute to technical leadership across the organization.\n\u2022 Drive innovation in privacy-by-design, regulatory compliance (e.g., HIPAA), and data observability solutions.",
      "word_count": 92
    },
    {
      "bullet_count": 7,
      "heading": "What You Need",
      "heading_was_present": true,
      "source_marker": {
        "first_5_words": "\u2022 8+ years of experience in",
        "last_5_words": "management frameworks for scale."
      },
      "text": "\u2022 8+ years of experience in software engineering, with 3+ years focused on data governance or related tooling.\n\u2022 Strong experience building distributed systems for metadata management, data lineage, and quality tracking.\n\u2022 Proficient in backend development (Python, Java, or Scala or Go) and familiar with RESTful API design.\n\u2022 Expertise in modern data stacks: Kafka, Spark, Airflow, Snowflake etc.\n\u2022 Experience with open-source data governance frameworks like Apache Atlas, Amundsen, or DataHub is a big plus.\n\u2022 Familiarity with cloud platforms (AWS, Azure, GCP) and their native data governance offerings.\n\u2022 Prior experience in building metadata management frameworks for scale.",
      "word_count": 104
    }
  ],
  "urls": [
    {
      "type": "website",
      "url": "https://innovaccer.com"
    },
    {
      "type": "linkedin",
      "url": "https://www.linkedin.com/company/innovaccer"
    },
    {
      "type": "other",
      "url": "https://www.glassdoor.com/Overview/Working-at-Innovaccer-EI_IE1234567.11,21.htm"
    },
    {
      "type": "youtube",
      "url": "https://www.youtube.com/c/Innovaccer"
    }
  ]
}

API 1 — extract-from-jd click to toggle

{
  "final_skills": [
    {
      "is_primary": true,
      "skill_name": "Python"
    },
    {
      "is_primary": true,
      "skill_name": "Java"
    },
    {
      "is_primary": true,
      "skill_name": "Scala"
    },
    {
      "is_primary": true,
      "skill_name": "Go"
    },
    {
      "is_primary": true,
      "skill_name": "REST"
    },
    {
      "is_primary": true,
      "skill_name": "Kafka"
    },
    {
      "is_primary": true,
      "skill_name": "Spark"
    },
    {
      "is_primary": true,
      "skill_name": "Airflow"
    },
    {
      "is_primary": true,
      "skill_name": "Snowflake"
    },
    {
      "is_primary": false,
      "skill_name": "Apache Atlas"
    },
    {
      "is_primary": false,
      "skill_name": "Amundsen"
    },
    {
      "is_primary": false,
      "skill_name": "DataHub"
    },
    {
      "is_primary": false,
      "skill_name": "AWS"
    },
    {
      "is_primary": false,
      "skill_name": "Azure"
    },
    {
      "is_primary": false,
      "skill_name": "GCP"
    },
    {
      "is_primary": false,
      "skill_name": "HIPAA"
    }
  ],
  "jd_role": {
    "display_name": "Staff Engineer - Data Governance",
    "rationale": null,
    "role_aliases": [
      "Data Governance Engineer",
      "Data Engineer",
      "Staff Data Engineer"
    ],
    "role_archetype": "Engineering",
    "slug": ""
  },
  "nano_parsed": {
    "JD_type": "pass",
    "about_company": {
      "source_marker": {
        "first_5_words": "Innovaccer Inc. is the data",
        "last_5_words": "visit innovaccer.com."
      },
      "text": "Innovaccer Inc. is the data platform that accelerates innovation. The Innovaccer platform unifies patient data across systems and care settings, and empowers healthcare organizations with scalable, modern applications that improve clinical, financial, operational, and experiential outcomes. Innovaccer\u2019s EHR-agnostic solutions have been deployed across more than 1,600 hospitals and clinics in the US, enabling care delivery transformation for more than 96,000 clinicians, and helping providers work collaboratively with payers and life sciences companies. Innovaccer has helped its customers unify health records for more than 54 million people and generate over $1.5 billion in cumulative cost savings. The Innovaccer platform is the #1 rated Best-in-KLAS data and analytics platform by KLAS, and the #1 rated population health technology platform by Black Book. For more information, please visit innovaccer.com.",
      "word_count": 128
    },
    "certifications": [],
    "company_name": "Innovaccer",
    "ctc": null,
    "domain": {
      "primary": {
        "aliases": [
          "HealthTech",
          "MedTech"
        ],
        "domain": "Healthcare"
      },
      "secondary": null
    },
    "education": [
      {
        "level": "Bachelor\u0027s",
        "qualification": "BTECH/BE/BSC - Computer Science",
        "raw": "Bachelor\u0027s or Master\u2019s degree in Computer Science, Engineering, or a related field.",
        "requirement": "required"
      },
      {
        "level": "Master\u0027s",
        "qualification": "MTECH/ME/MSC - Computer Science",
        "raw": "Bachelor\u0027s or Master\u2019s degree in Computer Science, Engineering, or a related field.",
        "requirement": "preferred"
      }
    ],
    "experience": {
      "max": null,
      "min": 8,
      "raw": "8+ years of experience in software engineering, with 3+ years focused on data governance or related tooling."
    },
    "job_locations": [
      {
        "aliases": [
          "Noida, UP"
        ],
        "city": "Noida",
        "country": "India",
        "state": "Uttar Pradesh",
        "work_mode": "onsite"
      }
    ],
    "role": "Staff Engineer - Data Governance",
    "role_aliases": [
      "Data Governance Engineer",
      "Data Engineer",
      "Staff Data Engineer"
    ],
    "role_archetype": "Engineering",
    "roles_and_responsibilities": [
      {
        "bullet_count": 0,
        "heading": "About the Role",
        "heading_was_present": true,
        "source_marker": {
          "first_5_words": "We are looking for a",
          "last_5_words": "we\u2019d love to talk."
        },
        "text": "We are looking for a Staff Engineer - Data Governance to lead the design and development of a scalable, secure, and robust data governance platform. You will play a key role in building data platform capabilities for data quality, metadata management, lineage tracking, and compliance across all data layers. If you\u2019re passionate about building foundational data infrastructure that accelerates innovation in healthcare, we\u2019d love to talk.",
        "word_count": 56
      },
      {
        "bullet_count": 7,
        "heading": "A Day in the Life",
        "heading_was_present": true,
        "source_marker": {
          "first_5_words": "\u2022 Architect, design, and build",
          "last_5_words": "and data observability solutions."
        },
        "text": "\u2022 Architect, design, and build scalable data governance tools and frameworks.\n\u2022 Collaborate with cross-functional teams to ensure data compliance, security, and usability.\n\u2022 Lead initiatives around metadata management, data lineage, and data cataloging.\n\u2022 Define and evangelize standards and best practices across data engineering teams.\n\u2022 Own the end-to-end lifecycle of governance tooling \u2013 from prototyping to production deployment.\n\u2022 Mentor and guide junior engineers and contribute to technical leadership across the organization.\n\u2022 Drive innovation in privacy-by-design, regulatory compliance (e.g., HIPAA), and data observability solutions.",
        "word_count": 92
      },
      {
        "bullet_count": 7,
        "heading": "What You Need",
        "heading_was_present": true,
        "source_marker": {
          "first_5_words": "\u2022 8+ years of experience in",
          "last_5_words": "management frameworks for scale."
        },
        "text": "\u2022 8+ years of experience in software engineering, with 3+ years focused on data governance or related tooling.\n\u2022 Strong experience building distributed systems for metadata management, data lineage, and quality tracking.\n\u2022 Proficient in backend development (Python, Java, or Scala or Go) and familiar with RESTful API design.\n\u2022 Expertise in modern data stacks: Kafka, Spark, Airflow, Snowflake etc.\n\u2022 Experience with open-source data governance frameworks like Apache Atlas, Amundsen, or DataHub is a big plus.\n\u2022 Familiarity with cloud platforms (AWS, Azure, GCP) and their native data governance offerings.\n\u2022 Prior experience in building metadata management frameworks for scale.",
        "word_count": 104
      }
    ],
    "urls": [
      {
        "type": "website",
        "url": "https://innovaccer.com"
      },
      {
        "type": "linkedin",
        "url": "https://www.linkedin.com/company/innovaccer"
      },
      {
        "type": "other",
        "url": "https://www.glassdoor.com/Overview/Working-at-Innovaccer-EI_IE1234567.11,21.htm"
      },
      {
        "type": "youtube",
        "url": "https://www.youtube.com/c/Innovaccer"
      }
    ]
  },
  "rejected": false,
  "rejection_reason": null,
  "run_id": "c8124d96-207f-470a-8c8d-b437d47ba9fc",
  "stage3_signals": {
    "alias_found": true,
    "alias_match_roles": [
      {
        "display_name": "Data Engineer",
        "kra_matches": null,
        "matched_count": null,
        "matched_skills": null,
        "role_id": 2,
        "score": 1.0,
        "slug": "data-engineer",
        "total_count": null
      },
      {
        "display_name": "Data Governance Engineer",
        "kra_matches": null,
        "matched_count": null,
        "matched_skills": null,
        "role_id": 146,
        "score": 1.0,
        "slug": "data-governance-engineer",
        "total_count": null
      }
    ],
    "kra_match_roles": [
      {
        "display_name": "Data Engineer",
        "kra_matches": [
          {
            "kra_text": "Maintains data catalog entries, column-level data lineage, and technical documentation to support data discoverability and governance across the organization.",
            "sentence": "Lead initiatives around metadata management, data lineage, and data cataloging.",
            "similarity": 0.6556
          },
          {
            "kra_text": "Designs dimensional models, star schemas, data vault structures, and curated data mart tables to support BI tools and self-service analytics consumption.",
            "sentence": "Architect, design, and build scalable data governance tools and frameworks.",
            "similarity": 0.6336
          },
          {
            "kra_text": "Maintains data catalog entries, column-level data lineage, and technical documentation to support data discoverability and governance across the organization.",
            "sentence": "You will play a key role in building data platform capabilities for data quality, metadata management, lineage tracking, and compliance across all data layers.",
            "similarity": 0.5975
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 2,
        "score": 0.6289,
        "slug": "data-engineer",
        "total_count": null
      },
      {
        "display_name": "Cloud Architect",
        "kra_matches": [
          {
            "kra_text": "Conducts architecture reviews, approves technical design documents, and guides engineering teams through cloud migration and modernization projects.",
            "sentence": "Mentor and guide junior engineers and contribute to technical leadership across the organization.",
            "similarity": 0.5302
          },
          {
            "kra_text": "Conducts architecture reviews, approves technical design documents, and guides engineering teams through cloud migration and modernization projects.",
            "sentence": "Architect, design, and build scalable data governance tools and frameworks.",
            "similarity": 0.5075
          },
          {
            "kra_text": "Establishes cloud governance guardrails including budget alerts, resource quotas, policy-as-code enforcement, and compliance posture management.",
            "sentence": "Own the end-to-end lifecycle of governance tooling \u2013 from prototyping to production deployment.",
            "similarity": 0.4673
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 9,
        "score": 0.5017,
        "slug": "cloud-architect",
        "total_count": null
      },
      {
        "display_name": "Cyber Security Engineer",
        "kra_matches": [
          {
            "kra_text": "Defines secure engineering standards, secure coding guidelines, threat intelligence feeds, and compliance requirements for the organization.",
            "sentence": "Collaborate with cross-functional teams to ensure data compliance, security, and usability.",
            "similarity": 0.5244
          },
          {
            "kra_text": "Defines secure engineering standards, secure coding guidelines, threat intelligence feeds, and compliance requirements for the organization.",
            "sentence": "Define and evangelize standards and best practices across data engineering teams.",
            "similarity": 0.5217
          },
          {
            "kra_text": "Reviews and enforces access control policies, privilege escalation procedures, role-based access control, and identity governance workflows.",
            "sentence": "We are looking for a Staff Engineer - Data Governance to lead the design and development of a scalable, secure, and robust data governance platform.",
            "similarity": 0.4432
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 5,
        "score": 0.4964,
        "slug": "cybersecurity-engineer",
        "total_count": null
      },
      {
        "display_name": "MLOps Engineer",
        "kra_matches": [
          {
            "kra_text": "Manages the end-to-end ML model release lifecycle from training job completion through validation gates to production deployment approval.",
            "sentence": "Own the end-to-end lifecycle of governance tooling \u2013 from prototyping to production deployment.",
            "similarity": 0.5487
          },
          {
            "kra_text": "Validates model performance benchmarks, data schema contracts, and system integration health before signing off on production release readiness.",
            "sentence": "Collaborate with cross-functional teams to ensure data compliance, security, and usability.",
            "similarity": 0.4664
          },
          {
            "kra_text": "Maintains model versioning, experiment lineage, and artifact tracking using MLflow, DVC, or Weights \u0026 Biases for reproducibility and auditability.",
            "sentence": "Strong experience building distributed systems for metadata management, data lineage, and quality tracking.",
            "similarity": 0.4617
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 16,
        "score": 0.4923,
        "slug": "ml-ops-engineer",
        "total_count": null
      },
      {
        "display_name": "Flutter Developer",
        "kra_matches": [
          {
            "kra_text": "collaborate with design, product, and backend teams",
            "sentence": "Collaborate with cross-functional teams to ensure data compliance, security, and usability.",
            "similarity": 0.5856
          },
          {
            "kra_text": "collaborate with design, product, and backend teams",
            "sentence": "Own the end-to-end lifecycle of governance tooling \u2013 from prototyping to production deployment.",
            "similarity": 0.4323
          },
          {
            "kra_text": "collaborate with design, product, and backend teams",
            "sentence": "Architect, design, and build scalable data governance tools and frameworks.",
            "similarity": 0.4303
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 74,
        "score": 0.4827,
        "slug": "flutter-developer",
        "total_count": null
      }
    ],
    "skill_match_roles": [
      {
        "display_name": "Data Engineer",
        "kra_matches": null,
        "matched_count": 6,
        "matched_skills": [
          "Apache Spark",
          "Java",
          "Kafka",
          "Python",
          "Scala",
          "Snowflake"
        ],
        "role_id": 2,
        "score": 0.6667,
        "slug": "data-engineer",
        "total_count": 9
      },
      {
        "display_name": "Backend Developer",
        "kra_matches": null,
        "matched_count": 5,
        "matched_skills": [
          "Go",
          "Java",
          "Kafka",
          "Python",
          "REST"
        ],
        "role_id": 1,
        "score": 0.5556,
        "slug": "backend-engineer",
        "total_count": 9
      },
      {
        "display_name": "Fullstack Developer",
        "kra_matches": null,
        "matched_count": 4,
        "matched_skills": [
          "Go",
          "Java",
          "Python",
          "REST"
        ],
        "role_id": 15,
        "score": 0.4444,
        "slug": "full-stack-engineer",
        "total_count": 9
      },
      {
        "display_name": "ML Engineer",
        "kra_matches": null,
        "matched_count": 4,
        "matched_skills": [
          "Airflow",
          "Go",
          "Python",
          "Scala"
        ],
        "role_id": 3,
        "score": 0.4444,
        "slug": "ml-engineer",
        "total_count": 9
      },
      {
        "display_name": "MLOps Engineer",
        "kra_matches": null,
        "matched_count": 4,
        "matched_skills": [
          "Airflow",
          "Go",
          "Python",
          "Scala"
        ],
        "role_id": 16,
        "score": 0.4444,
        "slug": "ml-ops-engineer",
        "total_count": 9
      }
    ]
  },
  "stage4_decision": {
    "alias_collision_detected": false,
    "case": "DOMAIN",
    "chosen_role": {
      "display_name": "Data Governance Engineer",
      "kra_matches": null,
      "matched_count": null,
      "matched_skills": null,
      "role_id": 146,
      "score": 0.99,
      "slug": "data-governance-engineer",
      "total_count": null
    },
    "confidence": 0.99,
    "is_new_role": false,
    "llm2_fired": false,
    "llm2_reasoning": null,
    "matched_dimensions": [
      "Data Governance Platform Engineering",
      "Metadata and Lineage Management",
      "Data Quality and Observability",
      "Compliance and Privacy Engineering",
      "Distributed Systems for Governance Tooling",
      "Backend API Development",
      "Cross-functional Technical Leadership"
    ],
    "matched_kras": [
      "lead the design and development of a scalable, secure, and robust data governance platform",
      "build data platform capabilities for data quality, metadata management, lineage tracking, and compliance",
      "architect, design, and build scalable data governance tools and frameworks",
      "collaborate with cross-functional teams to ensure data compliance, security, and usability",
      "lead initiatives around metadata management, data lineage, and data cataloging",
      "define and evangelize standards and best practices across data engineering teams",
      "own the end-to-end lifecycle of governance tooling",
      "drive innovation in privacy-by-design, regulatory compliance, and data observability solutions",
      "mentor and guide junior engineers",
      "contribute to technical leadership across the organization"
    ],
    "matched_skills": [
      "data governance",
      "metadata management",
      "lineage tracking",
      "data quality",
      "data cataloging",
      "data compliance",
      "data security",
      "data observability",
      "Python",
      "Java",
      "Scala",
      "Go",
      "RESTful API design",
      "Kafka",
      "Spark",
      "Airflow",
      "Snowflake",
      "Apache Atlas",
      "Amundsen",
      "DataHub"
    ],
    "new_role_display_name": null,
    "new_role_slug": null,
    "queued": false,
    "reasoning": "Domain=Data Engineering \u0026 Analytics; The JD is centered on building data governance tooling for metadata, lineage, cataloging, quality, compliance, and observability, which aligns most directly with Data Governance Engineer.",
    "sub_role": null
  },
  "stage5_updates": {
    "centroid_n_after": 7,
    "centroid_updated": true,
    "collision_log_id": null,
    "new_kra_attached": {
      "best_kra_similarity": 0.0,
      "queue_id": 1198,
      "r_and_r_preview": "We are looking for a Staff Engineer - Data Governance to lead the design and development of a scalable, secure, and robust data governance platform. You will play a key role in building data platform ",
      "role_display_name": "Data Governance Engineer",
      "role_slug": "data-governance-engineer",
      "status": "pending"
    },
    "new_skills_attached": [
      {
        "is_primary": false,
        "queue_id": 16702,
        "role_display_name": "Data Governance Engineer",
        "role_slug": "data-governance-engineer",
        "skill_name": "Apache Atlas",
        "status": "pending"
      },
      {
        "is_primary": false,
        "queue_id": 16704,
        "role_display_name": "Data Governance Engineer",
        "role_slug": "data-governance-engineer",
        "skill_name": "Amundsen",
        "status": "pending"
      },
      {
        "is_primary": false,
        "queue_id": 16706,
        "role_display_name": "Data Governance Engineer",
        "role_slug": "data-governance-engineer",
        "skill_name": "DataHub",
        "status": "pending"
      }
    ],
    "queue_entry_id": null,
    "v3_pipeline_triggered": false,
    "v3_role_slug": null,
    "v3_run_id": null
  }
}

API 2 — extract-details

{}

API 3 — final-role-output

{}

LLM Calls

Every model call made for this run, in pipeline order. Click a card to see the model's response.

Loading…