← Back to history

Pipeline run

b90fffca-a264-4009-a738-d1f55f9cc3ad

Pipeline LLM cost (USD)
API 1: $0.0026 API 2: $0.0000 API 3: $0.0000 Total: $0.0026

Client output enrichment

v2 Skill cluster · Nature of work · AI index · Tech stack maturity · Evidence · KRA description
SPARSE JD role baseline loaded sources · ai_index: role_baseline · nature_of_work: jd · tech_stack_maturity: role_baseline
Nature of work · Data pipeline development
Build and maintain Databricks/Spark pipelines that ingest and process data, while working with analysts and data scientists to tune workflows and enforce data quality and engineering best practices.
"Develop and maintain Databricks pipelines for data ingestion and processing"
Tech stack maturity
Modern Cloud Native
Data engineers typically build cloud-based batch and streaming pipelines and warehouse models, but AI is usually incidental rather than central to the role.
AI index (0 = no AI use, 5 = totally AI-dependent · v2.1)
1.20 / 5
· Title match
Has AI skill
· AI skill (primary)
· AI skill (secondary)
· On AI team
· Builds AI products
vocab breakdown (legacy)
Assistants (×1):
Frameworks (×2):
Models / concepts (×3): Machine Learning
Evidence — skills matched in JD (2)
Databricks Apache Spark
Skill cluster (2 dimension groups, role-scoped)
ETL and ELT Tooling
Apache Spark
Cross-cutting / unaligned
Databricks
Show KRA description ↓
• Develop and maintain Databricks pipelines for data ingestion and processing • Collaborate with data scientists and analysts to optimize data workflows • Implement data engineering best practices and ensure data quality and integrity • Utilize Apache Spark for data processing and analysis tasks

Signals

Skill data-engineer
0.50
Alias data-engineer
1.00
KRA data-engineer
0.66

Post-classification

Centroidupdated · n=347
Alias collision log
New-role queue
New skills captured0
New KRA captured
Status: completed Created: 2026-05-27T15:47:05.916480Z Updated: 2026-06-12T15:58:51.268232Z API 3 duration: 6202 ms
Flow Current 3-step pipeline

1 POST /skills/extract-from-jd

2 POST /skills/extract-details

3 POST /skills/final-role-output

Role Chosen role & resolution

Data Engineer

CASE A

slug: data-engineer · id: 2 · source: db

Exact alias hit on data-engineer (1.0) — no other alias at this confidence; skill_top data-engineer 0.50 does not contradict

Resolution: in_db — role exists in library; skill↔dim and role↔dim links saved when applicable.

0
New skills
0
Skill↔dim saved
0
Role↔dim saved
0
Skipped

Job description

Skills:
Apache Spark, Data Engineering, Python, SQL, Machine Learning, Data Analysis, ETL,

Company Overview

CN Solutions partners with clients to provide top-notch solutions for their manpower needs. Specializing in Web Technologies, Databases, ERP, Data warehousing, and more. Offering staffing solutions, leadership hiring, RPO services, and more.

Job Overview

Senior Databricks role with 7 to 10 years of experience in Hyderabad. Full-Time employment with

Qualifications And Skills

• 7-10 years of experience in Data Engineering and Databricks
• Proficiency in Python, SQL, and data analysis tools
• Strong knowledge of Apache Spark and ETL processes
• Experience in Machine Learning and working with large datasets


Roles And Responsibilities

• Develop and maintain Databricks pipelines for data ingestion and processing
• Collaborate with data scientists and analysts to optimize data workflows
• Implement data engineering best practices and ensure data quality and integrity
• Utilize Apache Spark for data processing and analysis tasks

Skills from this JD

Each row merges API 1 extraction, API 2 library match / v3 orchestration (dimensions + locked dims), and API 3 persistence tags.

Databricks Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: Databricks id=1202 · databricks

Aliases — catalog

  • Databricks (CANONICAL)

Context tags (catalog)

Apache Spark Databricks Runtime Delta Lake MLflow SQL Analytics Spark cloud integration collaborative workspace data engineering data lakes data pipelines data visualization job scheduling machine learning notebooks real-time analytics

Stored enrichment (catalog DB)

Category
Platform
Sub-category
Data Analytics Platform
Vendor
Databricks, Inc.
License
other_open
Year introduced
2013
Confidence
0.97
Version strategy
NOT_APPLICABLE

Maturity reasoning: Databricks appears frequently in data engineering and analytics job postings, especially alongside Spark, Delta Lake, and lakehouse stacks; strong vendor adoption and broad enterprise usage signal mainstream demand.

Skill profile (library / DB)

Skill nature
PLATFORM
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
9
Sub-category id
911
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • React Frontend Development Catalog dimension db id 96

    Library dimension (catalog)

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
React Frontend Development
d_init_01
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Apache Spark Primary Library skill API 3: existing canonical (in_db) Existing skill (matched library)
Canonical: Apache Spark id=1350 · apache-spark

Aliases — catalog

  • Apache Spark (CANONICAL)
  • apache spark 3 (VERSION)
  • spark (VERSION)
  • spark 3 (VERSION)
  • spark 3.x (VERSION)
  • spark3 (VERSION)

Context tags (catalog)

Apache Kafka Cluster Manager DAGScheduler Data Lake DataFrame ETL Hadoop MLlib Machine Learning PySpark RDD Scala Spark SQL Spark Streaming SparkSession

Stored enrichment (catalog DB)

Category
Framework
Sub-category
Distributed Data Processing Framework
Vendor
Apache Software Foundation
License
apache_2
Year introduced
2010
Confidence
0.94
Version strategy
SEPARATE_ENTITY
Version tag
3.x

Maturity reasoning: Apache Spark appears in many data engineering JDs and remains a standard for distributed ETL/ELT; its GitHub and vendor ecosystem activity stay strong, with Databricks and cloud platforms still promoting it.

Skill profile (library / DB)

Skill nature
FRAMEWORK
Volatility
STABLE
Typical lifespan
EVERGREEN
Category id
5
Sub-category id
1021
Extractable
True
Also category
False

Dimensions (API 2 worklist)

  • ETL and ELT Tooling Catalog dimension db id 24

    Library dimension (catalog)

    Roles linked in library: Data Engineer

API 3 link attempts (this skill)

Dimension Skill↔dim Role↔dim Outcome
ETL and ELT Tooling
etl-and-elt-tooling
Existing dimension (library) · Role↔dimension saved

All API 3 persistence rows

Same grid as the skill-extractor “Persistence items” table: one row per (skill × dimension) work item.

Skill Tag Dimension Skill↔dim Role↔dim Outcome Notes
Databricks in_db
React Frontend Development
d_init_01
Existing dimension (library) · Role↔dimension skipped (dimension not under chosen role)
Apache Spark in_db
ETL and ELT Tooling
etl-and-elt-tooling
Existing dimension (library) · Role↔dimension saved

Library artifacts (this run)

No artifact rows for this run.
nano JD Parser — gpt-4.1-nano click to toggle
RoleSenior Databricks
CompanyCN Solutions
Experience7-10 years of experience
DomainIT Services & Consulting
Location Hyderabad, India (onsite)
JD type pass
Show raw JSON
{
  "JD_type": "pass",
  "about_company": {
    "source_marker": {
      "first_5_words": "CN Solutions partners with clients",
      "last_5_words": "leadership hiring, RPO services, and more."
    },
    "text": "CN Solutions partners with clients to provide top-notch solutions for their manpower needs. Specializing in Web Technologies, Databases, ERP, Data warehousing, and more. Offering staffing solutions, leadership hiring, RPO services, and more.",
    "word_count": 42
  },
  "certifications": [],
  "company_name": "CN Solutions",
  "ctc": null,
  "domain": {
    "primary": {
      "aliases": [
        "ITES",
        "BPO"
      ],
      "domain": "IT Services \u0026 Consulting"
    },
    "secondary": null
  },
  "education": [],
  "experience": {
    "max": 10,
    "min": 7,
    "raw": "7-10 years of experience"
  },
  "job_locations": [
    {
      "aliases": [
        "Hyderabad, AP"
      ],
      "city": "Hyderabad",
      "country": "India",
      "state": null,
      "work_mode": "onsite"
    }
  ],
  "role": "Senior Databricks",
  "role_aliases": [
    "Databricks Engineer",
    "Data Engineer",
    "Senior Data Engineer"
  ],
  "role_archetype": "Data",
  "roles_and_responsibilities": [
    {
      "bullet_count": 4,
      "heading": "Roles And Responsibilities",
      "heading_was_present": true,
      "source_marker": {
        "first_5_words": "\u2022 Develop and maintain Databricks",
        "last_5_words": "for data processing and analysis tasks"
      },
      "text": "\u2022 Develop and maintain Databricks pipelines for data ingestion and processing\n\u2022 Collaborate with data scientists and analysts to optimize data workflows\n\u2022 Implement data engineering best practices and ensure data quality and integrity\n\u2022 Utilize Apache Spark for data processing and analysis tasks",
      "word_count": 42
    }
  ],
  "urls": []
}
API 1 — extract-from-jd click to toggle
{
  "final_skills": [
    {
      "is_primary": true,
      "skill_name": "Databricks"
    },
    {
      "is_primary": true,
      "skill_name": "Apache Spark"
    }
  ],
  "jd_role": {
    "display_name": "Senior Databricks",
    "rationale": null,
    "role_aliases": [
      "Databricks Engineer",
      "Data Engineer",
      "Senior Data Engineer"
    ],
    "role_archetype": "Data",
    "slug": ""
  },
  "nano_parsed": {
    "JD_type": "pass",
    "about_company": {
      "source_marker": {
        "first_5_words": "CN Solutions partners with clients",
        "last_5_words": "leadership hiring, RPO services, and more."
      },
      "text": "CN Solutions partners with clients to provide top-notch solutions for their manpower needs. Specializing in Web Technologies, Databases, ERP, Data warehousing, and more. Offering staffing solutions, leadership hiring, RPO services, and more.",
      "word_count": 42
    },
    "certifications": [],
    "company_name": "CN Solutions",
    "ctc": null,
    "domain": {
      "primary": {
        "aliases": [
          "ITES",
          "BPO"
        ],
        "domain": "IT Services \u0026 Consulting"
      },
      "secondary": null
    },
    "education": [],
    "experience": {
      "max": 10,
      "min": 7,
      "raw": "7-10 years of experience"
    },
    "job_locations": [
      {
        "aliases": [
          "Hyderabad, AP"
        ],
        "city": "Hyderabad",
        "country": "India",
        "state": null,
        "work_mode": "onsite"
      }
    ],
    "role": "Senior Databricks",
    "role_aliases": [
      "Databricks Engineer",
      "Data Engineer",
      "Senior Data Engineer"
    ],
    "role_archetype": "Data",
    "roles_and_responsibilities": [
      {
        "bullet_count": 4,
        "heading": "Roles And Responsibilities",
        "heading_was_present": true,
        "source_marker": {
          "first_5_words": "\u2022 Develop and maintain Databricks",
          "last_5_words": "for data processing and analysis tasks"
        },
        "text": "\u2022 Develop and maintain Databricks pipelines for data ingestion and processing\n\u2022 Collaborate with data scientists and analysts to optimize data workflows\n\u2022 Implement data engineering best practices and ensure data quality and integrity\n\u2022 Utilize Apache Spark for data processing and analysis tasks",
        "word_count": 42
      }
    ],
    "urls": []
  },
  "rejected": false,
  "rejection_reason": null,
  "run_id": "b90fffca-a264-4009-a738-d1f55f9cc3ad",
  "stage3_signals": {
    "alias_found": true,
    "alias_match_roles": [
      {
        "display_name": "Data Engineer",
        "kra_matches": null,
        "matched_count": null,
        "matched_skills": null,
        "role_id": 2,
        "score": 1.0,
        "slug": "data-engineer",
        "total_count": null
      }
    ],
    "kra_match_roles": [
      {
        "display_name": "Data Engineer",
        "kra_matches": [
          {
            "kra_text": "Develops batch and real-time streaming data pipelines using Apache Spark, Apache Kafka, Apache Flink, or Airflow for data movement and processing at scale.",
            "sentence": "Utilize Apache Spark for data processing and analysis tasks",
            "similarity": 0.673
          },
          {
            "kra_text": "Develops batch and real-time streaming data pipelines using Apache Spark, Apache Kafka, Apache Flink, or Airflow for data movement and processing at scale.",
            "sentence": "Develop and maintain Databricks pipelines for data ingestion and processing",
            "similarity": 0.6581
          },
          {
            "kra_text": "Works with data analysts, data scientists, and business stakeholders to define data models, ingestion schedules, and data delivery requirements.",
            "sentence": "Collaborate with data scientists and analysts to optimize data workflows",
            "similarity": 0.6448
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 2,
        "score": 0.6586,
        "slug": "data-engineer",
        "total_count": null
      },
      {
        "display_name": "ML Engineer",
        "kra_matches": [
          {
            "kra_text": "Prepares, cleans, and transforms training datasets, manages feature stores, and builds feature engineering pipelines for model training.",
            "sentence": "Develop and maintain Databricks pipelines for data ingestion and processing",
            "similarity": 0.4983
          },
          {
            "kra_text": "Prepares, cleans, and transforms training datasets, manages feature stores, and builds feature engineering pipelines for model training.",
            "sentence": "Implement data engineering best practices and ensure data quality and integrity",
            "similarity": 0.4685
          },
          {
            "kra_text": "Prepares, cleans, and transforms training datasets, manages feature stores, and builds feature engineering pipelines for model training.",
            "sentence": "Collaborate with data scientists and analysts to optimize data workflows",
            "similarity": 0.4487
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 3,
        "score": 0.4719,
        "slug": "ml-engineer",
        "total_count": null
      },
      {
        "display_name": "Svelte Frontend Developer",
        "kra_matches": [
          {
            "kra_text": "backend data integration",
            "sentence": "Implement data engineering best practices and ensure data quality and integrity",
            "similarity": 0.4955
          },
          {
            "kra_text": "backend data integration",
            "sentence": "Develop and maintain Databricks pipelines for data ingestion and processing",
            "similarity": 0.4824
          },
          {
            "kra_text": "backend data integration",
            "sentence": "Collaborate with data scientists and analysts to optimize data workflows",
            "similarity": 0.4256
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 92,
        "score": 0.4678,
        "slug": "svelte-frontend-developer",
        "total_count": null
      },
      {
        "display_name": "MLOps Engineer",
        "kra_matches": [
          {
            "kra_text": "Validates model performance benchmarks, data schema contracts, and system integration health before signing off on production release readiness.",
            "sentence": "Implement data engineering best practices and ensure data quality and integrity",
            "similarity": 0.4984
          },
          {
            "kra_text": "Automates ML platform operations including scheduled retraining triggers, pipeline orchestration, evaluation workflows, and alerting configuration.",
            "sentence": "Develop and maintain Databricks pipelines for data ingestion and processing",
            "similarity": 0.4411
          },
          {
            "kra_text": "Coordinates model promotion workflows across development, staging, and production environments including integration testing and data contract validation.",
            "sentence": "Collaborate with data scientists and analysts to optimize data workflows",
            "similarity": 0.4344
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 16,
        "score": 0.4579,
        "slug": "ml-ops-engineer",
        "total_count": null
      },
      {
        "display_name": "DevOps Engineer",
        "kra_matches": [
          {
            "kra_text": "Builds and maintains CI/CD pipelines using Jenkins, GitHub Actions, GitLab CI, or CircleCI to automate build, test, security scanning, and deployment workflows.",
            "sentence": "Develop and maintain Databricks pipelines for data ingestion and processing",
            "similarity": 0.4589
          },
          {
            "kra_text": "Collaborates with development teams to improve build processes, reduce deployment friction, containerize applications, and adopt DevOps best practices.",
            "sentence": "Collaborate with data scientists and analysts to optimize data workflows",
            "similarity": 0.4518
          },
          {
            "kra_text": "Collaborates with development teams to improve build processes, reduce deployment friction, containerize applications, and adopt DevOps best practices.",
            "sentence": "Implement data engineering best practices and ensure data quality and integrity",
            "similarity": 0.3892
          }
        ],
        "matched_count": null,
        "matched_skills": null,
        "role_id": 10,
        "score": 0.4333,
        "slug": "devops-engineer",
        "total_count": null
      }
    ],
    "skill_match_roles": [
      {
        "display_name": "Data Engineer",
        "kra_matches": null,
        "matched_count": 1,
        "matched_skills": [
          "Apache Spark"
        ],
        "role_id": 2,
        "score": 0.5,
        "slug": "data-engineer",
        "total_count": 2
      }
    ]
  },
  "stage4_decision": {
    "alias_collision_detected": false,
    "case": "A",
    "chosen_role": {
      "display_name": "Data Engineer",
      "kra_matches": null,
      "matched_count": null,
      "matched_skills": null,
      "role_id": 2,
      "score": 1.0,
      "slug": "data-engineer",
      "total_count": null
    },
    "confidence": 1.0,
    "is_new_role": false,
    "llm2_fired": false,
    "llm2_reasoning": null,
    "matched_dimensions": [],
    "matched_kras": [],
    "matched_skills": [],
    "new_role_display_name": null,
    "new_role_slug": null,
    "queued": false,
    "reasoning": "Exact alias hit on data-engineer (1.0) \u2014 no other alias at this confidence; skill_top data-engineer 0.50 does not contradict",
    "sub_role": null
  },
  "stage5_updates": {
    "centroid_n_after": 347,
    "centroid_updated": true,
    "collision_log_id": null,
    "new_kra_attached": null,
    "new_skills_attached": [],
    "queue_entry_id": null,
    "v3_pipeline_triggered": false,
    "v3_role_slug": null,
    "v3_run_id": null
  }
}
API 2 — extract-details
{
  "alias_matches": [
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 1838,
      "existing_alias_text": "Databricks",
      "input_term": "Databricks",
      "matched_canonical": {
        "category_id": 9,
        "display_name": "Databricks",
        "id": 1202,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "PLATFORM",
        "slug": "databricks",
        "sub_category_id": 911,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    },
    {
      "alias_persist_skipped_reason": "alias_text already exists for this canonical skill",
      "alias_persisted": false,
      "existing_alias_id": 2004,
      "existing_alias_text": "Apache Spark",
      "input_term": "Apache Spark",
      "matched_canonical": {
        "category_id": 5,
        "display_name": "Apache Spark",
        "id": 1350,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "FRAMEWORK",
        "slug": "apache-spark",
        "sub_category_id": 1021,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "matched_via": "alias"
    }
  ],
  "candidate_roles": [
    {
      "display_name": "Data Engineer",
      "id": 2,
      "rationale": null,
      "role_archetype": null,
      "slug": "data-engineer",
      "source": "db"
    }
  ],
  "chosen_role": {
    "display_name": "Data Engineer",
    "id": 2,
    "rationale": "Exact alias hit on data-engineer (1.0) \u2014 no other alias at this confidence; skill_top data-engineer 0.50 does not contradict",
    "role_archetype": null,
    "slug": "data-engineer",
    "source": "db"
  },
  "dimensions": [
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "React Frontend Development",
        "id": 96,
        "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
        "slug": "d_init_01",
        "source": "db"
      },
      "input_skill": "Databricks",
      "llm_role": null,
      "roles_from_db": []
    },
    {
      "dimension": {
        "difficulty_hint": "well_known",
        "display_name": "ETL and ELT Tooling",
        "id": 24,
        "rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
        "slug": "etl-and-elt-tooling",
        "source": "db"
      },
      "input_skill": "Apache Spark",
      "llm_role": null,
      "roles_from_db": [
        {
          "display_name": "Data Engineer",
          "id": 2,
          "rationale": null,
          "role_archetype": null,
          "slug": "data-engineer",
          "source": "db"
        }
      ]
    }
  ],
  "input_final_skills": [
    "Databricks",
    "Apache Spark"
  ],
  "input_llm_skills": [
    "Databricks",
    "Apache Spark"
  ],
  "new_aliases_persisted": 0,
  "run_id": "b90fffca-a264-4009-a738-d1f55f9cc3ad",
  "skills_detail": [
    {
      "aliases_in_db": [
        {
          "alias_text": "Databricks",
          "alias_type": "CANONICAL",
          "id": 1838,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 9,
        "display_name": "Databricks",
        "id": 1202,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "PLATFORM",
        "slug": "databricks",
        "sub_category_id": 911,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "React Frontend Development",
            "id": 96,
            "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
            "slug": "d_init_01",
            "source": "db"
          },
          "input_skill": "Databricks",
          "llm_role": null,
          "roles_from_db": []
        }
      ],
      "input_skill": "Databricks",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    },
    {
      "aliases_in_db": [
        {
          "alias_text": "Apache Spark",
          "alias_type": "CANONICAL",
          "id": 2004,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "apache spark 3",
          "alias_type": "VERSION",
          "id": 2006,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "spark",
          "alias_type": "VERSION",
          "id": 2510,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "spark 3",
          "alias_type": "VERSION",
          "id": 2007,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "spark 3.x",
          "alias_type": "VERSION",
          "id": 2009,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        },
        {
          "alias_text": "spark3",
          "alias_type": "VERSION",
          "id": 2008,
          "is_primary": false,
          "match_strategy": "CASE_INSENSITIVE"
        }
      ],
      "canonical": {
        "category_id": 5,
        "display_name": "Apache Spark",
        "id": 1350,
        "is_also_category": false,
        "is_extractable": true,
        "skill_nature": "FRAMEWORK",
        "slug": "apache-spark",
        "sub_category_id": 1021,
        "typical_lifespan": "EVERGREEN",
        "volatility": "STABLE"
      },
      "dimensions": [
        {
          "dimension": {
            "difficulty_hint": "well_known",
            "display_name": "ETL and ELT Tooling",
            "id": 24,
            "rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
            "slug": "etl-and-elt-tooling",
            "source": "db"
          },
          "input_skill": "Apache Spark",
          "llm_role": null,
          "roles_from_db": [
            {
              "display_name": "Data Engineer",
              "id": 2,
              "rationale": null,
              "role_archetype": null,
              "slug": "data-engineer",
              "source": "db"
            }
          ]
        }
      ],
      "input_skill": "Apache Spark",
      "matched_via": "alias",
      "new_alias_persisted": false,
      "new_alias_text": null,
      "new_skill_meta": null,
      "source_tag": "db",
      "was_in_llm_skills": true
    }
  ],
  "unmatched_skills": []
}
API 3 — final-role-output
{
  "chosen_role": {
    "display_name": "Data Engineer",
    "id": 2,
    "rationale": "Exact alias hit on data-engineer (1.0) \u2014 no other alias at this confidence; skill_top data-engineer 0.50 does not contradict",
    "role_archetype": null,
    "slug": "data-engineer",
    "source": "db"
  },
  "chosen_role_resolution": "in_db",
  "final_input_skills": [
    {
      "skill": "Databricks",
      "tag": "in_db"
    },
    {
      "skill": "Apache Spark",
      "tag": "in_db"
    }
  ],
  "llm_cost_api1_usd": null,
  "llm_cost_api2_usd": null,
  "llm_cost_api3_usd": null,
  "llm_cost_total_usd": null,
  "persistence": {
    "items": [
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "React Frontend Development",
          "id": 96,
          "rationale": "Building interactive web user interfaces with React.js, including component composition, state management, hooks, and rendering patterns. React.js belongs here because it is a core library for client-side UI development in modern web applications.",
          "slug": "d_init_01",
          "source": "db"
        },
        "dimension_id": 96,
        "input_skill": "Databricks",
        "llm_role": null,
        "matched_chosen_role": false,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension skipped (dimension not under chosen role)",
        "role_dimension_saved": false,
        "roles_from_db": [],
        "skill_dimension_saved": true,
        "skill_id": 1202,
        "skill_tag": "in_db",
        "skipped_reason": null
      },
      {
        "chosen_role_id": 2,
        "dimension": {
          "difficulty_hint": "well_known",
          "display_name": "ETL and ELT Tooling",
          "id": 24,
          "rationale": "Packaged tools for extracting, loading, and transforming data across systems. This dimension covers connector-based ingestion, transformation frameworks, and managed integration products.",
          "slug": "etl-and-elt-tooling",
          "source": "db"
        },
        "dimension_id": 24,
        "input_skill": "Apache Spark",
        "llm_role": null,
        "matched_chosen_role": true,
        "outcome_line": "Existing dimension (library) \u00b7 Role\u2194dimension saved",
        "role_dimension_saved": true,
        "roles_from_db": [
          {
            "display_name": "Data Engineer",
            "id": 2,
            "rationale": null,
            "role_archetype": null,
            "slug": "data-engineer",
            "source": "db"
          }
        ],
        "skill_dimension_saved": true,
        "skill_id": 1350,
        "skill_tag": "in_db",
        "skipped_reason": null
      }
    ],
    "new_skills_created": 0,
    "role_dimension_saved": 0,
    "skill_dimension_saved": 0,
    "skipped": 0
  },
  "planner_output": null,
  "run_id": "b90fffca-a264-4009-a738-d1f55f9cc3ad"
}

LLM Calls

Every model call made for this run, in pipeline order. Click a card to see the model's response.

Loading…