Senior Data Engineer

December 20, 2024
Apply Now

Job Description

CodaMetrix is revolutionizing Revenue Cycle Management with its AI-powered autonomous coding solution, a multi-specialty AI-platform that translates clinical information into accurate sets of medical codes. CodaMetrix’s autonomous coding drives efficiency under fee-for-service and value-based care models and supports improved patient care. We are passionate about getting physicians and healthcare providers away from the keyboard and back to clinical care.
Overview

The Senior Data Engineer is a member of the Data & Analytics team, reporting to the VP, Data & Analytics. The Data & Analytics team is responsible for designing a data strategy and maintaining robust data architectures on the Databricks platform, which can efficiently handle large-scale, real-time data processing. They are also responsible for managing data pipelines to ensure seamless data flow from various source systems. The goal of the team is to integrate data into a unified Data Lake, facilitating more insightful decision-making and analytical insights. This integration will enhance our data management capabilities, optimize our data pipeline, improve data quality, and boost operational efficiency.

The Data Engineer is responsible for the analytics data ecosystem, creating and maintaining performant data pipelines and repositories, providing the infrastructure to discover and consume data while continually evolving our data storage and analytic capabilities.

The Data Engineer supports our analytics and customer onboarding teams, data scientists and software engineers on data initiatives and will ensure optimal data access across the organization. As a team member, they will populate and maintain our data and data pipeline architecture, as well as optimizing data flow and collection for cross functional teams. They are an experienced data pipeline author and data wrangler who enjoys optimizing data systems and evolving them – and have a customer-centric approach towards the various teams who provide and consume data.

They are self-directed and comfortable supporting the data engineering and analytics needs of multiple stakeholders and systems and are relentless about data security. The right candidate will be excited by the prospect of optimizing our company’s data platform architecture to support deep-dive analytics to power our next generation of AI-driven products and solutions.

Responsibilities

  • Create, maintain, populate and optimize the CodaMetrix data platform and analytics architecture
  • Design, build and maintain robust and scalable infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources such as AWS RDS PostgreSQL DB, Salesforce..etc
  • Assemble large, complex data sets that meet functional / non-functional business requirements
  • Identify, design, and implement internal process improvements such as automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
  • Collaborate with the Analytics, Machine Learning, and Product teams to address data-related technical issues and support their data infrastructure needs to enhance data availability and usability
  • Optimize existing data systems and pipelines for performance and scalability
  • Provide accurate and relevant data to standard and ad hoc data requests to incorporate into new and existing product dashboards and reports, proactive understanding of what needs to be communicated and when
  • Leverage advanced technical skills to support development of the CodaMetrix data lake, warehouse and business intelligence solutions
  • Foster a culture of continuous improvement and learning within the team
  • Evaluate and recommend new data technologies and methodologies to enhance data capabilities
  • Create and maintain comprehensive documentation of data processes, systems, and architectures
  • Provide regular updates to management on data engineering initiatives and project statuses
  • Participate and lead portions of team Agile-based ceremonial activities, including stand-ups, stakeholder demos, and reviews of other engineers designs and code
  • Provide technical consulting to users of the various data warehouse and dashboarding tools and advises users on optimizations, conflicts and appropriate and inappropriate data usage

Requirements

  • Required
    • Bachelor’s or Master’s degree in Computer Science, Data Science, Information Technology or a related field
    • 5+ years of experience with Big Data technologies for data processing using Apache Spark, Kafka, Cassandra
    • 5+ years of experience in AWS Cloud infrastructure & Databases services such as RDS, PostgreSQL, Aurora or Redshift
    • 5+ years of programming experience in ingesting, processing and reading large volumes of data using pyspark or Scala
    • 5+ years of experience in writing SQL and performance tuning
    • Posses an advanced understanding of various structured data in a healthcare setting and organize for visualization and consumption
    • Experience in building Data lake/Data Warehousing with both structured and unstructured dataset
    • Experience with ETL processes to implement data pipelines and workflows
    • Experience with Databricks platform using Unity catalog and DLT pipeline 
    • Advanced Knowledge of data modeling, data architecture and data integration
    • Ability to manage multiple tasks or projects simultaneously
    • Proven analytical and problem-solving skills
    • Effective verbal and written communication skills with both management and peers
  • Preferred
    • Common BI Tools; Tableau is a huge plus
    • Knowledge of HIPAA compliance requirements as well as other security/compliance practices such as PII and SOC2 a big plus
    • Experience with Streaming workloads and integrating Spark with Apache Kafka
    • Experience with consuming or authoring REST and/or SOAP web service APIs
    • You understand what IaC means and have experience with common tools to implement it