Senior Data Engineer

Ellisor Group

San Francisco, CA

JOB DETAILS
SKILLS
Analysis Skills, Artificial Intelligence (AI), Construction, Customer/Client Research, Data Analysis, Data Formats, Data Management, Data Sets, Database Extract Transform and Load (ETL), Editing, Finance, Insurance, Loan Structuring, Loans, Manufacturing, Mortgage, Mortgage Lending, Mortgage Risk, Neural Networks, Pricing, Production Systems, Python Programming/Scripting Language, Risk, Risk Modeling, SQL (Structured Query Language), Sales Pipeline, Standards Development, Structured Data, Warehousing
LOCATION
San Francisco, CA
POSTED
2 days ago

Role: Senior Data Engineer

Comp Range: 150-200K base USD + benefits etc..

No 3rd party/C2C

Client is re-writing the risk model for the non-QM mortgage market -combining data analytics, proprietary risk intelligence, and Loan Defect Insurance to turn mortgage manufacturing risk into quantifiable, insurable outcomes for lenders, investors, and RMBS issuers. Client AI (SAI) is the analytical engine powering this platform, built on neural network triage, hazard-based pricing, and AI-driven defect detection.

The Senior Data Engineer owns the foundation that makes all of this possible -if the data layer is unreliable or unauditable, no model output is defensible.

In this role you will:

· Own SAI's data architecture — defining standards, approving design decisions, and providing technical direction to the engineering team.

· Design, build, and maintain SAI's core data infrastructure: loan tape ingestion across all non-QM formats, warehouse tables, canonical identifiers, and gold datasets with full lineage tracking

· Build and maintain the model feature pipeline -engineering non-QM-specific inputs for SAI's triage, hazard pricing, and defect detection models

· Deliver and maintain pipelines for external macro datasets: housing price indices, employment data, market rent feeds, and other third-party enrichment sources

· Implement feature versioning and point-in-time correctness to keep datasets free from look-ahead bias

· Build and maintain warehouse architecture supporting multi-client data segregation and RBAC

· Monitor pipeline health -alerting on delays, schema drift, and quality issues before they reach models The Ideal Candidate:

· Treats data reliability as a non-negotiable, not a nice-to-have

· Comfortable building from scratch in a fast-moving, pre-production environment

· Thinks about downstream model consumers when making infrastructure decisions Basic

Qualifications:

· 5+ years of data engineering with at least 2 years building production pipelines for ML or analytics platforms

· Strong SQL and Python with hands-on experience in Snowflake, BigQuery, or Redshift

· Proven experience building ELT/ETL pipelines for structured financial data at scale

· Pipeline orchestration (Airflow, Prefect, or equivalent) and dbt Preferred Qualifications:

· Mortgage loan tape formats, LOS data, or structured finance data pipelines

· Non-QM loan structures: DSCR, bank statement, asset depletion, and associated data formats

· External data feed integrations (CoreLogic, Black Knight, employment verification providers)

· Feature store design and point-in-time correct dataset construction

Tech: Python / SQL · dbt · Snowflake or BigQuery · Airflow or Prefect · AWS S3 or GCS · Great Expectations or equivalent · Git

About the Company

E

Ellisor Group