Role: Senior Data Engineer
Comp Range: 150-200K base USD + benefits etc..
No 3rd party/C2C
Client is re-writing the risk model for the non-QM mortgage market -combining data analytics, proprietary risk intelligence, and Loan Defect Insurance to turn mortgage manufacturing risk into quantifiable, insurable outcomes for lenders, investors, and RMBS issuers. Client AI (SAI) is the analytical engine powering this platform, built on neural network triage, hazard-based pricing, and AI-driven defect detection.
The Senior Data Engineer owns the foundation that makes all of this possible -if the data layer is unreliable or unauditable, no model output is defensible.
In this role you will:
· Own SAI's data architecture — defining standards, approving design decisions, and providing technical direction to the engineering team.
· Design, build, and maintain SAI's core data infrastructure: loan tape ingestion across all non-QM formats, warehouse tables, canonical identifiers, and gold datasets with full lineage tracking
· Build and maintain the model feature pipeline -engineering non-QM-specific inputs for SAI's triage, hazard pricing, and defect detection models
· Deliver and maintain pipelines for external macro datasets: housing price indices, employment data, market rent feeds, and other third-party enrichment sources
· Implement feature versioning and point-in-time correctness to keep datasets free from look-ahead bias
· Build and maintain warehouse architecture supporting multi-client data segregation and RBAC
· Monitor pipeline health -alerting on delays, schema drift, and quality issues before they reach models The Ideal Candidate:
· Treats data reliability as a non-negotiable, not a nice-to-have
· Comfortable building from scratch in a fast-moving, pre-production environment
· Thinks about downstream model consumers when making infrastructure decisions Basic
Qualifications:
· 5+ years of data engineering with at least 2 years building production pipelines for ML or analytics platforms
· Strong SQL and Python with hands-on experience in Snowflake, BigQuery, or Redshift
· Proven experience building ELT/ETL pipelines for structured financial data at scale
· Pipeline orchestration (Airflow, Prefect, or equivalent) and dbt Preferred Qualifications:
· Mortgage loan tape formats, LOS data, or structured finance data pipelines
· Non-QM loan structures: DSCR, bank statement, asset depletion, and associated data formats
· External data feed integrations (CoreLogic, Black Knight, employment verification providers)
· Feature store design and point-in-time correct dataset construction
Tech: Python / SQL · dbt · Snowflake or BigQuery · Airflow or Prefect · AWS S3 or GCS · Great Expectations or equivalent · Git