Data Engineer - Commodities

Millennium Management LLC

Old Greenwich, CT

JOB DETAILS
SALARY
$175,000–$250,000 Per Year
SKILLS
Analysis Skills, Artificial Intelligence (AI), Best Practices, Cargo/Freight, Communication Skills, Continuous Deployment/Delivery, Continuous Integration, Data Analysis, Data Collection, Data Management, Data Modeling, Data Quality, Data Sets, Data Warehousing, Database Extract Transform and Load (ETL), Detail Oriented, Documentation, Financial Services, Git, GitHub, Investment Management, Market Research, Metadata, Pytest, Python Programming/Scripting Language, Quantitative Research, Reconciliation, SQL (Structured Query Language), Sales Pipeline, Scalable System Development, Snowflake Schema, Source Code/Configuration Management (SCM), Structured Data, Technical Writing, Test Automation, Time Management, Training Data Sets, Use Cases
LOCATION
Old Greenwich, CT
POSTED
5 days ago

Data Engineer - Commodities

The Commodities Technology team builds and operates the data platform that aggregates and curates critical commodities data, including weather, supply/demand, storage, transportation and other fundamental and alternative datasets. This curated "content layer" is central to how our Portfolio Managers and researchers understand markets and construct trades.

We are seeking a Commodities Content Engineer who will focus on building robust ETL workflows and data models on top of our commodities data platform.

In this role, you will use Python and SQL to design, implement and maintain pipelines that ingest, clean, transform, and catalog commodities datasets. You will work closely with quantitative researchers, data analysts, and the broader Commodities Technology team to translate domain requirements into well‑structured, reliable data assets that can be easily discovered and reused across strategies.

This is a hands‑on engineering role with significant exposure to commodities data and the opportunity to shape how that data is represented and consumed across the firm.

Key Responsibilities:

  • Design and implement end‑to‑end ETL workflows in Python and SQL to ingest and transform commodities data from multiple vendors and internal sources.
  • Build and maintain standardized data models, schemas, and metadata that make commodities datasets easy to understand and discover within the platform.
  • Use Airflow (or similar tools) to schedule, monitor, and manage data pipelines, ensuring reliability and timely delivery.
  • Implement robust validation, reconciliation, and anomaly‑detection checks to ensure data completeness, correctness, and consistency.
  • Leverage AI to automate schema inference across structured and semi-structured data sources, manage schema drift, and accelerate development of scalable ingestion pipelines.
  • Apply AI-driven data quality, observability, and documentation capabilities to detect anomalies, monitor data health, and generate clear lineage and technical documentation across complex data workflows.
  • Leverage Git, GitHub Actions, and automated testing (PyTest) to maintain high‑quality code and repeatable deployments.
  • Partner with commodities PMs, researchers, and data strategists to understand use cases and continuously refine datasets, definitions, and documentation.

Required Qualifications:

  • 4 years of experience in data engineering, analytics engineering, or similar roles focused on building and maintaining ETL pipelines.
  • Strong skills in Python and SQL, with experience working with large datasets and complex transformations.
  • Hands‑on experience with Airflow or other workflow schedulers.
  • Familiarity with version control (Git), CI/CD pipelines (GitHub Actions or equivalent), and test automation (e.g., PyTest).
  • Strong attention to detail, data quality and documentation; ability to reason for edge cases and data integrity.
  • Ability to work independently, communicate clearly with both technical and non‑technical stakeholders, and manage work across multiple concurrent initiatives.

Preferred Qualifications:

  • Knowledge of commodities markets and commodities data (e.g., weather, supply/demand, storage, freight, flows).
  • Experience with data warehousing technologies (e.g., Snowflake, columnar storage formats, or analytic databases).
  • Prior experience in a financial services, trading, or research driven environment.
  • Exposure to data catalog / data governance tools and best practices.

The estimated base salary range for this position is $175,000 to $250,000, which is specific to New York and may change in the future. Millennium pays a total compensation package which includes a base salary, discretionary performance bonus, and a comprehensive benefits package. When finalizing an offer, we take into consideration an individual's experience level and the qualifications they bring to the role to formulate a competitive total compensation package.

About the Company

M

Millennium Management LLC