Sr. Machine Learning Ops Engineer (Director)
Los Angeles, CA
Technology and Operations - Information Technology / Full Time / On-site
Apply for this job
ABOUT CIM GROUP
CIM is a community-focused real estate and infrastructure owner, operator, lender, and developer. Our team of experts works together to identify and create value in real assets, benefiting the communities in which we invest. Back in 1994, our three founders focused on projects in Southern California neighborhoods. Today, we are a diverse team of 900+ employees with projects across the Americas. Our projects have delivered jobs; created comfortable places to live, work, and relax; and provided necessary and sustainable infrastructure. Our focus on enhancing communities is unwavering, and we strive to make an even greater impact in the years to come. Join us and make an impact today!
POSITION PURPOSE
The Senior ML Ops Engineer leads the design and maintenance of scalable, secure infrastructure for ML model deployment, lifecycle management, and Generative AI enablement. This role is responsible for building and operating the firms ML Ops platform on Databricks, with a strategic focus on productionizing GenAI/LLM solutions including Retrieval-Augmented Generation (RAG) systems and vector database implementations.
RESPONSIBILITIES
ML Model Deployment & Platform Management
Lead the design, implementation, and ongoing maintenance of scalable ML infrastructure on Databricks, including ML flow for experiment tracking, model registry, and model serving endpoints.
Oversee the development of the ML Ops platform and automated pipelines for deploying, monitoring, and maintaining models within production environments.
Implement robust solutions for model versioning, systematic retraining, and comprehensive artifact management using Databricks Unity Catalog for ML governance.
Design and manage Databricks Feature Store for consistent feature engineering across training and inference pipelines.
Generative AI & LLM Operations
Architect and implement Retrieval-Augmented Generation (RAG) systems for document Q&A, enabling business teams to query fund documents, investor letters, and market research.
Design, deploy, and manage vector database solutions (Databricks Vector Search, Pinecone, or similar) for semantic search and retrieval across enterprise documents.
Lead LLM fine-tuning and customization initiatives, training models like Claude or open-source alternatives with CIM proprietary data while ensuring data privacy and compliance.
Develop and optimize document processing pipelines including PDF parsing, chunking strategies, and embedding generation for RAG applications.
Implement prompt engineering best practices and LLM evaluation frameworks to ensure output quality, relevance, and factual accuracy.
Automation & CI/CD Pipelines
Design and implement extensive automation across the ML workflow, covering model training, testing, validation, and deployment using Databricks Workflows and Asset Bundles.
Set up robust CI/CD pipelines for both traditional ML models and GenAI applications, leveraging GitHub Actions, Azure DevOps, or similar tools.
Automate complex data and model workflows utilizing orchestration tools such as Airflow, Prefect, or Databricks Workflows.
Monitoring, Performance & Reliability
Implement comprehensive monitoring and alerting systems for real-time tracking of model performance, data quality, and GenAI output quality.
Utilize specialized tools (Evidently AI, WhyLabs, Prometheus/Grafana) to proactively detect model drift, data quality anomalies, and RAG retrieval degradation.
Develop evaluation frameworks for GenAI applications including relevance scoring, faithfulness metrics, and human feedback loops.
Troubleshoot issues within production environments, including debugging model deployment failures, RAG retrieval issues, and LLM response quality problems.
Data & Feature Engineering Support
Build and maintain sophisticated feature stores on Databricks, ensuring precise alignment between training and inference data pipelines.
Collaborate with data engineers and information architects to build robust ETL pipelines that feed into the Databricks Lakehouse.
Design embedding pipelines and vector index management strategies for RAG applications, including incremental updates and versioning.
Security, Compliance & Trustworthy AI
Integrate robust security measures directly into ML Ops and GenAI pipelines, including access controls via Unity Catalog and data encryption.
Implement Trustworthy AI guardrails addressing bias detection, explainability, prompt injection prevention, and responsible AI practices.
Ensure GenAI applications handling sensitive fund and investor data comply with regulatory requirements and internal policies.
Collaborate with Legal and Compliance to establish AI governance policies and audit trails for model decisions.
Collaboration & Business Partnership
Engage in extensive collaboration with data scientists, platform engineers, information architects, and DevOps teams to ensure seamless ML/AI integration.
Partner with business teams (Fund Accounting, FP&A, Investor Relations, Sales, Investments) to identify high-value AI use cases and translate business needs into technical solutions.
Communicate complex AI concepts in business terms, managing expectations and demonstrating ROI of ML/GenAI initiatives.
Provide technical mentorship to team members, including refactoring data scientist code for production readiness.
EDUCATION/EXPERIENCE REQUIREMENTS
GenAI/LLM Technical Requirements
Databricks Platform Requirements
PREFERRED
ABOUT YOU
The ideal candidate demonstrates proven experience with model pipeline and registry tools, including the ability to detect and proactively prevent model drift, automate comprehensive model monitoring, and consistently ensure model accuracy. Experience with RAG systems, vector databases, and LLM deployment is essential for this role.
KEY COMPETENCIES
WHAT CIM OFFERS
At CIM, we believe our success stems from our collective efforts, and we are committed to providing well-rounded support and resources for our employees. In addition to a competitive compensation plan, CIM offers a comprehensive benefits program for employees to thrive both inside and outside of work. Eligible employees can enjoy a wide range of benefits, including:
ACTUAL BASE SALARY RANGE
The anticipated base salary range for the position in Los Angeles, CA is $175,000 - $225,000.
CIM Group is a premier full service urban real estate and infrastructure fund manager with approximately $20.5 billion of assets under management. Since its founding in 1994, CIM has been a process- and research-driven investor that mitigates risk through the fundamental analysis of the long-term drivers in communities. CIM is a relative value investor that systematically targets investments that are priced below their long-term intrinsic value. Over time, CIM has delivered a strong risk-adjusted track record of returns by relying on its vertically-integrated team, investment discipline, and sourcing capabilities.