Cloud Site Reliability Engineer (SRE) - Data Management & Analytics Platform

Bloomberg

Princeton, NJ

JOB DETAILS
SALARY
$160,000–$240,000 Per Year
SKILLS
AWS Lambda, Amazon Simple Storage Service (S3), Amazon Web Services (AWS), Automation, Best Practices, Budgeting, Cloud Architecture, Cloud Computing, Computer Science, Computer Security, Continuous Deployment/Delivery, Continuous Improvement, Continuous Integration, Cost Control, Customer Relations, Data Analysis, Data Management, Data Processing, DevOps, Distributed Computing, Diversity, Docker, Electronic Medical Records, Genetics, High Availability, Incident Management, Incident Response, Large-Scale Systems, Leadership, Mathematics, Metrics, Network Routing, Performance Tuning/Optimization, Process Improvement, Production Support, Production Systems, Python Programming/Scripting Language, Reliability Engineering, Reporting Dashboards, Root Cause Analysis, Security Monitoring, Software Engineering, System Operations, Systems Administration/Management, Systems Reliability, Team Player
LOCATION
Princeton, NJ
POSTED
2 days ago
Cloud Site Reliability Engineer (SRE) - Data Management & Analytics Platform


Location


Princeton


Business Area


Engineering and CTO


Ref #


10050564


Description & Requirements


At Bloomberg, data is at the heart of everything we do. As part of the Data Management and Analytics Platform (DMAP) SRE team you will play a critical role in driving analytics throughout the organization to improve our products, better engage with our customers, create greater efficiencies, and unlock new business opportunities through data-driven insights.


Our team is responsible for capturing and processing the who, what, when, where, and why of how clients use Bloomberg products, how our systems perform, and how employees interact with customers. We ingest and prepare massive volumes of data to power reporting, dashboards, self-service tools, and advanced analytics used across the company.


We are looking for a Cloud Site Reliability Engineer (SRE) who is passionate about building and operating highly reliable, scalable data platforms in the cloud. In this role, you will focus on ensuring the availability, performance, and scalability of critical data pipelines and analytics infrastructure. You will work at the intersection of software engineering and infrastructure, applying automation, observability, and reliability best practices to support large-scale distributed systems.


You’ll Be Trusted To


+ Design, build, and operate highly available, scalable, and resilient cloud infrastructure supporting large-scale data ingestion and analytics platforms

+ Define, implement, and monitor SLIs/SLOs for data systems and services; drive reliability improvements using error budgets and operational metrics

+ Improve observability across data pipelines and platforms through logging, metrics, tracing, and alerting

+ Automate infrastructure provisioning and system management using Infrastructure as Code (IaC)

+ Lead incident response efforts, perform root cause analysis (RCA), and implement post-incident improvements

+ Optimize performance, reliability, and cost efficiency of cloud-based data systems

+ Ensure data platform reliability, including batch and streaming pipelines, storage systems, and reporting infrastructure

+ Partner with data engineers, software engineers, and stakeholders to improve system reliability and operational maturity

+ Strengthen platform security through proactive monitoring, vulnerability management, and cloud security best practices

+ Continuously improve CI/CD pipelines and deployment processes for data infrastructure


You’ll Need To Have


+ 5+ years of experience in Site Reliability Engineering, DevOps, or Cloud Infrastructure roles

+ Strong proficiency in at least one programming or scripting language (Python, and/or Go)

+ Experience supporting production systems with a focus on reliability, scalability, and observability

+ Hands-on experience operating or designing highly available distributed systems.

+ A Bachelor’s degree in Computer Science, Engineering, Mathematics, or a related field, or equivalent professional experience


We’d Love To See


+ Experience supporting large-scale data platforms, data pipelines, or analytics infrastructure

+ Strong experience operating production systems in AWS at scale

+ Experience defining and managing SLIs, SLOs, and error budgets

+ Strong background in monitoring and observability tools (e.g., Prometheus, Grafana, CloudWatch, Datadog)

+ Experience leading incident management and conducting postmortems

+ Hands-on experience with Infrastructure as Code (Terraform or CloudFormation)

+ Experience building and maintaining CI/CD pipelines

+ Strong understanding of distributed systems and cloud architecture

+ Experience with containerized workloads (Docker, Kubernetes)

+ Knowledge of AWS services related to data platforms (e.g., S3, EMR, Lambda, Kinesis, Glue, Redshift)

+ Knowledge of Databricks or Snowflake platform

+ Experience with cloud networking concepts (VPCs, routing, security groups)

+ Experience optimizing cloud costs in large-scale environments

+ AWS certification (Associate level or above)

+ A security-first mindset and familiarity with compliance and data governance best practices

+ Experience using operational metrics and data to drive continuous improvement


Our most successful engineers are collaborative, data-driven, and take strong ownership of production systems end-to-end, ensuring the reliability of the data platforms that power Bloomberg’s analytics and insights.


Salary Range = 160,000 - 240,000 USD Annual + Benefits + Bonus


The referenced salary range is based on the Company's good faith belief at the time of posting. Actual compensation may vary based on factors such as geographic location, work experience, market conditions, education/training and skill level.


We offer one of the most comprehensive and generous benefits plans available and offer a range of total rewards that may include merit increases, incentive compensation (exempt roles only), paid holidays, paid time off, medical, dental, vision, short and long term disability benefits, 401(k) +match, life insurance, and various wellness programs, among others. The Company does not provide benefits directly to contingent workers/contractors and interns.


Discover what makes Bloomberg unique - watch our for an inside look at our culture, values, and the people behind our success.
Bloomberg is an equal opportunity employer and we value diversity at our company. We do not discriminate on the basis of age, ancestry, color, gender identity or expression, genetic predisposition or carrier status, marital status, national or ethnic origin, race, religion or belief, sex, sexual orientation, sexual and other reproductive health decisions, parental or caring status, physical or mental disability, pregnancy or parental leave, protected veteran status, status as a victim of domestic violence, or any other classification protected by applicable law.

Bloomberg is a disability inclusive employer. Please let us know if you require any reasonable adjustments to be made for the recruitment process. If you would prefer to discuss this confidentially, please email amer_recruit@bloomberg.net

About the Company

B

Bloomberg