Senior Database DBA - MemSQL / SingleStore

Qode

Texas, TX

JOB DETAILS
SKILLS
Amazon Web Services (AWS), Architectural Services, Auditing, Authentication, Bash Scripting, Best Practices, CPU (Central Processing Unit), Cloud Computing, Communication Skills, Cryptography, Data Collection, Data Partitioning, Data Recovery, Database Administration, Database Design, Database Programming, Disaster Recovery, Distributed Databases, Failover, GCP (Good Clinical Practices), High Availability, Identify Issues, Incident Response, Information/Data Security (InfoSec), Input/Output, Leadership, Licensing, Linux Operating System, Memory Hardware, Microsoft Windows Azure, Operating Systems, Performance Tuning/Optimization, Python Programming/Scripting Language, Query Optimization, Replication and Remote Mirroring, Right-Sizing, Root Cause Analysis, SQL (Structured Query Language), Scripting (Scripting Languages), Service Level Agreement (SLA), Splunk, Test Strategy
LOCATION
Texas, TX
POSTED
7 days ago

Job Title: Senior Database DBA – MemSQL / SingleStore

Location: New Jersey / Irving, TX / Tampa, FL

Role Overview

We are seeking a Senior MemSQL / SingleStore Cluster Administrator to own and manage mission-critical, large-scale distributed database platforms. This role requires a pure Database Administrator (DBA) with deep expertise in handling petabyte-scale data, complex distributed clusters, and real-time latency-sensitive workloads.

Core Technical Expectations

Experience handling petabytes of data ingested every 15 minutes in large-scale environments.

Strong expertise managing large MemSQL / SingleStore clusters (multi-node, multi-TB to multi-PB).

Deep understanding of data distribution across aggregators and leaf nodes.

Expertise in:

  • Partitioning and shard key strategy
  • Data skew mitigation
  • Hot partition resolution
  • Worker node and leaf node optimization

Strong table-level knowledge including:

  • Index strategy
  • Thread management
  • Connection pooling
  • Memory limits
  • Query plan optimization

Strong understanding of different MemSQL/SingleStore versions and corresponding architectural/feature changes.

Key Responsibilities

End-to-end ownership of large MemSQL/SingleStore clusters (design, build, upgrade, operate, decommission).

Architect and maintain High Availability (HA) and Disaster Recovery (DR) setups including:

  • Redundancy levels
  • Availability groups
  • Cross-region replication

Plan and execute:

  • Cluster expansion
  • Downsizing
  • Online partition rebalancing
  • Leaf node management with minimal/no downtime

Proactively monitor cluster health, throughput, latency, and capacity; define and maintain SLAs.

Perform advanced performance tuning:

  • Schema design
  • Shard key design
  • Index strategy
  • NUMA and memory tuning
  • Workload management

Implement backup/restore strategies and regularly test DR & failover.

Lead incident response and perform deep root cause analysis.

Enforce database security best practices:

  • Authentication & authorization
  • Encryption
  • Auditing
  • Network controls

Drive automation using scripting (Python/Bash) and Infrastructure as Code.

Maintain documentation, operational runbooks, and standards.

Evaluate new MemSQL/SingleStore features and lead version upgrades and migrations.

Required Experience & Skills

10+ years of total database engineering/administration experience.

4–5+ years of deep, production-grade experience administering MemSQL/SingleStore clusters at scale.

Strong hands-on experience with:

  • Aggregators & leaf nodes
  • Licensing and memory limits
  • Cluster expansion & partition rebalancing
  • Replication & failover/failback
  • Proven ability to diagnose:


  • Locking issues
  • Data skew
  • Hot partitions
  • Bad execution plans
  • Strong Linux system tuning knowledge:


  • CPU/NUMA affinity
  • Disk & I/O optimization
  • Networking
  • ulimits & OS-level tuning
  • Experience with monitoring & alerting tools:


  • Prometheus / Grafana
  • Datadog
  • Splunk
  • ELK
  • Strong SQL expertise and scripting (Python/Bash).
  • Experience in Cloud/Container environments (AWS/Azure/GCP, Kubernetes) is highly preferred.

Excellent communication skills with ability to lead production calls and explain technical trade-offs clearly.


About the Company

Q

Qode