Overview
Skills
Job Details
Job Title: Cockroach DBA
Location: Sunnyvale, CA (On-Site)
Job Description:
Must Have Technical/Functional Skills:
Design, deploy, operate, and scale multi-region CockroachDB clusters in production environments
Ensure high availability, fault tolerance, and data consistency for globally distributed clusters
Monitor cluster health, latency, replication status, and resource utilization using observability tools
Perform capacity planning and proactive scaling for future growth
Troubleshoot complex database and infrastructure issues including:
Node failures
Network partitions
Leaseholder and range imbalance
Replication lag
Hot spotting
High latency / throughput bottlenecks
Design disaster recovery strategies (multi-region, backup/restore, failover/fallback)
Implement and test backup, restore, and point-in-time recovery processes
Automate provisioning, scaling, patching, and upgrades of CRDB clusters
Perform rolling upgrades with zero or near-zero downtime
Optimize SQL query performance and database schema efficiency
Create operational runbooks, SOPs, and on-call playbooks for CRDB
Participate in on-call rotations and incident response for production clusters