Site Reliability Engineer (PostgreSQL)

Remote • Posted 1 day ago • Updated 3 hours ago
Full Time
Remote
Fitment

Dice Job Match Score™

🤯 Applying directly to the forehead...

Job Details

Skills

  • Query Optimization
  • Load Balancing
  • Archiving
  • Storage
  • Dashboard
  • Database Security
  • Access Control
  • RBAC
  • Auditing
  • Disaster Recovery
  • Streaming
  • Failover
  • Orchestration
  • Data Integrity
  • Business Intelligence
  • Extract
  • Transform
  • Load
  • Workflow
  • Grafana
  • Zabbix
  • Nagios
  • Regulatory Compliance
  • Capacity Management
  • Forecasting
  • Concurrent Computing
  • Python
  • Bash
  • Ansible
  • Migration
  • Knowledge Sharing
  • Mentorship
  • Scalability
  • Computer Engineering
  • Electrical Engineering
  • Information Technology
  • Incident Management
  • Data Centers
  • Cloud Computing
  • Management
  • Replication
  • Performance Tuning
  • Database Administration
  • PostgreSQL
  • SQL
  • Profit And Loss
  • Database Performance Tuning
  • Backup
  • Recovery
  • High Availability
  • Linux
  • Debian Linux
  • Scripting
  • Analytical Skill
  • Problem Solving
  • Conflict Resolution
  • Communication
  • Collaboration
  • Oracle
  • MySQL
  • Database

Summary

Site Reliability Engineer - Data Center (Level 3) - PostgreSQL

Location: Remote - WFH

Job Summary

We are seeking an experienced Site Reliability Engineer (SRE) to join our Data Center Engineering team at Level 3. This role requires a technically strong and operationally mature engineer who will help design, scale, and maintain the reliability of our physical and virtual data center infrastructure. As a Level 3 SRE, you will be a technical leader responsible for ensuring system uptime, optimizing capacity and performance, and contributing to long-term infrastructure resiliency.

Key Responsibilities
Design, implement, and maintain PostgreSQL databases, including schema design, indexing strategies, query optimization, logical/physical replication, hot standby failover, and load balancing.
Develop and execute backup and recovery strategies, including pg_dump, pg_basebackup, WAL archiving, point-in-time recovery (PITR), and disaster recovery planning.
Monitor and optimize database performance, resource utilization, and storage growth using pg_stat_statements, EXPLAIN ANALYZE, pg_top, and PrometheGrafana dashboards; proactively troubleshoot performance bottlenecks.
Ensure database security through role-based access control (RBAC), audit logging with pgaudit, and compliance with regulatory standards.
Implement high availability (HA) and disaster recovery (DR) solutions using Patroni, streaming replication, synchronoasynchronous replication, and failover orchestration.
Plan and execute database version upgrades and apply security or performance patches with minimal downtime, ensuring data integrity and compatibility checks.
Collaborate with application teams, BI developers, and ETL engineers to support data pipelines, optimizing queries, and workflow performance.
Implement monitoring and alerting solutions using Prometheus, Grafana, Zabbix, or Nagios to track database health, query latency, and resource usage.
Manage database user accounts, roles, and privileges to enforce security policies and regulatory compliance, including sudo/OS-level permissions for critical operations.
Conduct capacity planning, workload forecasting, and index/partition tuning to handle anticipated growth and high-concurrency workloads.
Automate database maintenance tasks using Python, Bash, or Ansible scripts, including schema migrations, routine checks, and patch deployment.
Document procedures, configurations, operational runbooks, and PostgreSQL best practices for team knowledge sharing.
Mentor and guide team members on PostgreSQL internals, replication setups, and performance tuning techniques.
Evaluate and recommend new database tools, extensions (like TimescaleDB, pg_stat_statements), and best practices to improve efficiency, scalability, and resilience.

Education and Experience
Bachelor's degree in Computer Engineering, Electrical Engineering, Information Technology, or a related technical field.
4-7 years of experience in database administration and operations.
Experience participating in or leading incident response and postmortem analysis processes.
Previous exposure to hybrid environments integrating on-premise data centers with public or private cloud platforms is desirable.
Experienced PostgreSQL Database Administrator managing production and non-production PostgreSQL environments.
Skilled in backup and recovery, replication, performance tuning, and high availability.
Proven ability to troubleshoot critical issues, automate DBA tasks, and ensure database reliability.

Expertise
4+ years of hands-on PostgreSQL administration experience.
Strong SQL and PL/pgSQL expertise; experience with database optimization and indexing.
Hands-on experience with backup, recovery, and HA solutions.
Strong proficiency in Linux and Debian environments.
Proficiency in scripting for database automation.
Excellent analytical, problem-solving, and troubleshooting skills.
Strong communication skills for cross-team collaboration.
Understanding of Oracle and MySQL databases is a plus, but not mandatory.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: RTL155990
  • Position Id: ac1d3a4878b47a804969f6ddd01d7989
  • Posted 1 day ago
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Remote or St. Louis, Missouri

17d ago

Full-time

Remote

Today

Full-time

Remote or San Francisco, California

Today

Full-time

USD 106,773.33 - 144,906.66 per year

Remote

Today

Full-time

Search all similar jobs