Senior Site Reliability Engineer (SRE), Backend & Performance
REMOTE
We are looking for an engineer to improve the reliability, performance, and scalability of our systems. This role combines DevOps, backend performance tuning, and Site Reliability Engineering (SRE), with a strong focus on Python applications, AWS infrastructure, and PostgreSQL databases.
Requirements
- Python performance optimization
- AWS services
- PostgreSQL and database tuning
- Familiarity with observability and monitoring tools
- DevOps and SRE best practices
- Experience with automated testing and code review
Key Responsibilities
Build and maintain cloud infrastructure on AWS
Set up and manage CI/CD pipelines for automated deployments
Monitor application performance using tools like OpenTelemetry
Analyze and optimize Python application performance
Diagnose and tune PostgreSQL database queries and configurations
Define and enforce performance and reliability metrics (SLOs)
Act as a gatekeeper for deployments—block releases if metrics are not met
Investigate production issues and implement long-term fixes
Review and test code using tools like Claude and other methods
Experience with containerization (Docker, Kubernetes)
Knowledge of infrastructure as code (Terraform, CloudFormation)
Experience with load testing and performance benchmarking