Apply Now

Site Reliability Engineering (SRE) Lead

Hybrid in Phoenix, AZ, US • Posted 17 hours ago • Updated 23 minutes ago

Contract W2

12 Months

Hybrid

$60 - $60/hr

TechVirtue LLC

Fitment

Dice Job Match Score™

👤 Reviewing your profile...

Job Details

Skills

Site Reliability Engineering
SRELead
Aws
Cloud
IBM
Websphere

Summary

Job Title: Site Reliability Engineering (SRE) Lead

Location: Phoenix AZ

Duration: Long Term Contract

We are seeking an experienced Site Reliability Engineering (SRE) Lead to design, build, and evolve highly available, scalable, and secure payment platforms. The role requires strong expertise across AWS cloud, enterprise middleware (IBM WebSphere, IBM MQ), modern application stacks, observability, and DevOps, with deep understanding of Payments domain systems.

You will define SRE strategy, reliability architecture, and operational excellence while collaborating closely with application, infrastructure, security, and business teams.

Key Responsibilities

Reliability & Architecture

Design and architect highly resilient, fault tolerant payment systems supporting high throughput and low latency SLAs.
Define SRE principles, including SLOs, SLIs, error budgets, and reliability KPIs for mission critical payment services.
Lead architecture decisions for cloud native, hybrid, and legacy systems, including IBM WebSphere based platforms.
Drive active active, DR, and HA strategies for AWS and on prem integrations.

Cloud & Platform Engineering

Architect and operate workloads on AWS (EC2, EKS/ECS, RDS, S3, IAM, VPC, CloudWatch).
Optimize infrastructure for scalability, availability, security, and cost efficiency.
Guide containerization and orchestration strategies where applicable.

Application & Middleware Expertise

Partner with development teams on Java, Spring Boot based microservices.
Support front end platforms built using React and Angular in terms of performance and reliability.
Architect and operate messaging platforms using Kafka and IBM MQ.
Manage enterprise middleware including IBM WebSphere Application Server.

DevOps & Automation

Build and maintain CI/CD pipelines using Jenkins.
Automate infrastructure provisioning, deployments, monitoring, and recovery processes.
Promote Infrastructure as Code (IaC) and immutable infrastructure best practices.
Champion DevOps and SRE culture across engineering teams.

Observability & Operations

Design and standardize monitoring, logging, and alerting using:
- Splunk
- AWS CloudWatch
- Datadog
Implement proactive monitoring and advanced alerting for payment flows.
Lead incident response, root cause analysis (RCA), and post incident reviews.
Drive reduction in MTTR and recurring incidents.

Database & Data Layer

Architect and support PostgreSQL and Oracle databases with focus on:
High availability
Performance tuning
Backup, restore, and disaster recovery

Payments Domain Leadership

Provide reliability leadership for payment processing systems (authorization, capture, settlement, reconciliation).
Ensure compliance with PCI DSS, security, and regulatory standards relevant to payments.
Understand dependencies across gateways, processors, fraud, and downstream systems.

Leadership & Collaboration
Act as technical lead/architect for SRE initiatives.
Mentor SREs and engineers; guide best practices and standards.
Work closely with product, architecture, security, and operations teams.
Influence executive stakeholders on reliability, risk, and scalability decisions.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Dice Id: 91170837
Position Id: 9001956
Posted 17 hours ago

Company Info

About TechVirtue LLC

TechVirtue is involved in developing a wide range of solutions in finding the perfect candidate who has a strong knowledge in his/her work and suits the company's work culture. We even provide one-stop solutions ranging from software development and maintenance to expert support and advisory. Our team consists of experts who have several years of experience in staffing, recruitment, and web development. Our dedicated and motivated team makes sure to fulfill all our customers requirements.

Go to company profile

Contact the job poster

Harikrishna vanga

Recruiter @ TechVirtue LLC

View Profile

Create job alert

Never miss an opportunity! Create an alert based on the job you applied for.

Alpharetta, Georgia

•

Yesterday

Technical Program Manager Cloud Security / Vulnerability Management Location: Alpharetta, GA Duration: 12+ months Interview Process: * 1st Zoom * 2nd In person Project: Likely tied to enterprise-wide Claude Mythos security remediation / patching initiatives * 100% Needs TECHNICAL Domain knowledge (Not just a good program manager but a REALLY GOOD TPM) * Cloud controls * AI Tooling Experience (coding Scanning) * Vulnerability Management Remediation * CDE structure * This team is Building new too

Easy Apply

Full-time

50 - 60

Search all similar jobs

More jobs at TechVirtue LLC in Phoenix, AZ