Genesis10 is currently seeking a Site Reliability Engineer (SRE) Lead - Messaging Services for a hybrid position with a Global Financial Institution located in Plano, TX. This is a 12+ month contract opportunity. We are seeking an experienced Site Reliability Engineer (SRE) Lead for Messaging Services to drive platform reliability, observability, and operational excellence across IBM MQ and Kafka environments. This role combines production engineering and reliability leadership, platform security and resilience engineering, and ownership of large-scale, distributed messaging runtimes. The position has a hybrid schedule requirement of a minimum of 3 days per week onsite.
Responsibilities: - Leading reliability engineering for high-scale messaging platforms supporting tens of thousands of runtimes and high-volume message throughput
- Driving EOL remediation, patching, and stabilization across MQ queue managers and Kafka clusters
- Implementing SRE best practices such as SLIs / SLOs focused on message delivery, latency, and availability, and incident management, escalation, and postmortem culture
- Enhancing observability and monitoring for messaging flows, queue depths, lag, and throughput
- Designing proactive fault detection and auto-remediation strategies (e.g., DLQ handling, backlog mitigation, failover recovery)
- Building resilient messaging platforms capable of supporting real-time, event-driven workloads
- Supporting global production messaging environments with on-call rotation and escalation ownership
- Partnering with engineering, application, and security teams to ensure reliability, scalability, and secure message transport
Requirements: - Strong experience in Site Reliability Engineering / Production Engineering
- Hands-on expertise with IBM MQ (queue managers, clustering, channels, DLQ management), Kafka / Confluent platform (topics, brokers, partitions, consumer groups), and large-scale distributed messaging systems and runtime management
- Deep understanding of system reliability, scalability, and high availability design; messaging reliability patterns (guaranteed delivery, retry handling, replay, ordering); and incident management, root cause analysis, and problem management
- Experience with observability tools (Dynatrace, Splunk, Prometheus, Grafana) for messaging platforms and event and anomaly detection in high-volume systems
- Strong scripting/automation skills in Shell, Python, PowerShell
- Experience managing Linux/Unix and Windows production environments
- Knowledge of event-driven architecture and messaging-based integration patterns
- Understanding of messaging platform security (TLS, certificates, channel auth, encryption) and vulnerability remediation and risk mitigation in production systems
- Excellent troubleshooting skills in high-pressure, real-time environments (e.g., message backlog, latency spikes, connection failures)
Desired skills: - Experience implementing SRE frameworks (SLIs, SLOs, error budgets) specifically for messaging workloads
- Familiarity with Kubernetes / containerized messaging platforms
- Experience with Kafka ecosystem components (Schema Registry, Connect, Streams) and IBM MQ advanced features (Native HA, clustering)
- Exposure to AI-driven operations (AIOps), anomaly detection, or automated remediation and large-scale messaging modernization or migration programs
- Messaging or middleware certifications (IBM MQ, Kafka, or equivalent)
- Experience in regulated environments (e.g., financial services)
Only candidates available and ready to work directly as Genesis10 employees will be considered for this position. If you have the described qualifications and are interested in this exciting opportunity, please apply! Ranked a Top Staffing Firm in the U.S. by Staffing Industry Analysts for six consecutive years, Genesis10 puts thousands of consultants and employees to work across the United States every year in contract, contract-for-hire, and permanent placement roles. With more than 300 active clients, Genesis10 provides access to many of the Fortune 100 firms and a variety of mid-market organizations across the full spectrum of industry verticals.
For contract roles, Genesis10 offers the benefits listed below. If this is a perm-placement opportunity, our recruiter can talk you through the unique benefits offered for that particular client. Benefits of Working with Genesis10: - Access to hundreds of clients, most who have been working with Genesis10 for 5-20+ years.
- The opportunity to have a career-home in Genesis10; many of our consultants have been working exclusively with Genesis10 for years.
- Access to an experienced, caring recruiting team (more than 7 years of experience, on average.)
- Behavioral Health Platform
- Medical, Dental, Vision
- Health Savings Account
- Voluntary Hospital Indemnity (Critical Illness & Accident)
- Voluntary Term Life Insurance
- 401K
- Sick Pay (for applicable states/municipalities)
- Commuter Benefits (Dallas, NYC, SF, and Illinois)
For multiple years running, Genesis10 has been recognized as a Top Staffing Firm in the U.S., as a Best Company for Work-Life Balance, as a Best Company for Career Growth, for Diversity, and for Leadership, amongst others. To learn more and to view all our available career opportunities, please visit us at our website.
Genesis10 is an Equal Opportunity Employer. Candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or status as a protected veteran.