Overview
On Site
Full Time
Skills
SaaS
Virtual Machines
Service Level
Recovery
Root Cause Analysis
Software Development
Quality Assurance
Software Development Methodology
Scalability
Systems Design
Capacity Management
Load Testing
Performance Tuning
Microservices
Knowledge Sharing
Documentation
Knowledge Base
Mentorship
SSO
Continuous Improvement
Optimization
PaaS
Messaging
Storage
Analytics
Log Analysis
Scripting
Windows PowerShell
Bash
Python
Continuous Integration
Continuous Delivery
Version Control
Git
Workflow
GitHub
Computer Networking
TCP/IP
Dragon NaturallySpeaking
DNS
HTTP
HTTPS
TLS
Load Balancing
Firewall
Analytical Skill
Conflict Resolution
Problem Solving
Scrum
C#
.NET
ARM
Terraform
Docker
Orchestration
Kubernetes
SQL Azure
Microsoft SQL Server
NoSQL
Cosmos-Db
Database
JavaScript
React.js
TypeScript
Computer Science
Information Technology
Reliability Engineering
Cloud Computing
System Administration
Software Engineering
FOCUS
Management
Incident Management
Agile
Financial Services
Microsoft Azure
Microsoft
DevOps
ITIL
Service Management
IaaS
Typing
Writing
Cabling
UI
Communication
Active Listening
Bridging
Presentations
System Integration Testing
Laptop
Servers
SAP BASIS
Law
IT Service Management
Innovation
Collaboration
Recruiting
Insurance
Finance
Professional Development
Training
Leadership
CompTIA
Customer Service
Career Counseling
Oracle Application Express
Apex
Job Details
Job#: 2075370
Job Description:
Job Description
Job Title: Site Reliability Engineer
Type: 6-month contract to hire
Schedule: 4x a week onsite
Location: Austin, Texas
SUMMARY:
Oue team is seeking a motivated and experienced Site Reliability Engineer (SRE) to join our dynamic organization. This crucial role focuses on safeguarding the availability, performance,
and scalability of our mission-critical, Azure-hosted platform serving thousands of financial professionals nationwide.
As an individual contributor, you will apply your growing expertise in cloud infrastructure, automation, and
observability to maintain and enhance our systems. You will collaborate closely with Agile development teams,
embedding reliability principles throughout the application lifecycle and driving continuous improvement. You will play a key role in ensuring platform stability, participating in our 24x7 on-call rotation to support our advisors around the clock.
ESSENTIAL DUTIES AND RESPONSIBILITIES: To perform this job successfully, this individual must be able to
perform each essential duty satisfactorily:
Monitor, Maintain, and Optimize Azure Infrastructure: Ensure the health, performance, availability, and
capacity of Azure IaaS, PaaS, and SaaS components (including VMs, VNETs, App Services, Function Apps,
Container Apps, Azure SQL, Cosmos DB, Service Bus, Storage, Front Door, App Gateway, CDN).
Enhance Observability: Define, measure, and refine Service Level Indicators (SLIs) and Objectives (SLOs);
implement, configure, and enhance monitoring, logging, and actionable alerting using Azure Monitor,
Application Insights, and Log Analytics (KQL).
Automate Everything: Develop automation and tooling using scripting languages (PowerShell, Bash, Python)
and potentially C#/.NET to eliminate manual tasks ("toil"), improve efficiency, accelerate recovery, and
codify operational best practices.
Incident Response & Resolution: Actively participate in the 24x7 on-call rotation; lead or contribute to
incident triage, mitigation, root cause analysis (RCA), post-mortem processes, and the implementation of
preventive actions.
Collaborate for Reliability: Partner effectively with software development, QA, and other technology teams
throughout the application lifecycle to ensure reliability, scalability, performance, and operational
requirements are met. Provide insights during system design discussions
Performance and Capacity: Contribute to capacity planning, load testing, and performance tuning initiatives,
particularly across our .NET / React micro-service architecture.
Documentation & Knowledge Sharing: Create and maintain clear documentation for systems, processes,
runbooks, FAQs, and knowledge-base articles; mentor teammates on reliability concepts.
Support Integrations: Troubleshoot and support integrations with third-party systems via APIs, SSO
implementations, and secure file transfer protocols.
Continuous Improvement: Identify opportunities and contribute to initiatives aimed at improving processes,
automation, cost optimization, security posture, and overall engineering excellence.
KNOWLEDGE, SKILLS, AND/OR ABILITIES: To perform this job successfully, individuals should have the
following skills and abilities:
Azure Cloud Expertise: Strong hands-on experience managing and troubleshooting production workloads on
Microsoft Azure, including IaaS & PaaS services (Networking, Compute, Databases, Messaging, Storage,
Security).
Monitoring & Observability: Proficiency with Azure monitoring tools (Azure Monitor, Application Insights,
Log Analytics) and KQL query language. Deep understanding of monitoring concepts, distributed tracing,
and log analysis.
Automation & Scripting: Solid scripting skills for automation (e.g., PowerShell, Bash, Python).
CI/CD: Experience with CI/CD concepts and tools, particularly Azure DevOps pipelines.
Version Control: Proficiency with Git workflows and platforms like Azure Repos or GitHub.
Networking Fundamentals: Solid understanding of networking concepts (TCP/IP, DNS, HTTP/HTTPS, TLS,
Load Balancing, Firewalls, VNETs).
Troubleshooting: Strong analytical and problem-solving skills across complex, distributed systems.
Collaboration & Communication: Excellent communication (written and verbal) and collaboration skills.
Ability to work effectively independently and as part of a team.
Agile Practices: Familiarity with Agile/Scrum methodologies and ceremonies.
C#/.NET Understanding (Desired): Basic understanding or development experience with C#/.NET
applications for troubleshooting and potential tooling development.
Infrastructure as Code (IaC) (Desired): Experience with IaC tools like ARM templates, Bicep, or Terraform.
Containerization (Desired): Familiarity with containerization technologies (Docker) and orchestration
(Kubernetes, Azure Container Apps).
Database Skills (Desired): Experience supporting relational (e.g., Azure SQL, MS SQL Server) and NoSQL
(e.g., Cosmos DB) databases, including query writing and basic troubleshooting.
Frontend Familiarity (Desired): Basic familiarity with modern JavaScript front-end technologies like
React/Typescript.
EDUCATION AND/OR EXPERIENCE:
Bachelor's degree in Computer Science, Information Technology, Engineering, or a related field, or
equivalent practical experience.
Typically 2-5 years of experience in Site Reliability Engineering, DevOps, Cloud Operations/Engineering,
Systems Administration, or Software Engineering with a strong operational focus.
Demonstrated experience managing production workloads in Microsoft Azure is strongly preferred.
Proven experience with monitoring, alerting, logging systems, and participating in incident response / on-call
rotations.
Track record of successfully automating operational tasks and improving deployment processes.
Experience working in Agile development environments (Desired).
Experience working within financial services or a similarly regulated industry (Desired).
CERTIFICATIONS, LICENSES, REGISTRATIONS:
Microsoft Certified: Azure Administrator Associate (AZ-104) (Desired)
Microsoft Certified: DevOps Engineer Expert (AZ-400) (Desired)
ITIL v4 Foundation or equivalent service-management credential (Desired)
Other relevant cloud, infrastructure, or technology certifications (Desired)
PHYSICAL DEMAND: The physical demands described here are representative of those that must be met by an
employee to successfully perform the essential functions of this job. Reasonable accommodation may be made to
enable individuals with disabilities to perform the essential functions.
Ability to sit or stand at a computer workstation for extended periods while using a keyboard, mouse, and
multiple monitors.
Frequent, repetitive hand-finger motions for typing, writing, and handling small peripherals or cables
Near-vision sufficient to read electronic documents, review code, and distinguish basic on-screen colors (e.g.,
for UI verification).
Clear spoken communication and active listening for in-person and virtual meetings, incident bridges, and
phone calls.
Ability to walk short distances, navigate a standard office environment, climb one flight of stairs, and stand
during white-boarding or presentations. Sit-stand desks and other ergonomic furniture are available upon
request.
Ability to lift and move equipment or boxed materials weighing up to 20 lbs (e.g., laptops, small servers,
office supplies).
Participation in overnight or weekend on-call rotations and critical production releases may require work
outside standard business hours.
OTHER DUTIES: Please note this job description is not designed to cover or contain a complete comprehensive
listing of activities, duties or responsibilities that are required of the employee for this job. Duties, responsibilities and
activities may change at any time with or without notice.
EEO Employer
Apex Systems is an equal opportunity employer. We do not discriminate or allow discrimination on the basis of race, color, religion, creed, sex (including pregnancy, childbirth, breastfeeding, or related medical conditions), age, sexual orientation, gender identity, national origin, ancestry, citizenship, genetic information, registered domestic partner status, marital status, disability, status as a crime victim, protected veteran status, political affiliation, union membership, or any other characteristic protected by law. Apex will consider qualified applicants with criminal histories in a manner consistent with the requirements of applicable law. If you have visited our website in search of information on employment opportunities or to apply for a position, and you require an accommodation in using our website for a search or application, please contact our Employee Services Department at or .
Apex Systems is a world-class IT services company that serves thousands of clients across the globe. When you join Apex, you become part of a team that values innovation, collaboration, and continuous learning. We offer quality career resources, training, certifications, development opportunities, and a comprehensive benefits package. Our commitment to excellence is reflected in many awards, including ClearlyRated's Best of Staffing in Talent Satisfaction in the United States and Great Place to Work in the United Kingdom and Mexico.
Apex Benefits Overview: Apex offers a range of supplemental benefits, including medical, dental, vision, life, disability, and other insurance plans that offer an optional layer of financial protection. We offer an ESPP (employee stock purchase program) and a 401K program which allows you to contribute typically within 30 days of starting, with a company match after 12 months of tenure. Apex also offers a HSA (Health Savings Account on the HDHP plan), a SupportLinc Employee Assistance Program (EAP) with up to 8 free counseling sessions, a corporate discount savings program and other discounts. In terms of professional development, Apex hosts an on-demand training program, provides access to certification prep and a library of technical and leadership courses/books/seminars once you have 6+ months of tenure, and certification discounts and other perks to associations that include CompTIA and IIBA. Apex has a dedicated customer service team for our Consultants that can address questions around benefits and other resources, as well as a certified Career Coach. You can access a full list of our benefits, programs, support teams and resources within our 'Welcome Packet' as well, which an Apex team member can provide.
Job Description:
Job Description
Job Title: Site Reliability Engineer
Type: 6-month contract to hire
Schedule: 4x a week onsite
Location: Austin, Texas
SUMMARY:
Oue team is seeking a motivated and experienced Site Reliability Engineer (SRE) to join our dynamic organization. This crucial role focuses on safeguarding the availability, performance,
and scalability of our mission-critical, Azure-hosted platform serving thousands of financial professionals nationwide.
As an individual contributor, you will apply your growing expertise in cloud infrastructure, automation, and
observability to maintain and enhance our systems. You will collaborate closely with Agile development teams,
embedding reliability principles throughout the application lifecycle and driving continuous improvement. You will play a key role in ensuring platform stability, participating in our 24x7 on-call rotation to support our advisors around the clock.
ESSENTIAL DUTIES AND RESPONSIBILITIES: To perform this job successfully, this individual must be able to
perform each essential duty satisfactorily:
Monitor, Maintain, and Optimize Azure Infrastructure: Ensure the health, performance, availability, and
capacity of Azure IaaS, PaaS, and SaaS components (including VMs, VNETs, App Services, Function Apps,
Container Apps, Azure SQL, Cosmos DB, Service Bus, Storage, Front Door, App Gateway, CDN).
Enhance Observability: Define, measure, and refine Service Level Indicators (SLIs) and Objectives (SLOs);
implement, configure, and enhance monitoring, logging, and actionable alerting using Azure Monitor,
Application Insights, and Log Analytics (KQL).
Automate Everything: Develop automation and tooling using scripting languages (PowerShell, Bash, Python)
and potentially C#/.NET to eliminate manual tasks ("toil"), improve efficiency, accelerate recovery, and
codify operational best practices.
Incident Response & Resolution: Actively participate in the 24x7 on-call rotation; lead or contribute to
incident triage, mitigation, root cause analysis (RCA), post-mortem processes, and the implementation of
preventive actions.
Collaborate for Reliability: Partner effectively with software development, QA, and other technology teams
throughout the application lifecycle to ensure reliability, scalability, performance, and operational
requirements are met. Provide insights during system design discussions
Performance and Capacity: Contribute to capacity planning, load testing, and performance tuning initiatives,
particularly across our .NET / React micro-service architecture.
Documentation & Knowledge Sharing: Create and maintain clear documentation for systems, processes,
runbooks, FAQs, and knowledge-base articles; mentor teammates on reliability concepts.
Support Integrations: Troubleshoot and support integrations with third-party systems via APIs, SSO
implementations, and secure file transfer protocols.
Continuous Improvement: Identify opportunities and contribute to initiatives aimed at improving processes,
automation, cost optimization, security posture, and overall engineering excellence.
KNOWLEDGE, SKILLS, AND/OR ABILITIES: To perform this job successfully, individuals should have the
following skills and abilities:
Azure Cloud Expertise: Strong hands-on experience managing and troubleshooting production workloads on
Microsoft Azure, including IaaS & PaaS services (Networking, Compute, Databases, Messaging, Storage,
Security).
Monitoring & Observability: Proficiency with Azure monitoring tools (Azure Monitor, Application Insights,
Log Analytics) and KQL query language. Deep understanding of monitoring concepts, distributed tracing,
and log analysis.
Automation & Scripting: Solid scripting skills for automation (e.g., PowerShell, Bash, Python).
CI/CD: Experience with CI/CD concepts and tools, particularly Azure DevOps pipelines.
Version Control: Proficiency with Git workflows and platforms like Azure Repos or GitHub.
Networking Fundamentals: Solid understanding of networking concepts (TCP/IP, DNS, HTTP/HTTPS, TLS,
Load Balancing, Firewalls, VNETs).
Troubleshooting: Strong analytical and problem-solving skills across complex, distributed systems.
Collaboration & Communication: Excellent communication (written and verbal) and collaboration skills.
Ability to work effectively independently and as part of a team.
Agile Practices: Familiarity with Agile/Scrum methodologies and ceremonies.
C#/.NET Understanding (Desired): Basic understanding or development experience with C#/.NET
applications for troubleshooting and potential tooling development.
Infrastructure as Code (IaC) (Desired): Experience with IaC tools like ARM templates, Bicep, or Terraform.
Containerization (Desired): Familiarity with containerization technologies (Docker) and orchestration
(Kubernetes, Azure Container Apps).
Database Skills (Desired): Experience supporting relational (e.g., Azure SQL, MS SQL Server) and NoSQL
(e.g., Cosmos DB) databases, including query writing and basic troubleshooting.
Frontend Familiarity (Desired): Basic familiarity with modern JavaScript front-end technologies like
React/Typescript.
EDUCATION AND/OR EXPERIENCE:
Bachelor's degree in Computer Science, Information Technology, Engineering, or a related field, or
equivalent practical experience.
Typically 2-5 years of experience in Site Reliability Engineering, DevOps, Cloud Operations/Engineering,
Systems Administration, or Software Engineering with a strong operational focus.
Demonstrated experience managing production workloads in Microsoft Azure is strongly preferred.
Proven experience with monitoring, alerting, logging systems, and participating in incident response / on-call
rotations.
Track record of successfully automating operational tasks and improving deployment processes.
Experience working in Agile development environments (Desired).
Experience working within financial services or a similarly regulated industry (Desired).
CERTIFICATIONS, LICENSES, REGISTRATIONS:
Microsoft Certified: Azure Administrator Associate (AZ-104) (Desired)
Microsoft Certified: DevOps Engineer Expert (AZ-400) (Desired)
ITIL v4 Foundation or equivalent service-management credential (Desired)
Other relevant cloud, infrastructure, or technology certifications (Desired)
PHYSICAL DEMAND: The physical demands described here are representative of those that must be met by an
employee to successfully perform the essential functions of this job. Reasonable accommodation may be made to
enable individuals with disabilities to perform the essential functions.
Ability to sit or stand at a computer workstation for extended periods while using a keyboard, mouse, and
multiple monitors.
Frequent, repetitive hand-finger motions for typing, writing, and handling small peripherals or cables
Near-vision sufficient to read electronic documents, review code, and distinguish basic on-screen colors (e.g.,
for UI verification).
Clear spoken communication and active listening for in-person and virtual meetings, incident bridges, and
phone calls.
Ability to walk short distances, navigate a standard office environment, climb one flight of stairs, and stand
during white-boarding or presentations. Sit-stand desks and other ergonomic furniture are available upon
request.
Ability to lift and move equipment or boxed materials weighing up to 20 lbs (e.g., laptops, small servers,
office supplies).
Participation in overnight or weekend on-call rotations and critical production releases may require work
outside standard business hours.
OTHER DUTIES: Please note this job description is not designed to cover or contain a complete comprehensive
listing of activities, duties or responsibilities that are required of the employee for this job. Duties, responsibilities and
activities may change at any time with or without notice.
EEO Employer
Apex Systems is an equal opportunity employer. We do not discriminate or allow discrimination on the basis of race, color, religion, creed, sex (including pregnancy, childbirth, breastfeeding, or related medical conditions), age, sexual orientation, gender identity, national origin, ancestry, citizenship, genetic information, registered domestic partner status, marital status, disability, status as a crime victim, protected veteran status, political affiliation, union membership, or any other characteristic protected by law. Apex will consider qualified applicants with criminal histories in a manner consistent with the requirements of applicable law. If you have visited our website in search of information on employment opportunities or to apply for a position, and you require an accommodation in using our website for a search or application, please contact our Employee Services Department at or .
Apex Systems is a world-class IT services company that serves thousands of clients across the globe. When you join Apex, you become part of a team that values innovation, collaboration, and continuous learning. We offer quality career resources, training, certifications, development opportunities, and a comprehensive benefits package. Our commitment to excellence is reflected in many awards, including ClearlyRated's Best of Staffing in Talent Satisfaction in the United States and Great Place to Work in the United Kingdom and Mexico.
Apex Benefits Overview: Apex offers a range of supplemental benefits, including medical, dental, vision, life, disability, and other insurance plans that offer an optional layer of financial protection. We offer an ESPP (employee stock purchase program) and a 401K program which allows you to contribute typically within 30 days of starting, with a company match after 12 months of tenure. Apex also offers a HSA (Health Savings Account on the HDHP plan), a SupportLinc Employee Assistance Program (EAP) with up to 8 free counseling sessions, a corporate discount savings program and other discounts. In terms of professional development, Apex hosts an on-demand training program, provides access to certification prep and a library of technical and leadership courses/books/seminars once you have 6+ months of tenure, and certification discounts and other perks to associations that include CompTIA and IIBA. Apex has a dedicated customer service team for our Consultants that can address questions around benefits and other resources, as well as a certified Career Coach. You can access a full list of our benefits, programs, support teams and resources within our 'Welcome Packet' as well, which an Apex team member can provide.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.