Site Reliability Engineer

New York, NY, US • Posted 60+ days ago • Updated 2 hours ago

Full Time

On-site

USD $175,000.00 - 200,000.00 per year

Software Guidance & Assistance

Fitment

Dice Job Match Score™

🫥 Flibbertigibetting...

Job Details

Skills

IaaS
Reliability Engineering
Microsoft Azure
Scalability
Management
High Availability
Performance Tuning
Design Patterns
Root Cause Analysis
Delegation
Technical Drafting
Mentorship
Requirements Elicitation
Stakeholder Engagement
DevOps
Amazon Web Services
Cloud Computing
Amazon EC2
Scripting
GitLab
Jenkins
Terraform
Dynatrace
Microservices
Docker
Kubernetes
Analytical Skill
Problem Solving
Communication
Collaboration
Computer Science
Financial Services
Asset Management
Capital Market
Python
Django
Pandas
NumPy
SQL
Regulatory Compliance
Optimization
Continuous Integration
Continuous Delivery
Incident Management
Documentation
Quality Assurance
MEAN Stack
Customer Service
Training And Development
SAP BASIS

Summary

Software Guidance & Assistance, Inc. (SGA) is searching for a Site Reliability Engineer (SRE) for a Direct Placement assignment with one of our premier financial services clients in mid-town New York City. Hybrid schedule, 2-3 days onsite/week.

The firm is seeking a hands-on and analytical Site Reliability Engineer to design, build, and maintain reliable, secure, and scalable cloud infrastructure and observability systems. The ideal candidate will have deep experience in AWS, Python, CI/CD automation, and monitoring frameworks, supporting mission-critical applications in a modern cloud ecosystem. This role plays a key part in enhancing system reliability, performance, and resilience across enterprise platforms.

Responsibilities

Architect, deploy, and maintain AWS and Azure infrastructure focused on reliability, scalability, and cost efficiency.
Design and manage monitoring, logging, and alerting systems to ensure high availability and rapid incident response.
Build and maintain CI/CD pipelines (GitLab, Jenkins) for continuous software delivery and automation.
Implement and maintain Infrastructure as Code (IaC) using CDK, Terraform, or CloudFormation.
Collaborate with development teams to improve deployment processes and production reliability.
Contribute to application codebase for resiliency, performance tuning, and observability best practices.
Maintain detailed documentation for architectures, design patterns, and configurations.
Partner with Dev, QA, and AppSecOps teams to promote automation, consistency, and improved security posture.
Perform incident triage, root cause analysis, and develop permanent solutions to production issues.
Continuously improve standards, tools, and processes for platform reliability and efficiency.

Work Allocation (Approximate)

60 % - Hands-on development, automation, and operations
25 % - Technical design, architecture, and mentoring
15 % - Collaboration, requirements gathering, and stakeholder engagement

Required Skills

8+ years of hands-on experience in Site Reliability, DevOps, or Platform Engineering roles.
Strong experience with AWS Cloud Services (ECS, EC2, Lambda, IAM, CloudWatch, etc.).
Proficiency in Python for automation, scripting, and infrastructure integration.
Solid understanding of CI/CD pipelines using GitLab or Jenkins.
Hands-on experience with Infrastructure as Code (CDK, Terraform, or CloudFormation).
Expertise in monitoring and observability tools (Datadog, Dynatrace, ELK).
Working knowledge of microservices, serverless architectures, and containerization (Docker, ECS, Kubernetes).
Strong analytical, troubleshooting, and problem-resolution skills.
Excellent communication and collaboration abilities in cross-functional teams.
Bachelor's degree in Computer Science or related field (or equivalent experience).

Preferred Skills

Experience in financial services, asset management, or capital markets environments.
Familiarity with Python/Django and data libraries (Pandas, NumPy, SQL).
Knowledge of security, compliance, and cost optimization best practices.
Proven ability to identify and implement process and automation improvements.

Initial Success Criteria (First 6 Months)
The successful candidate will demonstrate measurable impact by:

Implementing or optimizing monitoring and alerting systems to improve visibility and response time.
Playing a key role in modernizing CI/CD pipelines to enhance automation and release consistency.
Contributing hands-on code and tooling improvements that increase platform reliability and performance.
Establishing incident response processes and documentation for key production systems.
Building strong working relationships across engineering, QA, and AppSecOps teams to promote shared reliability ownership

SGA is a technology and resource solutions provider driven to stand out. We are a women-owned business. Our mission: to solve big IT problems with a more personal, boutique approach. Each year, we match consultants like you to more than 1,000 engagements. When we say let's work better together, we mean it. You'll join a diverse team built on these core values: customer service, employee development, and quality and integrity in everything we do. Be yourself, love what you do and find your passion at work. Please find us at .

SGA is an Equal Opportunity Employer and does not discriminate on the basis of Race, Color, Sex, Sexual Orientation, Gender Identity, Religion, National Origin, Disability, Veteran Status, Age, Marital Status, Pregnancy, Genetic Information, or Other Legally Protected Status. We are committed to providing access, equal opportunity, and reasonable accommodation for individuals with disabilities in employment, and our services, programs, and activities. Please visit our company to request an accommodation or assistance regarding our policy.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Dice Id: sgainc
Position Id: 25-03254
Posted 30+ days ago

Company Info

About Software Guidance & Assistance

Founded in 1981, SGA is a technology and resource solutions provider with a national footprint and headquartered in the shadow of Wall Street. We’re a certified women-owned business. We provide contingent staffing, direct placement, and professional and managed services to transform businesses and evolve careers. We’re small enough to tailor our services to each client and big enough to deliver for some of the world’s largest employers. Our professionals are experts in areas such as IT, finance, accounting, risk, and clinical.

SGA provides contingent staffing, direct placement, and professional and managed services nationwide for Fortune 500 companies, mid-size businesses and select startups.

Our core skillsets include all areas of technology – business & data analysis, cyber & network security, database administration, development & architecture, infrastructure, program & project management, quality assurance & testing. We also deliver talent across professional business functions such as finance, accounting, risk, and clinical.

Our Professional & Managed Services team delivers IT projects through onshore, offshore and hybrid delivery models. We develop software products, modernize applications, add features, and integrate and maintain systems. Our scope covers, among others, complex application suites, data management and visualizations, machine learning and mobile applications.

Go to company profile

Create job alert

Never miss an opportunity! Create an alert based on the job you applied for.