Apply Now

Site Reliability Engineer (SRE)

Alpharetta, GA, US • Posted 8 hours ago • Updated 8 hours ago

Contract W2

Contract Independent

Contract Corp To Corp

No Travel Required

On-site

$60 - $65/hr

Keylent

Fitment

Dice Job Match Score™

🫥 Flibbertigibetting...

Job Details

Skills

Summary

Role :Site Reliability Engineer (SRE)

Location :Alpharetta, GA(DAY1 Onsite - F2F interview)

Duration: Long Term

Note : Candidates with prior experience at Equifax are preferred

Job Description:

Seeking an experienced Site Reliability Engineer who can operate independently with limited guidance and oversight. This individual will be passionate about end-user experience and will be part of a tight-knit, distributed engineering team developing and delivering a comprehensive data operations management solution for Equifax''s Data Fabric Platform. SRE is a critical role in the entire SDLC from coding, scaling, and ensuring production stability that includes responding to on-call incidents.

Data Fabric is a Google Cloud Platform cloud-native modern data management platform which allows Equifax to acquire and curate data, provide entity resolution, and ingest into a single environment. It is deployed globally in multiple regions, highly secured and complies with regional and internal regulatory controls with strict governance and oversight. Business units, Data Scientists and many other stakeholders use APIs to consume data managed by the Data Fabric and operate data exchanges to monetize data through B2B and B2C channels.

Data operations management solution consists of:

· A web portal UI/UX that provides a single point of access to all data management and data reliability engineering

· A suite of backend API services that services the UI and integrates with low-level Data Fabric and other third-party system APIs

· Modern data lakehouse (data lake, data warehouse, batch and streaming ELT pipelines)

· The data operations roadmap envisions a set of rich management capabilities including:

· Serves a large community of geographically dispersed data operations stakeholders

· Data quality and observability management to detect, alert, and prevent data anomalies

· Troubleshooting, triaging and resolving data and data pipeline issues

· OLAP, batch and streaming big data processing, and BI reporting

· MLOps

· Real-time dashboards, alerting and notifications, case management, user/group management, AuthZ, and many other foundational capabilities

Tech Stack:

Frontend: Angular 17+, JavaScript, TypeScript, HTML, SCSS, Webpack Module Federation, Tailwinds CSS, Angular Material, Angular Elements

Backend: Java (JDK 17+), Spring Framework 6.X.X, Spring Boot 3.X.X, NestJS 10.X.X, REST and GraphQL microservices, NodeJS

Tools & Frameworks: Nx build management, Monorepo architecture, Jenkins CI/CD, Fortify, Sonar, GitHub

Cloud & Data: Google Cloud Platform (GKE, Composer + Airflow, Dataflow + Apache Beam, BigQuery, BigTable, Firestore, GCS, PubSub, Vertex AI), Terraform, Helm Charts, GitOps

Other Technologies: Websockets, SSE, event-driven architecture

Environment:

Culture: Fast-paced, creative, results-oriented

Team Structure: Agile, working in 2-week sprints using Aha and Jira for project management

Expectations: Self-starters who can work independently with limited guidance, delivering solutions that end-users value and love

General Responsibilities:

· Contribute to Development Activities: SRE is expected to participate in SDLC activities that include design, develop, test, deploy, and operate, covering both frontend and backend

· Cross-Functional Work: Collaborate with global teams to integrate with existing internal systems and Google Cloud Platform cloud

· Issue Resolution: Triage and resolve product or system issues, ensuring quality and performance

· Documentation: Write technical documentation, support guides, and run books

· Agile Practices: Participate in sprint planning, retrospectives, and other agile activities

· Compliance: Ensure software meets secure development guidelines and engineering standards

SRE Accountability:

General: Use coding, automation, and software engineering principles to ensure scalability, performance, and reliability efficiently and toil-free

IAC: Build infrastructure as code (IAC) patterns that meet security and engineering standards using one or more technologies (Terraform, scripting with cloud CLI, and programming with cloud SDK)

CI/CD: Build CI/CD pipelines for build, test and deployment of application and cloud architecture patterns, using platform (Jenkins) and cloud-native toolchains

Automation: Build automated tooling to deploy service requests to push a change into production. Build runbooks that are comprehensive and detailed to manage detect, remediate and restore services

Change Management: Work closely with the dev team to ensure all DevSecOps issues are addressed timely, in compliance with Equifax security policies, and adherence to Engineering Handbook

Incident management: Solve problems and triage complex distributed architecture service maps. On call for high severity application incidents and improving run books to improve MTTR

RCA and postmortem: Lead root cause analysis and blameless postmortem and own the call to action to remediate recurrences

Customer Focus: Address service disruptions and downtime ensuring end-customer needs are met, and drive processes for a flawless customer experience ensuring

Reliability and Availability: Ensure monitoring of SRE golden signals, SLO, SLIs, and SLAs are honoured within error budgets. Work closely with devs, QE, POs, and other stakeholders providing continuous feedback on uptime, scalability, and reliability, and influence best practices with aim of providing excellent operational experiences

Reliability roadmap: Own the reliability roadmap by taking a holistic view of all data operations management capabilities that includes participating in Production Readiness Review (PRR), and working with stakeholders to ensure DR plans are in place.

Must-Have Skills:

General experience: 5-7 years of experience in software engineering, systems administration, database administration, and networking. System administration skills, including automation and orchestration of Linux/Windows using Terraform, Chef, Ansible and/or containers (Docker, Kubernetes), and shell scripting

Cloud-Native Application Development: 3+ years. Solid experience with developing and supporting cloud-native applications. Experience with cloud-based security: IAM, AuthZ

End-user Application Experience: 3+ years experience as a SRE supporting an end-user facing application, e.g web/mobile/desktop app that includes UI, APIs, and backend systems

Development Experience: 2+ years of general proficiency with Java, or JavaScript/NodeJS

Frontend Experience: Experience with Angular, JavaScript, TypeScript, or modern web application development frameworks

Architecture Knowledge: Understanding of modular systems, performance, scalability, security

Agile Experience: Agile development mindset and experience

Service-Oriented Architecture: Knowledge of RESTful web services, JSON, AVRO

Application Troubleshooting: Debugging, performance tuning, production support

Documentation Skills: Strong written and verbal communication

General SDLC: Experience with CI/CD concepts and can use tools including Jenkins/Bamboo, and release management concepts. Understanding of Google Cloud Platform services related to big data like BigQuery, Dataflow, Pub/Sub,GCS, Composer/Airflow. Or, similar solutions in AWS: Redshift, SNS, SQS, S3, Kinesis and others.

Nice-to-Have Skills

· Big Data Processing: ETL/ELT experience

· Scripting Languages: Groovy, Python

· Cloud Certification: Relevant certifications in cloud technologies

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Dice Id: 10423210A
Position Id: 8954639
Posted 8 hours ago

Company Info

About Keylent

We established Keylent to provide the Key Talent that our clients seek. We are all about People. About Passion. Professional and Process driven.

We have been involved with the industry for over 2 decades and have seen the up's and down's. We have weathered bad times and enjoyed good times by putting our client needs ahead of ours. We continue to do the same thing.

We take great care of our Talent Acquisition and Administrative staff who in turn put in their best work to fulfill our Consultant and Client needs.

Our Clients and our Consultants have a variety of choices and we are thankful that they have chosen Keylent.

Careers

Go to company profile

Create job alert

Never miss an opportunity! Create an alert based on the job you applied for.

Atlanta, Georgia

•

Today

Role :Site Reliability Engineer (SRE) Location :Alpharetta,GA(DAY1 Onsite - F2F interview) Need Strong Google Cloud Platform Experience Note : Candidates with prior experience at Equifax are preferred Job Description Seeking an experienced Site Reliability Engineer who can operate independently with limited guidance and oversight. This individual will be passionate about end-user experience and will be part of a tight-knit, distributed engineering team developing and delivering a comprehensive d

Easy Apply

Contract, Third Party

Depends on Experience

Lead DevOps Engineer_Atlanta,GA

Atlanta, Georgia

•

7d ago

Lead DevOps Engineer (15+ years) Location: Atlanta,GA Interview: Face to Face Must Devops Skills: GitLab, Bash Shell, Teamcity, Nolio, Nexus, Kubernetes,Docker Image, OpenShift, CHEF(CHEF workstation), Stash, SonarQube, AppDynamics, Kibana Knowledge of distributed infrastructure, Unix/Windows expertise, SQL skills, and familiarity with databases like Oracle and SQL Experience with Kafka/ActiveMQ or other message brokers for event-driven communication A solid understanding of Service First, rele

Easy Apply

Contract, Third Party

Depends on Experience

Sr Google Cloud Platform Architect_NJ

Jersey City, New Jersey

•

5d ago

Sr Google Cloud Platform Architect New Jersey,NJ Sr positions must have 15+ years exp and onsite NJ Job Summary: We are seeking a skilled Google Cloud Platform (Google Cloud Platform) Data Engineer to design, build, and optimize data pipelines and analytics solutions in the cloud. The ideal candidate must have hands-on experience with Google Cloud Platform data services, strong ETL/ELT development skills, and a solid understanding of data architecture, data modeling, data warehousing and perform

Easy Apply

Third Party, Contract

Depends on Experience

Google Cloud Platform Data Architect/Lead

Jersey City, New Jersey

•

11d ago

Sr Google Cloud Platform Architect Location- NJ onsite. - Sr positions must have 15+ years exp and onsite NJ Job Summary: We are seeking a skilled Google Cloud Platform (Google Cloud Platform) Data Engineer to design, build, and optimize data pipelines and analytics solutions in the cloud. The ideal candidate must have hands-on experience with Google Cloud Platform data services, strong ETL/ELT development skills, and a solid understanding of data architecture, data modeling, data warehousing

Easy Apply

Contract, Third Party

Depends on Experience

Search all similar jobs

Site Reliability Engineer (SRE)

Keylent

Dice Job Match Score™

Job Details

Skills

Summary

Company Info

About Keylent

Similar Jobs