Site Reliability Engineer

Chandler, AZ, US • Posted 1 day ago • Updated 1 day ago
Contract W2
On-site
Depends on Experience
Company Branding Image
Fitment

Dice Job Match Score™

⏳ Almost there, hang tight...

Job Details

Skills

  • AngularJS
  • DevOps
  • JIRA
  • Jenkins
  • KPI
  • Java
  • Python

Summary

Role: Site Reliability Engineer
Location: Chandler, AZ
Job Description:
Application Operational Services is seeking Site Reliability Engineer. This role requires a strong IT professional focused on establishing and improving monitoring to measure end-to-end performance and end-user availability of systems via a suite of common monitoring tools. Interface with business partners and operations teams to develop business and technical monitoring requirements. As part of this role, the person will primarily be responsible for supporting production or operations of critical applications. They will ensure the application s operational readiness by evaluating its performance, reliability, scale, resiliency & observability. They will be responsible for identifying issues in production, triaging identified issues, partnering with other engineers on the team to identify the root cause. Possess strong analytical ability in solving IT problems, working towards automation, and elimination of systems and or process bottlenecks.
Responsibilities:
As part of the SRE team, perform full stack triaging of alerts and engage other engineers to identify root cause of application performance & stability issues.
Work with stakeholders such as product owners to define service level objectives (SLOs) for application features and services.
Track performance against SLOs in partnership with development teams or other stakeholders, and ensure systems continue to meet SLOs over time.
Design, develop dashboards and reports to communicate key metrics.
Identify opportunities to improve alerting posture and create/update alerts accordingly.
Work closely with the Engineering team to understand application architecture and perform Single point of failure analysis and create scenarios for testing resiliency of the application.
Create/derive NFR/Workload model and ensure performance & resiliency is considered early in the SDLC.
Execute performance/chaos tests, analyze using APM and other tools to identify performance & stability issues.
Document any findings/analysis/results, communicate and present to stakeholders.
Perform analytics on previous incidents to understand root causes and use automation to reduce the probability and/or impact of problem recurrence.
Demonstrate proficiency with DevOps tools, JIRA, ServiceNow, MS Project and perform tasks using the tools.
Required hard and soft skills/experience
8 - 10 years of information technology experience with 6+ years working on DevOps or SRE team or performance engineering team
Experienced in triaging of production issues using APM tools such as Dynatrace or AppDynamics or New Relic and log aggregation tools such as Splunk, ELK, etc.
Be a technical expert with expertise across multiple technology areas and the ability to diagnose complex issues throughout many technologies and apply this knowledge to effective monitoring of applications.
Strong experience in Java and Front-end development (UI and UX) (React JS, Angular)
Experience with Apache/tomcat Middleware and Java/RESTful services framework (mulesoft is a plus)
Backend Database experience is a must - Oracle, sqlserver, hadoop
Strong Python, UNIX, Wintel, Perl/Shell scripting
Strong experience working with CI/CD tools - bitbucket, JFrog Artifactory, Jenkins, Artifactory, Terraform/Packer, Ansible
Experience working with Business and Technical leaders to develop KPIs for application monitoring.
Experience with SRE concepts like SLI/SLOs & error budgets and working with developers to track and improve them on a continuous basis.
Must be able to provide oral and written discussion of analytical findings using narrative and graphic forms.
Must be able to use qualitative and quantitative analytical skills to assess the effectiveness of the operations.
Identifying symptoms for process improvement.
Analytical and investigation, and organization skills
Communications including being able to craft content for executive level presentations.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 10507810
  • Position Id: 8938415
  • Posted 1 day ago

Company Info

About Akshaya Inc

Akshaya Inc is one of the premier IT Consulting Services Company in the Bay Area.

Experienced and trained IT professionals at Akshaya Inc provide variety of IT services including IT staff augmentation, application development, IT support and offshore development.

Our world class professionals can assist you with development and support in ERP (SAP, Oracle Applications), Business Intelligence and Data Warehousing, Web development and much more.

Careers

About_Company_OneAbout_Company_Two
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Scottsdale, Arizona

11d ago

Easy Apply

Third Party, Contract

60 - 65

Chandler, Arizona

Today

Contract

USD54 - USD65

Tempe, Arizona

Today

Contract

USD 55.00 - 65.00 per hour

Phoenix, Arizona

9d ago

Easy Apply

Third Party, Contract

Depends on Experience

Search all similar jobs