Site Reliability Engineer (W2 position)

Overview

On Site
Contract - W2
Contract - 12+ Month(s)
75% Travel

Skills

SPLUNK
AWS

Job Details

KMM an ISO 9001:2015, CMMI Level 2 certified company provides high-quality IT consulting services and innovative solutions by using the most effective and modern technologies. We have a core group of Subject Matter Experts with certifications and immense experience in successfully delivering mission-critical solutions. We have extensive industry experience in the financial, insurance, Health IT, media, marketing, retail, and government markets. We have a proven track record in understanding client s business challenges, determine a customer-focused solution, and provide the technical implementation and documentation to bring it to fruition.
Interview Mode: 1st round Video Interview & 2nd round In Person interview
Experience needed: 8+ years
location: Please apply VA,MD,DC States consultants only
Position: Site Reliability Engineer
Location: Reston, VA - Hybrid - Weekly 3 days onsite
Duration: Long Term
Job Description:
  • Independently determine the needs of the customer and create solution frameworks.
  • Design and develop moderately complex software solutions to meet needs.
  • Use a process-driven approach in designing and developing solutions.
  • Implement new software technology and coordinate end-to-end tasks across the team.
  • Support cloud applications and technologies, including containerization, virtualization, microservices, and server-less architecture
  • Work in an Agile, Scrum, or Kanban environment
  • Support application monitoring / observability, including building dashboards, establishing service level indicators / objectives / agreements (SLIs / SLOs / SLAs), and logging / tracing
  • Apply problem-solving skills and proactivity in resolving issues / blockers
  • Utilize Experience with AWS CloudWatch, Splunk, Dynatrace, CatchPoint, and / or Datadog
  • Experience with application production / operations support, including incident response, problem management, runbooks, and knowledge articles using tools such as ServiceNow, Moogsoft, StatusHub, and / or Blameless
  • Utilize Experience with post-mortems, root-cause analysis (RCA), and / or AWS Correction-of-Errors (CoE)
  • Create disaster recovery plans and execute failover tests
  • Capacity planning and performance testing / engineering tools, such as JMeter and / or LoadRunner
  • Failure Mode Effect Analysis (FMEA) and Chaos testing / engineering tools, such as Gremlin, Chaos Monkey, Chaos Toolkit, AWS Fault Injection Service (FIS)
Thanks & Regards
Jayanth
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.