Site Reliability Engineer

Overview

Remote
Up to $65
Contract - W2
Contract - 12 Month(s)
No Travel Required

Skills

SRE
Linux systems
Kubernetes
CKA
python

Job Details

Do something big and innovative! Stretch your creative muscles and work on big issues. Since 1989, we have developed technology environments, applications, and tools by providing experienced teams to implement, enhance, and maintain our clients essential systems and applications. Come join the Scalence team!

Title: Site Reliability Engineer
Start Date: ASAP
Duration: 12+ months
Location: Fully remote
Pay Rate: $60/hr. - $65/hr. W2

Job Summary
Key Qualifications

  • Strong sense of ownership and integrity demonstrated through clear communication and collaboration.
  • Experience operating production applications at scale, including detailed performance testing, HA and disaster recovery concepts, capacity planning, and managing distributed systems on internal and public cloud infrastructure, principally Kubernetes.
  • Proficiency in authoring and releasing code in Go, Python, Java, or Scala using common configuration management and software delivery platforms.
  • Proficiency with the architecture, deployment, performance tuning, and troubleshooting of open-source Big Data technologies, especially Apache Spark, Flink, AirFlow, Hive, Hadoop/HDFS, Trino, Druid, or related software.
  • Experience with storage and coordination systems such as Apache Cassandra, Zookeeper, etc., Redis, as well as blob and block storage technologies.
  • The successful candidate is frustrated with toil and has an acute drive to both automate manual operations and evolve them into automatic processes.
  • Understanding of the Linux Operating System, containers and virtualization, standard networking protocols, and components.
  • Demonstrates excellent troubleshooting and problem-solving skills using the scientific method.
  • Ability to participate in our 24x7 weekly on-call rotation.


Skills

  • Kubernetes, Amazon EKS, and/or GKE.
  • Python, Golang, Scala, and/or Java comprehension and development experience.
  • Manual process automation through innovative tools.
  • High-Availability Architecture
  • Big Data Processing and/or Data Governance; Apache Spark, Hive, Hadoop/HDFS, Trino, Druid, etc.
  • Fault troubleshooting of virtual and/or containerized services through the stack.
  • Developing and managing CI/CD or software delivery pipelines.
  • Infrastructure-as-Code orchestration tools, such as terraform or pulumi.
  • Agile methodology


Education or Experience

  • 5+ years of software development or production operations experience in a large-scale environment.
  • BS/MS in Computer Science or Equivalent.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.