Overview
Skills
Job Details
Position/TITLE: SRE + DevOps Engineer
Location : Dallas, TX & NY/NJ/Onsite
Full Time or W2
Mphasis.AI is looking for a highly experienced SRE + DevOps Engineer with 5-10 years of experience to join our growing team. The ideal candidate will be responsible for ensuring the reliability, scalability, and security of our cloud infrastructure while streamlining development and deployment processes. You will collaborate with cross-functional teams to implement DevOps practices, monitor system performance, and resolve incidents efficiently.
Responsibilities:
Infrastructure and Cloud Management: Leverage AWS, Azure, Google Cloud Platform, or OCI services for cloud infrastructure management using Infrastructure-as-Code (IAC) tools such as Terraform, Ansible, or CloudFormation.
Containerization and Orchestration: Work with Docker and Kubernetes to implement, manage, and scale containerized applications.
Monitoring and Incident Management: Ensure robust monitoring, logging, and ing mechanisms using tools like Prometheus , Grafana , and ELK stack . Respond promptly to incidents and perform root cause analysis.
CI/CD Pipelines: Set up and maintain continuous integration/continuous deployment pipelines using Jenkins , GitLab CI , or similar tools to enable automated build, testing, and deployment.
Security and Compliance: Implement security best practices, such as IAM policies , secrets management, and network security. Perform vulnerability assessments and security audits.
Automation and Scripting: Develop automation scripts in Python , Bash , or Java to optimize system performance, automate repetitive tasks, and enhance operational efficiency.
Performance Tuning and Scalability: Conduct performance tuning and optimize resources for high-availability and scalable systems, ensuring reliability even during peak loads.
Release and Change Management: Oversee release and change management processes, ensuring zero-downtime deployments while coordinating closely with development teams.
Collaboration: Work closely with software engineers, product teams, and stakeholders to integrate and align SRE/DevOps practices with business goals.
Qualifications and Skills:
5-10 years of experience in SRE, DevOps, or Infrastructure roles.
Strong knowledge of AWS , Azure , Google Cloud Platform , or OCI cloud services.
Expertise in container orchestration technologies such as Docker and Kubernetes .
Hands-on experience with CI/CD pipelines , particularly Jenkins , GitLab CI , or CircleCI .
Proficiency in scripting languages like Python , Bash , or Java for automation.
Solid understanding of networking principles , security best practices , and identity management .
Experience with monitoring tools like Prometheus , Grafana , ELK Stack , and Datadog .
Strong communication skills and ability to work in a collaborative team environment.
Nice to Have:
Certifications in AWS , Azure , Google Cloud Platform , or Kubernetes ( CKA , CKAD ).
Experience with microservices architecture and service mesh technologies .
Familiarity with DevSecOps practices.
Education:
Bachelor's degree in Computer Science, Engineering, or a related field (or equivalent work experience).