Overview
Skills
Job Details
Position: Site Reliability Engineer/ Software Engineer (SWE)
Location: San Jose, CA
Duration: 6+Months with possible extensions
Description:
They are looking for a Lead level engineer with strong terraform experience and Google Cloud Platform Cloud experience here too. They are flexible on the GO experience now but still a nice to have!
Role Overview:
We are seeking a dynamic Site Reliability Engineer (SRE) / Software Engineer (SWE) to join our innovative team.
In this hybrid role, you will be responsible for maintaining and enhancing our production applications, monitoring system alerts and logs, troubleshooting and resolving code issues,
and ensuring the reliability and scalability of our infrastructure. This position offers a startup-like culture where collaboration, learning new technologies, and effective communication are highly valued.
Key Responsibilities:
Production Application Management:
- Monitor and maintain the health of production applications.
- Respond to system alerts and logs to ensure high availability and performance.
Code Troubleshooting and Bug Fixing:
- Analyze, troubleshoot, and resolve code issues in Go and Kotlin.
- Collaborate with the development team to implement fixes and improvements.
Infrastructure and Monitoring:
- Design, implement, and manage infrastructure using Terraform.
- Set up and maintain monitoring, logging, and alerting systems to proactively identify and address issues.
Collaboration and Communication:
- Work closely with cross-functional teams to ensure seamless integration and deployment of applications.
- Participate in on-call rotations and provide support as needed.
Required Qualifications:
Technical Skills:
- Proficiency in Go and/or Kotlin programming languages.
- Experience with Google Cloud Platform (Google Cloud Platform) services and architecture.
- Strong understanding of infrastructure as code (IaC) principles, particularly with Terraform.
- Experience with monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack).
Experience:
- Previous experience in a DevOps or Site Reliability Engineering role with a focus on cloud environments.
- Demonstrated ability to troubleshoot complex systems and code issues.
Soft Skills:
- Excellent communication and collaboration abilities.
- Adaptability and willingness to learn new technologies.
- Ability to thrive in a fast-paced, startup-like environment.
Preferred Qualifications
- Experience with Java programming language.
- Familiarity with containerization and orchestration tools (e.g., Docker, Kubernetes).
- Knowledge of CI/CD pipelines and related tools.
- Understanding of networking concepts and security best practices.