Overview
Skills
Job Details
DevOps Engineer
Location: Newton Square, PA
Duration:12+ months
Key Responsibilities:
- Implement, manage, and improve monitoring solutions that use Prometheus, ensuring high availability and accurate alerting for our systems.
- Contribute to the development of observability strategies to improve our Cloud monitoring posture.
- Collaborate with development teams to integrate observability into the CI/CD pipeline and throughout the application lifecycle.
- Respond to and investigate incidents, providing thorough post-mortem analyses and implementing preventive measures.
- Stay current with the latest trends and best practices in site reliability and observability.
- Work with cross-functional teams to ensure system reliability, scalability, and performance.
Qualifications:
- Bachelor\'s degree in Computer Science, Information Technology, or a related field, or equivalent experience.
- Proven experience with observability tools such as Prometheus, Grafana, and Splunk.
- Hands-on experience with Kubernetes and container orchestration, preferably with Gardener Kubernetes.
- Familiarity with logging, monitoring, and application performance management (APM) tools; experience with Dynatrace is a plus.
- Strong understanding of cloud infrastructure, networking, and distributed systems.
- Excellent problem-solving and analytical skills, with the ability to work independently and as part of a team.
- Strong communication skills and the ability to work effectively with both technical and non-technical stakeholders.
- Experience with scripting and automation tools. (Python, Terraform, Ansible, etc.)