Location: Remote
General Responsibilities
Provision, monitor and operate cloud services in a globally distributed team
Analyze and solve operational issues and respond to incidents
Exposure to working with appropriate complex systems administration, database administration and managing landscape maintenance, upgrades and hotfixes
Maintaining the integrity and security of servers and systems
Exposure to developing and operating monitoring policies and standards
Ensure proper resource allocation related to the use of computing resources across cloud environments
Conduct incident root cause analysis and implement continuous improvements
Partner with product development team to design and enhance service reliability
Exposure in developing and implementing testing strategies and documenting results
Work in a diverse environment and cross-train with other global team members
Willingness to Support On-call rotation schedule
Flexible schedule which may include weekend or after-hours work
Requirements
Expertise with GIT
Expertise with Concourse including setup, management and troubleshooting of new pipelines
Expertise with Linux specifically SUSE and Ubuntu
Expertise with Kafka, Zookeeper and BigData technologies
Expert in development of automation for testing, deployment, scalability and management cloud services
Expertise with building, implementing, and/or supporting cloud monitoring tools
Expert knowledge of Cloud Computing and Databases
Expert understanding of web services, networking, virtualization, and internet protocols
Ability to multitask and handle various projects, deadlines and changing priorities
Excellent communication and prioritization skills
Expertise with security fundamentals as they pertain to SaaS Multitenant Application systems
Strong interpersonal, presentation and customer service skills
Desired Qualifications
Experience with AWS Route 53, EC2, S3, CloudWatch, DynamoDB, RDS, IAM, ACM, KMS, VPC
Experience with Cloud Foundry based environments
Experience with Jenkins and/or Chef automation and Terraform
Expert with Kubernetes, troubleshooting, operations, management and configuration of complex Kubernetes services
Exposure to and understanding of troubleshooting IP networks and application stacks
Education
BS/BA degree in Computer Science, Management Information Systems, or related IT discipline preferred
ALLOWABLE SUBSTITUTION: An additional four (4) years of experience can be substituted for a BS or
BA degree
8+ years of experience
Additional Requirements
Participation in an on-call rotation for handling P1 incidents is required.
Experience with observability tools such as Prometheus and Grafana.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
- Dice Id: sharpdec
- Position Id: 51985
- Posted 13 hours ago