Title: Infrastructure Automation Engineer
Duration: 6 Months
Location: Remote
The Infrastructure Automation Engineer will support the Cloud Automation Engineering team by designing, building, and evolving automation platforms and shared services that reduce manual operational effort across the organization. This role is intended to add resiliency and capacity to a small, highly specialized team and enable consistent progress despite shifting priorities or resource constraints.
This position requires a highly autonomous engineer who can take broad direction, leverage existing documentation, and independently move projects forward while collaborating appropriately with the team.
Team & Reporting Structure
- Reports directly to Manager, Cloud Automation Engineering
- The team operates fully remote and across time zones
Core Responsibilities
- Design, implement, and maintain automation-first solutions that:
- Reduce manual toil for engineering teams
- Are simple and intuitive for downstream teams to consume
- Build and maintain Infrastructure as Code for shared services and platforms
- Develop reusable automation components intended for long-term reuse, not one-off solutions
- Contribute to and extend automation pipelines and frameworks
- Support and enhance internally provided automation services, then hand off day-to-day operation to consuming teams
- Work independently from high-level project definitions and documented frameworks
- Review and leverage extensive documentation (Confluence) to self-onboard and execute effectively Current & Representative Projects
- Heavy use of Terraform to define and manage infrastructure and platform integrations
Migration initiatives, including:
- Moving away from PagerDuty to Datadog Incident Management and On?Call, with Terraform-driven parity between platforms
- Evaluation and potential transition from Chef to Ansible for configuration management
- Chef experience is not required
- Ansible experience is needed
- Supporting shared tooling and platforms that span multiple business units Cloud & Platform Environment
Multi-cloud environment:
- AWS is the primary cloud across the organization
- Significant tooling and automation platforms currently reside in Azure
Engineers are expected to:
- Be strong in at least one major cloud platform
- Adapt easily between AWS and Azure as needed
Observability & incident tooling:
- Datadog is a core platform for monitoring, observability, and incident management
- The team manages vendor relationships, contracts, and onboarding of new business units
- Ability to understand Datadog capabilities and explain them to internal teams is important
Required Skills & Experience
- Strong background in infrastructure automation
- Deep experience with Terraform
- Proven ability to design and build Infrastructure as Code
- Experience working in environments where automation is delivered as a shared service
- Ability to work independently with minimal supervision
- Comfortable navigating and learning from existing documentation
- Strong problem-solving and analytical skills
- Should be comfortable using AI (Cloud desktop, copilot,
Desired Skills & Experience
Experience with:
- Ansible (or strong willingness to learn)
- Datadog (monitoring, observability, or incident management)
- Configuration management tools (Chef experience is optional)
- Exposure to both AWS and Azure environments
- Experience migrating or modernizing tooling platforms
Working Style & Expectations
- Self-directed and comfortable operating without micromanagement
- Able to accept broad project parameters and determine implementation details independently
- Willing to ask questions when blocked, without requiring constant direction
- Focused on maintainability, scalability, and ease of consumption for end users
- Adaptable to changing organizational priorities and tooling decisions