Job Title: Data Center/Cloud & Automation SME Experience Required: 10 & Above
Location: Culver City, CA (Onsite - Locals preferred)- 100% ONSITE FEOM DAY 1
Duration: 6 months
GBaMS ReqID:10769084
Skills: Digital : Amazon Web Service(AWS) Cloud Computing Primary Skills : Linux, AWS, Terraform & Ansible, Automation Secondary Skills : Windows, Citrix, Nutanix and VMware
Detailed Responsibility**
- Manage and support Linux servers including installation, patching, user management, performance monitoring, and troubleshooting.
- Design, provision, and maintain AWS infrastructure (EC2, Certificate Manager, Security Groups, VPC, IAM, S3, Load Balancers) ensuring security, availability, and cost efficiency.
- Implement Infrastructure as Code using Terraform to build, update, and manage AWS resources in a consistent and reusable manner.
- Automate system configuration, deployments, and patching using Ansible playbooks and roles.
- Code version control through Github.
- Monitor systems and cloud resources, respond to incidents, and perform root cause analysis.
- Follow security best practices, access controls, and compliance requirements across OS and cloud platforms.
- Collaborate with application, network, and security teams to support deployments and changes.
- Maintain documentation, SOPs, and continuously improve automation and operational efficiency.
Required Technical Skill Set** Linux Administration: Strong hands on experience with RHEL / Amazon Linux / Ubuntu, including user management, patching, troubleshooting, shell scripting, and performance monitoring.
- AWS Cloud: Practical knowledge of core AWS services such as EC2, VPC, Subnets, Certificate Manager,IAM, S3, Load Balancers, Auto Scaling, and CloudWatch, with security and cost optimization awareness.
- Infrastructure as Code (Terraform): Experience in writing and managing Terraform modules, variables, and state files to provision and maintain AWS infrastructure.
- Configuration Management (Ansible): Ability to create and manage Ansible playbooks and roles for OS configuration, automation, and deployment tasks.
- Version Control: Working knowledge of Git for code versioning and collaboration.
- Understanding of networking fundamentals Automation & Scripting: Proficiency in Bash scripting and automation of repetitive operational tasks.
- Monitoring & Troubleshooting: Experience in system and cloud monitoring, incident handling, and root cause analysis.
- Security & Compliance: Understanding of access control, encryption, patch management, and secure configuration practices.
Good-to-Have-
- Windows Adminitration, Citrix - storefront, Delivery controllers, netscaler, PAM, XenApp Administration.
- Experience in Nutanix and VMware
- CI/CD Tools: Exposure to Jenkins, GitHub Actions, GitLab CI, or similar pipelines.
- Containers & Orchestration: Basic knowledge of Docker and Kubernetes.
- Advanced AWS Services: Familiarity with RDS, Lambda, DynamoDB, Route53, or EKS.
- Monitoring & Logging Tools: Experience with tools like Prometheus, Grafana, ELK stack, or CloudTrail.
- Ansible AWX / Tower: Experience with Ansible Tower or AWX for centralized automation.
- Terraform Advanced Usage: Exposure to multi account setups, workspaces, and complex module design.
- Security Practices: Knowledge of vulnerability scanning, security audits, and compliance requirements.
- ITIL / ITSM Awareness: Experience working with incident, problem, and change management processes.
- Scripting Languages: Basic knowledge of Python for automation tasks.
- Documentation & Knowledge Sharing: Ability to create SOPs, runbooks, and technical documentation.
Responsibility of Automation & Configuration Management Engineer / Expectations from the Role
1 Manage and support Linux servers, including installation, patching, monitoring, and troubleshooting.
2 Provision, configure, and maintain AWS infrastructure ensuring availability, security, and scalability.
3 Develop and maintain Infrastructure as Code using Terraform for consistent cloud deployments.
4 Automate system configuration, deployments, and operational tasks using Ansible.
5 Monitor systems and cloud resources; handle incidents, changes, and service requests.
6 Perform root cause analysis and implement preventive actions
7 Enforce security best practices across OS and cloud environments.
8 Collaborate with cross functional teams to support application deployments and platform stability.
9 Maintain documentation, SOPs, and runbooks
10 Manage code/version control through GitHub Continuously improve automation, reliability, and operational efficiency.
Thanks,
Akhil,
akhil dot macharla at adventglobal dot com