Linux Engineer / Automation (Houston)

Overview

On Site
Depends on Experience
Full Time

Skills

Ansible
Bash
EMC
Linux
Linux administration
Python
automation tools
powershell
change management
Red Hat

Job Details

We are currently seeking an experienced Linux Specialist to join a large midstream company in Houston, TX. The ideal candidate will have strong Linux administration, bash scripting, automation experience (using bash, powershell, python or perl) and would be great if the person has some experience working with Ansible.

What you'll do

Responsible for troubleshooting, deployment, and operational support of Linux based systems as well as on-going maintenance. As part of the Linux team, the candidate will work closely with business and other IT groups to provide 3rd level support and technical expertise as needed.
Responsibilities include:
Support the operation and maintenance of Linux servers, ensuring operational availability & performance, conducting health checks, managing software upgrades, patching (including testing and implementation), system optimization and administration.
Monitor server health and performance to identify issues, bugs, or potential improvements
Strict adherence to change management processes to ensure changes are properly planned, documented, and deployed
Develop, review, and update existing operational documentation (SOPs, application checklists, playbooks, etc)
Provide after-hours on-call technical support
Collaborate with the Security Operations Center (SOC) team for process optimization, tool tuning & integration, information sharing, playbook development and incident response
Implement automated near real-time monitoring of all tools to ensure proper operation and collection of pertinent data
Incident and Problem Management; including both during and post-incident, along with Root Cause Analysis
Application support, issue management and escalation
Perform incident investigation, diagnosis, and resolution
Perform system monitoring and remediation
What you'll need


The successful candidate will meet the following qualifications:
10+ years of experience installing, administering, and maintaining Oracle or Red Hat Linux based servers
5+ years of experience designing and implementing redundant systems including data backups/recoveries, high availability, load balancing, and disaster recovery
5+ years of experience designing, analyzing, and repairing large-scale distributed systems
Experience with deploying and maintaining AWS and on-premises Linux servers
Experience in application deployment automation, modern DevOps practices, and infrastructure as code
Experience with IT automation tools such as Ansible Automation Platform, Chef, Puppet, or Terraform
Knowledgeable of core IT infrastructure technologies including virtualization, networking, and storage management
Technical documentation skills
Comfortable interacting with management at various levels in a professional manner
Takes ownership of areas of responsibility and makes recommendations and decisions on the improvement and operation of those areas
High level of organizational skills
Knowledge of and experience with Security Design and Implementation
Ability to participate in after-hours on-call rotation
Knowledge of backup and recovery methods and verification
Knowledge of EMC PowerMax and Isilon storage, including snapshots
Excellent written and verbal communications
Ability to work in a fast paced, schedule-driven, and customer-oriented environment
Experience with Bash, Perl, and Python scripting
Experience with LVM including online expansion of file systems
Preferred Qualifications:
Experience supporting container-based platforms
SUSE Manager for patching Linux servers
Red Hat Satellite for patching Linux servers
Prometheus and Grafana for system performance monitoring