Overview
On Site
Depends on Experience
Contract - W2
Contract - 2 week(s)
Skills
Linux
Job Details
Server Specialist III
We are currently seeking an experienced Senior Linux Specialist to join the Linux support team. The ideal candidate will be responsible for troubleshooting, deployment, and operational support of Linux based systems as well as on-going maintenance. As part of the Linux team, the candidate will work closely with business and other IT groups to provide 3rd level support and technical expertise as needed.
Responsibilities include:
- Support the operation and maintenance of Linux servers, ensuring operational availability & performance, conducting health checks, managing software upgrades, patching (including testing and implementation), system optimization and administration.
- Monitor server health and performance to identify issues, bugs, or potential improvements
- Strict adherence to change management processes to ensure changes are properly planned, documented, and deployed
- Develop, review, and update existing operational documentation (SOPs, application checklists, playbooks, etc)
- Provide after-hours technical support
- Collaborate with the Security Operations Center (SOC) team for process optimization, tool tuning & integration, information sharing, playbook development and incident response
- Implement automated near real-time monitoring of all tools to ensure proper operation and collection of pertinent data
- Incident and Problem Management; including both during and post-incident, along with Root Cause Analysis
- Application support, issue management and escalation
- Perform incident investigation, diagnosis, and resolution
- Perform system monitoring and remediation
Qualifications:
The successful candidate will meet the following qualifications:
- 7+ years of experience installing, administering, and maintaining Oracle or Red Hat Linux based servers
- 5+ years of experience designing and implementing redundant systems including data backups/recoveries, high availability, load balancing, and disaster recovery
- 5+ years of experience designing, analyzing, and repairing large-scale distributed systems
- Experience with deploying and maintaining AWS and on-premises Linux servers
- Experience in application deployment automation, modern DevOps practices, and infrastructure as code
- Experience with IT automation tools such as Ansible Automation Platform, Chef, Puppet, or Terraform
- Knowledgeable of core IT infrastructure technologies including virtualization, networking, and storage management
- Technical documentation skills
- Comfortable interacting with management at various levels in a professional manner
- Takes ownership of areas of responsibility and makes recommendations and decisions on the improvement and operation of those areas
- High level of organizational skills
- Knowledge of and experience with Security Design and Implementation
- Ability to participate in after-hours technical support
- Knowledge of backup and recovery methods and verification
- Knowledge of EMC PowerMax and Isilon storage, including snapshots
- Excellent written and verbal communications
- Ability to work in a fast paced, schedule-driven, and customer-oriented environment
- Experience with Bash, Perl, and Python scripting
- Experience with LVM including online expansion of file systems
Preferred Qualifications:
- Experience supporting container-based platforms
- SUSE Manager for patching Linux servers
- Red Hat Satellite for patching Linux servers
- Prometheus and Grafana for system performance monitoring
#10847
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.