Overview
Skills
Job Details
Job Title: IT Infrastructure Monitoring & Tools Specialist
Location: Alpharetta, GA Onsite Position
Job Type: Full time position
Interview process: Teams
Role Summary:
We are seeking a skilled IT Infrastructure Monitoring & Tools Engineer to manage and maintain monitoring tools across Windows, Linux, AIX, HP-UX, ESXi, and MPE environments. The ideal candidate will be responsible for installing, configuring, and ensuring the continuous operability of customer approved monitoring tools, compliance management, and discovery agents while integrating with high-availability solutions. Additionally, the role includes evaluating and updating the Configuration Management Database (CMDB) using BITSMTS tools as part of the change management process.
Key Responsibilities:
Monitoring & Discovery Tool Installation & Configuration:
- Deploy and configure systems monitoring tools and discovery agents for
- Windows Server (Physical & Virtual HP & Dell Hardware)
- Linux (RHEL & Oracle Linux) (Physical & Virtual HP & Dell Hardware)
- IBM AIX (IBM Power Platform)
- HP-UX (HPE Integrity Platform)
- VMware ESXi Hypervisor
- MPE (HP3000 Emulator running on Linux)
- Ensure tools are continuously operable and provide real-time visibility into infrastructure performance.
- Integrate monitoring solutions with high-availability tools to support business continuity.
- Configuration Management & Change Control (BITSMTS Tools):
- Utilize BITSMTS tools to manage change control processes and updates.
- Regularly evaluate and update the Configuration Management Database (CMDB) to reflect accurate system configurations.
- Ensure that all infrastructure changes are documented and comply with IT governance standards.
- Work closely with change management teams to validate and test configuration changes before deployment.
Compliance & SOx Controls:
- Ensure full compliance with SOx regulations and internal IT security policies.
- Audit, manage, and document systems monitoring configurations for compliance reporting.
- Implement automated alerts for non-compliance detection and resolution.
- Batch Job Scheduling & Infrastructure Automation:
- Install and configure customer approved batch job scheduling tools.
- Automate monitoring processes to minimize manual intervention.
- Optimize monitoring and alerting for critical workloads and job scheduling.
Troubleshooting & Performance Tuning:
- Analyze and troubleshoot performance issues across multi-platform environments.
- Work with cross-functional teams to ensure optimal infrastructure health and uptime.
- Conduct root cause analysis (RCA) for system outages and apply corrective actions.
- Documentation & Continuous Improvement:
- Develop and maintain detailed documentation for monitoring tools, configurations, and workflows.
- Recommend enhancements to improve infrastructure observability and proactive issue detection.
- Stay updated with emerging monitoring technologies and best practices.
Preferred Education, Experience, & Skills:
Bachelor's Degree with 7-8 years of hands-on experience in monitoring tools and infrastructure management.
Qualifications:
- 7+ years of experience in IT infrastructure monitoring, system administration, or IT operations.
- Expertise in installing, configuring, and managing monitoring tools for Windows, Linux, AIX, HP-UX, VMware, and MPE platforms.
- Experience in change management processes and working with BITSMTS tools.
- Knowledge of high-availability integration, system discovery, and batch job scheduling.
- Strong experience with Configuration Management Database (CMDB) updates.
- Proficiency in troubleshooting, log analysis, and system performance tuning.
- Experience with compliance management (SOx, IT security policies).
- Strong automation skills (PowerShell, Bash, Python, or Ansible preferred).
- Excellent documentation and communication skills.