Infrastructure Engineer - Remote / Telecommute

Overview

Remote
On Site
Hybrid
$$47 / hr
Contract - W2
Contract - 1 day((s))

Skills

Infrastructure Engineer

Job Details

Job Description:

Primary Monitoring and Incident Response:
  • Azure Monitor, Splunk, Dyna Trace, and custom dashboards.
  • Respond to alerts and triage P1/P2 escalations via ServiceNow war rooms, performing initial diagnosis and remediation where possible.
  • Incident / Change / Exception process adherence.
Capacity and Availability Management:
  • Identify scaling opportunities with virtual machines or service as required and identify zone redundancy patterns for performance.
  • Keep track of capacity forecasts and proactively identify performance bottlenecks.
Backup and Restore Operations:
  • Execute frequent backups (Azure Backup, NetApp Snapshots) and perform basic restore tasks to ensure business continuity.
  • Conduct routine backup verifications/tests to confirm data integrity.
Access and Permissions Management:
  • Maintain Azure/NetApp file shares, setting up and adjusting access controls and AD group permissions according to organizational policy.
  • Perform periodic identity and access reviews to ensure the principle of least privilege.
Logging and Metrics Oversight:
  • Oversee monitoring agents (e.g., Splunk, Dyna Trace, Azure Alerts, System Pulse), ensuring they are up to date and generating the right alerts/metrics for L2 to act upon.
  • Collaborate with L3 to fine tune alert thresholds and logging when chronic issues emerge.
Basic Performance Testing:
  • Execute routine performance checks (e.g., load or stress tests) in coordination with L3 teams when potential service degradation is suspected.
  • Document and escalate consistent performance anomalies.
Skills Set:
  • Comfortable reading and troubleshooting logs/metrics (Splunk, Dyna Trace, Azure Monitor).
  • Familiar with Azure Backup services, basic restore procedures, and file share permissions.
  • Proficiency in ticketing systems (ServiceNow), collaborating with other technical teams for escalations.
  • Sufficient knowledge to follow runbooks and standard operating procedures (SOPs).
  • Documentation of standard operating procedures and IaC changes should be continuously updated in a central repository (e.g., Git repos).
  • Familiarity with Epic implementations (on-prem / cloud).
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.