Overview
Skills
Job Details
Job Summary:
We are looking for a highly skilled Splunk Subject Matter Expert (SME) and Enterprise Monitoring Engineer to lead the design, implementation, and optimization of our monitoring and observability ecosystem. The ideal candidate will be an expert in Splunk, with a strong background in enterprise IT infrastructure, system performance monitoring, and log analytics. You will play a pivotal role in ensuring end-to-end visibility across systems, applications, and services.
Key Responsibilities
Splunk Administration & Engineering
Serve as the SME for Splunk architecture, deployment, and configuration across the enterprise.
Maintain and optimize Splunk infrastructure (indexers, forwarders, search heads, clusters).
Develop/manage custom dashboards, alerts, saved searches, and visualizations.
Implement and tune log ingestion pipelines using Splunk Universal Forwarders, HTTP Event Collector, etc.
Ensure high availability, scalability, and performance of the Splunk environment.
Expertise in SPL (Search Processing Language), dashboard design, advanced searches, parsing, and external lookups.
Troubleshoot and monitor applications using tools like AppDynamics, Splunk, Grafana, Argos, OTEL, etc.
Create dashboards to monitor health, network issues, and configure alerts.
Document runbooks and guidelines for multi-cloud infrastructure and microservices monitoring.
Develop long-term strategy and roadmap for AI/ML tooling to support observability.
Enterprise Monitoring & Observability
Design and implement holistic enterprise monitoring solutions with Splunk and tools like AppDynamics, Dynatrace, Prometheus, Grafana, SolarWinds, etc.
Collaborate with application, infrastructure, and security teams to define monitoring KPIs, SLAs, and thresholds.
Build visibility into application performance, system health, and user experience.
Integrate Splunk with ITSM platforms (e.g., ServiceNow) for automation.
Operations, Troubleshooting & Optimization
Perform data onboarding, parsing, and field extraction.
Support incident response and root cause analysis.
Optimize searches, data retention policies, and index lifecycle management.
Maintain documentation, SOPs, and runbooks.
Required Qualifications
5+ years of experience in IT infrastructure, DevOps, or monitoring roles.
3+ years of hands-on Splunk Enterprise experience (admin, architect, or engineer).
Strong knowledge of SPL, dashboard design, and alerting.
Experience designing/managing large-scale, multi-site Splunk deployments.
Familiarity with Linux, scripting (Bash, Python), and APIs.
Experience with enterprise monitoring tools and Splunk integrations.
Strong understanding of logging, metrics, tracing, and system telemetry.
Solid networking knowledge (DNS, firewall, proxy, SSL/TLS).
Preferred Qualifications
Splunk certifications (Power User, Admin, Architect).
Experience with Splunk ITSI, Enterprise Security, or Observability Suite.
Knowledge of AWS, Azure, or Google Cloud Platform cloud monitoring integrations.
Experience with compliance-driven logging (PCI, HIPAA, SOX).
Familiarity with CI/CD pipelines and GitOps practices.