Observability Specialist
Chicago, IL, US • Posted 3 days ago • Updated 3 days ago

K-Tek Resourcing LLC
Dice Job Match Score™
⏳ Almost there, hang tight...
Job Details
Skills
- Amazon Web Services
- Analytical Skill
- Benchmarking
- Cloud Computing
- Collaboration
- Database
- Communication
- Conflict Resolution
- Good Clinical Practice
- Google Cloud Platform
- Continuous Improvement
- Customer Engineering
- Management
- Microsoft Azure
- Optimization
- Performance Monitoring
- Problem Solving
- Recovery
- Regulatory Compliance
- Docker
- FOCUS
- Incident Management
- Root Cause Analysis
- Service Level
- Kubernetes
- Machine Learning (ML)
- Reliability Engineering
- Roadmaps
- KPI
- Observability
- OpenShift
- RCA
- AWS
- Azure
- GCP
- Service Level Objectives
- SLOs
Summary
Role: Observability Specialist
Location: New Jersey / Chicago, IL (Onsite)
Mode: Contract
Work Authorization: USC-EAD/-EAD/L2-EAD/TN/E3
Job Description:
Observability Specialist (Job Description)
Seeking an experienced and highly skilled Observability Specialist with a strong focus on observability and monitoring. In this critical role, you will be instrumental in ensuring the reliability, performance, and efficiency of our systems through the strategic implementation and management of cutting-edge observability practices and tooling. You will leverage your deep technical expertise to provide actionable insights, proactively identify potential issues, and drive continuous improvement across our infrastructure and applications. Key role is to lead technical discussion with the customer on products like DataDog to define a roadmap suitable for the landscape and engage in new feature discussion initiated by the vendor/customer.
Role Summary
As an Observability Specialist, you will be a key contributor to our reliability initiatives, with a specific focus on designing, implementing, and maintaining comprehensive observability solutions. You will work closely with customer, engineering, and product teams to establish robust monitoring, alerting, and analysis capabilities. Your expertise will be crucial in enabling proactive problem detection, accelerating incident resolution, and optimizing system performance.
Key Responsibilities
- Design and implement comprehensive framework for observability roadmap.
- Lead system performance benchmarking and optimization initiatives.
- Establish automated recovery mechanisms for common failure scenarios.
- Develop and enforce reliable monitoring solution.
- Create technical standards for resilient monitoring solution and approach.
- Participate in Root Cause Analysis (RCA) and postmortem processes.
- Develop frameworks to establish correlation in system failures.
- Design, implement, and manage end-to-end observability solutions encompassing metrics, logs, and traces across our infrastructure and applications.
- Evaluate, deploy, and maintain tools for monitoring, logging, tracing, alerting, and automation.
- Define intelligent alerting rules and escalation policies to ensure timely and effective incident response. Implement automated recovery mechanisms for common failure scenarios.
- Lead system performance benchmarking and optimization initiatives, leveraging observability data to identify bottlenecks and areas for improvement.
- Analyze observability data to identify trends, anomalies, and potential risks. Generate actionable insights and reports to improve system reliability and performance.
Qualifications
- Significant experience in Observability Specialist or similar role with a strong focus on observability and monitoring.
- Deep understanding of observability principles and best practices (metrics, logging, tracing).
- Experience implementing and managing centralized logging and monitoring systems.
- Experience with cloud platforms (e.g., AWS, Azure, Google Cloud Platform) and containerization technologies (e.g., Docker, Kubernetes, OpenShift).
- Familiarity with AIOps and ML-based anomaly detection systems is a plus.
- Background in database performance monitoring and optimization.
- Knowledge of Service Level Objectives (SLOs) and KPI implementation.
- Experience participating in Root Cause Analysis (RCA) and postmortem processes.
- Understanding of compliance requirements related to monitoring and logging.
- Excellent problem-solving and analytical skills.
- Strong communication and collaboration skills.
- Dice Id: 10411276
- Position Id: 8862328
- Posted 3 days ago
Company Info
Vision
To be a trusted partner and advisor to our customers
Mission
At K-Tek we believe in understanding the specific needs of the customer and tailor-creating innovative solutions to meet these needs. We invest in our employees and customers. We build a relation of trust with our customers through empathy, solutions and being the first time right.
Who We Are & What We Do
K-Tek Resourcing is a consulting organization with offices in Houston TX and St. Paul, MN. It is supported by 2 global delivery centers, located in India. With its global employee strength of over 250, K-Tek has been supporting its clients for over 9 years. We have been consistently achieving a growth of 30% Year on Year. We have an extensive experience of working in domains including BFSI, Retail, Healthcare and Pharma, Oil & Gas, Travel & Hospitality and Insurance. The technologies we service are IT Infrastructure, Mobile Technologies, Cloud & Big Data Solutions. We understand the needs of our customers and provide them with customized solutions and resources with the tenet of being the "First Time Right".
Values
-Commitment to our customers success through Integrity
-Excellence through Quality
-Growth through customer value creation
Similar Jobs
It looks like there aren't any Similar Jobs for this job yet.
Search all similar jobs
