Overview
Skills
Job Details
We are looking for 12+ years experience Enterprise Observability Engineer, who would leverage powerfully insightful data to inform our systems and solutions, and we re seeking an experienced pipeline-centric data engineer to put it to good use in building out ETL and Data Operations framework (Data Preparation / Normalization and Ontological processes).
Technical Skills:
Five or more years of experience with Python, SQL, and data visualization/exploration tools
Full stack observability lead with Splunk (preferred) / Datadog, Infra monitoring, App onboarding and APM experience
Proficiency in observability tools: They are familiar with tools for logging, metrics, and tracing, such as ELK Stack, Splunk and distributed tracing systems.
Familiarity with OOB dashboards and templates creation. Trying to integrate ITSI to correlate event data for analytics.
Communication skills, especially for explaining technical concepts to nontechnical business leaders
General understanding of distributed systems: They need to understand the complexities of modern architectures, including microservices, cloud-native environments, and hybrid infrastructure.
Familiarity with the AWS ecosystem, specifically Redshift and RDS
Communication skills, especially for explaining technical concepts to nontechnical business leaders
Ability to work on a dynamic, research-oriented team that has concurrent projects
Experience in building or maintaining ETL processes
Experience in insurance domain
Professional certification.
Strong understanding of distributed systems: They need to understand the complexities of modern architectures, including microservices, cloud-native environments, and hybrid infrastructure.
Proficiency in observability tools: They are familiar with tools for logging, metrics, and tracing, such as ELK Stack, Prometheus, Grafana, and distributed tracing systems.
Data analysis and visualization skills: They can analyze telemetry data to identify trends and patterns and create visualizations to communicate insights.
Scripting and automation: They can automate tasks and create scripts to manage observability infrastructure.
Should have experience with cloud platforms like AWS, Azure, and Google Cloud Platform
Key Responsibilities and Skill
Work with Splunk and internal teams to create a factory model to onboard applications to Splunk
Use agile software development processes to make iterative improvements to our back-end systems
Model front-end and back-end data sources to help draw a more comprehensive picture of user flows throughout the system and to enable powerful data analysis
Build data pipelines that clean, transform, and aggregate data from disparate sources
Develop and expand PubSub models and scaling event/messaging architectures
Establish and extrapolate Ontological/Semantic standards
Develop models that can be used to make predictions and answer questions for the overall business.
Designing and Implementing Observability Pipelines: Observability engineers create robust pipelines to collect, aggregate, and analyze data from various sources.
Monitoring and ing: They establish monitoring systems and s to detect anomalies and performance issues in real-time.
Metric & Instrumentation Standards: Defining common metric standards for every stage of the Application Lifecycle process and Instrumentation standards and scripting including OTel standards alignment
Data Analysis and Visualization: They analyze telemetry data (logs, metrics, traces) to gain insights into system behavior and identify trends.
Incident Response: They investigate and troubleshoot incidents, using observability data to understand the root cause and implement solutions.
Collaboration and Communication: They collaborate with development, SRE, and other teams to ensure observability practices are integrated into workflows and to share insights.
Staying Up-to-Date: They stay current with the latest trends in observability, logging, monitoring, and cloud technologies.
Documentation and Knowledge Sharing: They create comprehensive documentation for observability systems and processes and share knowledge with other teams .