Overview
Skills
Job Details
Location: Dallas, TX (HYBRID and Relocation Assistance Provided)
Duration: Permanent Direct Hire
Compensation: $200K - $325K base salary, plus bonus
Work Requirements: , Holders or Authorized to Work in the U.S.
Data & Observability Architect
The Data & Observability Architect will define and lead the strategy for collecting, storing, and serving observability data across the engineering organization. This role will ensure consistency, cost-effectiveness, and actionable insights from diverse telemetry sources spanning Data Center, Compute, Network, Storage, Kubernetes, HPC as a Service, Schedulers, and Applications. The architect will establish a taxonomy for observability data, implement tiered storage and retrieval strategies, and tailor insights for different consumer personas (finance, operations, executives, developers).
Responsibilities:
- Define/maintain a unified observability taxonomy across metrics, logs, traces; helping design a traceable observability platform.
- Design and implement ingestion storage retrieval pipelines with automation for large-scale observability data with tiered retention (hot/warm/cold).
- Architect observability across all infrastructure layers (DC, network, storage, compute, Kubernetes, HPC, apps) - with multi-tenancy.
- Establish tech stack standards (e.g., VictoriaMetrics, Loki, Tempo, OpenTelemetry, Coralogix) for different observability signals.
- Help build persona-oriented views for Finance, Operation, Executives, Developers, Platform etc.
- Build and guide transparency around cost, observability and resiliency of the observability platform.
- Define and enforce data governance for telemetry (label taxonomy, cardinality budgets, PII handling etc.).
- Partner with Platform, Security, and Solution Architecture teams to ensure observability onboarding, integrates with compliance, incident response, and developer workflows.
- Coach engineering teams on OpenTelemetry instrumentation and best practices for emitting metrics/logs/traces.
Required Skills:
- Strong expertise in observability platforms: PrometheVictoriaMetrics, Grafana, Loki/ELK, Tempo/Jaeger, OpenTelemetry.
- Experience designing large-scale telemetry pipelines with ingestion, retention, and query optimization.
- Experience reducing MTTD/MTTR by implementing detective, preventive and proactive monitoring/controls.
- Familiarity with SRE principles: SLIs, SLOs, error budgets, burn-rate alerting.
- Knowledge of data governance in observability contexts (taxonomy, labeling, cardinality control, PII redaction).
- Hands-on skills with data pipelines (Kafka, Fluent Bit, Vector, Airflow) and object storage for archival.
- Strong communication and documentation skills to serve diverse stakeholders (finance, ops, exec, dev).
Preferred Experience:
- 12+ years in SRE, Platform Engineering, or Data Engineering with a focus on observability.
- Proven track record in building enterprise-wide observability strategies for hybrid/on-prem + cloud environments.
- Experience with high-performance computing (HPC) telemetry and schedulers (Slurm, LSF).
- Experience with Resilience and Business Continuity.
- Exposure to multi-cloud observability integration (e.g., AWS CloudWatch, Azure Monitor) alongside on-prem stacks.
- Familiarity with cost modeling and chargeback tied to resource telemetry.
- Prior leadership in designing persona-based observability views for technical and business consumers.
Technology is our focus and quality is our commitment. As a national expert in delivering flexible technology and talent solutions, we strategically align industry and technical expertise with our clients business objectives and cultural needs. Our solutions are tailored to each client and include a wide variety of professional services, project, and talent solutions. By always striving for excellence and focusing on the human aspect of our business, we work seamlessly with our talent and clients to match the right solutions to the right opportunities. Learn more about us at inspyrsolutions.com.
INSPYR Solutions provides Equal Employment Opportunities (EEO) to all employees and applicants for employment without regard to race, color, religion, sex, national origin, age, disability, or genetics. In addition to federal law requirements, INSPYR Solutions complies with applicable state and local laws governing nondiscrimination in employment in every location in which the company has facilities.
Information collected and processed through your application with INSPYR Solutions (including any job applications you choose to submit) is subject to INSPYR Solutions Privacy Policy and INSPYR Solutions AI and Automated Employment Decision Tool Policy: . By submitting an application, you are consenting to being contacted by INSPYR Solutions through phone, email, or text.