Overview
Skills
Job Details
Position: Data Pipeline Engineer
Location: San Jose, CA (onsite 2X a week) In Person Interview Required
Duration: 12+ month contract
Requirements: Background Check
LinkedIn needed
Local candidates only.
Key Skills: experience building multiple Data Pipelines, Airflow, Kafka, Python (PySpark), Cloud experience. Must have experience working with Large Scale Data warehouses (multiple TBs).
Description:
We re looking for a Data Pipeline Engineer with deep experience building and orchestrating large-scale ingestion pipelines. This role is ideal for someone who enjoys working across high-volume telemetry sources, optimizing data workflows, and solving schema drift challenges in real-world distributed environments.
You ll be part of the Security Data Platform and ML Engineering team, helping to onboard and normalize security data that powers analytics, detection, and ML workflows across the BU.
Key Responsibilities:
- Design and build scalable batch and streaming data pipelines for ingesting telemetry, log, and event data
- Develop and maintain orchestration workflows using tools like Apache Airflow or similar schedulers
- Onboard new data sources, build connectors (API/Kafka/file-based), and normalize security-related datasets
- Monitor and manage schema drift across changing source systems and formats
- Implement observability into pipelines logging, metrics, and alerts for health and performance
- Optimize ingestion for performance, resilience, and cost-efficiency
- Collaborate across detection, threat intel, and platform teams to align ingestion with security use cases
Required Qualifications:
- 5+ years of experience in data engineering or infrastructure roles focused on pipeline development
- Strong experience with Python and distributed data processing tools like Apache Spark or PySpark
- Hands-on experience with orchestration frameworks like Apache Airflow, Dagster, or similar
- Deep understanding of ingestion best practices, schema evolution, and drift handling
- Experience working with Kafka, S3, or cloud-native storage and messaging systems
- Experience in cloud environments (AWS, Azure, or Google Cloud Platform)
- Bonus: Familiarity with security tools (e.g., Crowdstrike, Wiz), OCSF, or compliance-related data