Overview
Skills
Job Details
Senior Big Data Engineer - Data Infrastructure
This Senior Big Data Engineer role is critical to the organization's mission of delivering dependable, high-performance smart home and networking products globally. The position is situated within a dedicated engineering team responsible for architecting and optimizing the foundational data infrastructure. You'll drive the development of terabyte-scale data pipelines on AWS, ensuring high performance, fault tolerance, and security for products used by customers in over 170 countries. This role requires technical leadership, deep expertise in distributed systems, and a commitment to integrating emerging data technologies to foster a culture of operational excellence.
Key Responsibilities
- Design and Build Scalable Data Pipeline: Develop and maintain high-performance and large-scale data ingestion and transformation workflows, including ETL/ELT processes, data de-identification, and security management.
- Data Orchestration and Automation: Develop and manage automated data workflows using tools like Apache Airflow to schedule pipelines, manage dependencies, and ensure reliable, timely data processing and availability.
- AWS Integration and Cloud Expertise: Build data pipelines integrated with AWS cloud-native storage and compute services, leveraging scalable cloud infrastructure for data processing.
- Monitoring and Data Quality: Implement comprehensive monitoring, logging, and alerting to ensure high availability, fault tolerance, and data quality through self-healing strategies and robust data validation processes.
- Technology Innovation: Stay current with emerging big data technologies and industry trends, recommending and implementing new tools and approaches to continuously improve data infrastructure.
- Technical Leadership: Provide ** technical leadership** for data infrastructure teams, guide architecture decisions and system design best practices. Mentor junior engineers through code reviews and knowledge sharing, lead complex projects from concept to production, and help to foster a culture of operational excellence.
Required Qualifications
- Experience requirements: 5+ years in data engineering, software engineering, or data infrastructure with proven experience building and operating large-scale data pipelines and distributed systems in production, including terabyte-scale big data environments.
- Programming Proficiency: Strong Python skills for building data pipelines and processing jobs. Experience with Git version control and collaborative development workflows is required.
- Distributed Systems Expertise: Deep knowledge of distributed systems and parallel processing concepts. Must be proficient in debugging and performance tuning large-scale data systems, with a strong understanding of data partitioning, sharding, and fault tolerance.
- Big Data Frameworks: Strong proficiency in big data processing frameworks such as Apache Spark for batch processing.
- Database and Data Warehouse Expertise: Strong understanding of relational database concepts and data warehouse principles.
- Workflow Orchestration: Hands-on experience with data workflow orchestration tools like Apache Airflow or AWS Step Functions.
- Soft Skills: Excellent problem-solving skills, strong attention to detail, and ability to work effectively in collaborative team environments.
Preferred Qualifications
- Advanced Degree: Master's degree in Computer Science or a related field.
- Modern Data Technology: Exposure to agentic AI patterns or knowledge base systems. Experience with real-time streaming processing frameworks like Apache Kafka or Apache Flink is a plus.
- Advanced Database and Data Warehouse Expertise: Familiarity with diverse database technologies (NoSQL, NewSQL, etc.) and experience in modern data warehouses (Redshift, BigQuery, Databricks, Snowflake).
- Additional Programming Languages: Proficiency in Java or Scala is a plus.
- Cloud and Infrastructure Expertise: Experience with AWS cloud platforms and hands-on skills in infrastructure as code (Terraform) and container orchestration (Docker/Kubernetes).
location: Irvine, California
job type: Permanent
salary: $180,000 - 200,000 per year
work hours: 8am to 5pm
education: Masters
responsibilities:
- Design and Build Scalable Data Pipeline: Develop and maintain high-performance and large-scale data ingestion and transformation workflows, including ETL/ELT processes, data de-identification, and security management.
- Data Orchestration and Automation: Develop and manage automated data workflows using tools like Apache Airflow to schedule pipelines, manage dependencies, and ensure reliable, timely data processing and availability.
- AWS Integration and Cloud Expertise: Build data pipelines integrated with AWS cloud-native storage and compute services, leveraging scalable cloud infrastructure for data processing.
- Monitoring and Data Quality: Implement comprehensive monitoring, logging, and alerting to ensure high availability, fault tolerance, and data quality through self-healing strategies and robust data validation processes.
- Technology Innovation: Stay current with emerging big data technologies and industry trends, recommending and implementing new tools and approaches to continuously improve data infrastructure.
- Technical Leadership: Provide ** technical leadership** for data infrastructure teams, guide architecture decisions and system design best practices. Mentor junior engineers through code reviews and knowledge sharing, lead complex projects from concept to production, and help to foster a culture of operational excellence.
qualifications:
Required Qualifications
- Experience requirements: 5+ years in data engineering, software engineering, or data infrastructure with proven experience building and operating large-scale data pipelines and distributed systems in production, including terabyte-scale big data environments.
- Programming Proficiency: Strong Python skills for building data pipelines and processing jobs. Experience with Git version control and collaborative development workflows is required.
- Distributed Systems Expertise: Deep knowledge of distributed systems and parallel processing concepts. Must be proficient in debugging and performance tuning large-scale data systems, with a strong understanding of data partitioning, sharding, and fault tolerance.
- Big Data Frameworks: Strong proficiency in big data processing frameworks such as Apache Spark for batch processing.
- Database and Data Warehouse Expertise: Strong understanding of relational database concepts and data warehouse principles.
- Workflow Orchestration: Hands-on experience with data workflow orchestration tools like Apache Airflow or AWS Step Functions.
- Soft Skills: Excellent problem-solving skills, strong attention to detail, and ability to work effectively in collaborative team environments.
Preferred Qualifications
- Advanced Degree: Master's degree in Computer Science or a related field.
- Modern Data Technology: Exposure to agentic AI patterns or knowledge base systems. Experience with real-time streaming processing frameworks like Apache Kafka or Apache Flink is a plus.
- Advanced Database and Data Warehouse Expertise: Familiarity with diverse database technologies (NoSQL, NewSQL, etc.) and experience in modern data warehouses (Redshift, BigQuery, Databricks, Snowflake).
- Additional Programming Languages: Proficiency in Java or Scala is a plus.
- Cloud and Infrastructure Expertise: Experience with AWS cloud platforms and hands-on skills in infrastructure as code (Terraform) and container orchestration (Docker/Kubernetes).
Equal Opportunity Employer: Race, Color, Religion, Sex, Sexual Orientation, Gender Identity, National Origin, Age, Genetic Information, Disability, Protected Veteran Status, or any other legally protected group status.
At Randstad Digital, we welcome people of all abilities and want to ensure that our hiring and interview process meets the needs of all applicants. If you require a reasonable accommodation to make your application or interview experience a great one, please contact
Pay offered to a successful candidate will be based on several factors including the candidate's education, work experience, work location, specific job duties, certifications, etc. In addition, Randstad Digital offers a comprehensive benefits package, including: medical, prescription, dental, vision, AD&D, and life insurance offerings, short-term disability, and a 401K plan (all benefits are based on eligibility).
This posting is open for thirty (30) days.
Qualified applicants in San Francisco with criminal histories will be considered for employment in accordance with the San Francisco Fair Chance Ordinance.
Qualified applicants with arrest or conviction records will be considered for employment in accordance with the Los Angeles County Fair Chance Ordinance for Employers and the California Fair Chance Act.
We will consider for employment all qualified Applicants, including those with criminal histories, in a manner consistent with the requirements of applicable state and local laws, including the City of Los Angeles' Fair Chance Initiative for Hiring Ordinance.