Overview
Skills
Job Details
Data Engineer- Irvine, CA OR Los Angeles, CA (Hybrid)
Job Summary:
We are seeking a highly skilled and motivated Data Engineer with hands-on experience in Python, PySpark, Apache Airflow, AWS services, and database management. The ideal candidate will work on designing, developing, and maintaining scalable data pipelines and infrastructure for large-scale data processing and analytics.
Key Responsibilities:
Design and implement scalable ETL/ELT pipelines using PySpark and Airflow
Develop and maintain robust Python scripts and automation tools
Work with structured and unstructured data across AWS services such as S3, Glue, Lambda, EMR, Redshift, etc.
Design efficient data models and manage databases (SQL and NoSQL) for analytics and reporting
Optimize the performance of big data processing workflows and ensure data quality and reliability
Collaborate with data scientists, analysts, and other engineering teams to support data needs
Monitor and troubleshoot data pipelines, ensuring smooth operations and minimal downtime
Implement best practices for code versioning, testing, and CI/CD for data engineering workflows.