Overview
Skills
Job Details
Data Engineer PySpark & Scala
Location: Irving TX // Richardson TX
Full-time | Onsite
>Job Description</>
We are seeking an experienced and highly motivated Data Engineer with deep expertise in PySpark, and modern data engineering practices. This role involves designing, developing, and optimizing big data pipelines and solutions in a cloud-native environment. Ideal candidates will have strong analytical skills, proficiency in distributed data processing, and a solid understanding of data architecture and governance.
< data-start="754" data-end="786">Required Qualifications</>
Bachelor s degree in Computer Science, Engineering, or a related field (or foreign equivalent)
In lieu of every year of education, three years of relevant experience may be considered4+ years of professional experience in Information Technology
3+ years of hands-on experience with PySpark and Scala in large-scale data environments
Experience designing and developing data pipelines and ETL/ELT processes using Spark (PySpark & Scala)
Proficient in SQL for data transformation, analysis, and optimization
Solid understanding of distributed data processing and performance tuning in Spark
Experience with cloud platforms (e.g., AWS, Azure, or Google Cloud Platform) and associated data services
< data-start="1528" data-end="1561">Preferred Qualifications</>
Experience with Apache Airflow or other workflow orchestration tools
Strong understanding of data warehousing principles and modern data lake/lakehouse architectures
Familiarity with CI/CD, Git, and DevOps practices in data engineering projects
Hands-on experience with data quality, validation frameworks, and data lineage tools
Exposure to Kafka, Delta Lake, or other real-time/batch processing frameworks
Knowledge of data security, access control, and governance best practices
Excellent problem-solving and communication skills
Ability to work in a global delivery model and collaborate with cross-functional teams