Overview
Skills
Job Details
Job Title: PySpark with Scala Developer Location: Irving,Tx/Richardson,Tx Job Type: Full-Time Permanent Job Summary:
We are seeking an experienced and highly motivated PySpark with Scala Developer to join our Big Data engineering team. The ideal candidate will have strong experience building scalable data processing pipelines using Apache Spark, with expertise in both PySpark and Scala. This role requires strong problem-solving skills, attention to detail, and the ability to work collaboratively in a fast-paced, data-driven environment. Key Responsibilities:
Design, develop, and optimize large-scale data processing pipelines using Apache Spark, with focus on PySpark and Scala. Build and maintain reliable data ingestion, transformation, and validation workflows. Integrate data from various sources including files, databases, APIs, and streaming systems. Write efficient and reusable code for ETL and data analytics use cases. Collaborate with data scientists, data engineers, and business teams to understand requirements and deliver data solutions. Ensure performance tuning, debugging, and optimization of Spark jobs. Implement data quality, lineage, and governance practices. Develop unit tests and support automated deployment pipelines (CI/CD). Work with large datasets in both batch and real-time environments on cloud or on-premise platforms. Required Skills and Experience:
5+ years of experience in Big Data or data engineering roles. Strong hands-on experience with Apache Spark, using both PySpark and Scala. Deep understanding of Spark internals including RDDs, DataFrames, Datasets, and Spark SQL. Proficient in data modeling, data wrangling, and data transformations. Experience with Hadoop ecosystem, HDFS, Hive, or HBase. Proficiency in querying with SQL and working with structured and unstructured data. Experience working with Git, CI/CD tools, and Agile methodologies. Familiarity with cloud platforms such as AWS, Azure, or Google Cloud Platform (e.g., S3, EMR, Databricks) is a plus. Ability to write clean, efficient, and maintainable code. Strong communication and interpersonal skills.