Role: Data Engineer with Spark & SCALA on Our W2
Location - Bentonville AR
Onsite Requirement - Y/N- Y
Number of days onsite - 5 Days
If Onsite - Office Address -
Mandatory Areas
Must Have Skills - Data Engineer with Scala
Skill 1 - Scala, Spark, Python, SQL, Bigdata, Hadoop
Google Cloud Platform data tools: BigQuery, Dataproc, Vertex AI, Pub/Sub, Cloud Functions
Skill 2 - PySpark, Python, SparkSQL, and data modeling
Only W2 Candidates required, No glider needed for this role
We are seeking a Data Engineer with Spark & SCALA ; Streaming skills builds real-time, scalable data pipelines using tools like Spark, Kafka, and cloud services (Google Cloud Platform) to ingest, transform, and deliver data for analytics and ML.
Responsibilities:
As a Senior Data Engineer, you will
Design, develop, and maintain ETL/ELT data pipelines for batch and real-time data ingestion, transformation, and loading using Spark (PySpark/Scala) and streaming technologies (Kafka, Flink).
Build and optimize scalable data architectures, including data lakes, data warehouses (BigQuery), and streaming platforms.
Performance Tuning: Optimize Spark jobs, SQL queries, and data processing workflows for speed, efficiency, and cost-effectiveness
Data Quality: Implement data quality checks, monitoring, and alerting systems to ensure data accuracy and consistency.
Required Skills & Qualifications:
Programming: Strong proficiency in Python, SQL, and potentially Scala/Java.
Big Data: Expertise in Apache Spark (Spark SQL, DataFrames, Streaming).
Streaming: Experience with messaging queues like Apache Kafka, or Pub/Sub.
Cloud: Familiarity with Google Cloud Platform, Azure data services.
Databases: Knowledge of data warehousing (Snowflake, Redshift) and NoSQL databases.
Tools: Experience with Airflow, Databricks, Docker, Kubernetes is a plus.
Experience Level:
Total IT Experience - Minimum 8 years
Google Cloud Platform - 4 + years of recent Google Cloud Platform experience