Data engineer


Prabhav Services Inc
Dice Job Match Score™
🤯 Applying directly to the forehead...
Job Details
Skills
- Programming
- Databases
- ETL Tools
Summary
W2 profile only
Location: Whippany, NJ (Hybrid)
Interview Mode: In person interview required
Job Description:
· Hands‑on senior engineer, working directly with developers on design and implementation of modernization initiatives.
· Strong Data engineer with more than 8 years of experience
· Strong hand on experience in Python
· Strong handoff experience in Pyspark and stream processing with Kafka.
· Lead containerization and cloud onboarding of services.
· Drive adoption of GitLab CI/CD, M1 pipelines, and release automation.
· Champion modern testing practices
· Drive Kafka adoption for event driven design standards
Role Overview
Pipeline Development: Build and maintain ETL/ELT pipelines for ingesting and transforming data.
Data Warehousing: Design and manage data warehouses and lakes (Snowflake, BigQuery, Redshift).
Big Data Processing: Optimize large-scale data workflows using Apache Spark or Hadoop.
Data Governance: Ensure data quality, lineage, and compliance with regulations.
Workflow Orchestration: Use Airflow or similar tools to schedule and monitor pipelines.
Integration: Connect APIs, databases, and streaming sources (Kafka).
Collaboration: Partner with analysts, data scientists, and business teams to deliver usable datasets.
Required Skills
| Skill Area | Key Tools/Technologies | Why It Matters |
|---|---|---|
| Programming | Python, SQL, Scala, Java | Core for building pipelines and transformations |
| Databases | MySQL, PostgreSQL, MongoDB, Cassandra | Supports structured and unstructured data |
| Big Data | Apache Spark, Hadoop, Kafka | Enables processing of massive datasets |
| ETL Tools | Airflow, dbt, Talend, Informatica | Automates and manages workflows |
| Cloud Platforms | AWS (Glue, Redshift, S3), Azure (Synapse, Data Lake), Google Cloud Platform (BigQuery) | Provides scalability and cost efficiency |
| Data Modeling | Star/Snowflake schemas, partitioning | Ensures optimized storage and query performance |
| Security | Role-based access, encryption | Critical for compliance and governance |
Data Quality Issues: Poor validation can lead to unreliable analytics.
Pipeline Failures: Inadequate monitoring may cause downtime and data loss.
Cost Overruns: Inefficient queries or storage can inflate cloud costs.
Compliance Risks: Missing GDPR/DPDP controls can lead to legal exposure.
Best Practices
Automate pipeline monitoring with Airflow/Kafka.
Use data profiling before ingestion to detect anomalies.
Implement partitioning and indexing for performance.
Collaborate closely with data science teams to align schema design.
- Dice Id: 10371609
- Position Id: 9001896
- Posted 12 hours ago
Company Info
About Prabhav Services Inc
High ROI
Many companies find that constant maintenance eats into their budget for new technology. By outsourcing your IT management to us, you can focus on what you do best--running your business.
Satisfaction Guaranteed
That's why our goal is to provide an experience that is tailored to your company's needs. No matter the budget, we pride ourselves on providing professional customer service.
Technical Experience
We are well-versed in a variety of operating systems, networks, and databases. We use this expertise to help our customers with a variety of small to mid-sized projects.
Similar Jobs
It looks like there aren't any Similar Jobs for this job yet.
Search all similar jobs