We are looking for a talented Big Data Engineer with extensive experience in software engineering, data engineering, and leadership to join our dynamic team. The ideal candidate will have hands-on experience with ETL processes using big data technologies such as Apache Spark, Hadoop, and Hive. You will be responsible for developing and optimizing data pipelines, providing technical guidance to junior team members, and delivering innovative data solutions to support business needs.
Key Responsibilities:
- Design, develop, and maintain scalable ETL pipelines using Apache Spark, Hadoop, Hive, and other big data technologies.
- Perform data processing and transformation using Pyspark/Python, Scala, and UNIX/Shell scripting.
- Manage and optimize data workflows on Hadoop clusters and related infrastructure.
- Develop reports and dashboards using PowerBI for data analytics and visualization.
- Collaborate with data scientists and business teams to understand data requirements and deliver solutions.
- Lead and mentor junior team members, providing guidance on best practices and technical solutions.
- Ensure data quality, security, and compliance standards are met.
- Troubleshoot and resolve data pipeline issues promptly.
- Stay updated with emerging big data technologies and recommend improvements.
Qualifications:
- Proven experience in software engineering and data engineering roles.
- Strong hands-on experience with ETL processes using Apache Spark, Hadoop, Hive, and related ecosystem.
- Proficiency in Pyspark/Python, Scala, and UNIX/Shell scripting.
- Experience working with RDBMS and other database systems.
- Solid understanding of big data architecture and cloud platforms (optional).
- Experience leading tech teams and providing technical guidance.
- Strong problem-solving and communication skills.
- Experience with PowerBI or similar business intelligence tools.
Preferred Skills:
- Knowledge of containerization and orchestration tools like Docker and Kubernetes.
- Familiarity with cloud platforms such as AWS, Azure, or Google Cloud Platform.
- Experience with data modeling and database design.
- Knowledge of data governance and security practices.