Overview
Skills
Job Details
Role: Data Engineer
Location:-Pittsburgh, PA (Day 1- Onsite, Candidate needs to work 5 Days at the Client Office)
Duration: 12+ Month
Level | Skill | Core Concepts | Total Years of Experience |
Expert | Databricks and Spark Pipeline Development | Databricks, Apache Spark, Delta Lake, Data Engineering Pipelines, Notebook Orchestration |
|
Expert | Python & PySpark Development | PySpark, Python Libraries, Lambda Functions, Error Handling, Modular Code Design |
|
Proficient | AWS Cloud Data Integration | AWS Glue, Athena, Redshift, S3, Kafka, ElasticSearch, RDS, Lambda |
|
Proficient | CI/CD and Source Control | Git, Jenkins, Build Automation, Branching Strategy, Code Reviews |
|
Proficient | ETL Tools and Methodologies | Informatica, ETL Concepts, Functional Design, Technical Specs, Data Mapping |
|
Proficient | Database Querying and Optimization | Oracle, Redshift, MongoDB, DynamoDB, SQL Optimization, Indexes, UDFs, Views |
|
Job Description:
- Databricks Minimum of 8-10 years professional IT experience
- Experience in Databricks, Data/Delta Lake, Oracle, or AWS Redshift type relational databases.
- Extensive experience in Databricks/Spark-based Data Engineering Pipeline development 8+ years working experience in Python-based data integration and pipeline development.
- Data lake and Delta Lake experience with AWS Glue and Athena.
- 5+ years of Experience with AWS Cloud on data integration with Apache Spark, Glue, Kafka, Elastic Search, Lambda, S3, Redshift, RDS, MongoDB/DynamoDB ecosystems.
- Strong real-life experience in python development especially in pySpark in AWS Cloud environment Design, develop test, deploy, maintain, and improve data integration pipeline.
- Experience in Python and common python libraries.
- Lead engineering team to drive the project initiatives and flexible to work on onsite and offshore model with India & China teams.
- Strong analytical experience with database in writing complex queries, query optimization, debugging, user defined functions, views, indexes etc.
- Strong experience with source control systems such as Git and Jenkins build and continuous integration tools.
- Highly self-driven, execution-focused, with a willingness to do "what it takes” to deliver results as you will be expected to rapidly cover a considerable amount of demands on data integration Understanding of development methodology and actual experience writing functional and technical design specifications.
- Excellent verbal and written communication skills, in person, by telephone, and with large teams.
- Strong prior technical, development background in either data Services or Engineering Demonstrated experience resolving complex data integration problems.
- Must be able to work cross-functionally.
- Above all else, must be equal parts data-driven and results-driven.
- Scrum and Agile experience and need to participate and run scrum team as tech lead. Require having background of the informatica or any other ETL took experience