Overview
Skills
Job Details
Data Engineer with AWS, Kafka, Lambda,python, Hadoop, Google Cloud Platform, AWS Glue, pyspark, Snowflake, databricks, S3, Redshift, HDFS, HBASE, Azure, NoSql- Remote- must need 10+ Years
2.Job summary : Around 10 years of experience working with almost all Hadoop ecosystem components AWS cloud services Microsoft Azure Google Cloud Platform Apache Spark strong working background in designing developing and deploying complex data integration solutions .
Experience with star schema modeling and knowledge of snowflake dimensional modeling and implied Inmon and Kimball data modeling concepts.
Extensive handson experience with major Hadoop Ecosystem components such as core Map Reduce HDFS Hive HBAS
3.Experience : 9to13Yrs
4.Required Skills : ,Python,AWS Lambda,Snowflake,PySpark,Databricks,AWS Glue,Redshift,Kafka
Responsibilities : Around 10 years of experience working with almost all Hadoop ecosystem components AWS cloud services Microsoft Azure Google Cloud Platform Apache Spark strong working background in designing developing and deploying complex data integration solutions . Experience with star schema modeling and knowledge of snowflake dimensional modeling and implied Inmon and Kimball data modeling concepts. Extensive handson experience with major Hadoop Ecosystem components such as core Map Reduce HDFS Hive HBASE Sqoop Apache Solr. Proficient in designing optimizing and maintaining NoSQL databases to handle highthroughput lowlatency workloads leveraging features like data partitioning replication and consistency models Experience in Azure Cloud Azure Data Factory Azure Data Lake Storage Azure Synapse Analytics Snowflake Azure Analytical services Azure Cosmos NoSQL DB and Data bricks. Experienced Data Engineer with advanced skills in Databricks Delta Lake and cloud platforms such as Google Cloud Platform Azure and AWS. Proven track record in building robust data pipelines managing data access and security with Unity Catalog and supporting advanced analytics initiatives Led Databricks administration and Unity Catalog implementation optimizing data access and security across large datasets for enhanced data governance Experience in creating pipelines data flows and complex data transformations and manipulations using ADF with Databricks. Experience in Onpremises to Cloud implementation project using ADF and Python scripting for data extraction from relational databases or files. Expertise in design and develop Spark applications using PySpark and SparkSQL for data extraction transformation and aggregation from multiple file formats for analyzing & transforming the data. Extensive handson experience with AWS services such as EC2 S3 EMR Redshift Lambda Glue and DynamoDB with a focus on designing and deploying scalable data solutions. Expertise in designing and orchestrating multistep data workflows using AWS Step Functions integrating services like Lambda Glue and S3 for efficient automated data pipelines with error handling and retries. Expertise in design and develop cloudbased data pipelines using AWS Glue and EMR facilitating data extraction transformation and loading (ETL) from sources like S3 Redshift and RDBMS. Experience in migrating onprem SQL Server data to the Snowflake platform employing AWS Glue and Snow pipe for seamless integration and data flow automation. Expertise in designing and implementing Snowflakebased data pipelines including data ingestion transformation and loading (ETL/ELT) leveraging Snow pipe Snowflake Streams and Tasks for realtime data processing and seamless integration with cloud storage services like AWS S3 and Azure Blob. Experience designing and implementing workflows using AWS Step Functions including defining states tasks and transitions. Knowledge of best practices for designing AWS Step Functions workflows including error handling and retry logic. Indepth understanding and experience with realtime data streaming technologies such as Kafka and Spark Streaming. Experience with integrating Google Cloud Dataflow with other Google Cloud Platform services such as Big Query Google Cloud Storage and Google Cl