Overview
Remote
Depends on Experience
Full Time
Accepts corp to corp applications
No Travel Required
Skills
Senior Data Engineer
Data Migration
Data Lake/ Lake House ecosystem
Data Analytics
Job Details
Role : Senior Data Engineer
Duration: 12+months
Duration: 12+months
Remote
Need minimum 14 years candidate with strong Data Migration Experience
LinkedIn need to be before 2016
Job Description
8-12+ years of proven experience in Data Engineering, worked on designing and developing software with Big Data, Data Lake/ Lake House ecosystem, Data Analytics, backend microservices architecture, and heterogeneous data types at scale
Proven in-depth experience in creating ELT/integrations pipelines using Databricks, Spark, Python, SQL, Scala, Kafka, Presto, Parquet, Streaming, events, bots, AWS/cloud ecosystem.
Proficient in developing Micro Services and using AWS frameworks such as SQS, Stream, Kubernetes, EC2, S3, Lambda etc.
Experience with data pipelines/analysis/ visualization tooling such as Elastic stack, Logstash, Kibana, Kafka, Grafana, Splunk, Pandas, Message brokers, Data modeling.
Expertise in Data Lakehouse architecture and end-to-end Databricks techniques.
Have worked on connecting Data Lake to SAP Ecosystem, aware on technological ecosystem on how to onboard data from SAP ECC application layer and ingest back to SAP via application layer
Have designed and built PB sized scalable data lake and structured/unstructured data query interfaces and microservices to ingest, index, mine, transform, and compose large datasets.
Worked on end-to-end data lifecycle from Data Ingestion, Data Transformation, and Data Consumption layer.
Strong experience building Data Lake-led APIs and its usability for consuming large-scale data and ingesting real-time ecosystems.
Expert in Spark, Parquet, steaming, events, Kafka, telemetry, MapReduce, Hadoop, Hive, Presto, Spark, data query approaches, and dashboarding.
The one who have implemented Enterprise use cases like CMDB, Governance, time series classification, telemetry anomaly detection, logs, and real-time data ingestion through APIs.
Experience with structured data such as Avro, Parquet, Protobuf, Thrift, and concepts like schema evolution.
Enable end-to-end monitoring pipeline for associated applications and ingest to Datalike, build metrics log monitoring.
End to end data pipeline monitoring for Data Platform and Data Monitoring for data integrity and meet the business agreed SLA
Work on multi-die project & associated deliverables for Observability enabling lifecycle of applications, leveraging these use cases enable scalable self-service data pipeline.
Build self-service data ingestion, self-service consumption through API, enable data streaming to Data Lake, services for DB replication from other DBs to Data Lake, Streaming Pipeline using cloud events while ensuring Data Parsing metadata extraction, e.g., for Stream, File etc.
Building analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency, and other key business performance metrics.
Job Description
8-12+ years of proven experience in Data Engineering, worked on designing and developing software with Big Data, Data Lake/ Lake House ecosystem, Data Analytics, backend microservices architecture, and heterogeneous data types at scale
Proven in-depth experience in creating ELT/integrations pipelines using Databricks, Spark, Python, SQL, Scala, Kafka, Presto, Parquet, Streaming, events, bots, AWS/cloud ecosystem.
Proficient in developing Micro Services and using AWS frameworks such as SQS, Stream, Kubernetes, EC2, S3, Lambda etc.
Experience with data pipelines/analysis/
Expertise in Data Lakehouse architecture and end-to-end Databricks techniques.
Have worked on connecting Data Lake to SAP Ecosystem, aware on technological ecosystem on how to onboard data from SAP ECC application layer and ingest back to SAP via application layer
Have designed and built PB sized scalable data lake and structured/unstructured data query interfaces and microservices to ingest, index, mine, transform, and compose large datasets.
Worked on end-to-end data lifecycle from Data Ingestion, Data Transformation, and Data Consumption layer.
Strong experience building Data Lake-led APIs and its usability for consuming large-scale data and ingesting real-time ecosystems.
Expert in Spark, Parquet, steaming, events, Kafka, telemetry, MapReduce, Hadoop, Hive, Presto, Spark, data query approaches, and dashboarding.
The one who have implemented Enterprise use cases like CMDB, Governance, time series classification, telemetry anomaly detection, logs, and real-time data ingestion through APIs.
Experience with structured data such as Avro, Parquet, Protobuf, Thrift, and concepts like schema evolution.
Enable end-to-end monitoring pipeline for associated applications and ingest to Datalike, build metrics log monitoring.
End to end data pipeline monitoring for Data Platform and Data Monitoring for data integrity and meet the business agreed SLA
Work on multi-die project & associated deliverables for Observability enabling lifecycle of applications, leveraging these use cases enable scalable self-service data pipeline.
Build self-service data ingestion, self-service consumption through API, enable data streaming to Data Lake, services for DB replication from other DBs to Data Lake, Streaming Pipeline using cloud events while ensuring Data Parsing metadata extraction, e.g., for Stream, File etc.
Building analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency, and other key business performance metrics.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.