Pyspark Kafka Admin (Hybrid)

  • Charlotte, NC
  • Posted 7 hours ago | Updated 7 hours ago

Overview

Hybrid
Depends on Experience
Accepts corp to corp applications
Contract - Independent
Contract - W2
Contract - 12 Month(s)

Skills

Amazon Lambda
Amazon RDS
Amazon Web Services
Apache HTTP Server
Apache Kafka
Cloud Computing
Collaboration
Continuous Delivery
Continuous Integration
Data Flow
Data Lake
Data Processing
Data Quality
Data Storage
Data Warehouse
Database Administration
Decision-making
Extract
Transform
Load
Geospatial Analysis
High Availability
Management
Mentorship
PySpark
Python
Real-time
Relational Databases
Remote Desktop Services
SQL
Snow Flake Schema
Splunk
Streaming
Terraform
Testing
Workflow

Job Details

Pyspark Kafka Admin

Charlotte, NC ( 3-4 days in office )

Contract

Skills:

PySpark

Python

Data Management

Kafka Admin

Job Summary : Following are the skills needed for this position. 75% developer + 25% Architect.

Mandatory skills:

1. Pyspark, Python - 5/5

2. sql, Airflow, AWS (Glue, RDS, Lambda, Kafka), Datalake, Data pipelines (Batch & Streaming), Data warehousing - 4/5 Good to have: Snowflake, splunk, Apache iceberg, terraform, CI/CD pipelines.

3. Experience : 8 - 12 years

4. Required Skills : Technical Skills: Kafka, AWS RDS, AWS Lambda, AWS Glue, Snowflake, Data warehousing, AWS Data Pipeline, Data Lake Concepts, Airflow, SQL, PySpark, Python

Responsibilities : -

  • Oversee the deployment and management of data pipelines using AWS Data Pipeline and AWS Glue to ensure seamless data flow.
  • Provide expertise in data warehousing and data lake concepts to optimize data storage and retrieval processes.
  • Utilize Kafka for real-time data streaming to support dynamic data processing needs.
  • Implement AWS RDS and Snowflake solutions to ensure robust and scalable database management.
  • Develop and maintain ETL processes using Airflow to automate data workflows and improve efficiency.
  • Apply SQL and PySpark skills to analyze and transform large datasets for actionable insights.
  • Collaborate with cross-functional teams to align geospatial data initiatives with business objectives.
  • Ensure data quality and integrity through rigorous testing and validation processes.
  • Support the development of data-driven strategies to enhance decision-making capabilities.
  • Monitor and optimize system performance to ensure high availability and reliability of data services.
  • Provide technical guidance and mentorship to team members to foster skill development.

Qualifications

  • Possess extensive experience with AWS services, including AWS Lambda and AWS Glue, to manage cloud-based data solutions.
  • Demonstrate proficiency in Python and PySpark for data processing and analysis tasks.
  • Have a strong understanding of data warehousing and data lake concepts to support scalable data architectures.
  • Show expertise in SQL for querying and managing relational databases effectively. - Exhibit knowledge of Kafka for handling real-time data streaming requirements.
  • Experience with Snowflake for cloud-based data warehousing solutions is highly desirable.

Years of Experience: 12.00 Years of Experience

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

About Digitive LLC