Software Engineer (Artificial Intelligence / Machine Learning Big Data)

Overview

On Site

Depends on Experience

Contract - W2

Contract - Independent

Contract - 24 Month(s)

Able to Provide Sponsorship

Skills

Agile

Amazon Web Services

Analytical Skill

Apache HBase

Apache Hadoop

Apache Hive

Apache Maven

Apache Oozie

Apache Spark

Artificial Intelligence

Big Data

Cloud Computing

Collaboration

Communication

Concurrent Computing

Conflict Resolution

Continuous Delivery

Continuous Improvement

Continuous Integration

Distributed Computing

Docker

FOCUS

File Systems

Git

HDFS

Health Care

Integration Testing

JIRA

Jenkins

Kubernetes

Machine Learning (ML)

Management

Mentorship

Microservices

Microsoft SQL Server

Multitasking

Multithreading

NoSQL

OCI

Performance Tuning

Problem Solving

Project Management

PySpark

Python

RESTful

Relational Databases

Reporting

SAFE

SOA

SaaS

Scrum

Shell Scripting

Soft Skills

Software Development Methodology

Job Details

We are seeking an experienced Software Engineer (AI/ML Big Data) to work on optimizing Python/PySpark jobs within a Hadoop ecosystem. The engineer will design and develop scalable data-driven applications, manage large datasets, and collaborate across teams to deliver high-quality solutions. The ideal candidate will have strong expertise in distributed computing, cloud platforms, and modern data pipelines.

Key Responsibilities:

Develop and optimize Python/PySpark modules in Hadoop ecosystem (Spark, HDFS, YARN, Hive, Oozie).

Design and develop cloud applications (AWS, OCI, or similar).

Work with large datasets and implement data aggregation, quality checks, and reporting.

Conduct unit and integration testing; troubleshoot and resolve technical issues.

Collaborate with cross-functional and global teams to translate business requirements into technical solutions.

Participate in all phases of SDLC with focus on continuous improvement.

Mentor junior developers and provide technical guidance.

Required Skills:

5+ years of Python/PySpark development.

5+ years optimizing PySpark in Hadoop ecosystem (Spark, HDFS, YARN, Hive, Oozie).

5+ years in cloud application development (AWS, OCI, or similar).

Strong experience with distributed/cluster computing concepts.

Expertise in relational databases (MS SQL Server or similar) and NoSQL (HBase preferred).

3+ years creating and consuming RESTful APIs.

Hands-on experience with multi-threaded applications, concurrency, performance tuning, and memory management.

Strong knowledge of shell scripting and file systems.

Preferred Skills:

Experience with CI/CD tools (Git, Maven, Jenkins, Artifactory).

Knowledge of microservices, SOA, Docker, Kubernetes, OpenShift.

Familiarity with Agile/Scrum and SAFe Agile practices.

Experience with project management tools like JIRA.

Healthcare industry background (highly preferred but not required).

Soft Skills:

Strong analytical and problem-solving abilities.

Excellent communication and collaboration skills.

Ability to prioritize and multitask in a fast-paced environment.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Job Details

Share