Software Engineer (Artificial Intelligence / Machine Learning Big Data)

Overview

On Site
Depends on Experience
Contract - W2
Contract - Independent
Contract - 24 Month(s)
Able to Provide Sponsorship

Skills

Agile
Amazon Web Services
Analytical Skill
Apache HBase
Apache Hadoop
Apache Hive
Apache Maven
Apache Oozie
Apache Spark
Artificial Intelligence
Big Data
Cloud Computing
Collaboration
Communication
Concurrent Computing
Conflict Resolution
Continuous Delivery
Continuous Improvement
Continuous Integration
Distributed Computing
Docker
FOCUS
File Systems
Git
HDFS
Health Care
Integration Testing
JIRA
Jenkins
Kubernetes
Machine Learning (ML)
Management
Mentorship
Microservices
Microsoft SQL Server
Multitasking
Multithreading
NoSQL
OCI
Performance Tuning
Problem Solving
Project Management
PySpark
Python
RESTful
Relational Databases
Reporting
SAFE
SOA
SaaS
Scrum
Shell Scripting
Soft Skills
Software Development Methodology

Job Details

We are seeking an experienced Software Engineer (AI/ML Big Data) to work on optimizing Python/PySpark jobs within a Hadoop ecosystem. The engineer will design and develop scalable data-driven applications, manage large datasets, and collaborate across teams to deliver high-quality solutions. The ideal candidate will have strong expertise in distributed computing, cloud platforms, and modern data pipelines.

Key Responsibilities:

Develop and optimize Python/PySpark modules in Hadoop ecosystem (Spark, HDFS, YARN, Hive, Oozie).

Design and develop cloud applications (AWS, OCI, or similar).

Work with large datasets and implement data aggregation, quality checks, and reporting.

Conduct unit and integration testing; troubleshoot and resolve technical issues.

Collaborate with cross-functional and global teams to translate business requirements into technical solutions.

Participate in all phases of SDLC with focus on continuous improvement.

Mentor junior developers and provide technical guidance.

Required Skills:

5+ years of Python/PySpark development.

5+ years optimizing PySpark in Hadoop ecosystem (Spark, HDFS, YARN, Hive, Oozie).

5+ years in cloud application development (AWS, OCI, or similar).

Strong experience with distributed/cluster computing concepts.

Expertise in relational databases (MS SQL Server or similar) and NoSQL (HBase preferred).

3+ years creating and consuming RESTful APIs.

Hands-on experience with multi-threaded applications, concurrency, performance tuning, and memory management.

Strong knowledge of shell scripting and file systems.

Preferred Skills:

Experience with CI/CD tools (Git, Maven, Jenkins, Artifactory).

Knowledge of microservices, SOA, Docker, Kubernetes, OpenShift.

Familiarity with Agile/Scrum and SAFe Agile practices.

Experience with project management tools like JIRA.

Healthcare industry background (highly preferred but not required).

Soft Skills:

Strong analytical and problem-solving abilities.

Excellent communication and collaboration skills.

Ability to prioritize and multitask in a fast-paced environment.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.