Overview
On Site
Depends on Experience
Accepts corp to corp applications
Contract - W2
Contract - Independent
Contract - 12 Month(s)
Skills
h20
python
Job Details
Position: Data Scientist / Data Engineer
Location: Plano TX ( Day 1 Onsite )
Duration: Long Term
Job Summary:
We are seeking a highly skilled and versatile Data Scientist/ Data Engineer to join our data and analytics team. The ideal candidate will have strong experience in designing and building scalable data pipelines, as well as advanced analytics and machine learning solutions using tools like Azure, Databricks, Python, PySpark, and H2O.ai. You will be instrumental in driving data-driven insights and enabling robust machine learning capabilities across the organization.
Key Responsibilities:
- Design, build, and maintain scalable and efficient ETL/ELT data pipelines using Azure Data Factory and Databricks.
- Develop and deploy machine learning models using H2O, PySpark MLlib, and Python-based frameworks.
- Collaborate with business stakeholders to identify data-driven opportunities and define analytical requirements.
- Conduct data preprocessing, feature engineering, and transformation for large-scale datasets.
- Implement and manage model versioning, tracking, and monitoring in production environments.
- Optimize performance and scalability of data solutions using distributed computing tools on Azure.
- Ensure data quality, integrity, and governance through robust data validation and monitoring.
- Create clear, actionable visualizations and reports using Power BI or similar BI tools (optional).
- Work in an Agile/DevOps environment, collaborating with cross-functional teams including data engineers, software developers, and product managers.
Required Skills and Qualifications:
- Bachelor s or Master s degree in Computer Science, Data Science, Engineering, or a related field.
- 4+ years of hands-on experience in data science and/or data engineering roles.
- Strong proficiency in Python and PySpark for data processing and ML development.
- Expertise in Azure cloud services (e.g., Azure Data Lake, Azure Data Factory, Azure ML).
- Proven experience with Databricks and its collaborative analytics features.
- Hands-on experience with H2O.ai platforms (Driverless AI or H2O-3) for scalable ML.
- Deep understanding of machine learning concepts, statistical modeling, and algorithm development.
- Solid knowledge of SQL and experience with relational and non-relational databases.
- Familiarity with CI/CD, Git, and ML Ops practices is a strong advantage.
Preferred Qualifications:
- Azure certifications such as Azure Data Engineer Associate or Azure AI Engineer Associate.
- Experience with real-time data processing using tools like Kafka or Azure Event Hubs.
- Exposure to data security and privacy practices in cloud environments.
- Knowledge of containerization tools like Docker or Kubernetes is a plus.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.