Data Scientist (Big Data Engineer) 2 (C0022)
Telework
5 Months Contract
Required
4 Years - Implement ETL/ELT workflows for both structured and unstructured data
4 Years - Automate deployments using CI/CD tools
4 Years - Collaborate with cross-functional teams including data scientists, analysts, and stakeholders
4 Years - Design and maintain data models, schemas, and database structures to support analytical and operational use cases
4 Years - Evaluate and implement appropriate data storage solutions, including data lakes (Azure Data Lake Storage) and data warehouses
4 Years - Implement data validation and quality checks to ensure accuracy and consistency
4 Years - Contribute to data governance initiatives, including metadata management, data lineage, and data cataloging
4 Years - Implement data security measures, including encryption, access controls, and auditing; ensure compliance with regulations and best practices
4 Years - Proficiency in Python and R programming languages
4 Years - Strong SQL querying and data manipulation skills
4 Years - Experience with Azure cloud platform
4 Years - Experience with DevOps, CI/CD pipelines, and version control systems
4 Years - Working in agile, multicultural environments
4 Years - Strong troubleshooting and debugging capabilities
3 Years - Design and develop scalable data pipelines using Apache Spark on Databricks
3 Years - Optimize Spark jobs for performance and cost-efficiency
3 Years - Integrate Databricks solutions with cloud services (Azure Data Factory)
3 Years - Ensure data quality, governance, and security using Unity Catalog or Delta Lake
3 Years - Deep understanding of Apache Spark architecture, RDDs, DataFrames, and Spark SQL
3 Years - Hands-on experience with Databricks notebooks, clusters, jobs, and Delta Lake
Preferred
1 Year - Knowledge of ML libraries (MLflow, Scikit-learn, TensorFlow)
1 Year - Databricks Certified Associate Developer for Apache Spark
1 Year - Azure Data Engineer Associate