Overview
Skills
Job Details
Role name: Data Scientist
Role Description: 1. Create and set best practices for data ingestion, integration, and access patterns to support both real-time and batch-based consumer data needs2. Assist with design and lead development on scalable, high-performance data architecture solutions that supports both the consumer side of the business as well as analytic use cases3. Create comprehensive documentation for design, and processes to support ongoing maintenance and knowledge sharing for both GMP and non-GMP solutions. 4. Drive continuous data transformation to minimize technical debt5. Responsible for creation of test protocols test scripts and other validation deliverables. 6. Provide technical support to local end users on Data pipelines and Advanced Analytics Solutions developed
Competencies: Digital : Python, Digital : Apache Spark, Digital : Kafka
Experience (Years): 6-8
Essential Skills: ? Demonstrated experience in designing and implementing complex data systems from the ground up.? Strong experience with programming languages, such as Python, SQL & Spark? Experience with building batch and streaming pipelines using complex SQL, PySpark, Pandas, and similar frameworks? Develop, refine, and optimize Advanced Analytics Solutions using machine learning models to extract insights from complex data sources.? Transform data using SQL, NoSQL, and Python. Visualizing data using a diverse tool set including but not limited to Python and R.? Experience with cloud services in AWS andor Microsoft Azure? Experience with message brokers and event-driven architectures (e.g, MQTT, Kafka, RabbitMQ)? Experience in handling data streams, APIs, events, container orchestration products such as OpenShift, EKS, ECS.? Experience testing, troubleshooting & establishing API connectivity utilizing software documentation and tools such as Postman? Strong experience transforming data using common ETLELT patterns? Experience with orchestrating complex workflows and data pipelines using like Airflow or similar tools? Knowledge andor experience in predictive modeling and machine learning is a plus.? Manufacturing Pharma experience is a plus
Desirable Skills: ? Demonstrated experience in designing and implementing complex data systems from the ground up.? Strong experience with programming languages, such as Python, SQL & Spark? Experience with building batch and streaming pipelines using complex SQL, PySpark, Pandas, and similar frameworks? Develop, refine, and optimize Advanced Analytics Solutions using machine learning models to extract insights from complex data sources.? Transform data using SQL, NoSQL, and Python. Visualizing data using a diverse tool set including but not limited to Python and R.? Experience with cloud services in AWS andor Microsoft Azure? Experience with message brokers and event-driven architectures (e.g, MQTT, Kafka, RabbitMQ)? Experience in handling data streams, APIs, events, container orchestration products such as OpenShift, EKS, ECS.? Experience testing, troubleshooting & establishing API connectivity utilizing software documentation and tools such as Postman? Strong experience transforming data using common ETLEL
patterns? Experience with orchestrating complex workflows and data pipelines using like Airflow or similar tools? Knowledge andor experience in predictive modeling and machine learning is a plus.? Manufacturing Pharma experience is a plus