The Leidos Health group has an opening for a Data Scientist, contingent upon contract award, to contribute to various data science projects working within a cross-functional team at the Centers for Disease Control and Prevention (CDC) in Atlanta, GA.
The Data Scientist would analyze various datasets (both unstructured and structured) to determine data relationships, model datasets for loading/storing in relational databases, create new data pipelines and ETL processes, and create R Shiny dashboards and RMarkdown documents working closely with the team and CDC staff. This position requires an entrepreneurial mindset working closely with stakeholders to elicit and understand solution requirements. The developed solutions would meet CDC security and compliance requirements.
The specific tasks handled can vary between engagements but typically include building data pipelines to pull together information from different source systems; integrating, consolidating and cleansing data; and structuring it for use in individual analytics applications for center and agency specific needs.
Architect solutions and mentor Jr Data Scientists
Lead design and developing of complex enterprise level applications and other system software using a strong working knowledge of data transformation and Extract Transform Load (ETL) processes.
Liaise with CDC technical support teams to resolve data ingestion and quality issues to ensure quality data Work in a collaborative environment and attend Agile Scrum development meetings to support the Leidos team.
Data wrangling, typically in R, Python, or SAS (or similar technologies) to prepare data stored in our data lake or in SQL Server databases for visualization in Power BI.
Support development of ETL scripts in both SQL Server and R. Provide production support and ticket resolution to data centric applications Follow Leidos standards in creating the project life cycle application structures, application quality assurance
Performs data modeling, data design, and metadata and repository creation. Reviews object and data models and the metadata repository to structure the data for better management and quicker access.
Ensure adherence to legal and agency regulations on data usage and management
Bachelor's degree with 8+ years of experience in working on Data Science projects. 4 years of experience in R development. 1-2 years of experience in data visualization, including Microsoft Power BI, R Shiny development and R Markdown reports. Significant experience with SQL and connecting to SQL on-premise as well as Azure Synapse cloud databases. Data wrangling experience with R, Python, and/or SAS. Hands on experience in SQL Server Database / SQL Server Management Studio. Experience with source control software such as GitHub. Works well as an individual contributor as well as in a cross-functional team applying Agile Scrum development processes. Willingness and personal drive to learn new skills quickly to meet customer requirements. Experience building data visualizations in Microsoft Power BI. Familiarity with the Agile Development methodologies and SCRUM. Ability to offer creative technical solutions to data and analysis problems and ability to clearly articulate designed solutions to fellow data scientists and customers.
Graduate degree in Data Science, Statistics, Information Technology, Public Health or other relevant disciplines Experience working at CDC or other federal agencies Knowledge of GitHub Knowledge, familiarity with Public Health, Medical Data, HL 7, and/or Electronic Health Record Experience in applying AI/ML in data management Experience working with private or hybrid cloud based data lake Experience with other relational and/or no SQL databases Working knowledge of PowerBI
External Referral Bonus:
Potential for Telework:
Clearance Level Required:
Scheduled Weekly Hours: