Overview
Skills
Job Details
Title: Sr. Data Engineer
Clearance: TS/SCI w/ FSP
Location: Herndon/Chantilly, VA
Rate: $180K $190K
Summary:
Develop new tools, code, and services to execute data engineering activities involving data of varying types and in varying conditions. Activities include the following tasks:
-
Movement of structured and unstructured data using approved methods.
-
Execute data ingestion activities for storing data in a local or enterprise-level location.
-
Develop code to format data that supports exploration.
-
Analyze source data formats and work with Data Scientists and partners to determine the formats and transforms that best meet mission objectives.
-
Develop code and tools to provide one-time and ongoing data extraction from various repositories, formatting and transformations into enterprise or standalone data models.
-
Develop new ETL and perform O&M and enhancements on existing ETL code using best practices/standards.
-
Develop and deliver documentation for each project including ETL mappings, code use guide, code location, and access instructions.
Responsibilities:
-
Design and optimize data pipelines using tools such as Spark, Apache Iceberg, Trino, OpenSearch, EMR cloud services, NiFi, and Kubernetes containers.
-
Ensure the pedigree and provenance of the data is maintained such that the access to data is protected.
-
Clean and preprocess data to enable access for advanced analytics.
-
Collaborate with enterprise working groups to advance the state of data standards.
-
Collaborate with the engineering team, data stewards, and mission partners to aid in getting actionable value out of the data holdings.
-
Collaborate with software engineers to update, configure, and maintain data services based on the requirements.
-
Ensure data quality by working with the testing and data quality team to enhance standardization of data conditioning pipelines.
-
Experience adapting to various types and formats of data, and working with development teams to integrate new data processing platforms.
Required Skills:
10+ years' experience with:
-
Data lifecycle engineering.
-
Development and maintenance of extract, transform, and load (ETL) tools and services.
-
Cloud and on-prem data storage and processing solutions.
-
Python, SQL, Spark, and other data engineering programming.
-
COTS and open-source data engineering tools such as ElasticSearch and NiFi.
-
Processing data within the Agile Lifecycle.
Preferred Skills/Experience:
-
Experience using AI/LLMs to accelerate the processing and transformation of data.