Data Testing Engineer
Location: Baltimore MD, United States
Clearance Level Must Be Able to Obtain: Public Trust
Overview:
Skysoft is looking for a Data Testing Engineer with hands-on experience in Databricks and modern data platforms to help ensure the accuracy,reliability, and quality of our data pipelines and analytical systems. You will play a key role in validating data across distributed systems, supporting robust data workflows, data extracts, reports, and helping build automated test coverage in a Datawarehouse environment.
Job Duties
Design and implement comprehensive data testing strategies across Databricks-
based pipelines and workflows.
Build and maintain automated test suites for data validation using PySpark, SQL,
and Databricks notebooks.
Develop and execute test plans and data quality checks across ingestion,
transformation, and consumption layers.
Validate data transformations in Delta Lake tables and ensure schema evolution
is properly handled.
Collaborate with Data Engineers, Analysts, and Architects to identify test cases
aligned with business requirements.
Integrate testing processes into Databricks Workflows and CI/CD pipelines using
tools like GitHub.
Leverage tools or custom frameworks for scalable and reusable test patterns.
Monitor production pipelines and participate in root cause analysis of data quality
issues.
Document test cases, expected results, and test coverage for ongoing reference
and audit purposes.
Qualifications
Bachelor’s degree in computer science, Engineering, or a related field.
3+ years in a data QA, data engineering, or testing-focused role.
Strong experience with Databricks, including notebooks, Delta Lake, PySpark,
and SQL.
Strong experience in testing reports and dashboards build using BI tools like
QuickSight, Tableau etc.
Deep understanding of data lakehouse architecture, ETL/ELT workflows, and
data modeling concepts.
Hands-on experience testing distributed data pipelines and large-scale datasets.
Proficient in writing and optimizing SQL for validation across structured and semi-
structured data.
Experience in analyzing and decomposing requirements and creating testing
strategy/plan as well as testing estimates and work plan
Experience detailing test plans, test cases, tracking defects, reporting status to
various levels of the project organization
Familiar with CI/CD practices for data (e.g., automated testing via GitHub
Actions, Databricks Repos).
Preferred Qualifications:
Experience with Unity Catalog, data governance, and access controls within
Databricks.
Background in Python scripting and test automation for data workflows.
Exposure to MLflow or testing data science model inputs/outputs.