Role: Lead Data Engineer/Architect
Location: Richmond, VA/Remote
Duration: 12 Months
Job description:
The Data Engineer Lead will be responsible for architecting, designing, and optimizing enterprise-grade data pipelines and data platforms. The role demands strong hands on experience in SOL. and NoSQL databases (MongoDB), ETL/ELT engineering, cloud data ecosystems, and building scalable data solutions. The ideal candidate will lead data engineering initiatives, mentor junior engineers, and ensure high-quality data delivery across multiple business functions.
This role aligns with enterprise data engineering responsibilities referenced in internal Data Engineer and Lead Data Engineering roles.
Key Responsibilities:
1. Data Pipeline Engineering
Design, develop, and maintain scalable, reliable, and optimized ETL/ELT pipelines to support enterprise data flows.
Build data ingestion frameworks to extract data from structured/unstructured sources into data lakes and warehouses.
Implement batch and real-time data processing using modern workflow orchestration tools.
2. Database Engineering (SQL + MongoDB)
Develop, tune, and optimize complex SQL queries, stored procedures, indexing strategies, and performance tuned queries for high volume systems.
Architect and maintain MongoDB collections, schema design, aggregation pipelines, and NoSQL data models supporting high scale systems.
Conduct data validation and quality checks across SOLand NoSQL environments
3. Cloud and Data platform engineering
Work with cloud data services on AWS/Azure/Google Cloud Platform to build scalable, secure and high performance data platforms.
Leverage cloud native ETL tools (Airflow, AWS Glue, Datastage, SSIS, Azure ADF, PySpark to support transformation workloads).
4. Data Architecture and Modeling
Lead data modeling efforts, including dimensional modeling, normalization, schema optimization, and design for analytics.
Implement metadata management, governance processes, and data quality frameworks.
5. Performance, Optimization and Security
Monitor and tune pipeline performance, storage optimization, and cost efficient cloud data workloads.
Ensure compliance with data security standards, access controls, and regulatory requirements
6. Leadership and collaboration
Lead and mentor data engineers, providing guidance on best practices, tooling, architecture, and design decisions.
Collaborate with cross-functional teams including software engineers, DBAs, analysts, and data scientists to support enterprise data needs.
Participate in sprint planning, architectural reviews, and project governance.
Required Skills & Qualifications
10+ years of hands on experience in data engineering.
Strong expertise in:
SQL (advanced query optimization, indexing, tuning).
MongoDB (schema design, aggregation, performance tuning).
Python or similar scripting languages for data manipulation.
Proven experience with ETL tools: Airflow, Pyspark, SSIS, Datastage, Informatica, AWS Glue, Azure ADF.
Hands-on cloud data experience on AWS/Azure/GP (S3, Redshift, BigQuery, RDS, Lambda, Azure SQL, Data Lake, etc.).
Strong understanding of data warehousing concepts, DWH optimization, and data modeling frameworks.
Experience in building scalable, secure data pipelines and implementing data governance.
Excellent communication, documentation, and stakeholder management skills.
Preferred Qualifications:
Experience with Kafka, Spark, Snowflake, or Databricks.
Knowledge of data security practices, compliance frameworks, and best practices.
Background in healthcare or enterprise platforms is good to have.
Mandatory Skills:
Data Engineering, MongoDB, Data Modeling, Data Warehousing, SSIS, PySpark, Airflow.