Overview
Skills
Job Details
Job Openings - Sr. Lead Data Engineer
LOCATION: ** PREFERRED: Home office in Huntsville, TX. May work remotely, but would need the capability to report to the office with advanced notice. **
Duration: 12+Months
ONLY W2
POSITION REQUIREMENTS:
We are seeking a highly skilled and experienced professional to lead the design, implementation, and management of end-to-end enterprise-grade data solutions. This role involves expertise in building and optimizing data warehouses, data lakes, and lakehouse platforms, with a strong emphasis on data engineering, data science, and machine learning. You will work closely with cross-functional teams to create scalable and robust architectures that support advanced analytics and machine learning use cases while adhering to industry standards and best practices.
- Education: Bachelor's Computer Science, Data Science, Engineering, or a related field.
- Experience: Minimum 10 years in data engineering, data architecture, or a similar role, with at least 3 years in a lead capacity.
Responsibilities Include:
- Architect, design, and manage the entire data lifecycle from data ingestion,
transformation, storage, and processing to advanced analytics and machine learning databases and large-scale processing systems. - Implement robust data governance frameworks, including metadata management, lineage tracking, security, compliance, and business glossary development.
- Identify, design, and implement internal process improvements, including redesigning infrastructure for greater scalability, optimizing data delivery, and automating manual
processes. - Ensure high data quality and reliability through automated data validation and testing and provide high quality clean, and usable data from data sets of varying states of disorder.
- Develop and enforce architecture standards, patterns, and reference models for large-scale data platforms.
- Architect and implement Lambda and Kappa architectures for real-time and batch data processing workflows along with strong data modeling capabilities.
- Ability to identify and implement the most appropriate data management system and enable integration capabilities for external tools to perform ingestion, compilation, analytics and visualization.
REQUIRED SKILLS:
- Proficient in SQL, Python, and big data processing frameworks (e.g., Spark, Flink).
- Strong experience with cloud platforms (AWS, Azure, Google Cloud Platform) and related data services.
- Hands-on experience with data warehousing tools (e.g., Snowflake, Redshift, BigQuery), Databricks running on multiple cloud platforms (AWS, Azure and Google Cloud Platform) and data lake technologies (e.g., S3, ADLS, HDFS).
- Expertise in containerization and orchestration tools like Docker and Kubernetes.
- Knowledge of MLOps frameworks and tools (e.g., MLflow, Kubeflow, Airflow).
- Experience with real-time streaming architectures (e.g., Kafka, Kinesis).
- Familiarity with Lambda and Kappa architectures for data processing.
- Enable integration capabilities for external tools to perform ingestion, compilation, analytics and visualization.
PREFERRED SKILLS:
- Certifications in cloud platforms or data-related technologies.
- Familiarity with graph databases, NoSQL, or time-series databases.
- Knowledge of data privacy regulations (e.g., GDPR, CCPA) and compliance requirements.
- Experience in implementing and managing business glossaries, data governance rules, metadata lineage, and ensuring data quality.
- Highly experienced with AWS cloud platform and Databricks Lakehouse.