Overview
Skills
Job Details
Job Title: Data Engineer
Locations: Dallas, TX / Plano, TX / Austin, TX / Houston, TX / Richardson, TX
Experience Required: 6-10 Years
About the Role
We re seeking a skilled and passionate Data Engineer to design, build, and optimize data pipelines and architectures that enable efficient data processing and analytics.
You ll work closely with data scientists, analysts, and software engineers to ensure data flows are reliable, scalable, and secure supporting real-time insights and enterprise-grade decision-making.
This role is ideal for someone with hands-on experience in data integration, ETL pipelines, big data platforms, and cloud data solutions (AWS, Azure, or Google Cloud Platform).
Key Responsibilities
- Design, develop, and maintain ETL/ELT pipelines to process large volumes of structured and unstructured data.
- Build and optimize data models and data warehouses/lakes for analytics and reporting.
- Integrate data from multiple sources using tools such as Apache Spark, Kafka, or Airflow.
- Implement and manage data pipelines on cloud platforms like AWS (Glue, Redshift, S3) or Azure (Data Factory, Synapse).
- Ensure data quality, integrity, and performance through validation, monitoring, and optimization.
- Collaborate with stakeholders to define data requirements and deliver robust data solutions.
- Apply data governance, access control, and security best practices.
- Troubleshoot performance issues and optimize query execution across large datasets.
- Work in an Agile environment and contribute to continuous improvement and automation initiatives.
Primary Skills (Must Have)
- Strong experience in Python or Scala for data processing.
- Proficiency with SQL and database design (MySQL, PostgreSQL, SQL Server).
- Hands-on experience with Apache Spark, Kafka, or Hadoop ecosystems.
- Cloud experience with AWS (Glue, S3, Redshift, EMR) or Azure (ADF, Synapse, Databricks).
- Expertise in ETL/ELT design, data modeling, and pipeline orchestration.
- Understanding of data warehousing, data lakes, and distributed data systems.
Secondary Skills (Good to Have)
- Familiarity with NoSQL databases (MongoDB, Cassandra, DynamoDB).
- Experience with containerization (Docker, Kubernetes) and CI/CD pipelines.
- Exposure to Airflow, NiFi, or dbt for workflow orchestration.
- Knowledge of machine learning data pipelines or data quality frameworks.
- Excellent analytical and problem-solving skills.
- Strong communication and collaboration abilities.