Role: Principal Data Engineer (Apache Spark / Databricks / Azure).
Client: DXC / American Airlines
Hybrid/ Dallas-Fort Worth TX
Job Description
Minimum Qualifications Education & Experience
* Bachelor s degree in computer science, Computer Engineering, Information Systems (CIS/MIS), Engineering, or a related technical discipline, or equivalent practical experience.
* 9+ years of end-to-end Software Development Life Cycle (SDLC) experience designing, developing, and delivering large-scale data analytics, data warehousing, and data engineering solutions.
* Strong hands-on experience in data analytics, including data wrangling, mining, integration, modeling, analysis, visualization, and reporting.
Required Technical Skills
* Apache Spark, Scala, Azure Databricks *10+ years*
* Proven expertise in building and optimizing distributed data processing solutions.
Nice-to-Have Skills
* SQL *5+ years*
* CI/CD pipelines *5+ years*
Core Technical Expertise
* Minimum of 5 years of experience optimizing Spark jobs for performance and cost efficiency using advanced techniques such as:
* Partitioning and caching strategies
* Cluster configuration tuning
* Performance bottleneck identification and troubleshooting
* Strong hands-on tuning experience with the ability to independently research and solve complex technical challenges.
* Proactive, self-driven problem solver with a strong ownership mindset.
Key Responsibilities
* Provide technical leadership by analyzing long-term opportunities, designing modern and scalable data solutions, and contributing to core development as needed.
* Act as the *Product Technical Lead* for data engineering initiatives.
* Serve as a subject matter expert (SME) for the data domain within IT and partner with business teams to deliver self-service data products.
* Collaborate closely with business leaders, analysts, project managers, architects, technical leads, developers, and cross-functional teams to implement the enterprise data strategy.
* Design and build reusable, scalable, efficient, and maintainable data engineering frameworks that:
* Recover gracefully from failures
* Support easy reprocessing and extensibility
* Drive data quality, best practices, and coding standards, including:
* Test-Driven Development (TDD)
* Single source of truth identification across systems
* Quality analytics (MTTR, MTBF, and failure pattern analysis)
* Leverage data pipelines to deliver actionable insights into data quality and product performance.
* Identify and implement process improvements by automating manual workflows, optimizing data delivery, and enhancing infrastructure scalability.
* Continuously research industry best practices and evaluate optimal usage of cloud services and data engineering tools across the enterprise.
* Ensure data products are designed with privacy, security, and regulatory compliance embedded by design.
* Partner with product teams to help prioritize objectives, initiatives, and feature roadmaps.
* Conduct internal roadshows and knowledge-sharing sessions to promote data products and capabilities across the organization.
* Champion agile methodologies and test-driven development practices while using modern data engineering tools to analyze, model, design, build, and test reusable components.