Overview
Skills
Job Details
Job Title: Senior Data Engineer
Location: Must be onsite in McLean, VA for 5 days a week (Monday to Friday)
Duration: Long Term
Call Notes:
Looking for a Senior Cloud/Data Engineer with expertise in Python, Spark, PySpark, AWS, ETL and Snowflake
Job Description:
• Senior Data Engineer with 10+ years of experience designing and implementing scalable, cloud-native data solutions across mortgage, finance, telecommunications, and banking sectors using Java, Spring Boot, Python, PySpark, and AWS.
• Proficient in building real-time data pipelines, event-driven microservices, and scalable RESTful APIs, with strong experience in Kafka, Spark Streaming, and Spring Boot within the mortgage and multi-family domain.
• Hands-on expertise in Informatica IICS for orchestrating data ingestion, transformation, and integration workflows across on-premise and cloud data platforms, ensuring high data quality and governance.
• Deep experience in developing and optimizing big data pipelines using Spark (RDDs, DataFrames, Spark SQL), Databricks, and Hive on cloud platforms such as AWS EMR, Glue, and Azure Databricks.
• Strong background in cloud engineering with AWS (EC2, S3, Redshift, Glue, Lambda, EMR), and Azure, including automation using Step Functions, CloudFormation, and Terraform for deploying end-to-end data workflows.
• Skilled in PostgreSQL, MongoDB, and other relational and NoSQL databases, applying performance tuning, indexing, and data modeling best practices (Star, Snowflake schemas) for efficient querying and reporting.
• Experienced in modernizing legacy ETL frameworks by rearchitecting them into cloud-native solutions using Informatica IICS, Spring Boot, and AWS Glue for scalable mortgage data processing.
• Strong understanding of the mortgage and multi-family business domain, with a proven track record of translating complex business rules into performant, maintainable data engineering solutions.
• Well-versed in building resilient, event-driven systems using Kafka and Java microservices, ensuring real-time data availability and consistency across distributed platforms.
• Proficient in developing automation and validation scripts using Python and shell scripting, ensuring robust data ingestion, transformation, and quality assurance processes across structured and semi-structured sources.
• Experience with CI/CD pipelines using Jenkins, GitLab CI, Docker, Kubernetes, and infrastructure-as-code tools such as Terraform and Ansible for scalable deployment and maintenance of data platforms.
• Adept at leading Agile teams and collaborating with cross-functional stakeholders to translate business needs into data driven solutions, with a strong focus on regulatory compliance, data governance, and secure development practices.
• Excellent communication and mentoring skills, with a passion for continuous learning, innovation, and delivering enterprise grade solutions that support strategic decision-making.