Hello,
Greetings.!
This is Hima from Appridat Solutions LLC. I was reviewing your resume online and would like to talk to you regarding an exciting opportunity Data Engineer at Rockville, MD or McLean, VA (Hybrid 3 days onsite with 2 days remote)We have with one of Appridat Solutions LLC premier clients.
Title: Data Engineer
Location: Rockville, MD or McLean, VA (Hybrid 3 days onsite with 2 days remote)
Duration: 6 Months with possible extension
Interview process: Prescreen, Phone, Onsite panel
Job Summary:
The Data Engineer works with moderate supervision across two equally weighted domains:
(1) large-scale data pipeline development processing high-volume event data in a cloud environment, and
(2) design and development of agentic AI systems, including LLM-powered data assistants, MCP servers, and agent harness architectures.
This position contributes to overall product quality throughout the software development lifecycle.
Responsibilities
Build and maintain ETL/ELT pipelines using Apache Spark, Hive, and Trino across S3-based data lake environments
Develop and optimize SQL for large-scale datasets, including window functions, multi-table joins, and complex aggregations
Build and engineer big data systems (EMR-on-EC2, EMR-on-EKS) and develop solutions on analytical platforms (SageMaker, Domino, Dataiku)
Participate in data quality monitoring, anomaly detection, and production incident investigation
Develop AI agent systems using AWS Bedrock and agent frameworks (e.g., Strands Agents SDK, LangChain/LangGraph, or equivalent)
Build agent harness architectures combining LLM reasoning with deterministic execution (e.g., RAG-based SQL generation and structured output validation)
Implement agent memory, context management, and tool integration (MCP servers, API connectors, data catalog lookups)
Build evaluation frameworks for agent accuracy (e.g., paraphrase robustness, routing precision, structural consistency)
Stay informed of advances in LLM frameworks and emerging AI capabilities
Write clean, well-tested code; contribute to CI/CD pipelines and infrastructure-as-code on AWS
Ensure secure handling of sensitive data across both data pipelines and AI agent outputs, including auditable execution traces
Adhere to internal standards for secure development practices and technology policies
Partner across teams, communicate technical information effectively, and maintain documentation
Actively learn from senior team members and contribute to process improvement
Essential Technical Skills
Data Engineering & Big Data Technologies
Experience building data pipelines using Apache Spark (PySpark preferred) and SQL
Experience with SQL query engines (Hive, Trino/Presto, or similar) and cloud platforms (AWS S3, EMR, Lambda)
Understanding of data skew, large-scale data processing challenges, and debugging strategies
Generative AI & Agentic Systems
Experience building LLM-powered agent systems that use tools and produce structured outputs
Hands-on experience with agent frameworks (LangChain, LangGraph, AWS Strands, or equivalent)
Knowledge of prompt engineering, RAG architectures, and memory/context management
Experience with foundation model APIs (e.g., Anthropic Claude, Amazon Nova, OpenAI, or similar)
Memory & Agent Design
Understanding of memory architectures (working, episodic, semantic memory)
Familiarity with agent harness patterns (tool routing, guardrails, verification loops, fallback handling)
AI Tool Proficiency
Experience with AI development tools (e.g., GitHub Copilot, Q Developer, ChatGPT, Claude)
Experience with spec-driven development and AI-assisted coding workflows
Cloud Technologies
Experience with AWS services such as S3, EMR, Lambda, Bedrock, Step Functions
Familiarity with monitoring/logging tools (CloudWatch, CloudTrail)
Exposure to platforms like Google Vertex AI or similar
Programming (Python)
Strong Python skills for data engineering and automation
Ability to write clean, modular, and performant code
Understanding of functional programming concepts, concurrency, and memory management
SQL
Strong proficiency in SQL (window functions, joins, aggregations)
Ability to optimize complex queries and handle edge cases (NULLs, duplicates, ordering)
Nice to Have
Experience with agent frameworks and advanced patterns (evaluation harnesses, verification loops)
Model fine-tuning techniques (LoRA, PEFT, managed tuning platforms)
Vector databases (FAISS, Pinecone, OpenSearch)
Containerization and orchestration (Docker, Kubernetes, EKS)
Infrastructure as Code (Terraform, CloudFormation)
CI/CD tools (Jenkins, GitLab CI, GitHub Actions, ArgoCD)
Observability tools (Prometheus, Grafana, ELK stack)
Cloud or AI-related certifications
Education / Experience
Bachelor's degree in Computer Science, Data Science, Information Systems, or related field, with 2+ years of relevant experience (or equivalent practical experience)
Experience delivering enterprise-quality software solutions using object-oriented and database technologies
Knowledge of modern software engineering practices (test automation, build automation, configuration management)
Strong written and verbal communication skills
Ability to build effective working relationships and collaborate across teams
Ability to learn new technologies quickly and work in a fast-paced environment
Regards,
Hima Bindu
Appridat solutions LLC.
5655 Peachtree Parkway, Suite 212, Peachtree Corners, GA 30092
PHONE:*533 | Fax
Direct:
Email: