Senior Data Engineer need Clearance

Overview

On Site
$50 - $60
Contract - W2
Contract - Independent
Contract - 60 Month(s)

Skills

MLOps
MLflow
Git
SQL
visualization tools
NLP
LLMs

Job Details

Hello,

Hope you are doing well!!

Position: Senior Data Engineer

Location: Herndon, VA (Onsite)

Contract: Long-Term

Citizenship: with active secret clearance

If you are interested in this position, please share your updated resume -

Job Overview

Senior ML Engineer with NLP to design, build, and deploy scalable AI/ML models for use within the DoD's Search Portfolio. This role requires a strong background in natural language processing, generative AI (LLMs, RAG), distributed computing, and cloud-native architecture. The successful candidate will collaborate with interdisciplinary teams and apply the latest advancements in AI research to deliver secure, mission-ready solutions that process and analyze massive datasets.
Responsibilities will include but are not limited to:

  • Design, develop, test, and support AI/ML pipelines on Databricks using Python to support a variety of Department of Defense (DoD) technical missions.
  • Develop and operationalize NLP solutions for large datasets using modern techniques such as context extraction, topic extraction, and keyword extraction (e.g., RAKE, TF-IDF, and other statistical or embedding-based methods).
  • Leverage advanced NLP libraries and frameworks including Spark NLP, Hugging Face, and TensorFlow to design and deploy scalable machine-learning models.
  • Build, train, and deploy GPU-based models optimized for performance and cost-efficiency across distributed compute environments (Apache Spark/Databricks/Kubernetes).
  • Apply MLOps best practices using MLflow for model lifecycle management, experiment tracking, and reproducibility.
  • Integrate AI capabilities with Elasticsearch and Neo4j to enhance search, graph analytics, and semantic understanding across enterprise datasets.
  • Collaborate with cross-functional teams of data scientists, software engineers, and mission stakeholders to integrate AI/ML capabilities across the Search Portfolio and other data products.
  • Manage the full lifecycle of AI/ML components, from research and model development through deployment, monitoring, and iterative improvement.
  • Diagnose and solve complex data challenges using analytical modeling, AI-driven reasoning, and modern informatics techniques.
  • Document and present technical design alternatives, trade-offs, and implementation strategies to technical and non-technical stakeholders.
  • Build and maintain shared ML tools, libraries, and reusable assets to accelerate innovation and ensure engineering consistency.
  • Support strategic AI roadmap development and architectural planning to enable rapid prototyping and experimentation with advanced AI capabilities.
  • Ensure compliance, security, and traceability in all AI/ML workflows and infrastructure aligned with DoD and federal standards.

What You ll Need:

  • Bachelor s degree with 5 years of relevant experience
  • 5+ years of hands-on experience with Natural Language Processing (NLP), Large Language Models (LLMs), semantic search, text embedding, Retrieval-Augmented Generation (RAG), and generative AI applications.
  • Extensive knowledge of NLP techniques for large datasets, including context, topic, and keyword extraction methods.
  • Proficiency in NLP libraries and frameworks such as Spark NLP, Hugging Face, and TensorFlow.
  • 4+ years of experience working in Databricks as an ML Engineer, including building and managing distributed ML pipelines.
  • Strong Python expertise, including experience developing Flask APIs and reusable ML utilities.
  • Experience with MLOps and MLflow for model tracking, deployment automation, and governance.
  • Hands-on experience developing and tuning GPU-based models in production environments.
  • Working knowledge of Elasticsearch and Neo4j preferred for search and graph-based AI applications.
  • Deep understanding of machine-learning subfields such as computer vision, reinforcement learning, and statistical learning theory.
  • Proven experience with data preprocessing, feature engineering, and model evaluation.
  • Proficiency with version control systems (e.g., Git) for collaborative ML development.
  • Demonstrated experience with Apache Spark or Databricks for distributed data and ML workloads.
  • Experience working with petabyte-scale datasets, data exploration, SQL, and visualization tools.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.