Overview
Skills
Job Details
Job Title: Machine Learning Engineer LLMs & Information Extraction
Location: Dallas, TX
Job Summary:
We are seeking a highly skilled Machine Learning Engineer with hands-on experience in building and deploying production-grade AI/ML systems particularly those focused on large language models (LLMs) and information extraction workflows, not chatbot development.
This role will involve designing, fine-tuning, and integrating open-source and commercial LLMs (e.g., OpenAI, Cohere, Hugging Face) into systems that perform tasks such as prompt engineering, document analysis, knowledge retrieval, and workflow automation. The ideal candidate will bring a strong foundation in Python, deep learning frameworks, and real-world experience building scalable ML systems in production.
Key Responsibilities:
- Design and deploy end-to-end LLM-powered systems for tasks like data extraction, information retrieval, summarization, and document classification
- Engineer prompts and evaluation frameworks to optimize performance of commercial (OpenAI, Anthropic, Cohere) and open-source LLMs (e.g., Mistral, LLaMA, Falcon)
- Develop robust, production-grade ML pipelines that include model fine-tuning, inference optimization, and post-processing logic
- Work with cross-functional teams to integrate ML components into broader data or business process automation workflows
- Monitor and improve system performance, latency, accuracy, and cost-efficiency
- Stay current with advances in foundation models, open-source ML tools, and deployment techniques
Must-Have Qualifications:
- 3+ years of experience building and deploying machine learning systems in production
- Expert-level proficiency in Python and modern ML libraries such as PyTorch, TensorFlow, and Hugging Face Transformers
- Hands-on experience working with LLMs (e.g., OpenAI, Anthropic, Cohere, or open-source models)
- Strong understanding of prompt engineering, fine-tuning, embeddings, and LLM evaluation
- Experience with data extraction, semantic search, NER, or document processing
- Familiarity with inference deployment on cloud or on-prem infrastructure (e.g., AWS Sagemaker, Docker, REST APIs)
Nice-to-Haves:
- Experience with vector databases (e.g., FAISS, Pinecone, Weaviate)
- Familiarity with LangChain, LLM orchestration, or RAG (retrieval-augmented generation) systems
- Exposure to workflow automation, document pipelines, or enterprise data systems
- Understanding of evaluation metrics for LLM-driven applications