Overview
On Site
BASED ON EXPERIENCE
Full Time
Skills
Employment Authorization
Large Language Models (LLMs)
Fluency
Data Extraction
Customization
Scratch
Creative Problem Solving
Natural Language
Streaming
Prompt Engineering
Extraction
Boost
Performance Metrics
Testing
Regulatory Compliance
Data Security
Rapid Prototyping
Knowledge Sharing
Computer Science
LangChain
LlamaIndex
PyTorch
Python
Scripting
Prototyping
Interfaces
Vector Databases
Semantic Search
Reasoning
Conflict Resolution
Problem Solving
Debugging
GitHub
Productivity
Management
Vertex
Computer Vision
OpenCV
Communication
Collaboration
Teamwork
Autogen
Evaluation
Unstructured Data
Database
Optical Character Recognition
Workflow
Cloud Computing
Amazon SageMaker
Microsoft Azure
Machine Learning (ML)
Google Cloud Platform
Google Cloud
Docker
Orchestration
Kubernetes
Open Source
Publications
SAFE
Auditing
Health Insurance
Professional Development
Artificial Intelligence
SANS
FOCUS
Professional Services
Genetics
Law
Job Details
Role: AI Engineer
Location: Houston, TX 77007
Duration: Direct Hire
Work Authorization: s, or Authorized to Work in the US
We are seeking a talented AI Engineer with deep expertise in Large Language Model (LLM) engineering and design. The ideal candidate will be fluent in manipulating and integrating pre-trained LLMs within complex codebases to tackle practical challenges in data extraction, processing, and interactive systems. This role prioritizes hands-on application such as customizing LLMs for specific tasks, including retrieval-augmented generation (RAG) pipelines and vision-based workflows over building models from scratch. You'll focus on leveraging LLMs for solutions like advanced chatbots, natural language interfaces, semantic search, and creative problem-solving across data-intensive scenarios, while collaborating in a fast-paced team to push our AI products forward.<>
Key Responsibilities </>
Required Qualifications </>
Preferred Qualifications </>
What We Offer </>
About INSPYR Solutions
Technology is our focus and quality is our commitment. As a national expert in delivering flexible technology and talent solutions, we strategically align industry and technical expertise with our clients' business objectives and cultural needs. Our solutions are tailored to each client and include a wide variety of professional services, project, and talent solutions. By always striving for excellence and focusing on the human aspect of our business, we work seamlessly with our talent and clients to match the right solutions to the right opportunities. Learn more about us at inspyrsolutions.com.
Location: Houston, TX 77007
Duration: Direct Hire
Work Authorization: s, or Authorized to Work in the US
Job Description:
We are seeking a talented AI Engineer with deep expertise in Large Language Model (LLM) engineering and design. The ideal candidate will be fluent in manipulating and integrating pre-trained LLMs within complex codebases to tackle practical challenges in data extraction, processing, and interactive systems. This role prioritizes hands-on application such as customizing LLMs for specific tasks, including retrieval-augmented generation (RAG) pipelines and vision-based workflows over building models from scratch. You'll focus on leveraging LLMs for solutions like advanced chatbots, natural language interfaces, semantic search, and creative problem-solving across data-intensive scenarios, while collaborating in a fast-paced team to push our AI products forward.<>
Key Responsibilities </>
- Integrate and fine-tune pre-trained LLMs into our codebase using APIs, frameworks, and orchestration tools to enable features like natural language querying, automated summarization, and intelligent anomaly detection in data streams.
- Design, build, and optimize chatbot systems and conversational AI, incorporating LLMs for seamless user experiences, including multi-turn dialogues, context-aware responses, and integration with external data sources.
- Implement RAG architectures to enhance LLM performance by combining retrieval from vector databases with generation, enabling accurate responses grounded in large-scale document corpora.
- Apply advanced LLM techniques such as prompt engineering, chain-of-thought prompting, retrieval-augmented generation (RAG), and agentic workflows to solve general problems like automating workflows, debugging data pipelines, or generating insights from unstructured inputs.
- Work with open-source LLMs (e.g., Gemma, Llama) for local deployment and inference, optimizing for on-premises or edge environments to ensure low-latency performance and data sovereignty.
- Incorporate computer vision tasks, such as OCR for text extraction from images or documents, and broader CV techniques for processing visual data in hybrid LLM pipelines.
- Experiment iteratively with LLM configurations, hyperparameters, and embeddings to boost performance metrics like accuracy, latency, and cost-efficiency in real-world scenarios, including vector search optimizations.
- Maintain scalable codebase integrations, conduct thorough testing (e.g., unit tests for LLM outputs, A/B evaluations), and ensure compliance with AI best practices, including bias mitigation and data security.
- Collaborate with cross-functional teams on code reviews, rapid prototyping, and knowledge sharing, while leveraging AI coding assistants to accelerate development.
Required Qualifications </>
- Bachelor's or Master's degree in Computer Science, Artificial Intelligence, Machine Learning, or a related field.
- 3+ years of professional experience in AI engineering, specifically manipulating and deploying LLMs in production (e.g., via Hugging Face Transformers, LangChain, LlamaIndex, or OpenAI/Groq APIs), including hands-on work with open-source models like Gemma and Llama for local deployment (e.g., using Ollama, vLLM, or direct PyTorch inference setups).
- Advanced proficiency in Python, including scripting for LLM pipelines, handling dependencies with tools like Poetry or Pipenv, and integrating with libraries such as Sentence Transformers for embeddings, FAISS for vector search, or Streamlit/Gradio for prototyping interfaces.
- Experience with vector databases and semantic search (e.g., Pinecone, Weaviate, or FAISS) to support efficient retrieval in LLM applications.
- Demonstrated expertise in RAG systems, from building retrieval components to integrating them with LLMs for enhanced reasoning and factuality.
- Proven track record building and optimizing chatbots or conversational agents (e.g., using Rasa, Dialogflow, or custom LLM-based setups), with examples of deploying them in user-facing applications.
- Strong general problem-solving abilities, demonstrated through projects involving breaking down ambiguous tasks into structured AI solutions, debugging LLM hallucinations, or optimizing for edge cases.
- Hands-on experience with AI-powered coding assistants (e.g., Cursor, GitHub Copilot, or similar tools) to enhance productivity in LLM engineering workflows.
- Experience with Google Cloud Platform (Google Cloud Platform) for deploying and managing AI workloads, including services like Vertex AI, Cloud Run, or AI Platform.
- Familiarity with vision tasks, including OCR (e.g., via Tesseract or EasyOCR) and computer vision libraries (e.g., OpenCV) for processing images, PDFs, or multimodal data in conjunction with LLMs.
- Excellent communication and teamwork skills for thriving in collaborative, iterative environments.
Preferred Qualifications </>
- Expertise in complementary LLM ecosystem tools, such as AutoGen for multi-agent systems, Haystack for RAG pipelines, or evaluation frameworks like Rouge/BLEU for assessing outputs.
- Familiarity with data-intensive applications, including processing unstructured text, integrating with databases (e.g., Pinecone, Weaviate), or handling multimodal inputs like images for OCR-enhanced workflows.
- Experience with other cloud platforms (e.g., AWS SageMaker, Azure ML) alongside Google Cloud Platform, including containerization with Docker and orchestration via Kubernetes.
- Contributions to open-source LLM projects, hackathons, or publications showcasing innovative uses of models like GPT-series, Llama, or Mistral.
- Knowledge of ethical AI practices, such as implementing guardrails for safe LLM interactions or conducting fairness audits.
What We Offer </>
- Competitive salary and benefits package, including health insurance, retirement plans, and professional development opportunities.
- A collaborative, innovative work environment in Houston with flexible hybrid options.
- The chance to work on impactful AI projects that directly influence industries.
- Opportunities for growth in a seed-stage company backed by strong funding and a visionary team.
About INSPYR Solutions
Technology is our focus and quality is our commitment. As a national expert in delivering flexible technology and talent solutions, we strategically align industry and technical expertise with our clients' business objectives and cultural needs. Our solutions are tailored to each client and include a wide variety of professional services, project, and talent solutions. By always striving for excellence and focusing on the human aspect of our business, we work seamlessly with our talent and clients to match the right solutions to the right opportunities. Learn more about us at inspyrsolutions.com.
INSPYR Solutions provides Equal Employment Opportunities (EEO) to all employees and applicants for employment without regard to race, color, religion, sex, national origin, age, disability, or genetics. In addition to federal law requirements, INSPYR Solutions complies with applicable state and local laws governing nondiscrimination in employment in every location in which the company has facilities
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.