AI Engineer

Overview

On Site

BASED ON EXPERIENCE

Full Time

Skills

Employment Authorization

Large Language Models (LLMs)

Fluency

Data Extraction

Customization

Scratch

Creative Problem Solving

Natural Language

Streaming

Prompt Engineering

Extraction

Boost

Performance Metrics

Testing

Regulatory Compliance

Data Security

Rapid Prototyping

Computer Science

LangChain

LlamaIndex

PyTorch

Python

Scripting

Prototyping

Interfaces

Vector Databases

Semantic Search

Reasoning

Conflict Resolution

Problem Solving

Debugging

GitHub

Productivity

Management

Vertex

Computer Vision

OpenCV

Communication

Collaboration

Teamwork

Autogen

Evaluation

Unstructured Data

Database

Optical Character Recognition

Workflow

Cloud Computing

Amazon SageMaker

Microsoft Azure

Machine Learning (ML)

Google Cloud Platform

Google Cloud

Docker

Orchestration

Kubernetes

Open Source

Publications

SAFE

Auditing

Health Insurance

Professional Development

Artificial Intelligence

SANS

FOCUS

Professional Services

Genetics

Law

Job Details

Role: AI Engineer
Location: Houston, TX 77007
Duration: Direct Hire
Work Authorization: s, or Authorized to Work in the US
Job Description:

We are seeking a talented AI Engineer with deep expertise in Large Language Model (LLM) engineering and design. The ideal candidate will be fluent in manipulating and integrating pre-trained LLMs within complex codebases to tackle practical challenges in data extraction, processing, and interactive systems. This role prioritizes hands-on application such as customizing LLMs for specific tasks, including retrieval-augmented generation (RAG) pipelines and vision-based workflows over building models from scratch. You'll focus on leveraging LLMs for solutions like advanced chatbots, natural language interfaces, semantic search, and creative problem-solving across data-intensive scenarios, while collaborating in a fast-paced team to push our AI products forward.<>
Key Responsibilities </>

Integrate and fine-tune pre-trained LLMs into our codebase using APIs, frameworks, and orchestration tools to enable features like natural language querying, automated summarization, and intelligent anomaly detection in data streams.
Design, build, and optimize chatbot systems and conversational AI, incorporating LLMs for seamless user experiences, including multi-turn dialogues, context-aware responses, and integration with external data sources.
Implement RAG architectures to enhance LLM performance by combining retrieval from vector databases with generation, enabling accurate responses grounded in large-scale document corpora.
Apply advanced LLM techniques such as prompt engineering, chain-of-thought prompting, retrieval-augmented generation (RAG), and agentic workflows to solve general problems like automating workflows, debugging data pipelines, or generating insights from unstructured inputs.
Work with open-source LLMs (e.g., Gemma, Llama) for local deployment and inference, optimizing for on-premises or edge environments to ensure low-latency performance and data sovereignty.
Incorporate computer vision tasks, such as OCR for text extraction from images or documents, and broader CV techniques for processing visual data in hybrid LLM pipelines.
Experiment iteratively with LLM configurations, hyperparameters, and embeddings to boost performance metrics like accuracy, latency, and cost-efficiency in real-world scenarios, including vector search optimizations.
Maintain scalable codebase integrations, conduct thorough testing (e.g., unit tests for LLM outputs, A/B evaluations), and ensure compliance with AI best practices, including bias mitigation and data security.
Collaborate with cross-functional teams on code reviews, rapid prototyping, and knowledge sharing, while leveraging AI coding assistants to accelerate development.

<>
Required Qualifications </>

Bachelor's or Master's degree in Computer Science, Artificial Intelligence, Machine Learning, or a related field.
3+ years of professional experience in AI engineering, specifically manipulating and deploying LLMs in production (e.g., via Hugging Face Transformers, LangChain, LlamaIndex, or OpenAI/Groq APIs), including hands-on work with open-source models like Gemma and Llama for local deployment (e.g., using Ollama, vLLM, or direct PyTorch inference setups).
Advanced proficiency in Python, including scripting for LLM pipelines, handling dependencies with tools like Poetry or Pipenv, and integrating with libraries such as Sentence Transformers for embeddings, FAISS for vector search, or Streamlit/Gradio for prototyping interfaces.
Experience with vector databases and semantic search (e.g., Pinecone, Weaviate, or FAISS) to support efficient retrieval in LLM applications.
Demonstrated expertise in RAG systems, from building retrieval components to integrating them with LLMs for enhanced reasoning and factuality.
Proven track record building and optimizing chatbots or conversational agents (e.g., using Rasa, Dialogflow, or custom LLM-based setups), with examples of deploying them in user-facing applications.
Strong general problem-solving abilities, demonstrated through projects involving breaking down ambiguous tasks into structured AI solutions, debugging LLM hallucinations, or optimizing for edge cases.
Hands-on experience with AI-powered coding assistants (e.g., Cursor, GitHub Copilot, or similar tools) to enhance productivity in LLM engineering workflows.
Experience with Google Cloud Platform (Google Cloud Platform) for deploying and managing AI workloads, including services like Vertex AI, Cloud Run, or AI Platform.
Familiarity with vision tasks, including OCR (e.g., via Tesseract or EasyOCR) and computer vision libraries (e.g., OpenCV) for processing images, PDFs, or multimodal data in conjunction with LLMs.
Excellent communication and teamwork skills for thriving in collaborative, iterative environments.

<>
Preferred Qualifications </>

Expertise in complementary LLM ecosystem tools, such as AutoGen for multi-agent systems, Haystack for RAG pipelines, or evaluation frameworks like Rouge/BLEU for assessing outputs.
Familiarity with data-intensive applications, including processing unstructured text, integrating with databases (e.g., Pinecone, Weaviate), or handling multimodal inputs like images for OCR-enhanced workflows.
Experience with other cloud platforms (e.g., AWS SageMaker, Azure ML) alongside Google Cloud Platform, including containerization with Docker and orchestration via Kubernetes.
Contributions to open-source LLM projects, hackathons, or publications showcasing innovative uses of models like GPT-series, Llama, or Mistral.
Knowledge of ethical AI practices, such as implementing guardrails for safe LLM interactions or conducting fairness audits.

<>
What We Offer </>

Competitive salary and benefits package, including health insurance, retirement plans, and professional development opportunities.
A collaborative, innovative work environment in Houston with flexible hybrid options.
The chance to work on impactful AI projects that directly influence industries.
Opportunities for growth in a seed-stage company backed by strong funding and a visionary team.

About INSPYR Solutions
Technology is our focus and quality is our commitment. As a national expert in delivering flexible technology and talent solutions, we strategically align industry and technical expertise with our clients' business objectives and cultural needs. Our solutions are tailored to each client and include a wide variety of professional services, project, and talent solutions. By always striving for excellence and focusing on the human aspect of our business, we work seamlessly with our talent and clients to match the right solutions to the right opportunities. Learn more about us at inspyrsolutions.com.

INSPYR Solutions provides Equal Employment Opportunities (EEO) to all employees and applicants for employment without regard to race, color, religion, sex, national origin, age, disability, or genetics. In addition to federal law requirements, INSPYR Solutions complies with applicable state and local laws governing nondiscrimination in employment in every location in which the company has facilities

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

Job Details

About INSPYR Solutions

Share