mid-level Generative AI / LLM-focused Software Developer (AI Engineer)

Hybrid in Philadelphia, PA, US • Posted 15 hours ago • Updated 15 hours ago
Contract W2
Contract Independent
No Travel Required
Hybrid
$60 - $65/hr
Fitment

Dice Job Match Score™

🎯 Assessing qualifications...

Job Details

Skills

  • Python
  • LLM deployment (Llama/Mistral)
  • RAG pipelines
  • vector databases (Qdrant/Chroma/Milvus/pgvector)
  • embeddings & semantic search
  • CPU-based inference optimization
  • enterprise security & data privacy
  • and monitoring/logging

Summary

Dear Partner,

Good Morning ,
Greetings from Nukasani group Inc !, We have below urgent long term contract project immediately available for mid-level Generative AI / LLM-focused Software Developer (AI Engineer), Philadelphia, PA, Hybrid  need submissions  you please review the below role, if you are available,  could you please  send me updated word resume, and below candidate submission format details,  immediately. If you are not available, any referrals would be greatly appreciated.

Interviews are in progress, urgent response is appreciated. Looking forward for your immediate response and working with you.

   **Candidate Submission Format - needed from you**
Full Legal Name
Personal Cell No ( Not google phone number)
Email Id
Skype Id
Interview Availability
Availability to start, if selected
Current Location
Open to Relocate
Work Authorization
Total Relevant Experience
Education./  Year of graduation
University Name, Location
Last 5 digits of SSN
Country of Birth
Contractor Type
 DOB: (dd/mm) mm/dd     
Home Zip Code

LinkedIn ID

Assigned Job Details

Job Title : mid-level Generative AI / LLM-focused Software Developer (AI Engineer)
Location:  Philadelphia, PA, Hybrid
Rate : Best competitive rate

**Position Overview**

We are seeking a mid-level Software Developer/Engineer with strong expertise in Generative AI systems, particularly in deploying Large Language Models (LLMs) within secure, enterprise environments.

The ideal candidate will have hands-on experience with on-premise LLM deployments, Retrieval-Augmented Generation (RAG) pipelines, and vector database integration, along with a solid foundation in Python-based backend development.

This role involves working on cutting-edge AI solutions while ensuring performance, scalability, and data security in enterprise-grade systems.

**Key Responsibilities**

Deploy and manage open-source LLMs (e.g., Llama 3, Mistral, Mixtral) in on-premise or private cloud environments
Design, build, and optimize LLM inference pipelines using Python
Develop and implement Retrieval-Augmented Generation (RAG) workflows
Design and integrate vector databases for semantic search and retrieval
Optimize model performance through quantization and CPU-based inference tuning
Ensure data privacy, governance, and security compliance in enterprise environments
Implement access controls, logging, and monitoring for AI systems
Create reference architectures, prototypes, and technical documentation
Collaborate with cross-functional teams to support deployment, adoption, and knowledge transfer

**Required Qualifications**

5–9 years of experience in software development or engineering
Strong proficiency in Python for backend and AI/ML development
Hands-on experience deploying open-source LLMs (e.g., Llama 3, Mistral, Mixtral)
Experience building and optimizing RAG pipelines
Practical knowledge of vector databases (e.g., Qdrant, Chroma, Milvus, pgvector)
Understanding of embeddings, similarity search, and metadata filtering
Experience with CPU-based inference optimization techniques
Familiarity with enterprise security practices, including data privacy and air-gapped environments

**Preferred Qualifications**

Experience with LangChain or LlamaIndex
Familiarity with Docker and Kubernetes
Exposure to Rust, Go, or C++ for high-performance systems
Experience with LLM inference frameworks (e.g., vLLM, llama.cpp, Hugging Face Transformers)
Prior experience working in regulated or enterprise environments
Deliverables
End-to-end reference architecture for LLM and vector database solutions
Fully functional prototype (LLM + RAG + Vector Database)
Comprehensive technical documentation and knowledge transfer


Best,

Bhavani
Recruiter | IT & Digital Marketing

P:
540 W Galena Blvd, Suite 200
Aurora, IL 60506

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: 10211499
  • Position Id: 8936085
  • Posted 15 hours ago
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Hybrid in Philadelphia, Pennsylvania

Today

Easy Apply

Third Party, Contract

Depends on Experience

Hybrid in Philadelphia, Pennsylvania

Today

Easy Apply

Contract

Up to $58

Hybrid in Philadelphia, Pennsylvania

Today

Easy Apply

Third Party, Contract

$60 - $70

Hybrid in Philadelphia, Pennsylvania

Today

Easy Apply

Contract, Third Party

Depends on Experience

Search all similar jobs