Overview
Skills
Job Details
About R Systems:
R Systems is a leading digital product engineering company that designs and develops chip-to-cloud software products, platforms, and digital experiences that empower its clients to achieve higher revenues and operational efficiency. Our product mindset and engineering capabilities in Cloud, Data, AI, and CX enable us to serve key players in the high-tech industry, including ISVs, SaaS, and Internet companies, as well as product companies in telecom, media, finance, manufacturing, and health verticals. We Are Great Place to Work Certified in 10 countries with a full-time workforce [India, USA, Canada, Poland, Romania, Moldova, Indonesia, Singapore, Malaysia & Thailand]! We are recognized as one of the Best Tech Brands 2024 by the Times Group and India's Top 500 Value Creators 2023 by Dun & Bradstreet.
Company Link:
Responsibilities: Performance Leadership:
Define and implement performance engineering strategies for our Generative AI full stack, including services, application, LLMs, RAG pipelines, and related infrastructure.
Lead performance testing, profiling, and analysis efforts to identify and resolve performance bottlenecks.
Establish and maintain performance benchmarks and SLAs for critical AI services.
Provide technical leadership and mentorship to performance engineering team members.
LLM Capacity and Tuning:
Analyze and improve LLM inference performance, including latency, throughput, and resource utilization.
Develop and implement strategies for LLM capacity planning and scaling.
Collaborate with AI researchers to evaluate and improve LLM model architectures and training techniques for performance.
Optimize LLM inference through techniques such as quantization, distillation, and optimized kernel implementation.
RAG Performance Optimization:
Design and implement performance tests for RAG pipelines, including retrieval, ranking, and generation components.
Identify and optimize performance bottlenecks in RAG systems, such as database queries, vector search, and document processing.
Evaluate and optimize RAG system architectures for scalability and efficiency.
Tune vector databases for optimal recall and latency.
Infrastructure Optimization:
Collaborate with infrastructure teams to optimize hardware and software configurations for AI workloads.
Evaluate and recommend new technologies and tools for performance monitoring and analysis.
Develop and maintain performance dashboards and reports to track key metrics.
Optimize GPU utilization and memory management for LLM inference.
Collaboration and Communication:
Work closely with AI researchers, software engineers, and product managers to ensure performance requirements are met.
Communicate performance findings and recommendations to stakeholders at all levels.
Stay up-to-date with the latest developments in Generative AI and performance engineering.
Qualifications:
Education:
Bachelor's degree in Computer Science, Engineering, or a related field (Master's preferred).
Experience:
10+ years of experience in performance engineering, with a focus on large-scale distributed systems.
2+ years of experience working with AI/ML technologies
Proven experience in performance testing, profiling, and analysis of complex software systems.
Deep understanding of NLP architectures, training, and inference.
Experience with vector databases and search technologies.
Experience with cloud computing platforms (e.g., AWS, Azure, Google Cloud Platform) and containerization technologies (e.g., Docker, Kubernetes).
Strong programming skills in python.
Experience with performance analysis tools (e.g., profilers, debuggers, monitoring tools).
Skills:
Strong analytical and problem-solving skills.
Excellent communication and collaboration skills.
Ability to work in a fast-paced and dynamic environment.
Passion for AI and a desire to push the boundaries of performance engineering
- Frequent Internal Hackathons: Engage in dynamic competitions with exciting prizes to keep your skills sharp.
- Cultural Celebrations: Strengthen our familial bonds through shared celebrations, fostering a sense of community.
- Diverse Project Exposure: Work on a variety of projects across sectors like Healthcare, Banking, e-commerce, and Retail, collaborating with leading global brands.
- Centre of Excellence (COE): Benefit from technical guidance and upskilling opportunities provided by our team of technology experts, helping you navigate your career path.
- E-Learning Platform: Gain access to comprehensive e-learning platforms coupled with a robust mentorship program to enhance your skills.
- Open Door Policy: Embrace a culture of mutual support, respect, and open dialogue, promoting a collaborative work environment.