Overview
Skills
Job Details
Work Location: New york (onsite)
Industry: Media & Entertainment / Large-Scale Event Readiness
Role and responsibilities:
Role Overview
Google AI Architect with deep technical expertise in AI/ML systems on Google Cloud Platform (Google Cloud Platform) to lead code-level reviews, model optimization, and performance tuning for a high-concurrency, real-time conversational application. This role requires hands-on experience with Vertex AI, unstructured data handling, and latency-sensitive AI workloads, and will collaborate with infrastructure and database specialists to ensure end-to-end system efficiency.
Key Responsibilities
* Conduct deep-dive reviews of AI/ML codebase, including model inference, data pipelines, and response generation logic.
* Optimize AI components for performance, scalability, and responsiveness, especially under high concurrency.
* Work with the Infrastructure Specialist to align AI workloads with Cloud Run, Cloud Armor, and Model Armor configurations.
* Work with the AlloyDB Specialist to ensure efficient data access patterns and caching strategies using AlloyDB and Redis.
* Identify and resolve latency bottlenecks in AI workflows, including preprocessing, model execution, and post-processing.
* Provide architectural guidance on unstructured data handling, retrieval-augmented generation (RAG), and real-time inference.
* Document optimization strategies and share best practices with engineering teams.
Required Skills & Experience
* 8+ years of experience in AI/ML architecture and solution engineering.
* Strong expertise in Google Cloud Platform, including: Vertex AI, Cloud Run
* Experience with Google Cloud Platform infrastructure and DB components including Cloud Armor, Model Armor, AlloyDB, Redis.
* Proven track record in building and optimizing real-time AI systems with low-latency requirements.
* Experience with unstructured data, NLP pipelines, and conversational AI architectures.
Preferred Qualifications
* Google Cloud Certified - Professional Machine Learning Engineer or Cloud Architect.
* Experience with post-code freeze optimization and production readiness reviews.
* Familiarity with caching, query acceleration, and AI security best practices.