Gen AI - Platform Cloud Engineer


Transcend IT Solutions
Dice Job Match Score™
⭐ Evaluating experience...
Job Details
Skills
- GenAI
- Large Language Models (LLMs)
- AI/ML Infrastructure
- NLP
- Platform Engineering
- Cloud Architecture
- Hybrid Cloud
- Azure
- Google Cloud Platform (GCP)
- Google Kubernetes Engine (GKE)
- Kubernetes
- Docker
- Terraform
- Ansible
- Infrastructure as Code (IaC)
- High Performance Computing (HPC)
- GPU Computing
- Cluster Computing
- Grid Computing
- Object Storage
- High Performance Storage
- Elasticsearch
- BigQuery
- Python
- REST APIs
- API Development
- API Gateway
- Virtual Private Cloud (VPC)
- Identity and Access Management (IAM)
- Landing Zone
- Azure App Service Environment
- Azure Blueprints
- Control Plane
- CI/CD
- DevOps
- Site Reliability Engineering (SRE)
- Apache Spark
- Data Processing
- Data Virtualization
- Dremio
- TIBCO Data Virtualization
- Microservices
- Load Balancing
- F5 Load Balancer
- Cloud Security
- Capacity Planning
- Infrastructure Automation
- Platform Resiliency
- Fault Tolerance
- Scalability
- Performance Optimization
- Agile
- Scrum
- Git
- Linux
- Cloud Certifications (Azure
- GCP
- Kubernetes)
Summary
- Lead and design the platform and infrastructure architecture for AIML and NLP in modern hybrid cloud computing.
- Participate in day-to-day standups for infrastructure and platform scrums, provide architectural solutions for public and private cloud and resolve technical blockers for the team.
- Collaborate and work very closely with the engineering strategy, platform engineering and development teams to understand the infrastructure requirements and drive all aspects of AIML and GENAI solution delivery to the cloud platforms.
- Lead teams within DTI and across the enterprise in researching and proposing highly available, resilient and fault tolerant infrastructure solutions for AIML and GENAI workloads.
- Define and drive infrastructure and platform roadmaps that align with technology and business strategy. Perform hardware and capacity planning, analysis, and forecasts for your portfolio of applications with focus on highest availability, scalability, performance, and timely delivery.
- Provide SME guidance around infrastructure and platform for product and business partners.
- Research industry best practices, evaluate new technologies, develop standards and engineering best practices and recommend innovative solutions that support automation and improve platform resiliency and fault tolerance of critical applications.
Minimum Requirements
- 3+ years of experience in leading the design and implementation of grid/cluster computing infrastructure with CPU and GPUs supporting AIML and NLP workloads.
- 3+ years of experience with Azure and/or Google Cloud Platform/GKE, as well as experience building complex infrastructure programmatically with IaC tools (Terraform/Ansible etc.)
- 1+ years of experience designing solutions and working with high-performance storage technologies including Object Storage.
- 2+ years of experience working and supporting infrastructure for high throughput and low latency High performance (HPC) computing.
- 2+ years of experience with Elastic Search
- 1+ years of working with big data (Big Query)
- Working knowledge and understanding of developing APIs using Python.
- Excellent understanding and working knowledge of cloud computing concepts like Virtual Private Cloud (VPC), landing zone, Identity and Access Management (IAM), App Service Environment, Blueprints, Control Plane etc.
- Excellent verbal, written, and interpersonal communication skills. Ability to articulate technical solutions to both technical and business audiences.
- Recent and demonstrated ability to influence management on technical or business solutions.
- Experience with CI/CD, DevOps concepts and SRE principles.
Preferred Skills
- 1+ years of experience in LLM, Generative AI (developing capabilities or dev/ops)
- Experience in developing APIs on Google Cloud Platform/Azure/API Gateways
- Experience with data processing technology (Apache Spark etc.)
- Experience with data virtualization technology (Tibco DV, Dremio, etc.)
- Understanding of Agile practices and ability to work with Agile teams to define and track user stories.
- Experience with designing and implementing complex F5 or other Load Balancer Technologies
- Knowledge and understanding of Cloud computing, PaaS design principles and micro services and k8s containers.
- Cloud certifications K8s, Google Cloud Platform & Azure preferred.
- Dice Id: 91140092
- Position Id: TISG8272324
- Posted 19 hours ago
Company Info
About Transcend IT Solutions
Our business model has scaled to successful heights on account of the long-lasting and loyal relationship with our clients, partners and candidates who have been pillar foundation of why Transcend IT Solutions is a market leader.
Our team has an outstanding track record of success in project staffing, project development and management, and consulting services. With firsthand experience in these areas, we have a sound understanding of our client’s needs.
Transcend IT Solutions provides assistance with staffing for all Information Technology roles. Providing the best-fit person of the highest caliber is our recruiting team’s goal. Our success is based on longevity of client-candidate relationships. Whether it is an individual team member or an entire team, Transcend IT Solutions will assist you with your staffing needs.
Our consulting services cover all aspects of Information Technology. Our expert consultants provide expertise in the areas of project management, requirement gathering and management, analysis and design, construction, quality control and assurance, and system deployment. As well as being individual contributors, our consultants also provide mentoring, process improvement expertise, and reviews and audits of existing projects.


Similar Jobs
It looks like there aren't any Similar Jobs for this job yet.
Search all similar jobs