Overview
Skills
Job Details
Dallas TX
Job Description:
>> Design, implement, and maintain ML pipelines for training, testing, and deploying AlML models.
>> Manage and optimize cloud-based ML infrastructure (Google Cloud Platform Vertex AI, AWS SageMaker, or equivalent).
>> Implement CICD pipelines for ML and Al-driven applications.
>> Monitor, troubleshoot, and optimize model performance and system reliability.
>> Automate workflows for data ingestion, model training, deployment, and monitoring.
>> Collaborate with cross-functional teams to ensure secure, scalable, and compliant ML operations.
>> Apply MLOps best practices for reproducibility, versioning, and governance of ML models.
Requirements:
>> 5+ years experience in DevOps, CloudOps, or ML
>> 5+ years experience with Google Cloud Platform AIML services (Vertex Al, Al Platform, BigQuery ML) or AWS ML services (SageMaker etc)
>> 5+ years Experience with containerization and orchestration (Docker, Kubernetes)
>> Proficiency in infrastructure-as-code (Terraform, CloudFormation, or Deployment Manager)
>> Familiarity with CICD pipelines (Jenkins, GitHub Actions, GitLab CI, or ArgoCD)
>> Strong programming skills in Python, Bash, or Go, with experience in ML frameworks (TensorFlow, PyTorch, Scikit-learn)