AWS Cloud Engineer

Overview

Remote
Depends on Experience
Full Time

Skills

AWS
Lambda
Sagemaker
Bedrock
Python
golang
Machine Learning
Data Science

Job Details

We are unable to sponsor H1B ORVisas

DUTIES AND RESPONSIBILITIES:
Design, develop, and maintain modular AI services on AWS using Lambda, SageMaker, Bedrock, S3, and related components built for scale, governance, and cost-efficiency.
Lead the end-to-end development of RAG pipelines that connect internal datasets (e.g., logs, S3 docs, structured records) to inference endpoints using vector embeddings.
Design and fine-tune LLM-based applications, including Retrieval-Augmented Generation (RAG) using LangChain and other frameworks.
Tune retrieval performance using semantic search techniques, proper metadata handling, and prompt injection patterns.
Collaborate with internal stakeholders to understand business goals and translate them into secure, scalable AI systems.
Own the software release lifecycle, including CI/CD pipelines, GitHub-based SDLC, and infrastructure as code (Terraform).
Support the development and evolution of reusable platform components for AI/ML operations.
Create and maintain technical documentation for the team to reference and share with our internal customers.
Excellent verbal and written communication skills in English.

REQUIRED KNOWLEDGE, SKILLS, AND ABILITIES:


10+ years of proven software engineering experience with a strong focus on Python and GoLang and/or Node.js.
Demonstrated contributions to open-source AI/ML/Cloud projects, with either merged pull requests or public repos showing real usage (forks, stars, or clones).
Direct, hands-on development of RAG, semantic search, or LLM-augmented applications, using frameworks and ML tooling like Transformers, PyTorch, TensorFlow, and LangChain not just experimentation in a notebook.
Ph.D. in AI/ML/Data Science and/or named inventor on pending or granted patents in machine learning or artificial intelligence.
Deep expertise with AWS services, especially Bedrock, SageMaker, ECS, and Lambda.
Proven experience fine-tuning large language models, building datasets, and deploying ML models to production.
Demonstrated success delivering production-ready software with release pipeline integration.
NICE-TO-HAVES:
Policy as Code development (i.e., Terraform Sentinel) to manage and automate cloud policies, ensuring compliance
Experience optimizing cost-performance in AI systems (FinOps mindset).
Awareness of data privacy and compliance best practices (e.g., PII handling, secure model deployment).
Demonstrated experience with AWS organizations and policy guardrails (SCP, AWS Config).

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.