Senior AWS Agentcore Platform Engineer

Hybrid in Reading, PA, US • Posted 7 hours ago • Updated 7 hours ago
Contract Independent
Contract W2
Hybrid
$50 - $60/hr
Fitment

Dice Job Match Score™

🫥 Flibbertigibetting...

Job Details

Skills

  • AWS CloudWatch
  • X-Ray
  • Bedrock logging
  • MCP servers
  • vector databases
  • Attribute-Based Access Control (ABAC)
  • Agentcore
  • AWS-native

Summary

Role: Senior AWS Agentcore Platform Engineer (Client: )

Position Type: Contract to hire after initial 6 months on C2H


Location: Reading, PA or Exton, PA (Hybrid 2-3 days a week from office)


Job Description:

1. Observability & Distributed Tracing
Gap Analysis: Assess AWS CloudWatch, X-Ray, Bedrock logging, and AgentCore traces against agentic workflow requirements; produce a comprehensive gap analysis and lead the setup of observability within Dynatrace.

Validation Pipelines: Design and implement post-deployment validation pipelines for agents and Model Context Protocol (MCP) servers, ensuring deployment health and successful tool registration.

Tracing & Logging: Implement distributed tracing and structured logging to capture LLM decision logic, tool selections, sub-agent calls, and MCP interactions.

Architecture Strategy: Evaluate LangFuse and LiteLLM proxies against AWS-native solutions; deliver a target-state observability architecture recommendation.

2. Cost Tracking & TCO (Total Cost of Ownership)
Taxonomy Expansion: Extend tagging taxonomy to capture costs across agent runtimes, MCP servers, vector databases, and Bedrock token consumption per namespace.

Cost Modeling: Design a granular cost visibility model to aggregate expenses for agents, MCPs, and LLM tokens by team and department.

Dashboards & Alerting: Build CloudWatch (or equivalent) dashboards for per-team spending; configure AWS Budgets with proactive alerting thresholds.

Automation: Automate cost reporting via email and Microsoft Teams, incorporating anomaly detection rules to identify spend spikes.

3. Monitoring & Incident Management
Alerting Framework: Define and implement P1 P4 alerting rules covering deployment failures, runtime errors, tool invocation failures, and MCP connectivity issues.

Incident Integration: Integrate alert notifications with Microsoft Teams and email, utilizing resource ownership tags for intelligent routing.

Operational Excellence: Author detailed runbooks for every alert; publish and maintain these in Confluence to facilitate developer self-service resolution.

Stack Evaluation: Compare AWS-native vs. third-party monitoring stacks to deliver a long-term recommendation aligned with the broader observability architecture.

4. Security & Governance
Risk Assessment: Evaluate current IAM and tagging strategies for multi-team isolation; identify scalability gaps and potential security risks.

Policy Engines: Assess the Cedar policy engine (AgentCore) for fine-grained tool access control and document gaps for enterprise-scale deployment.

Identity Architecture: Design a scalable Attribute-Based Access Control (ABAC) identity model to ensure multi-team isolation without IAM policy sprawl; deliver production-ready Terraform modules.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
  • Dice Id: RTX1c3151
  • Position Id: 8964504
  • Posted 7 hours ago
Create job alert
Set job alertNever miss an opportunity! Create an alert based on the job you applied for.

Similar Jobs

Reading, Pennsylvania

Yesterday

Easy Apply

Contract, Third Party

Reading, Pennsylvania

Today

Easy Apply

Contract, Third Party

60 - 65

Hybrid in Reading, Pennsylvania

Yesterday

Easy Apply

Full-time

Depends on Experience

Reading, Pennsylvania

3d ago

Easy Apply

Full-time

Depends on Experience

Search all similar jobs