Overview
Skills
Job Details
Role : Systems Performance Engineer
Location: city of o Fallon , Missouri -Hybrid (Wednesday and Thursday in office).
Duration: Contract
Job Description
DevOps/Cloud Engineer that has experience in development or a strong Lead Software Engineer that has handled performance testing would work.
Location: Hybrid (Wednesday and Thursday in office). O'Fallon MO
High Level Project Description: Working with application development and infrastructure teams to monitor the performance of the system applications.
Interviews: Two or Three virtual interviews if the person isn?t local. One or Two in person interviews if the person is local.
The two main skills are Spring and experience in a high transaction industry .This individual will need to have a strong understanding of Java/Spring Boot development and be able to dive deep into the code to resolve performance issues and troubleshoot across entire application in a cloud environment.
communication should be very strong
Systems Performance Engineer ? Lead SE
Summary:
Responsible for identifying and resolving end-to-end performance bottlenecks across distributed systems, Spring Boot services, middleware components, and hybrid cloud environments (private cloud + AWS). This role goes far beyond traditional testing by deeply analyzing container orchestration, networking paths, and system interactions under load. This position maps full system workflows, sets realistic latency budgets, and ensures each component meets its SLOs. Ideal candidates have extensive experience with high-scale, multi-region, and high-transaction platforms (e.g., financial systems, payment processing, or large enterprise SaaS) running in a Cloud environment.
Key Responsibilities
- Define service-level objectives (SLOs), performance budgets, and latency/throughput targets across services.
- Architect and champion comprehensive distributed tracing strategies (Dynatrace, AWS X-Ray, etc.).
- Analyze application, platform, and cloud behavior using deep-dive techniques such as heap dumps, thread dumps, flame graphs, logs, network traces, and storage I/O profiling.
- Review service and system architectures for performance risks (e.g., synchronous hops, excessive dependencies, misconfigured connection pools, poor cache placement).
- Conduct and lead root-cause analysis for performance incidents in production and pre-production environments.
- Develop capacity models and performance baselines for services running across cloud environments.
Requirements
Areas of Expertise
- Application Layer: Spring Boot internals, JVM tuning, thread/heap management, concurrency debugging, optimization
- Container Runtime: PCF, Docker, container resource limits, CPU throttling, memory pressure
- Orchestrators: PCF, Kubernetes, ECS (autoscaling, pod health, scheduling issues)
- Networking: Service-to-service hops, TLS overhead, DNS, routing, load balancer configs (F5, Nginx, ALB/NLB), service mesh performance
- Storage: Latency, IOPS constraints, distributed file system behavior
- Caching & Middleware: Redis, Hazelcast, NATS, Kafka, RabbitMQ configuration and throughput tuning
- Databases: Connection pool tuning, slow queries, indexing, replication lag
- Cloud Layer: AWS compute/storage/network performance, regional latency, cross-cloud traffic patterns