Overview
Skills
Job Details
Location: Remote - IST with 3 4 hrs US (ET) overlap
Company Description:
Mechanized AI modernizes legacy systems into secure, scalable, production-grade platforms. We partner with enterprises to deliver AI-driven products that meet performance, reliability, and cost targets. We value engineers who solve real customer problems and own outcomes end-to-end.
Job Summary:
We are seeking an experienced Backend Engineer to build and operate AI products across AWS/Azure/Google Cloud Platform using Python Flask and containerized services on ECS. Your mandate: make things work endtoend, quickly from API to DB to network prioritizing throughput and latency in practice over ceremony.
Key Responsibilities (debuggingfirst, e2e outcomes):
- Own e2e flows: API auth service queue/cache DB network deployment; deliver working software fast.
- Debug quickly: isolate problems across app, DB, infra, and network layers; produce clear RCAs and durable fixes.
- Spinup speed (TTFD): bootstrap a new Flask on ECS service with CI/CD, secrets, logs, metrics, health checks, and runbook within 1 business day.
- Golden signals & SRE hygiene: define and track MTTD 15 min, MTTR 2 hrs (Sev2), alert routing, and error budgets; prevent regression with tests and deployment guards.
- Networking you can trust: VPC/subnets/NACLs/SGs, ALB/NLB listeners/target groups/health checks, Route 53/DNS, TLS/mTLS, NAT/VPC endpoints/PrivateLink, ECS awsvpc port mappings.
- Performance in practice: reduce tail latency (p95/p99), eliminate headofline blocking; tune queries, caching, connection pools, and gunicorn worker models.
- Observability for debugging: highsignal logs, correlation IDs, metrics, and traces to cut MTTR; dashboards and actionable alerts.
- Ship & operate: IaC (Pulumi/Terraform), CI/CD (AWS CodePipeline or equivalents), safe rollouts/rollbacks; oncall with runbooks and post
Required Skills & Experience:
- Bachelor s degree in CS/SE or related field; 6 years as a Backend Engineer.
- Cloud: build on AWS or Azure; containers on ECS (Fargate or EC2).
- IaC: Pulumi/Terraform (2 yrs); CI/CD with AWS CodePipeline or equivalents.
- Python + Flask: blueprints, dependency injection, config management; WSGI and gunicorn worker tuning; can stand up a minimal service in hours.
- Debugging expert: `pdb/ipdb`, `py-spy`, `cProfile`, `tracemalloc`, flamegraphs; log/trace correlation; binary search through systems; proven MTTR 2 hrs for Sev2 class incidents.
- Networking expert: VPC/subnets/route tables/NACLs/SGs, ALB/NLB, TLS/mTLS, Route 53, NAT, VPC endpoints/PrivateLink; ECS service discovery and port mapping; can explain 502/504 causes and fixes.
- Auth & security: Auth0/Cognito
- OAuth2/OIDC, secret mgmt, leastprivilege IAM.
- Data: SQL (Postgres/MySQL) one NoSQL; schema design, indexing, query tuning, connection pooling.
- Proven experience building production products with ECS Flask Python; collaborates well with FE/ML/Data.
- Experience with multithreading through celery, multiprocessing, and equivalent.
- 2+ years of experience in Serverful & Serverless.
- 2+ years building complex ECS tasks on AWS.
- Follows TDD & BDD best practices for development
Preferred Qualifications:
- Security experience (threat modeling, IAM reviews, remediation).
- Strong AWS knowledge; certification (e.g., SAA) is a plus.
Packetlevel debugging or SRE background; React experience is a plus (not required)