12+months
Backfill Role
Senior Linux Platform Engineer (CPaaS Infrastructure)
Overview
We are seeking a hands-on Senior Linux Platform Engineer to architect, build, automate, and operate the infrastructure powering a large-scale CPaaS voice platform. This role focuses heavily on Linux systems engineering, infrastructure automation, networking, observability, and scalable cloud infrastructure.
You will help design and maintain the software supply chain and infrastructure lifecycle that transforms source code into production-ready artifacts and deploys them across cloud and customer-hosted environments.
This is a highly technical engineering role focused on reliability, scalability, automation, and performance optimization within high-traffic production environments.
Responsibilities
Linux Platform Engineering
· Administer and optimize enterprise Linux environments including RHEL, Rocky Linux, and AlmaLinux
· Tune Linux systems for high-performance networking workloads
· Write and maintain systemd unit files, startup scripts, and operating system configurations
· Troubleshoot production issues across infrastructure, networking, and application layers
Infrastructure Automation
· Design and maintain Infrastructure-as-Code using Terraform
· Provision and manage cloud infrastructure including networking, compute, security groups, routing, and load balancers
· Build reusable and modular Terraform components
· Manage Terraform state in collaborative team environments
Configuration Management
· Develop and maintain Ansible roles and playbooks
· Automate provisioning and configuration of bare-metal and virtual machine environments
· Ensure automation is repeatable, idempotent, and production-ready
Release Engineering & Packaging
· Build and maintain RPM packages and private repositories
· Manage package dependencies, versioning, and deployment processes
· Create reproducible build environments using Docker or similar tooling
· Support CI/CD pipeline automation and deployment workflows
Cloud & Network Architecture
· Design and support cloud infrastructure across AWS and/or Oracle Cloud Infrastructure (OCI)
· Configure VPCs, subnets, routing, NAT gateways, and Layer 4 load balancers
· Support high-throughput TCP and UDP workloads
· Assist with multi-cloud connectivity and hybrid networking solutions
Observability & Reliability
· Build and maintain monitoring and observability solutions using Grafana, Prometheus, OpenTelemetry, and related tooling
· Monitor infrastructure health, performance, and availability
· Develop operational dashboards, alerting, and troubleshooting runbooks
· Participate in production support and incident response activities
Documentation & Collaboration
· Produce architecture diagrams, operational documentation, and deployment procedures
· Partner with engineering, networking, and operations teams to improve platform reliability and scalability
· Contribute to technical standards and infrastructure best practices
Required Qualifications
· 8+ years of experience in Systems Engineering, DevOps, Infrastructure Engineering, or Site Reliability Engineering (SRE)
· Expert-level Linux administration experience in production environments
· Strong experience with Terraform and Infrastructure-as-Code practices
· Strong experience with Ansible or similar configuration management tools
· Experience supporting cloud infrastructure in AWS or OCI
· Strong networking fundamentals including TCP/IP, routing, NAT, firewalls, and load balancing
· Experience supporting high-traffic, production SaaS environments
· Proficiency with Bash scripting and at least one additional scripting/programming language such as Python or Go
· Experience with CI/CD pipelines, automation, and deployment tooling
· Strong troubleshooting and problem-solving skills across infrastructure and application stacks
· Experience with monitoring and observability platforms such as Grafana, Prometheus, or OpenTelemetry
· Strong written and verbal communication skills
Preferred Qualifications
· Experience with CPaaS, VoIP, SIP, RTP, or telecom infrastructure
· Familiarity with Jambonz or related voice platforms
· Experience with technologies such as FreeSWITCH, drachtio, rtpengine, Redis, MySQL, or Homer
· Experience managing Layer 4 load balancing for stateful TCP/UDP traffic
· Experience tuning Linux kernels for networking or real-time communications workloads
· Experience with RPM packaging and repository management
· Experience supporting multi-cloud networking between AWS and OCI
· Familiarity with voice quality and observability metrics such as MOS, ASR, and PDD
What Success Looks Like
· Infrastructure is automated, scalable, and maintainable
· Deployments are reliable and repeatable
· Production systems remain stable under heavy load
· Monitoring and observability provide rapid insight into platform health
· Documentation and operational processes are clear and actionable
· Engineering teams can deploy and scale services confidently