Senior AWS Platform Administrator

Overview

Remote
$110,000 - $140,000
Full Time
No Travel Required

Skills

Amazon Web Services
Cloud Computing
Disaster Recovery
Emerging Technologies
High Availability
Problem Solving
Scalability
Ubuntu
Performance Monitoring
Security
Design
IoT
Network Design
Linux Administration

Job Details

This role is responsible for the operation and evolution of Prism, our IoT software platform, and the systems that support it. This involves planning, design, monitoring, management, and troubleshooting of the infrastructure that hosts our customer s large-scale IoT applications, ensuring availability, scalability, security, and performance. Success in this role means our platform consistently meets its 99.999% uptime goal, incidents are resolved within SLA requirements, and the infrastructure scales economically with customer growth.

It merges traditional network and infrastructure engineering and management knowledge with a focus on IoT applications with a five nines availability target. Subject domains include the core network fabric, both cloud and on-prem, Linux servers, software like Postgres, Redis/Valkey, RabbitMQ, node.js, and others, common IoT protocols and technologies like MQTT, CoAP, WebSockets, and LoRaWAN, monitoring and management tools like Ansible, Zabbix, and log aggregation and analysis.

Summary

  • Manage, monitor, and troubleshoot the Prism platform and infrastructure, and the hosted customer applications it supports.
  • Management, planning, and deployment of software and security patches and updates.
  • Analyze, recommend, and implement infrastructure changes and enhancements based on evolving application requirements, updated technologies, and cost optimization.
  • Review and recommend security measures and strategies.
  • Design and deploy sandbox environments to test and evaluate new software and simulate failure scenarios.
  • Collaborate with project managers and developers to plan, test, and implement new platform capabilities to support new customer applications.
  • Collaborate with software engineers and developers to troubleshoot system issues and optimize application performance and availability.
  • Tier 3 troubleshooting of customer application questions and issues.
  • Utilize a variety of tools for monitoring, automation, and capacity planning.
  • This role is 100% remote

Key Responsibilities

  • This role owns the uptime and operation of the Prism platform and its supporting infrastructure and services.
  • Ensure that Prism and the customer applications it hosts are running optimally.
  • Troubleshoot issues to determine their root cause and implement short- and long-term solutions to minimize future occurrences.
  • Monitor Prism and the customer applications it hosts to understand how to improve its reliability, availability, security, and performance.
  • Monitor network performance and troubleshoot connectivity issues to ensure minimal downtime and high availability.
  • Recommend, design, test, and implement changes, upgrades, and enhancements.
  • Serve as a knowledge resource on capabilities and limitations of the platform for engineers, developers, project managers, and customers.
  • Design, implement, and regularly test backup, failover, and disaster recovery procedures for all critical systems and data.
  • Provide Tier 3 support to internal stakeholders and customers to assist in troubleshooting reported anomalies.
  • Collaborate with software engineers, software developers, device integrators, and product teams to align platform functionality with application requirements.
  • Design and conduct simulations, feasibility studies, and capacity planning to support application scaling.
  • Monitor and ensure the security and integrity platform through encryption, device authentication, segmentation, intrusion detection, and other security measures.
  • Define and track reliability metrics (uptime, SLOs/SLAs, MTTR, etc.) and report on platform performance against these targets.
  • Research, evaluate, and recommend emerging technologies and tools to continuously improve the platform and infrastructure.
  • Identify, design, and implement ways to improve processes, increase efficiency, and automate manual processes.
  • Maintain network documentation, diagrams, and configuration details.
  • Be part of the on-call rotation for after-hours monitoring and indecent response.

Qualifications

  • Bachelor's or Master's degree in Computer Science, Information Technology, or related field, or equivalent experience.
  • Direct experience (typically 5+ years) with network infrastructure and systems management, with a focus on IoT or similar environments.
  • Direct experience (typically 5+ years) with AWS
  • Solid understanding of networking protocols, services, and technologies (TCP/IP, MQTT, CoAP, HTTP, AMQP, DNS, DHCP, load balancers, LoRaWAN, etc.).
  • Hands-on experience with network hardware and cloud services: configuration, installation, monitoring, management, and troubleshooting.
  • Experience with and/or knowledge of security frameworks (SOC2, ISO27001, PCI DSS).
  • Excellent problem-solving and skills, proven ability to collaborate in cross-functional teams, and communicate effectively with developers, project managers, and customers.

Required Experience

These are some of the key domains and technologies you'll work with every day. You should have expert-level or high-level working knowledge of these or closely related alternatives.

  • AWS (VPC, EC2, S3, CloudFront, SQS, SES, Route53, etc.)
  • Ubuntu Linux (Administration, shell scripting, monitoring, management, etc.)
  • Postgres Database Clusters
  • Redis/Valkey Clusters
  • RabbitMQ Clusters
  • eMQTT/eMQX Clusters
  • Security Frameworks (SOC2, ISO27001, PCI DSS)
  • Backup and Disaster Recovery Planning
  • Integration with third-party systems/services via APIs

Beneficial Experience

These are some of the other domains and technologies we currently use or may use in the future. Familiarity with or exposure to these will be beneficial to you in this role.

  • Ansible (or similar)
  • Zabbix (or similar)
  • Log Aggregation (Graylog, ELK, or similar)
  • CI/CD Pipeline and Release Management
  • Security Compliance and Audits
  • Docker/Swarm and/or Portainer
  • etcd Distributed Key-Value Cluster
  • Google Apps
  • Wireshark and tcpdump
  • Experience with scripting in JavaScript, Python, etc.
  • Familiarity with other Security Frameworks (OWASP, NIST, etc.)
  • Familiarity with Azure or Google Cloud Services

About Object Spectrum (For the Bold, Curious, and Driven)

Are you someone who thrives on action, challenges the status quo, and loves to dive into new technology? At ObjectSpectrum, this is exactly the energy we look for. We build powerful, reliable Internet of Things (IoT) solutions with Prism, our platform where creativity, curiosity, and a get-it-done spirit shape every project.

If you see problems as puzzles to solve, love tinkering with hardware and software, and see learning as a lifelong adventure, you'll fit right in. We're a remote-first, collaborative team of engineers, developers, and designers who believe progress comes from honest feedback, relentless curiosity, and the courage to try, test, and improve new solutions.

What drives us:

  • Bias for action: We value self-starters and innovators who take initiative and aren't afraid to experiment with new ideas or tech.
  • Challengers welcome: Your voice matters challenge how things are done, suggest better ways, and help us raise the bar.
  • Integrity & honesty: Open, transparent communication and trusting each other is how we build and deliver great products.
  • Growth mindset: Don't just bring your skills bring your curiosity and your hunger for learning. We support your growth with education opportunities, mentorship from seasoned engineers, and the freedom to explore new technology.
  • Real impact: Whether you're crafting a new solution or directly shaping customer experiences, your contribution will make a difference.

Beyond the paycheck:

  • Flexible schedules and flexible PTO work in the way that fuels your creativity.
  • Competitive pay, profit sharing, and comprehensive health plans (Medical, Dental, Vision, Life, HSA).
  • Support for your home office, plus budgets for conferences and continuous learning.
  • A culture where diversity of thought and background is genuinely valued we know innovation comes from many perspectives.

If you love to push boundaries, test innovative solutions, and are eager to make an impact in the IoT space with smart, kind teammates, ObjectSpectrum is where you'll thrive. Join us and let s build something amazing together.

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.