Senior Cloud Reliability Engineer -Azure


Work from homeMauldin, SC
Full Time


PaaSReasoningSoftware deploymentYAMLGovernanceBorder Gateway ProtocolADOAnsibleGitHubPost-productionMicrosoftMulti-factor authenticationLinuxDesignDevOpsMicrosoft AzureMicrosoft WindowsPerformance analysisCollaborationDemand forecastingCapacity managementSoftware developmentScrumTeamworkStoragePythonMultitaskingTeachingOperational excellenceManagementRecoveryProject managementLawProcess modelingContinuous integrationSQLWindows PowerShellRegulatory ComplianceJSONRetailAutomationAgileConfluencecontinuous integration and developmentTerraformContinuous improvementLoad balancingDataComputer scienceCloud computingPatch ManagementSoftware securityBackupRVTest-driven developmentHyper-VNetworkVMwareDragon NaturallySpeakingARMOperationsProduct engineeringSecurity managementInformation TechnologyRoutingdata lakeDatabaseSANIaaSContinuous deliveryCDNChange managementPIMKPIMilitaryRecruitingVirtual machinesProblem solvingLeadershipComputer networkingSystem deploymentSoftware architectureFirewallIncident managementBridgingExtreme programmingRouting protocolsAnalyticsCloud architectureDNSKubernetesProduct development

Job Details

Address: USA-SC-Mauldin-211 BiLo Boulevard
Store Code: Greenville Data Center - It (5118640)

Retail Business Services, ranked No. 25 on Fast Company's 2022 100 Best Workplaces for Innovators, is the services company of leading grocery retail group Ahold Delhaize USA, which includes Food Lion, Giant Food, The GIANT Company, Hannaford and Stop & Shop.

Primary Purpose:
Platform Reliability Engineer will help ensure service availability, identifying and automating manual processes, and bridging the gaps between product development teams and operations. Implementing operational improvements in availability, latency, performance, efficiency, change management, monitoring, incident response, patch management and capacity planning are all within scope for this role. Whether it's done through code, the introduction of modern tools, and/or better processes continuous improvement and efficiency is the goal.
You'll provide operational excellence with troubleshooting skills, ownership in supporting various Azure services.

Duties and Responsibilities:
• Builds, manages, and operate Azure Core Services with automation and infrastructure as code
• Manages, and operates the continuous delivery framework and tools, manages, and automates the lifecycle of the different cloud platform components and help support product teams
• Leverage cloud architecture, applying site reliability principles, full-stack troubleshooting skills across network, application, security, Identity, OS, Containers, on-prem, and distributed services layers.
• Provide reasoning about system & application architecture as well as be comfortable looking at code and offering feedback on how it can be improved to increase reliability.
• Identify opportunities and drive the implementation of automation to improve patch management, service health, manageability, reliability, and telemetry.
• Own, triage, investigate and resolve service issues with an emphasis on broad communications, learning & teaching throughout the process
• Design process or technology solutions that monitor, identify, and resolve platform, system, deployment, and environmental issues both prior & post production releases, and ensure measurable improvements against Service KPIs.
• Drive Security and compliance aspects for services in accordance with Azure compliance requirements.
• Engage in service capacity planning, demand forecasting and work towards Azure cost optimizations.
• Create and document Runbooks, Operational procedures, and Standards on confluence
• Communicate on a deeply technical level with product engineering, project management and product teams to improve and optimize products, improve infrastructure, and evolve services.
• Work within a project management/agile scrum teams in a support role as part of a wider team
• Remain current on new technologies, methods and procedures including, but not limited to, coding practices such as Test Driven Development, Continuous Integration, Continuous Deployment and Operational excellence

• Bachelor's Degree in Computer Science, Information Technology, Engineering, or related field
• 7+ years of IT experience focused on infrastructure which includes server, storage, network, security, Identity
• 3+ years of experience supporting, maintaining, and automating Azure environments
• 2+ years of experience using IaC tools (ARM, Terraform, JSON,YAML, PowerShell, Github etc...)
• Production experience in Cloud technologies - Azure IaaS, PaaS, networking, Azure functions, Azure automation and runbooks, workbooks, Insights, Security center, Azure Monitor, Log Analytics.
• Ability to read, write, configure, design, and script end-to-end service telemetry, alerting and self-healing capabilities for platform services
• Ability to work in an Extreme Programming environment and work in a paired programming/operating model
• Able to facilitate diverse teams, multi-task, and work under pressure to meet aggressive schedule targets
• Hands on experience with IaC tools like ADO, ARM, terraform, ansible, PowerShell, python, azcli, github
• Experience in service capacity planning, demand forecasting, software performance analysis and system tuning
• Technical and Operational expertise in Windows/Linux/VMware/Hyper-V/AKS, SQL and N0-SQL DB's, IaaS, PaaS, FaaS, Data, BCDR, Security, Management, Storage, Networking, Monitoring, Identity and Connectivity
• Experience managing and maintaining code repos, build systems, and CICD pipelines
• Experience in infrastructure and configuration as code, as well as service auto-scale capabilities.
• Worked in Devops and Agile environments, Blend of both Development and SRE mindset
• Systematic problem-solving and troubleshooting skills coupled with a strong sense of ownership and drive.
• Participate in on call rotation. Participate, collaborate, and provide guidance in retrospectives.
• At least 4 years of hands-on operational experience supporting the following or related experience:
o Azure Virtual Network, VWAN, Express route, Load Balancer (L4/L7), Traffic Manager, CDN, Azure DNS, routing & routing protocols like BGP, firewall concepts
o Azure Identity including any of the following: Azure AD, PIM, Conditional Access, MFA, Azure AD Connect, Password less sign-ins, Microsoft Defender, key vault
o Azure Governance, Security, Monitoring, Workbooks, Compliance, and cost awareness
o Azure Virtual Machines, Containers and/or Kubernetes and/or OpenShift (infrastructure perspective)
o Azure Storage Account, Disk, Snapshot, Backup, Site Recovery, file sync, Data Lake

Preferred Qualifications:
• Certification in Azure Administrator, Azure DevOps -preferred, Azure Solutions Architect -preferred/
LI-RV1 #DICEJobs #LI-Hybrid

Retail Business Services currently provides services to five omnichannel grocery brands, including Food Lion, Giant Food, The GIANT Company, Hannaford and Stop & Shop. Retail Business Services leverages the scale of the local brands to drive synergies and provide industry-leading expertise, insights and analytics to local brands to support their strategies. We are committed to diversity, equity and inclusion and we foster a community of belonging where everyone is valued.

Retail Business Services is an equal opportunity employer. We comply with all applicable federal, state and local laws. Qualified applicants are considered without regard to sex, race, color, ancestry, national origin, citizenship status, religion, age, marital status (including civil unions), military service, veteran status, pregnancy (including childbirth and related medical conditions), genetic information, sexual orientation, gender identity, legally recognized disability, domestic violence victim status or any other characteristic protected by law. We provide reasonable accommodations to applicants and employees with disabilities. As important as what we do is how we do it. Our team embodies our values of Courage, Care, Teamwork, Integrity and Humor in everything that they do. We have a culture of care that values and celebrates the qualities and perspectives that make us all unique.

If you have a disability and require assistance in the application process, please contact our Talent Acquisition Department at .

For more information, visit .

Job Requisition: 306575_external_USA-SC-Mauldin_6152023

About Retail Business Services