Platform Engineer - Enterprise Monitoring

Overview

On Site
USD 85,000.00 - 111,000.00 per year
Full Time

Skills

Bloomberg
Wholesale
Retail
Roadmaps
Scalability
Management
High-level Design
Technical Drafting
IBM Tivoli Monitoring
Onboarding
Design Review
Testing
Debugging
UPS
Dashboard
Root Cause Analysis
Regulatory Compliance
Documentation
Incident Management
Hardware Development
Regression Analysis
Research
Regression Testing
Integration Testing
Operating Systems
Continuous Integration and Development
DevOps
Agile
SaaS
Dynatrace
Microsoft SCOM
Tivoli
HP
Network
Grafana
Data Visualization
Microsoft Power BI
Terraform
Scripting
MySQL
Windows PowerShell
Python
Cloud Computing
Google Cloud
Google Cloud Platform
Communication
Computer Networking
Analytical Skill
Scheduling
SAP BASIS
Red Hat Linux
Docker
Microsoft Azure
IaaS
PaaS
Orchestration
Jenkins
Apache Maven
Continuous Integration
Continuous Delivery
IBM WebSphere
Gmail
Privacy
Pharmacy
Health Care
Insurance
Life Insurance
Recruiting
Authorization
Employment Authorization

Job Details

Costco IT is responsible for the technical future of Costco Wholesale, the third largest retailer in the world with wholesale operations in fourteen countries. Despite our size and explosive international expansion, we continue to provide a family, employee centric atmosphere in which our employees thrive and succeed.

This is an environment unlike anything in the high-tech world and the secret of Costco's success is its culture. The value Costco puts on its employees is well documented in articles from a variety of publishers including Bloomberg and Forbes. Our employees and our members come FIRST. Costco is well known for its generosity and community service and has won many awards for its philanthropy. The company joins with its employees to take an active role in volunteering by sponsoring many opportunities to help others.

Come join the Costco Wholesale IT family. Costco IT is a dynamic, fast-paced environment, working through exciting transformation efforts. We are building the next generation retail environment where you will be surrounded by dedicated and highly professional employees.

Platform Engineers translate high level platform design into low level technical design and are responsible for implementing, administering, supporting and patching their corresponding platforms. Platform Engineers work closely with Solution Architects to enable the capabilities defined on roadmaps and blueprints supporting platform programs and initiatives. Platform Engineers are well versed in modern data, infrastructure and integration platforms, industry/technology best practices and actively work on improving the reliability and scalability of infrastructure.

The Enterprise Monitoring (EM) team Platform Engineer will be responsible for the design, deployment and management of robust and scalable Monitoring and Logging platforms for the Enterprise. This role involves the execution, maintenance and delivery of new features/capabilities across multiple monitoring and observability solutions, to ensure end-to-end visibility into the health, performance and availability of applications and platforms. We are looking for a highly motivated, technically savvy, business focused Platform Engineer who has the skills to work in a highly collaborative and fast-paced environment and is willing to adapt to multiple strategic priorities to deliver high quality deliverables.

*This position will be filled onsite in Issaquah, WA or Dallas, TX.

If you want to be a part of one of the worldwide BEST companies "to work for", simply apply and let your career be reimagined.

ROLE

Designs, implements and maintains monitoring platforms across multi-cloud (Azure, Google Cloud Platform etc) and on-prem environments.

Manages and optimizes observability solutions (such as Dynatrace, Prometheus, OpenTelemetry etc) for end-to-end systems visibility.

Assesses technical components, translates high level design into low level technical design and executes updates based on SLAs.

Integrates logging solutions and ensures proper ingestion, parsing and indexing of logs across platforms.

Administers Enterprise Monitoring tools such as SCOM, IBM Tivoli Monitoring, HP Operations Agent, Network Node Manager (NNMi) and Dynatrace to ensure continuous availability and performance.

Develops automation scripts (e.g: Powershell, Python etc) to streamline alerting, onboarding and configuration tasks.

Manages the platform on an ongoing basis while performing typical run functions like monitoring, patching and support.

Leads and conducts code reviews, design reviews, testing, and debugging activities at the application level.

Conducts regular platform check ups and reports inefficiencies to relevant business and technology stakeholders.

Develops and instruments monitoring dashboards depicting platform health and performance.

Develops and builds scalable and generalized frameworks to support the integration of internal and third-party APIs.

Collaborates with DevOps, SRE, Cloud Engineering and Application teams to define and implement SLIs, SLOs, alerts, reports and dashboards.

Troubleshoots monitoring gaps and issues across platforms and provide root cause analysis and resolution guidance to teams.

Maintains platform upgrades, patching and configurations in line with compliance and security requirements.

Participates in the creation of documentation and artifacts used to describe the mechanisms used for deployment, monitoring, maintenance and best practices for platform usage, configuration and alert tuning.

Supports incident response, resolution and post-incident reviews for Production issues to identify failure points or performance degradation.

Performs coding tests to validate hardware design correctness, and creates software regression tests to ensure its reliability.

Conducts research and makes recommendations on standards, products, and services.

Tests diagnostics (including) automated regression testing.

Develops and executes integration testing plans (as needed).

Evaluates and documents all operating systems according to required standards.

Participates in the development of continuous integration/continuous development frameworks in support of DevOps and Agile practices.

REQUIRED

Experience with Cloud Platforms, Saas Products, On-Prem solutions, and end to end development and delivery roles and responsibilities.

Hands-on experience with Enterprise Monitoring tools such as Dynatrace, Google Cloud Logging, SCOM, IBM Tivoli, HP Operations Agent, Network Node Manager (NNMi) or equivalent.

Working knowledge of Observability frameworks such as OpenTelemetry, Grafana, Prometheus, or equivalent.

Working knowledge of data visualization tools such as MS PowerBI, Google Looker, etc.

Proficiency in IaC scripting using tools such as Terraform or equivalent.

Proficiency with scripting languages such as MySQL, PowerShell, Python, or equivalent.

Proficient with OS utilities.

Holds certifications in Cloud technologies such as Azure, Google Cloud Platform, or equivalent.

Excellent verbal and written communication skills.

Foundational networking knowledge.

Excellent analytical skills and ability to effectively troubleshoot and provide solutions.

Ability to work both independently and within a close team environment.

Scheduling flexibility to meet the needs of the business, including weekends, holidays, and 24/7 on call responsibilities on a rotational basis.
Recommended

Experience with RedHat OpenShift, Docker/Container, MS Azure, IaaS or PaaS solutions.

Experience with CI/CD orchestration tools e.g., Jenkins, Maven or similar CI/CD.

Experience with WebSphere Platform and application deployments.

Proficient in Google Workspace applications, including Sheets, Docs, Slides, and Gmail.

Required Documents

Cover Letter

Resume

California applicants, please click here to review the Costco Applicant Privacy Notice.

Pay Ranges:

Level 1 - $85,000 - $111,000

Level 2 - $105,000 - $135,000

Level 3 - $130,000 - $160,000

Level SR - $150,000 - $190,000, Bonus and Restricted Stock Unit (RSU) eligible

Level Staff - $180,000 - $225,000, Bonus and Restricted Stock Unit (RSU) eligible

We offer a comprehensive package of benefits including paid time off, health benefits - medical/dental/vision/hearing aid/pharmacy/behavioral health/employee assistance, health care reimbursement account, dependent care assistance plan, short-term disability and long-term disability insurance, AD&D insurance, life insurance, 401(k), stock purchase plan to eligible employees.

Costco is committed to a diverse and inclusive workplace. Costco is an equal opportunity employer. Qualified applicants will receive consideration for employment without regard of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or any other legally protected status. If you need assistance and/or a reasonable accommodation due to a disability during the application or the recruiting process, please send a request to

If hired, you will be required to provide proof of authorization to work in the United States. In some cases, applicants and employees for selected positions will not be sponsored for work authorization, including, but not limited to H1-B visas.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.