Observability Engineer

  • Charlotte, NC
  • Posted 14 hours ago | Updated 4 hours ago

Overview

Remote
On Site
$70.0000 - $80.0000
Full Time

Skills

Observability
Monitoring
Site Reliability
Infrastructure
Performance Optimization

Job Details

Client - Financial Services


Title - Observability Engineer (Lead)


Location - Charlotte, NC OR Dallas, TX OR Baltimore, MD OR Salt Lake City, UT (Hybrid)


Type - Contract to hire


DayToDay Responsbilities:



  • Identify gaps in logging, observability & measurement of performance of critical transactions. Help establish new instrumentation so we have more visibility to performance.

  • Review end-to-end performance of critical transactions to identify patterns & trends, identify bottlenecks & risks

  • Make recommendations to improve performance of critical transactions and optimize capacity. Collaborate with cross-functional teams to align on improvements, prioritize focus areas and implement.

  • Troubleshoot issues in production and lower environments to identify causes of the issues and brainstorm options to resolve

  • Develop and execute a comprehensive capacity plan for select technology services

  • Assist the organization in conducting load & performance testing on critical systems to identify potential bottlenecks

  • Develop and maintain documentation related to capacity planning and performance testing

  • Provide leadership to establish a performance and capacity management practice


Qualifications



  • Prior enterprise level experience

  • Proven ability to enable change and lead IT efforts

  • Bachelor's degree in Computer Science, Information Systems, Mathematics or a related field

  • 10+ years of experience with multiple types of technology and various operating systems including cloud computing environments and virtualization technologies

  • 5+ years experience writing code and using command line interfaces, including experience writing SQL statements and performing system administration functions

  • 5+ years experience evaluating transactional performance and system capacity issues & limitations

  • Experience building instrumentation of logs, observability & monitoring to provide visibility of transaction performance and bottlenecks

  • Experience aggregating data and analyzing data to draw conclusions about performance risks & issues

  • Experience using a combination of monitoring tools and system logs to troubleshoot incidents, identify the cause and determine how to resolve.

  • Experience forecasting how much capacity is needed to accommodate potential future increases in traffic

  • Experience identifying improvement recommendations and implementing improvements to transactional performance and capacity challenges

  • Excellent communication skills and ability to summarize complex technical details into a story executives can understand

  • Ability to work collaboratively in a team environment

  • Ability to thrive in a fast-paced, dynamic environment

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.