Location: Concord, CA
Salary: $78.00 USD Hourly - $84.00 USD Hourly
Description: Software Engineer / Site Reliability Engineer (SRE)Location: Concord, CA (1755 Grant St)
Work Model: Hybrid - 3 days onsite (Monday & Tuesday preferred)
Schedule: Start at 7:00 AM PT to coordinate with India-based teams
Employment Type: 12-month contract (with potential extension or conversion)
Line of Business: TCOO
Positions Available: 1
About the RoleIn this contingent assignment, you will serve as a senior-level
Software Engineer / Site Reliability Engineer responsible for designing and implementing end-to-end monitoring and observability for a high-visibility, enterprise platform. This role focuses on strategy, systems reliability, and operational excellence rather than application development.
You will collaborate closely with engineering, networking, and infrastructure teams to identify system dependencies, define reliability thresholds, and build dashboards and alerts that provide insight into platform health, performance, and data flow. This role requires strong analytical skills, enterprise-scale experience, and the ability to work across teams to drive complex initiatives forward.
Responsibilities- Consult on complex, large-scale Software Engineering and SRE initiatives with broad organizational impact.
- Design and implement end-to-end observability, monitoring, and alerting strategies across distributed systems.
- Analyze and resolve multi-faceted system reliability and performance challenges, including unprecedented or ambiguous scenarios.
- Build dashboards and alerts using enterprise monitoring tools to track system health, latency, and data flow.
- Identify system choke points, latency thresholds, and failure scenarios across application, storage, and network layers.
- Partner with internal networking, infrastructure, and operations teams to define monitoring requirements and ensure visibility into network traffic.
- Support and monitor third-party platforms hosted within the enterprise environment, ensuring alignment with internal reliability standards.
- Ensure compliance with internal policies, procedures, and regulatory requirements while meeting operational deliverables.
- Strategically collaborate with client stakeholders and provide consultative guidance on system reliability and observability best practices.
Minimum Qualifications- 5+ years of experience in Software Engineering, Site Reliability Engineering, or a related technical field, or equivalent practical experience demonstrated through work, consulting, training, military service, or education.
- Experience with observability and monitoring tools such as Grafana, Splunk, AppDynamics, or ThousandEyes.
- Hands-on experience analyzing system and network performance in enterprise environments.
- Experience working with relational databases such as PostgreSQL or MySQL.
- Experience with object storage solutions such as Amazon S3 or NAS storage.
- Ability to collaborate across teams and clearly articulate technical requirements and solutions.
Preferred Qualifications- Experience with OpenShift (OCP) and Kubernetes containerized platforms.
- Experience with enterprise-grade monitoring implementations.
- Familiarity with Skan.AI or similar third-party enterprise platforms.
- Understanding of distributed system architecture and data flow monitoring.
- Experience defining observability strategies for newly deployed or rapidly evolving platforms.
Additional Information- This is a net-new role intended to add end-to-end SRE expertise to the team.
- The platform supported includes a third-party system hosted internally, with storage managed via S3 and integrations with Splunk and Grafana.
- The role will involve monitoring data movement between compute, storage, and network layers, including latency and throughput analysis.
- Occasional early or overnight meetings may be required to align with global teams; start times will not be earlier than 6:00 AM PT.
Supplier Expectations- All resumes must be submitted through Beeline to be considered.
- No direct solicitation of resumes or communication with the hiring manager while the role is active.
By providing your phone number, you consent to: (1) receive automated text messages and calls from the Judge Group, Inc. and its affiliates (collectively "Judge") to such phone number regarding job opportunities, your job application, and for other related purposes. Message & data rates apply and message frequency may vary. Consistent with Judge's Privacy Policy, information obtained from your consent will not be shared with third parties for marketing/promotional purposes. Reply STOP to opt out of receiving telephone calls and text messages from Judge and HELP for help.
Contact: This job and many more are available through The Judge Group. Please apply with us today!