Job ID: T5880 - Lead Analytics Platform Engineer (Enterprise Infrastructure)
PLEASE NOTE: This is a 9 month contract to hire and needs to meet Client full-time conversion policies. Those dependent on a work permit sponsor now or anytime in the future (ie H1B, OPT, CPT, etc) do not meet Client requirements for this opening.
Position Summary: The Technical Engineer Lead serves as the senior technical consultant and guide for the Advanced Analytics Engineering team within Enterprise Technology Solutions (ETS) at Travelers Insurance. This role is responsible for the management, upgrade, and support of the enterprise analytical platform serving 500+ analytics users across all Travelers lines of business. The position combines deep technical expertise in multi-platform analytics infrastructure with team leadership, business stakeholder engagement, and strategic technology planning. The role also encompasses organizational responsibilities including Disaster Recovery (DR) coordination, production pipeline signoff governance, and AI tool advocacy to modernize team workflows.
Core Platform Responsibilities:
1. Miniforge Python "Condaplus" Linux, macOS, Windows, and AWS
o Manage and maintain the Condaplus Python distribution across all enterprise platforms. This includes version upgrades and rollouts (desktop, Linux server, Windows server, macOS), coordinating security vulnerability remediation across devices, managing Conda virtual environments for application deployments, integrating with Nexus IQ firewall for automated security scanning of packages, and supporting AWS-based analytics environments including EC2 AMIs. Condaplus serves as the foundation for Python-based analytics, data science workflows, and internal tool development across the organization.
2. SAS Linux and Windows Across 9 Physical Servers
o Oversee the SAS analytics platform including SAS 9.4 on RHEL 8 Linux servers and SAS Enterprise Guide on Windows. Responsibilities include planning and executing major SAS version upgrades (e.g., SAS 9.4M8, SAS Enterprise Guide 8.5), managing SAS/ACCESS client connectivity to enterprise databases (Teradata, DB2, Oracle) via ODBC configurations, coordinating user migration communications and timelines, troubleshooting SAS server performance and connectivity issues, and managing SAS web services and IIS integration for business-facing applications. This includes maintaining SAS environments across multiple lines of business including Claims, BI, and PI.
3. R Tools Windows and 5 Physical Servers
o Manage the R analytics ecosystem including RStudio Server (transitioned from Commercial to Opensource, saving $66K+ annually), R version upgrades across platforms (Linux, Windows Desktop, AWS AMIs), Rtools for Windows package compilation, CRAN repository management and Nexus proxy configuration, and GitLab Copilot integration with RStudio Server for version control workflows. Ensure R environments are consistent and accessible for the analytics user community.
Software Products & Technologies Managed:
The following is a comprehensive inventory of the software products, platforms, and technologies that this role is responsible for managing, supporting, and maintaining across the enterprise.
Analytics Platforms
o SAS 9.4 (M8) Enterprise analytics platform on RHEL 8 Linux (Claims, BI, PI environments)
o SAS Enterprise Guide 8.5 Windows desktop analytics client for 500+ users
o SAS/ACCESS Clients Database connectivity modules for Teradata, DB2, Oracle, and other data sources
o SAS Web Services / SAS Mid-Tier Web-based SAS application delivery via IIS integration
o SAS Stored Processes Server-side SAS programs invoked via web or application interfaces
o RStudio Server (Opensource) R development environment on Linux servers
o R (multiple versions: 3.6.3, 4.4.3) Statistical computing across Linux, Windows Desktop, and AWS AMIs
o Rtools Windows-based R package compilation toolchain
o Condaplus (Miniforge Python) Custom Python distribution on Linux, Windows, macOS, and AWS
o IBM SPSS Statistical analysis software for business analytics users
o PyCharm Professional Python IDE for analytics development
o KNIME Analytics Platform (Desktop) Visual data science and analytics workflow tool
Infrastructure & Operating Systems
o RHEL 8 (Red Hat Enterprise Linux) Primary Linux platform across 14+ servers
o Ubuntu 22.04 AWS EC2 instances and GPU AMI environments
o Windows Server SAS and R platform hosting, IIS web services
o GPFS (General Parallel File System) High-performance shared storage (/gpfs2/PI_SharedData)
o IIS (Internet Information Services) SAS web application hosting on Windows servers
AWS Cloud Services
o AWS EC2 Elastic compute instances for analytics workloads and ML environments
o AWS S3 Object storage for data lakes with security governance
o AWS Athena Serverless SQL query service
o AWS AMI Management Custom analytics AMIs (Cloudex) with R, Condaplus, encrypted EBS
o AWS Tagging Compliance ICS-V2 Cloud Tagging Standard enforcement (AppId, SystemNumber, Owner, CapabilityId)
o Amazon Bedrock Generative AI service (enabled for UConn project)
o Terraform Enterprise (TFE) Infrastructure as Code for AWS resource provisioning
o Predictive Modeling Competition Infrastructure Annual competition environment: 33 EC2 instances, dedicated GitLab server, S3 security governance (31 teams, 100+ participants)
Authentication & Security
o Quest Authentication Services (VAS) Active Directory integration for Linux/SAS via PAM
o SSSD (System Security Services Daemon) Open-source AD integration (Ubuntu/RHEL)
o Kerberos Authentication protocol for AD-integrated Linux environments
o PAM (Pluggable Authentication Modules) Linux authentication framework for SAS, SSH, and application access
o Nexus IQ / Nexus Repository Package security scanning and repository proxy for Python, R, and Conda packages
o IIQ (IdentityIQ) REST API Automated EC2 access provisioning and AD group registration
o SSL/TLS Certificate Management Server certificate lifecycle management
Monitoring, Alerting & Operations
o RAMMON (Resource and Memory Monitoring) Custom web-based server monitoring dashboard for 14+ servers
o Connect Direct File transfer monitoring for RHEL 8 production job protection
o OOM (Out of Memory) Protection Scripts Automated memory management and process governance
o Custom Email Alerting System Proactive incident notification via smtpxfer.prodlb.travp.net
o Server Load Analysis Tools CPU/RAM heatmap analysis for capacity planning
o Filesystem Health Check Scripts Automated monitoring of GPFS and local filesystems
Internal Web Applications (Flask/Python)
o AA Engineering Portal Central SSO-authenticated hub for all team applications
o Team Accomplishments Tracker Monthly/quarterly achievement tracking with email reports and Excel import/export
o Documentation Inventory Searchable documentation reference system (migrated from SharePoint Excel)
o Server Monitoring Dashboard (RAMMON) Real-time RAM, CPU, and process monitoring across all servers
Development & Version Control
o Git / GitHub Enterprise Version control and repository management for team code and configurations
o GitHub Copilot AI-assisted code development integrated with RStudio Server and development workflows
o Nginx Reverse proxy and web server for Flask application routing and SSL termination
o Gunicorn Python WSGI HTTP server for production Flask deployments
o systemd Linux service management for all deployed applications
Project Management & Collaboration
o ServiceNow Incident management, user stories, and Scrumban workflow adoption
o SharePoint Documentation hosting and team collaboration
o Agile/Scrumban Sprint planning using ServiceNow Strategic Planning Workspace
Key Responsibilities:
Technical Leadership & Team Guidance
o Serve as the senior technical consultant to the Advanced Analytics Engineering team, providing guidance on architecture decisions, troubleshooting complex infrastructure issues, and setting technical direction for platform evolution. Consult with and mentor team members on best practices for server administration, deployment automation, and cross-platform integration. Responsible for the overall technical output and delivery of the engineering team.
Infrastructure Monitoring & Reliability
o Design and maintain comprehensive server monitoring and alerting solutions across 14+ Linux servers supporting analytics workloads. This includes RAM and CPU usage monitoring dashboards (RAMMON), automated memory management and OOM protection scripts, filesystem health checks, proactive email alerting (which reduced incidents/outages by 90%), and server load analysis for capacity planning conversations with lines of business.
Disaster Recovery Coordination
o Serve as the DR Coordinator for the Advanced Analytics organization. Ensure business continuity through proactive monitoring, documented recovery procedures, and regular DR testing. Coordinate with infrastructure teams and business stakeholders to maintain recovery readiness across all analytics platforms.
Production Pipeline Governance
Serve as the pipeline signoff individual for the organization, maintaining rigorous testing and validation protocols for all production deployments. Ensure zero-impact rollouts of database drivers (e.g., Teradata Vantage), platform upgrades (e.g., SAS 9.4M8, SAS EG 8.5), and security patches (e.g., Condaplus 3.0 across 93 devices) through structured testing, communication plans, and staged deployment strategies