We are expanding our efforts into complementary data technologies for decision support in the areas of operating a big data platform while supporting fast development of intelligent applications. Our interests are in enabling, development, deployment, and monitoring of these big data platforms along with intelligent applications using containers.
To that end, this role will engage with team counterparts in deploying and operating big data platform technologies for intelligent and other analytical applications. Deployment of big data platform technologies activities include identifying and enabling opportunities for automation of fast integration and deployment of applications. Operation of big data platform technologies activities include monitoring and troubleshooting incidents, enabling security policies, managing data storage and compute resources. Responsibility also includes coding, testing, and documentation of new or modified automation for deployment and monitoring. This role participates along with team counterparts to architect an end-to-end framework developed on a group of core data technologies. Other aspects of the role include developing standards and processes for big data platforms in support of projects and initiatives.
- Manage Hadoop and Spark cluster environments, on bare-metal and container infrastructure, including service allocation and configuration for the cluster, capacity planning, performance tuning, and ongoing monitoring.
- Work with data engineering related groups in the support of deployment of Hadoop and Spark jobs.
- Work with IT Operations and Information Security Operations with monitoring and troubleshooting of incidents to maintain service levels.
- Work with Information Security Vulnerability Management and vendors to remediate known impacting vulnerabilities.
- Contribute to the evolving distributed systems architecture to meet changing requirements for scaling, reliability, performance, manageability, and cost.
- Report utilization and performance metrics to user communities
- Contributes to planning and implementation of new/upgraded hardware and software releases.
- Responsible for monitoring the Linux, Hadoop, and Spark communities and vendors and report on important defects, feature changes, and or enhancements to the team.
- Research and recommend innovative, and where possible, automated approaches for administration tasks. Identify approaches to efficiencies in resource utilization, provide economies of scale, and simplify support issues.
Reports to: AVP Emerging Data Technology
- Excellent knowledge of Linux, AIX, or other Unix flavors
- Deep understanding of Hadoop and Spark cluster security, networking connectivity and IO throughput along with other factors that affect distributed system performance
- Strong working knowledge of disaster recovery, incident management, and security best practices
- Working knowledge of containers (e.g., docker) and major orchestrators (e.g., Mesos, Kubernetes, Docker Datacenter)
- Working knowledge of automation tools (e.g., Puppet, Chef, Ansible)
- Working knowledge of software defined networking
- Working knowledge of parcel based upgrades with Hadoop (i.e., Cloudera)
- Working knowledge of hardening Hadoop with Kerberos, TLS, and HDFS encryption
- Ability to quickly perform critical analysis and use creative approaches for solving complex problems
- Excellent written and verbal communication skills
- 5+ years hands-on experience with supporting Linux production environments
- 3+ years hands-on experience with supporting Hadoop and/or Spark ecosystem technologies in production
- 3+ years hands-on experience with scripting with bash, perl, ruby, or python
- 2+ years hands-on development / administration experience on Kafka, HBase, Solr, and Hue
- Experienced with networking infrastructure including VLAN and firewalls
- Proven track record with Red Hat Enterprise Linux administration
- Proven track record with Cloudera Distribution of Hadoop administration
- Proven track record with troubleshooting YARN jobs
- Proven track record with HBase Administration to include tuning
- Proven track record with Apache Spark development and or administration
- Experience with Bluedata administration
Work primarily in a controlled climate environment. Mostly stationary with occasional need to travel between nearby DFW office locations to visit business partner customers and attend meetings. Occasional handling and lifting of computer and/or networking equipment involved. Occasional travel to attend conferences or training for development pursuits.