Flink Administrator

Overview

Full Time

Skills

Python

SQL

Big Data

Shell Scripting

Database

Linux

PL/SQL

Oracle

DEV OPS

Continuous Integration/Delivery

Scripting

Splunk

Change Management

Object Oriented Programming

Networking

Amazon Web Services

GCP

Metrics

Hadoop

Jenkins

Subversion

Puppet

CHEF

GitHub

GitLab

Docker

Kubernetes

Terraform

Application Deployment

Software Configuration

Fault-Tolerant

IAAS

Infrastructure Management

Streaming

Software Life Cycle

Performance Analysis

Forecasting

Packer

Welding

Inventory

Capacity Planning

Chef (All)

Switch Capacity

Job Details

Flink Administrator
Location : Dallas TX

Who are we looking for?

As a Big data Administrator, help in maintaining and administering on-premises and cloud based big data platform. Help in setting up Platform, automation, maintaining knowledgebase/ run books, troubleshooting, restoring service on platform and provide support.

Your responsibilities:

Build and support on-premises Hadoop, Flink (Cloudera Streaming Analytics) platform infrastructure and applications.
Deploy Flink based applications on the platform and use configuration management tools (such as Ansible, SaltStack, etc..) to manage them.
Deploy software to improve the availability, scalability, and efficiency of the platform.
Facilitate capacity planning and demand forecasting, software performance analysis, and system tuning.
Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services.
Partner with development teams in defining and implementing improvements.
Propose solutions related to server hardware and software configuration, networking, standard internet services, scripting languages, cloud computing patterns, technology security and compliance.
Troubleshoot priority incidents, facilitate blameless post-mortems.
Perform analytics on previous incidents and usage patterns to better predict issues and take proactive actions.
Work with development teams throughout the software life cycle ensuring sustainable software releases.
Lead and participate in tests; identify bottlenecks, opportunities for optimization, and capacity demands.
Participate in the 24x7 support coverage as needed.
Measurement and optimization of service performance
Tooling to enable observability services, Automating CI/CD pipelines.
Provide technical escalation, contribute in the on-call rotation.
Automate monitoring system to ensure uptime on production system.
Have experience in / be able to troubleshoot end-to-end on a private or public clouds Infrastructure.
Infrastructure Monitoring and Reports for all performance metrics.

Technical Skills:

10+ years' experience in Hadoop, YARN infrastructure management and application deployment
Hands on experience in maintain and support Flink/Spark streaming application on Hadoop/ cloud environment.
4+ years of experience in DevOps and Shell Scripting
SRE Engineer with strong experience in monitoring, troubleshooting and support.
Support rapid development and engineering productivity via release engineering, CI/CD & IaC automation, and build tools.
Perform health checks Apps/Infra to identify and proactively pre-empt issues from occurring (verification, alerts, etc).
Experience with Python including Object Oriented programming.
Working experience on Splunk to work on logs inventory creating dashboards, etc for various streams such as Linux, etc
Experience with Ansible, Puppet, SaltStack
Container administration and development utilizing Kubernetes, Docker, Mesos, or similar.
Infrastructure automation through Terraform, Chef, Ansible, Puppet, Packer or similar.
Experience with Cloud Orchestration frameworks, development and SRE support of these systems.
Experience with CI/CD pipelines including VCS (git, svn, etc), Gitlab Runners, Jenkins, Rundeck
Oracle Database knowledge in ATP, ADW and programming in SQL, PL/SQL
Cloud network experience
Experience with Linux
Experience working with fault tolerant, highly available, high throughput, distributed, scalable systems.
Integration with Code Deploy / GitHub Actions
Experience in IaaS tools like CFT, Terraform

Nice to have:

Experience with Kubernetes or other container orchestration framework.
Experience in public cloud-based solutions like Azure, Google Cloud Platform, AWS

Job Details

Share