Overview
Full Time
Skills
Python
SQL
Big Data
Shell Scripting
Database
Linux
PL/SQL
Oracle
DEV OPS
Continuous Integration/Delivery
Scripting
Splunk
Change Management
Object Oriented Programming
Networking
Amazon Web Services
GCP
Metrics
Hadoop
Jenkins
Subversion
Puppet
CHEF
GitHub
GitLab
Docker
Kubernetes
Terraform
Application Deployment
Software Configuration
Fault-Tolerant
IAAS
Infrastructure Management
Streaming
Software Life Cycle
Performance Analysis
Forecasting
Packer
Welding
Inventory
Capacity Planning
Chef (All)
Switch Capacity
Job Details
Flink Administrator
Location : Dallas TX
Who are we looking for?
As a Big data Administrator, help in maintaining and administering on-premises and cloud based big data platform. Help in setting up Platform, automation, maintaining knowledgebase/ run books, troubleshooting, restoring service on platform and provide support.
Your responsibilities:
- Build and support on-premises Hadoop, Flink (Cloudera Streaming Analytics) platform infrastructure and applications.
- Deploy Flink based applications on the platform and use configuration management tools (such as Ansible, SaltStack, etc..) to manage them.
- Deploy software to improve the availability, scalability, and efficiency of the platform.
- Facilitate capacity planning and demand forecasting, software performance analysis, and system tuning.
- Understand the end-to-end configuration, technical dependencies, and overall behavioral characteristics of production services.
- Partner with development teams in defining and implementing improvements.
- Propose solutions related to server hardware and software configuration, networking, standard internet services, scripting languages, cloud computing patterns, technology security and compliance.
- Troubleshoot priority incidents, facilitate blameless post-mortems.
- Perform analytics on previous incidents and usage patterns to better predict issues and take proactive actions.
- Work with development teams throughout the software life cycle ensuring sustainable software releases.
- Lead and participate in tests; identify bottlenecks, opportunities for optimization, and capacity demands.
- Participate in the 24x7 support coverage as needed.
- Measurement and optimization of service performance
- Tooling to enable observability services, Automating CI/CD pipelines.
- Provide technical escalation, contribute in the on-call rotation.
- Automate monitoring system to ensure uptime on production system.
- Have experience in / be able to troubleshoot end-to-end on a private or public clouds Infrastructure.
- Infrastructure Monitoring and Reports for all performance metrics.
Technical Skills:
- 10+ years' experience in Hadoop, YARN infrastructure management and application deployment
- Hands on experience in maintain and support Flink/Spark streaming application on Hadoop/ cloud environment.
- 4+ years of experience in DevOps and Shell Scripting
- SRE Engineer with strong experience in monitoring, troubleshooting and support.
- Support rapid development and engineering productivity via release engineering, CI/CD & IaC automation, and build tools.
- Perform health checks Apps/Infra to identify and proactively pre-empt issues from occurring (verification, alerts, etc).
- Experience with Python including Object Oriented programming.
- Working experience on Splunk to work on logs inventory creating dashboards, etc for various streams such as Linux, etc
- Experience with Ansible, Puppet, SaltStack
- Container administration and development utilizing Kubernetes, Docker, Mesos, or similar.
- Infrastructure automation through Terraform, Chef, Ansible, Puppet, Packer or similar.
- Experience with Cloud Orchestration frameworks, development and SRE support of these systems.
- Experience with CI/CD pipelines including VCS (git, svn, etc), Gitlab Runners, Jenkins, Rundeck
- Oracle Database knowledge in ATP, ADW and programming in SQL, PL/SQL
- Cloud network experience
- Experience with Linux
- Experience working with fault tolerant, highly available, high throughput, distributed, scalable systems.
- Integration with Code Deploy / GitHub Actions
- Experience in IaaS tools like CFT, Terraform
Nice to have:
- Experience with Kubernetes or other container orchestration framework.
- Experience in public cloud-based solutions like Azure, Google Cloud Platform, AWS