Full Time Position
Good with scripting - Python, Pandas (data analytics framework)
Experience in DBs SQL/NoSQL
Big Data exposure Hadoop , handling large data processing real-time streaming, batch processing
ML Experience Spark ML, Scikit-Learn, TensorFlow, NLP, etc .
8+ years of solid industrial application development experience with at least 2+ years development expertise with big data ecosystem such as Kafka, zookeeper, ELK, Apache Spark, Hadoop cluster/HDFS, etc.
Good knowledge with different database technologies, such as in memory Redis database, distributed time series OpenTSDB DB.
Solid performance tuning, configuration and development experiences with Kafka producer/consumer/avro schema.
Solid configuration and development experience with Streamsets, ELK, logstash, RabbitMQ
Solid configuration and application development experience with Docker container.
Solid working experience in security configuring of SSL/TLS, LDAP for Streamsets, ELK, Kafka/Zookeeper as well as encryption implementation for data at rest
Advanced Ansible automation development and test automation skill
Experience with UI virtualization tools such as Grafana
Experience with Linux environment including os installation, IPv6 and IPv4, disk /file system configuration
Experience with virtualization environment such as hypervisor, Vmware ESXi, Openstack
Knowledge with network management, syslog, snmp mib and etc
Must be self-starter, fast learner and be able to complete tasks with high quality on time under minimal supervision.
Must be flexible and willing to working with global project team members in different time zones.
Nice to have:
Experience with UCS setup, network configuration
Knowledge of machine learning algorithms
Experience with Spark Mlib