- Big data
- Bitbucket GitHub
Experience with BIG Data technology mentioned below Netezza, Hadoop Big Data HDFS, PYTHON,SPARK-SQL, MapReduce with PYSpark.
Hands-on experience in build CI CD pipelines is required Outstanding coding, debugging and analytical skills, Core problem solving skills
and data migration from different databases.
Good knowledge at using Spark APIs to cleanse, explore, aggregate, transform, store analyze available data and potential solutions,
eliminate possible solutions and select an optimal solution.
Experience in distributed processing , storage frameworks ,RDD , Dataframe with operation like diff Action Transformation.
Experience in UDF, Lambda, pandas numpy
Experience in installing, configuring, debugging, and troubleshooting Hadoop clusters is desired skill.
knowledge of deploying and managing ETL pipelines, RDBMS technologies PostgreSQL, MySQL, Oracle, etc.
knowledge with modern DevOps toolchains.
knowledge of dataframes, Pandas, data visualization tools data mining .
knowledge of statistical ,machine learning models and business intelligence .
knowledge of JIRA, Bitbucket GitHub and scheduling tools like Automic Airflow.