Client Reference Code:
Spectrum Reach ( www.spectrumreach.com
) grows businesses of all sizes with custom, multiscreen advertising solutions, backed by the power of TV, data, innovation, community experts, and unforgettable creative. JOB SUMMARY
Responsible for maintaining scalable, reliable, consistent and repeatable systems that support data operations for Data Science by gathering and processing raw data at scale. Profiles data to measure quality, integrity, accuracy, and completeness and delivers solutions by developing, testing, and implementing code and scripts. MAJOR DUTIES AND RESPONSIBILITIES
Actively and consistently supports all efforts to simplify and enhance the customer experience.
Create and maintain scalable, reliable, consistent and repeatable systems that support data operations for Data Science.
Gather and process raw data at scale (including writing scripts, web scraping, calling APIs, write SQL queries, etc.).
Using Big Data tools and frameworks, implement ETL processes in order to maintain, improve, clean, and manipulate data.
Profile data to measure quality, integrity, accuracy, and completeness.
Develop and implement tools, scripts, queries, and applications for ETL/ELT and data operations.
Design, build, and automate ML Data Pipeline.
Deliver solutions by developing, testing, and implementing code and scripts.
Produce reports and uphold data delivery schedules.
Manage life cycle of multiple data sources.
Work closely with stakeholders on the data demand side (analysts and data scientists).
Increase speed to delivery by implementing workload/workflow automation solutions.
Perform other duties as assigned. REQUIRED QUALIFICATIONS
Skills/Abilities and Knowledge
Strong coding/scripting experience using Python, Perl, shell scripts
Strong experience with SQL, Tableau, ML Pipeline techniques, and ETL techniques
Strong background in Linux/Unix/CentOS installation and administration; Windows experience preferred
Expertise in data storage that demonstrates knowledge of when to use a file system, relational database, or NoSQL variant
Strong experience with AWS architecture, Spark, and Hadoop/Hive
Ability to identify and resolve end-to-end performance, network, server, and platform issues
Attention to detail with the ability to effectively prioritize and execute multiple tasks Education
Bachelor's degree in an engineering discipline or computer science Related Work Experience
Hands-on working experience with RDBMS, SQL, scripting, and coding (3+ yrs)
Linux/Unix/CentOS system admin (5+ yrs)
Skills/Abilities and Knowledge
Familiarity with data workflow/data prep platforms, such as Alteryx, Pentaho, or KNIME
Familiarity with automation/configuration management, such as Airflow, Puppet, Chef or an equivalent
Knowledge of best practices and IT operations in an always-up, always-available service
Experience receiving, converting, and cleansing big data
Experience with visualization or BI tools, such as Tableau, Zoom data, Microstrategy, or anything Microsoft Power BI
Create proof of concept experiments for analytics, machine learning, or visualization tools that include hypothesis, test plans, and outcome analysis
Extensive leadership experience in advanced operational analytics
Windows experience WORKING CONDITIONS
Travel depending upon project needs Job Code :
EGN321 Data Engineer Exempt