This position is responsible for hands-on and design/implementation/development in the Hadoop framework using Spark/PySpark/SparkSQL along with other Hadoop ecosystems components such as HDFS, Hive, Hue, Impala, Zeppelin etc.
* 3-5 years SQL on Hadoop development experience - Spark preferred
* 3-5 years' python experience
* 2-3 years' experience with dimensional models and data warehousing
* 3-5 years' Hadoop experience - Cloudera preferred
* Bachelors in Computer Science (or equivalent)
* Certified Apache Spark Developer is a plus
Duties and Responsibilities:
* Design and development around Apache SPARK and Hadoop Framework.
* Extensive usage and experience with Data Frames with in Spark.
* Should be working with gigabytes/terabytes of data and must understand the challenges of transforming and enriching such large datasets.
* Provide effective solutions to address the business problems - strategic and tactical.
* Develop new applications and data flows based on functional requirements
* Collaboration with team members, project managers, business analysts and business users in conceptualizing, estimating and developing new solutions and enhancements.
* Read, extract, transform, stage and load data to multiple targets, including Hadoop and Oracle.
* Should be able to modify existing programming/codes for new requirements and enhancements.
* Unit testing and debugging. Perform root cause analysis (RCA) for any failed processes.
* Migrate existing file processing from standalone or legacy technology scripts to Hadoop framework processing.
* Complete production programming, deployments and deliveries within defined SLA.
* Convert business requirements into technical design specifications and execute on them.
* Execute new development as per design specifications and business rules/requirements.
* Participate in code reviews and keep applications/code base in sync with version control.
* Effective communicator, self-motivated and able to work independently but fully aligned within a team environment.
Additional Skills that are a plus:
* Data cleaning/wrangling
* Data visualization and reporting
* Production support/troubleshooting
* Experience designing, developing, and implementing ETL/ELT
* Experience with performance tuning for large data sets
* Proven experience working with, processing and managing large data sets (multi TB scale)
* Worked with relational data - Oracle preferred
* Ensure effective and efficient risk and compliance practices for data security - PII and hipaa data preferred
* Understanding of Kerberos, AD integration and networking.
* Knowledge of Java/J2EE & Web Services
* Oracle Data Integrator (ODI)
Conditions of Employment
All job offers are contingent upon successful completion of certain background checks which unless prohibited by applicable law may include criminal history checks, employment verification, education verification, drug screens, credit checks, DMV checks (for driving positions only) and fingerprinting.
Great People, Deserve Great Benefits
We know that we have some of the brightest and most talented associates in the world, and we believe in rewarding them accordingly. If you work here, expect competitive pay, comprehensive health coverage, and endless opportunities to advance your career. From tuition reimbursement to scholarship programs to employee stock purchase plans and 401(k)s, we offer associates a variety of benefits that work as hard for them as they work for us.
Epsilon(r) is an all-encompassing global marketing innovator. We provide unrivaled data intelligence and customer insights, world-class technology including loyalty, email and CRM platforms and data-driven creative, activation and execution. Epsilon's digital media arm, Conversant, is a leader in personalized digital advertising and insights through its proprietary technology and trove of consumer marketing data, delivering digital marketing with unprecedented scale, accuracy and reach through personalized media programs and through CJ Affiliate, one of the world's largest affiliate marketing networks. Together, we bring personalized marketing to consumers across offline and online channels, at moments of interest, that help drive business growth for brands. Recognized by Ad Age as the #1 World's Largest CRM/Direct Marketing Network, #1 Largest U.S. Agency from All Disciplines and #1 Largest U.S. Mobile Marketing Agency, Epsilon employs over 8,000 associates in 70 offices worldwide. Epsilon is an Alliance Data company. For more information, visit www.epsilon.com and follow us on Twitter @EpsilonMktg.
Alliance Data provides equal employment opportunities without regard to race, color, religion, gender, age, national origin, disability, sexual orientation, gender identity, veteran status or any other characteristic protected by law.
Alliance Data participates in E-Verify
For San Francisco Bay Area :
Alliance Data will consider for employment qualified applicants with criminal histories in a manner consistent with the requirements of San Francisco Police Code Sections 4901 - 4919, commonly referred to as the San Francisco Fair Chance Ordinance