Sr. Spark / Scala Developer

company banner
Matlen Silver
Contract W2, Contract

Job Description


Job Details:
We are looking for a Senior Spark/Scala Develoepr that has hands on experience working on the Hadoop Ecosystem.

Roles and Responsibilities:
  • Develop end to end data pipeline using Spark, Hive and Impala
  • Write SPARK jobs to fetch large data volumes from source
  • Understand business needs, analyze functional specifications and map those to development and designing Apache Spark programs and algorithms.
  • Optimizing Spark code, Impala queries and Hive partitioning strategy for better scalability, reliability and performance.
  • Work on leading BI technologies like MSTR, Tableau over Hadoop Ecosystem through ODBC/JDBC connection
  • Work on hive performance optimizations like using distributed cache for small datasets, Partition, Bucketing in Hive and Map Side joins.
  • Build Machine Learning Algorithms using Spark.
  • Design and deploy enterprise-wide scalable operations
  • Data wrangling and creating workable datasets and work on different file formats like Parquet, ORC, Sequence files and different serialization formats like Avro
  • Build applications using Maven, SBT and integrated with continuous integration servers like Jenkins to build jobs.
  • Execution of Hadoop ecosystem and Applications through Apache HUE
  • Feasibility Analysis (For the deliverables) - Evaluating the feasibility of the requirements against complexity and time lines.
  • Performance tuning of Impala queries
  • Design and documented operational problems by following standards and procedures using software reporting tool JIRA
  • Installing, configuring, and using Hadoop components like Spark, Spark Job server, Spark Thrift server, Phoenix on HBase, Flume, Sqoop
  • Expertise in Shell-Scripts, Cron Automation and Regular Expressions
  • Coordinating for the Development, Integration and Production deployments.
  • Use Rest services to access HBASE data and used the data for further processing in the downstream systems
  • Good experience in debugging issues using the Hadoop, Spark Log files
  • Responsible for preparing technical specifications, analyzing functional specs, development and maintenance of code
  • Create mapping documents to outline data flow from source to target.
  • Perform migration from Legacy Databases RDBMS to Hadoop Ecosystem
  • Use Cloudera Manager, an end-to-end tool to manage Hadoop operations in Cloudera Cluster
  • Create various database objects like tables, views, functions, and triggers using SQL
  • Experience with Spark and Spark SQL
  • Must have hands on experience in Java, Spark, Scala, AKKA,Hive, Maven/SBT, Amazon S3
  • Experience in Kafka, ReST services is a plus.
  • Experience in Hadoop, HBase, MongoDB, or other NoSQL platforms
  • Good knowledge of Big Data querying tools, such as Pig, Hive, and Impala
  • Knowledge in Sqoop, Flume preferred
  • Excellent communication skills with both Technical and Business audience
  • Experience in Apache Phoenix, Text Search (Solr, ElasticSearch, CloudSearch)

Education: Bachelors in Computer Science Degree plus 4years experience or MS in Computer science with 2years experience

Required Technical Skills
  • 3+ years strong native SQL skills
  • 3+ years strong experience in database and data warehousing concepts and techniques. Must understand: relational and dimensional modeling, star/snowflake schema design, BI, Data Warehouse operating environments and related technologies, ETL, MDM, and data governance practices.
  • 3+ years experience working in Linux
  • 3+ years experience with Spark
  • 3+ years experience with Scala
  • 1+ years experience with Hadoop, Hive, Impala, HBase, and related technologies
  • 1+ years strong experience with low latency (near real time) systems and working with Tb data sets, loading and processing billions of records per day
  • 1+ years experience with MapReduce/YARN
  • 1+ years experience with Lambda architectures
  • 1+ years experience with MPP, shared nothing database systems, and NoSQL systems
  • Ability to work in a fast-paced, team-oriented environment
  • Ability to complete the full lifecycle of software development and deliver on time
  • Ability to work with end-users to gather requirements and convert them to working documents
  • Strong interpersonal skills, including a positive, solution-oriented attitude
  • Must be passionate, flexible and innovative in utilizing the tools, their experience, and any other resources, to effectively deliver to very challenging and always changing business requirements with continuous success
  • Must be able to interface with various solution/business areas to understand the requirements and prepare documentation to support development
  • Healthcare and/or reference data experience is a plus
  • A willingness and ability to travel
  • Right to work in the recruiting country.

Company Information

Matlen Silver is the hardest working staffing team in the U.S. We do what we know is right for consultants and companies, creating a unique and powerful recruiting and talent experience. We don’t just say we’re hard-working. We are. We don’t just invest in great people, we invest in people with guts, who don’t stand alone with integrity, but together as one united front. Our core is a powerhouse that can’t be described but should be experienced.

Dice Id : matlennj
Position Id : 100999535219292
Originally Posted : 2 years ago

Similar Positions at Matlen Silver

Data Architects
  • plymouth meeting, PA
  • 10 hours ago
Big Data Hadoop Engineer
  • Atlanta, GA
  • 10 hours ago
Data Transformation Lead
  • Charlotte, NC
  • 10 hours ago