is a provider of wholesale and audit intelligence in the automotive industry. Datascan's solutions are the most comprehensive in the industry, providing banks, independent finance companies, and captive financial institutions with clarity critical to their success - helping them manage risk while increasing productivity and profits. All of our solutions are web-based and delivered from our state-of-the-art DataScan-managed data centers. Job Description
The Data Engineer will be working on creating solutions to implement and operationalize a data lake, with full support for ingestion of data from various sources. Additionally, the Data Engineer will implement security and governance controls within the data, manage a cataloging solution, and automate solutions for transforming data sets using state of the art big-data technologies (Hive, Spark, etc.).
- Build and maintain one or more data lakes to support scalable ingesting, manipulation, and reporting of data.
- Manipulate data to produce and maintain new data elements using repeatable, automated processes.
- Combine and transform data into usable sets by other individuals in the organization, for example, data scientists and business users.
- Build new software from the ground up utilizing industry standard best practices
- Experience working within AWS ecosystem (AWS S3, Glue, EMR, Redshift, Athena, QuickSight) or comparable cloud environments (AzureP)
- Experience extracting, ingesting, and transforming large data sets
- Experience with working with multiple kinds of databases (relational, document storage, key-value, timeseries) and their associated query languages -- specifically understanding when to use one vs. another.
- Experience with big data platforms such as Hadoop and Spark
- Experience utilizing enterprise data warehousing systems such as Redshift.
- Strong coding skills (Python, Java, C# are all reasonable)
- Lots of experience working with common data formats and manipulating data within them -- JSON, CSV, Parquet
- Experience working with Infrastructure as Code tools such as Cloudformation, CDK, or Terraform.
- Experience handling sensitive data and governing access to it.