Data Engineer (Hybrid Remote/Onsite)

Amazon S3, Apache Airflow, Apache Avro, Apache Kafka, Apache Maven, Apache Parquet, Apache Spark, BMC Control-M, Business intelligence, Data engineering, Data governance, Data integrity, Data science, Data storage, Database, Docker, ELT, ETL, Gradle, HDFS, Java, Kubernetes, Machine learning, PHP, NoSQL, Python, R, SQL, Scala, Shell scripting, Software development
Full Time
Depends on Experience
Work from home available Travel not required

Job Description

Agile Premier has a client that is looking to hire a Data Engineer.  The position can be located in either Oklahoma City, OK or Grapevine, TX.   Data Engineer will be responsible for the following:

  • Build, test, and validate robust production-grade data pipelines that can ingest, aggregate, and transform large datasets according to the specifications of the internal teams who will be consuming the data
  • Build frameworks and custom tooling for data pipeline code development
  • Deploy data pipelines and data connectors to production environments
  • Configure connections to source data systems and validates schema definitions with the teams responsible for the source data
  • Monitor data pipelines and data connectors and troubleshoot issues as the arise
  • Monitor data lake environment for performance and data integrity
  • Manage data infrastructure such as Kafka and Kubernetes clusters
  • Collaborate with IT and database teams to maintain the overall data ecosystem
  • Assist data science, business intelligence, and other teams in using the data provided by the data pipelines
  • Deploy machine learning models to production environments
  • Gather requirements and determine scope of new projects
  • Research and evaluate new technologies and set up proof-of-concept deployments
  • Test proof-of-concept deployments of new technologies
  • Collaborate with data governance and compliance teams to ensure data pipelines and data storage environments meet requirements
  • Serve as on-call for production issues related to data pipelines and other data infrastructure maintained by the data engineering team


MINIMUM BASIC QUALIFICATIONS

  • Degree in Computer Science or related field
  • 5+ years of data engineering work experience
  • Experience coding in Java or Scala and build tools such as Maven, Gradle, and SBT is required
  • Experience with SQL databases
  • Experience working with HDFS or S3 storage environments
  • Experience with Apache Spark or Databricks and reading and writing Parquet, Avro and JSON
  • Experience working in a Unix or Linux environment including writing shell scripts
  • Experience with ETL and ELT processes in data pipelines
  • Experience with CICD tools and processes
  • Experience coding in Python, R, C# and.or PHP
  • Experience with NoSQL solutions
  • Experience with Apache Kafka or Confluent is highly preferred
  • Experience with data lake query engines such as Presto or Dremio
  • Experience with Docker and Kubernetes highly preferred
  • Experience with workflow orchestration tools like Apache Airflow, Control-M, or Arrow highly preferred
Dice Id : 10522368
Position Id : 7166339
Originally Posted : 2 months ago
Have a Job? Post it

Similar Positions

Data Development Engineer
  • Agile Premier
  • Grapevine, TX, USA
Big Data Engineer, Tech Lead
  • Syeta Inc
  • Irving, TX, USA
Big Data Engineer
  • Macrosoft
  • Plano, TX, USA
Senior Azure Data Engineer
  • VSV WINS INC.
  • Dallas, TX, USA
Data Architect
  • Headway Tek Inc
  • Fort Worth, TX, USA
Azure Data Architect
  • Headway Tek Inc
  • Fort Worth, TX, USA
Director of Data Engineering - Oklahoma City, OK
  • Pivotal Solutions Inc
  • Oklahoma City, OK, USA
Senior Data Engineer
  • ZeniMax Media
  • Austin, TX, USA