Lead Data Engineer

MIS, Computer Science/Engineering, software development, data integration, software engineering, micro services, AWS Lambda, Spark, Kafka, StreamSets, ETL tools, Talend, Pentaho, SQL, NoSQL database, JavaScript, pySpark, Python, T-SQL, Https, FTP(s), EDI, JSON, APIs, Storm, Spark-Streaming, Amazon Web Services (AWS), Clodera/HDS
Full Time
Depends on Experience

Job Description

At NYPA, we will help bring about the flexible, distributed, consumer-driven energy system of the future by investing in the things that customers truly value.  We will realize this vision by becoming the first end-to-end “digital utility” in the country by redesigning how we manage our assets and use innovative digital technologies. NYPA has been making strategic investments in tools and capabilities to fuel its transition to an end-to-end fully digital utility. Fundamental to this transformation is the investment and modernization of our technology infrastructure.
Key to the successful execution of this objective is the development of a world-class well-governed Analytics Platform and Data Hub that will allow NYPA to manage “Data As An Asset” and further its data integration and analytics capabilities. The goal of this platform is to enable users to seamlessly combine data from disparate sources – both internal and external – and analyze data to gain critical business insights (from simple reporting to advanced analytics).
NYPA is looking for an experienced technologist to help lead NYPA’s Analytics platform and capability into the future. The Lead Data Engineer will work with the Data Services team to identify, design, develop and manage big data centric solutions that meet the strategic business and technology direction of the organization.

Responsibilities

Solution Design:  Design and develop Big data solutions, using technologies like AWS, Spark and Databricks, that are flexible, extensible, elastic, secure and reliable at large scale;  Use cloud infrastructure to design and deliver data as a service solutions; Design, Develop and manage data ingestion design patterns for consistency and reusability; Design and develop microservices strategy for data provisioning 
Solution Development:  Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL, Python and AWS ‘big data’ technologies; Build and manage data pipelines and promote data and analytics use cases to production
Collaboration & Communication:  Collaborate with Data Governance team to capture and manage meta data, and implement data quality rules; Work with stakeholders including the executive, product, data and business teams to assist with data related technical issues and support their data infrastructure needs.
Standards & Best Practices:  Work with Project Managers, Solution Engineers and Developers development teams, using an agile project framework, to ensure solutions align with business objectives while adhering to defined development standards; Ensure data engineering solutions adhere to framework put in place by data architecture and governance team;
Continuous Improvement:  Continuously learn and be at the leading edge of data integration, cloud, containerization and other industry leading trends; Identify, design, and implement internal process improvements (automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.); Stay on top of industry news, technology products, platforms and partners to ensure you and your team maintain deep industry and ecosystem expertise; Work with Solution Engineering, Cyber and business users to optimize the cost of cloud components; Identify ways to improve data reliability, efficiency and quality
Infrastructure & Security:   Conceptualize and facilitate the creation of infrastructure that allows big data to be accessed and analyzed; Work closely with IT security to monitor the company's data security and privacy; Work with Developers and Infrastructure teams to optimize the cloud components for better performance and scalability

Knowledge, Skills and Abilities

• Experience with any scripting language or command line tools (Lambda, PowerShell, Azure CLI, Python, etc.)
• Experience with full-stack engineering or DevOps experience (OpenStack or others)
• Experience in traditional and cloud data management components (MS SQL, RDS, Athena, etc.)
• Experience in designing and developing micro services and APIs 
• Experience building and optimizing data pipelines, architectures and data sets.
• Familiarity with DevOps and Agile methodologies
• Strong analytic skills related to working with structured, semi and unstructured datasets.
• Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores.
• Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases.
• Strong understanding of cloud concepts: virtualization technologies, IaaS, PaaS, SaaS, HA, distributed systems, and cloud delivery models.
• Knowledge and understanding of cloud security policies, infrastructure deployments in enterprise-wide environments
• Experience supporting and working with cross-functional teams in a dynamic environment
• Possesses strong organizational and time management skills, driving tasks to completion.
• Able to work independently with minimum supervision.
• Ability to quickly learn, understand, and work with new emerging technologies, methodologies and solutions.
• Exceptional verbal and written communication skills with the ability to effectively communicate with a diverse group of customers, partners, and colleagues

Education, Experience and Certifications

• Bachelor of Science Degree in MIS or Computer Science/Engineering (or similar)
• Minimum of 8 years of total experience in software development and data integration, including at least 6 years of Data Engineering experience
• Experience with software engineering best-practices such as build automation, continuous integration and continuous deployment
• Experience in building micro services using AWS Lambda and other technologies
• Experience with big data tools: Spark, Kafka, StreamSets, etc. and ETL tools: Talend, Pentaho, etc.
• Experience with relational SQL and NoSQL databases.
• Scripting experience with JavaScript, pySpark, Python, T-SQL or other similar languages
• Experience in various data transfer mechanisms including Https, FTP(s), EDI, JSON and APIs
• Experience with stream-processing systems: Storm, Spark-Streaming, etc.
• Amazon Web Services (AWS) Certified Big Data – Specialty or Clodera/HDS certified Apache Spark Developer.
• Willingness to travel if necessary

Physical Requirements

N/A

The New York Power Authority is an Equal Opportunity Employer.

 

Nearest Secondary Market: New York City
Job Segment: Database, Engineer, Computer Science, Cloud, SQL, Technology, Engineering

 

Dice Id : 10102261
Position Id : 9627
Originally Posted : 2 months ago
Have a Job? Post it