Data Architecture Design
1 Architect and implement a scalable data hub solution on AWS using best practices for data ingestion transformation storage and access control
2 Define data models data lineage and data quality standards for the DataHub
3 Select appropriate AWS services S3 Glue Redshift Athena Lambda based on data volume access patterns and performance requirements
4 Come up with a design that accommodates AIML applications in the next phase
Data Ingestion and Integration
1 Design and build data pipelines to extract transform and load data from various sources databases APIs flat files into the DataHub using AWS Glue AWS Batch or custom ETL processes
2 Implement data cleansing and normalization techniques to ensure data quality
3 Manage data ingestion schedules and error handling mechanisms
Data Governance and Access Control
1 Establish data access controls and security policies to protect sensitive data within the DataHub using IAM roles and policies
2 Develop data governance frameworks including data quality checks data lineage tracking and data retention policies
Data Analytics Enablement
1 Create data catalogs and metadata management systems to facilitate data discovery and understanding by business users and data analysts
2 Design and implement data views and dashboards using Power BI to enable data exploration and visualization
3 Create data warehouses and data marts to meet the needs of the business
Monitoring and Optimization
1 Monitor data pipeline performance data quality and system health to identify and resolve issues proactively
2 Optimize data storage and processing costs by leveraging AWS cost optimization features
Data Exchange
1 Develop the required governance security monitoring and guard rails to enable efficient data exchange between internal application and their external vendors partners and SaaS providers
2 Develop intake process SLAs and usage rules for internal and external data set producers and consumers
Required Skills and Experience
AWS Expertise Deep understanding of AWS data services including S3 Glue Redshift Athena Lake Formation Sep Functions CloudWatch and EventBridge
Data Modeling Proficiency in designing dimensional and snowflake data models for data warehousing and data lakes
Data Engineering Skills Experience with ETLELT processes data cleansing data transformation and data quality checks Experience with Informatica IICS and ICDQ is a plus
Programming Languages Proficiency in Python SQL and potentially PySpark for data processing and manipulation
Data Governance Knowledge of data governance best practices including data classification access control and data lineage tracking
Preferred Qualifications
Experience with data lakehouse architectures and the ability to leverage both structured and unstructured data
Familiarity with data visualization tools like Tableau or Power BI
Strong communication and collaboration skills to work with stakeholders across business and technical teams
AWS certifications related to data analytics and architecture"
Skills
Mandatory Skills : AWS Glue, AWS Lambda, AWS S3, Aws Step Functions, Dimensional Data Modeling
Good to Have Skills : Dynamo DB
We are an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex (including pregnancy, sexual orientation, or gender identity), national origin, citizenship status, age, disability, genetic information, protected veteran status, or any other characteristic protected by applicable law.