Overview
On Site
USD 131,750.00 - 178,250.00 per year
Full Time
Skills
Jersey
Customer Facing
BFSI
Life Sciences
Telecommunications
Consumer Goods
Data Flow
Workflow
Unstructured Data
Data Quality
Scalability
Distributed Computing
Blueprint
Artificial Intelligence
Data Structure
Industry-specific
Thought Leadership
Modeling
ELT
Data Processing
Performance Tuning
Data Storage
Database
Use Cases
Data Integration
Enterprise Integration
API
DaaS
Data Governance
Encryption
Access Control
Regulatory Compliance
HIPAA
Leadership
Communication
Customer Engagement
Team Leadership
Analytical Skill
Conflict Resolution
Problem Solving
Finance
Manufacturing
Retail
Project Management
Management
Data Architecture
Cloudera
Big Data
Apache Spark
Apache Hadoop
Optimization
Orchestration
Apache Airflow
Step-Functions
Relational Databases
Snow Flake Schema
Amazon Redshift
NoSQL
MongoDB
Apache Cassandra
Redis
Graph Databases
Neo4j
Streaming
Messaging
Apache Kafka
RabbitMQ
Amazon Kinesis
Real-time
Elasticsearch
Vector Databases
Semantic Search
Google Cloud Platform
Google Cloud
Storage
Analytics
DevOps
Terraform
Docker
Kubernetes
Continuous Integration
Continuous Delivery
Data Engineering
ERwin
UML
Jupyter
SQL
Data Modeling
Prototyping
Apache Atlas
Amazon Web Services
Microsoft Azure
Meta-data Management
Splunk
Cloud Computing
Software Performance Management
Machine Learning (ML)
Scripting
Data Integrity
Testing
Workflow Management
Agile
Confluence
JIRA
Git
Documentation
Collaboration
Python
Scala
Extract
Transform
Load
Enterprise Resource Planning
Customer Relationship Management (CRM)
Mainframe
Job Details
Description:
Role: AI Data Engineer Architect
Location options: San Francisco Bay Area, New York / New Jersey, Atlanta, Chicago, and Dallas.
Preface
The AI - Data Engineering Architect in TCS's Americas region designs and implements robust data infrastructures for enterprise AI and analytics solutions. This client-facing hybrid role ensures that organizational data assets are architected to optimize AI initiatives, supporting various industries including BFSI, Manufacturing, Life Sciences, Telecom, Retail, Travel, and Consumer Goods.
What You Would Be Doing
Data Architecture Design: Design data flows from source systems to AI/analytics models, creating conceptual and logical architecture diagrams.
Enterprise Data Assessment: Evaluate client data ecosystems, identify gaps, and recommend modernization strategies for AI readiness.
Pipeline & ETL Strategy: Define ingestion and processing strategies (batch/real-time), select tools, and outline data transformation workflows.
Data Storage & Modeling: Architect storage solutions (data lakes, warehouses, vector/graph databases) and design efficient schemas.
Integration of Heterogeneous Data: Plan integration of structured and unstructured data, ensuring consistency and alignment across systems.
Quality, Governance, and Security: Implement data quality checks, lineage, compliance, and security measures throughout the architecture.
Scalability & Performance: Design scalable, high-performance solutions using distributed computing, cloud auto-scaling, and performance tuning.
Technology Selection & Blueprint: Recommend platforms/tools and create comprehensive reference architecture blueprints.
Collaborate with AI/Machine Learning Teams: Align data structures with model requirements and coordinate on feature engineering and retrieval pipelines.
Prototype and Validate: Build proofs-of-concept for pipelines or retrieval solutions to validate approach and refine architecture.
Industry-Specific Data Solutions: Tailor architectures for domain-specific data types, volumes, and compliance needs.
Client Engagement & Thought Leadership: Lead workshops, produce documentation, and advise on data strategy and governance.
Oversee Implementation: Govern solution realization, ensure alignment with designs, and resolve technical challenges during execution.
What Skills Are Expected
Data Architecture Expertise: Experience designing complex data architectures with strong modeling and abstraction abilities.
Big Data and ETL/ELT Mastery: Proficient in Hadoop, Spark, and distributed data processing, with batch vs. streaming expertise.
Cloud Data Services: Skilled in AWS, Azure, or Google Cloud Platform data services for cloud-native data pipelines and cost/performance optimization.
Databases and Storage Systems: Proficient in relational, NoSQL, and analytical databases; able to recommend storage by use case.
Data Integration & APIs: Experienced with ETL, enterprise integration, and API-based data service design.
Data Governance & Quality: Knowledge of data cataloging, lineage, and quality monitoring frameworks and tools.
Security & Compliance: Understanding of encryption, access controls, and compliance standards (GDPR, HIPAA, etc.).
Collaboration & Leadership: Strong communication, client engagement, and team leadership abilities.
Analytical and Problem-Solving: Aptitude for troubleshooting, identifying bottlenecks, and designing mitigations.
Domain Knowledge: General awareness of data environments in finance, manufacturing, retail, and other key industries.
Project Management: Ability to coordinate architecture deliverables, align tasks, and manage technical workstreams.
Emerging Tech & Trends: Up-to-date on latest data technologies, paradigms, and trends relevant to data architecture.
Certifications (nice-to-have): Certifications in AWS, Azure, Google Cloud Platform, or Cloudera Data Engineering/Architecture are preferred.
Key Technology Capabilities
Big Data Frameworks: Advanced use of Apache Spark and Hadoop for batch/stream processing and optimization.
Data Pipeline Orchestration: Skilled in Apache Airflow, AWS Step Functions, or similar for workflow management.
Relational Databases & SQL: Strong SQL skills; experience with Snowflake, Redshift, BigQuery, Synapse, and columnar storage.
NoSQL and Specialized Stores: Experienced with MongoDB, Cassandra, Redis, and graph databases like Neo4j.
Streaming & Messaging: Proficient in Kafka, RabbitMQ, AWS Kinesis, Google Pub/Sub for real-time/event ingestion.
Search and Indexing: Familiarity with Elasticsearch/OpenSearch and vector databases for text/semantic search.
Cloud Data Ecosystems: Proficient with key data components on AWS, Azure, or Google Cloud Platform, including storage, ETL, and analytics services.
DevOps & Infrastructure as Code: Experience with Terraform, CloudFormation, Docker, Kubernetes, and CI/CD for data engineering.
Data Modeling Tools: Competency with ERwin, UML, Jupyter, or SQL IDEs for data modeling and prototyping.
Metadata & Catalog: Skilled in Apache Atlas, AWS Glue Data Catalog, Azure Purview, and lineage/metadata management.
Monitoring & Logging: Use of Splunk/ELK, cloud monitors, and APM for pipeline performance and alerting.
Machine Learning Integration: Understanding of feature stores, ML pipelines, and integration with data architectures.
Testing & Validation: Experience with Great Expectations and scripting for automated data integrity testing.
Workflow Management & Agile: Proficient with Confluence, JIRA, and Git for documentation and collaboration.
High-Level Languages: Ability to write and review Python or Scala for data pipeline development and reference.
Enterprise Systems Knowledge: Experience integrating with ERP, CRM, and mainframe data using standard methods.
Salary Range: $131,750 - $178,250 a year
#LI-AD1
Role: AI Data Engineer Architect
Location options: San Francisco Bay Area, New York / New Jersey, Atlanta, Chicago, and Dallas.
Preface
The AI - Data Engineering Architect in TCS's Americas region designs and implements robust data infrastructures for enterprise AI and analytics solutions. This client-facing hybrid role ensures that organizational data assets are architected to optimize AI initiatives, supporting various industries including BFSI, Manufacturing, Life Sciences, Telecom, Retail, Travel, and Consumer Goods.
What You Would Be Doing
Data Architecture Design: Design data flows from source systems to AI/analytics models, creating conceptual and logical architecture diagrams.
Enterprise Data Assessment: Evaluate client data ecosystems, identify gaps, and recommend modernization strategies for AI readiness.
Pipeline & ETL Strategy: Define ingestion and processing strategies (batch/real-time), select tools, and outline data transformation workflows.
Data Storage & Modeling: Architect storage solutions (data lakes, warehouses, vector/graph databases) and design efficient schemas.
Integration of Heterogeneous Data: Plan integration of structured and unstructured data, ensuring consistency and alignment across systems.
Quality, Governance, and Security: Implement data quality checks, lineage, compliance, and security measures throughout the architecture.
Scalability & Performance: Design scalable, high-performance solutions using distributed computing, cloud auto-scaling, and performance tuning.
Technology Selection & Blueprint: Recommend platforms/tools and create comprehensive reference architecture blueprints.
Collaborate with AI/Machine Learning Teams: Align data structures with model requirements and coordinate on feature engineering and retrieval pipelines.
Prototype and Validate: Build proofs-of-concept for pipelines or retrieval solutions to validate approach and refine architecture.
Industry-Specific Data Solutions: Tailor architectures for domain-specific data types, volumes, and compliance needs.
Client Engagement & Thought Leadership: Lead workshops, produce documentation, and advise on data strategy and governance.
Oversee Implementation: Govern solution realization, ensure alignment with designs, and resolve technical challenges during execution.
What Skills Are Expected
Data Architecture Expertise: Experience designing complex data architectures with strong modeling and abstraction abilities.
Big Data and ETL/ELT Mastery: Proficient in Hadoop, Spark, and distributed data processing, with batch vs. streaming expertise.
Cloud Data Services: Skilled in AWS, Azure, or Google Cloud Platform data services for cloud-native data pipelines and cost/performance optimization.
Databases and Storage Systems: Proficient in relational, NoSQL, and analytical databases; able to recommend storage by use case.
Data Integration & APIs: Experienced with ETL, enterprise integration, and API-based data service design.
Data Governance & Quality: Knowledge of data cataloging, lineage, and quality monitoring frameworks and tools.
Security & Compliance: Understanding of encryption, access controls, and compliance standards (GDPR, HIPAA, etc.).
Collaboration & Leadership: Strong communication, client engagement, and team leadership abilities.
Analytical and Problem-Solving: Aptitude for troubleshooting, identifying bottlenecks, and designing mitigations.
Domain Knowledge: General awareness of data environments in finance, manufacturing, retail, and other key industries.
Project Management: Ability to coordinate architecture deliverables, align tasks, and manage technical workstreams.
Emerging Tech & Trends: Up-to-date on latest data technologies, paradigms, and trends relevant to data architecture.
Certifications (nice-to-have): Certifications in AWS, Azure, Google Cloud Platform, or Cloudera Data Engineering/Architecture are preferred.
Key Technology Capabilities
Big Data Frameworks: Advanced use of Apache Spark and Hadoop for batch/stream processing and optimization.
Data Pipeline Orchestration: Skilled in Apache Airflow, AWS Step Functions, or similar for workflow management.
Relational Databases & SQL: Strong SQL skills; experience with Snowflake, Redshift, BigQuery, Synapse, and columnar storage.
NoSQL and Specialized Stores: Experienced with MongoDB, Cassandra, Redis, and graph databases like Neo4j.
Streaming & Messaging: Proficient in Kafka, RabbitMQ, AWS Kinesis, Google Pub/Sub for real-time/event ingestion.
Search and Indexing: Familiarity with Elasticsearch/OpenSearch and vector databases for text/semantic search.
Cloud Data Ecosystems: Proficient with key data components on AWS, Azure, or Google Cloud Platform, including storage, ETL, and analytics services.
DevOps & Infrastructure as Code: Experience with Terraform, CloudFormation, Docker, Kubernetes, and CI/CD for data engineering.
Data Modeling Tools: Competency with ERwin, UML, Jupyter, or SQL IDEs for data modeling and prototyping.
Metadata & Catalog: Skilled in Apache Atlas, AWS Glue Data Catalog, Azure Purview, and lineage/metadata management.
Monitoring & Logging: Use of Splunk/ELK, cloud monitors, and APM for pipeline performance and alerting.
Machine Learning Integration: Understanding of feature stores, ML pipelines, and integration with data architectures.
Testing & Validation: Experience with Great Expectations and scripting for automated data integrity testing.
Workflow Management & Agile: Proficient with Confluence, JIRA, and Git for documentation and collaboration.
High-Level Languages: Ability to write and review Python or Scala for data pipeline development and reference.
Enterprise Systems Knowledge: Experience integrating with ERP, CRM, and mainframe data using standard methods.
Salary Range: $131,750 - $178,250 a year
#LI-AD1
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.