Role: Data Architect
Location: Fremont, CA (Onsite)
Mandatory Skills: Splunk, PowerShell, or Python, Alerts & Logs Monitoring, Confluence and SharePoint
12 + years of IT experience in the Development/ Architecture Role
5+ years of Proven experience in Data Engineering and architecture roles.(with Azure Data Factory , Azure Databricks & PySpark, Azure Synapse and Azure SQL).
Key Skills: Azure Data Factory (ADF), Azure Databricks & PySpark, Azure Synapse, Azure SQL, Python, and Spark SQL
Skill Requirements:
Strong Experience with ETL/ELT tools like ADF, Informatica , Talend etc., and data warehousing technologies like Azure Synapse, Azure SQL, Amazon redshift , Snowflake , Google Big Query etc.
Strong hands-on experience with Azure Data Factory (ADF) for data orchestration (for building and managing pipelines), Azure Databricks for big data processing and analytics, and Apache Spark for distributed data processing).
Adept in with big data tools(Databricks , Spark etc..)
Experience with Power BI, Tableau/OBIEE etc.
Proficiency in PySpark, Python, and Spark SQL.
Experience with CI/CD pipelines for data platforms.
Solid understanding of data warehouse best practices, development standards and methodologies..
Strong expertise in SQL and relational data modeling.
Experience designing data lakes, data warehouses, or Lakehouse architectures.
Good understanding of data governance, metadata management, and security best practices.
Solid understanding of distributed data processing and performance tuning.
Experience with ETL/ELT patterns and best practices.
Strong analytical, problem-solving skills and ability to work in a fast-paced, dynamic environment.
Excellent communication and documentation skills.
Key Responsibilities:
Design and implement end-to-end data architecture for batch and real-time data processing to implement enterprise-grade data solutions using Azure and Microsoft Fabric.
Design, build, and optimize ETL/ELT pipelines for ingestion, transformation, and publishing using ADF, Spark, and Databricks.
Architect and optimize data pipelines using Azure Data Factory (ADF).
Develop and manage scalable data processing frameworks using Databricks and Apache Spark.
Develop efficient data transformation logic using PySpark and Spark SQL.
Build reusable, high-performance data models for analytics and reporting.
Develop and maintain data ingestion, transformation, and orchestration workflows.
Ensure data quality, consistency, security, and governance across platforms.
Define and enforce data engineering best practices, including CI/CD, versioning, testing, and monitoring.
Mentor and guide data engineers, fostering a culture of innovation and excellence.
Collaborate with data engineers, analysts, data scientists, and business stakeholders.
Nice to Have:
DevOps & CI/CD: Azure DevOps, GitHub Actions, Jenkins
Real-time streaming (Azure Event Hubs, Kafka), Microsoft Fabric