Overview
On Site
Depends on Experience
Full Time
No Travel Required
Skills
Hadoop
PySpark
Scala
HDFS
S3
Data Lake
Apache Iceberg
AWS
Azure
GCP
Job Details
Job Title: Big Data Architect
Job Location: Tampa, FL (Onsite) Job Type: Fulltime/Permanent Hire Years of Experience: 15+ Years
Primary Responsibilities:
Drive Data Automation Initiatives: Architectural Vision: Lead architectural vision for scalable and reliable data automation across data pipelines, platforms, and consumption layers.
Solution Design: Design and implement end-to-end frameworks for data ingestion, transformation, validation, and delivery.
Stakeholder Engagement: Collaborate with internal/external stakeholders to gather requirements and translate them into actionable solutions.
Technology Evaluation: Continuously research and recommend cutting-edge tools and methodologies.
Mentor and Coach for Data Quality Engineering Team:
Technical Guidance: Act as the primary mentor on architectural patterns, principles, and best practices for data quality.
Expertise Dissemination: Provide expertise on data profiling, validation, error detection, and remediation.
Capability Building: Foster a strong data quality culture via code reviews, guidance, and knowledge sharing.
Problem Solving: Conduct technical reviews and root cause analyses to ensure system scalability and robustness.
Lead Automation for Data Quality Test Automation:
Strategy & Customization: Customize enterprise data quality tools (e.g., Broadcom CA TDM, Informatica DQ).
Pipeline Design: Develop strategies for automation integration with defect management systems.
CI/CD Integration: Implement automated data quality checks within CI/CD pipelines.
Optimization: Review solution performance and suggest improvements across domains.
Support Interviews and Talent Evaluation:
Participate in interviews, conduct technical assessments, and help with hiring for data engineering and big data-focused roles.
Hands-on Technical Knowledge:
Prototyping & POCs: Lead hands-on development, prototyping, and proof-of-concept for DQ solutions.
Advanced SQL: High proficiency in writing and optimizing SQL queries for data profiling, cleansing, and validation.
Python: Strong experience in scripting automation, data engineering, and integrating with Big Data frameworks (e.g., PySpark).
Required Qualifications:
Bachelor s or Master s degree in Computer Science, Engineering, or related field.
15+ years of software engineering experience with 5+ years in roles such as Technical Architect, Lead Data Architect, or Principal Data Engineer.
Proficiency in Apache Spark (PySpark, Scala, or Java).
Expertise in:
Big Data Technologies: Spark SQL, Dataframe/Datasets
Data Storage: Hadoop, HDFS, S3, Data Lake, Apache Iceberg
Messaging: Apache Kafka
Cloud Platforms: AWS (EMR, Glue, Redshift, Lambda, Step Functions), Azure (Databricks, Synapse, Data Lake), Google Cloud Platform (Dataproc, BigQuery, Cloud Storage)
SQL: Advanced-level experience in complex queries, data ingestion, optimization.
Data Quality Automation: Proven experience building large-scale data quality frameworks.
Knowledge of:
Data lifecycle and SDLC
DevOps integration and CI/CD pipelines
Root cause analysis, anomaly detection, and continuous improvement strategies
Strong communication and leadership skills.
Job Location: Tampa, FL (Onsite) Job Type: Fulltime/Permanent Hire Years of Experience: 15+ Years
Primary Responsibilities:
Drive Data Automation Initiatives: Architectural Vision: Lead architectural vision for scalable and reliable data automation across data pipelines, platforms, and consumption layers.
Solution Design: Design and implement end-to-end frameworks for data ingestion, transformation, validation, and delivery.
Stakeholder Engagement: Collaborate with internal/external stakeholders to gather requirements and translate them into actionable solutions.
Technology Evaluation: Continuously research and recommend cutting-edge tools and methodologies.
Mentor and Coach for Data Quality Engineering Team:
Technical Guidance: Act as the primary mentor on architectural patterns, principles, and best practices for data quality.
Expertise Dissemination: Provide expertise on data profiling, validation, error detection, and remediation.
Capability Building: Foster a strong data quality culture via code reviews, guidance, and knowledge sharing.
Problem Solving: Conduct technical reviews and root cause analyses to ensure system scalability and robustness.
Lead Automation for Data Quality Test Automation:
Strategy & Customization: Customize enterprise data quality tools (e.g., Broadcom CA TDM, Informatica DQ).
Pipeline Design: Develop strategies for automation integration with defect management systems.
CI/CD Integration: Implement automated data quality checks within CI/CD pipelines.
Optimization: Review solution performance and suggest improvements across domains.
Support Interviews and Talent Evaluation:
Participate in interviews, conduct technical assessments, and help with hiring for data engineering and big data-focused roles.
Hands-on Technical Knowledge:
Prototyping & POCs: Lead hands-on development, prototyping, and proof-of-concept for DQ solutions.
Advanced SQL: High proficiency in writing and optimizing SQL queries for data profiling, cleansing, and validation.
Python: Strong experience in scripting automation, data engineering, and integrating with Big Data frameworks (e.g., PySpark).
Required Qualifications:
Bachelor s or Master s degree in Computer Science, Engineering, or related field.
15+ years of software engineering experience with 5+ years in roles such as Technical Architect, Lead Data Architect, or Principal Data Engineer.
Proficiency in Apache Spark (PySpark, Scala, or Java).
Expertise in:
Big Data Technologies: Spark SQL, Dataframe/Datasets
Data Storage: Hadoop, HDFS, S3, Data Lake, Apache Iceberg
Messaging: Apache Kafka
Cloud Platforms: AWS (EMR, Glue, Redshift, Lambda, Step Functions), Azure (Databricks, Synapse, Data Lake), Google Cloud Platform (Dataproc, BigQuery, Cloud Storage)
SQL: Advanced-level experience in complex queries, data ingestion, optimization.
Data Quality Automation: Proven experience building large-scale data quality frameworks.
Knowledge of:
Data lifecycle and SDLC
DevOps integration and CI/CD pipelines
Root cause analysis, anomaly detection, and continuous improvement strategies
Strong communication and leadership skills.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.