Technical Data Architect

Overview

Hybrid
$120,000 - $130,000
Full Time
Accepts corp to corp applications
75% Travel
Able to Provide Sponsorship

Skills

databricks
snowflake
mlops
aws
azure
gcp

Job Details

Role: Senior Technical Architect
Location: Cambridge, MA
Job type: W2/ Full time
 
About the Role
A hands-on Data Architect who can roll up their sleeves and build modern data solutions from the ground up. They will define the architecture, implement the solutions, and establish the standards that enable our organization to leverage data as a strategic asset. They will bridge the gap between business needs and technical execution, working directly with stakeholders to rapidly solution and deliver value. Working closely with technology vertical teams, cross company data science associates to enable advanced analytics and ML capabilities while implementing robust DataOps practices for operational excellence. As a technical lead, they will demonstrate excellence through action by building and shipping production-quality solutions that set the standard for the entire data engineering team that they are part of.
What Is Expected (To DO)
Partner directly with business and IT stakeholders to understand requirements, identify opportunities, and co-create data solutions
Translate business problems into technical architectures and implementation roadmaps that accelerate time to value
Drive rapid prototyping and iterative delivery to get data products into stakeholders' hands quickly
Facilitate solution design sessions and architecture reviews with cross-functional teams
Design and implement scalable data architectures using cloud-native technologies (AWS/Azure/Google Cloud Platform)
Build and maintain a modern lakehouse architecture that unifies data lake and warehouse capabilities
Design and implement data marts tailored to specific business domains and analytical use cases
Optimize lakehouse performance including table formats (Delta, Iceberg), partitioning strategies, and query optimization
Build and deploy data pipelines, APIs, and microservices that deliver reliable data products
Lead by example, write production code, release software, and demonstrate best practices in action rather than just documentation
Mentor data engineers through pair programming, code reviews, and hands-on collaboration on real projects
Establish patterns and reference implementations that data engineers can learn from and replicate
Create and manage data stores including data lakes, warehouses, and operational databases
Define data models, schemas, and metadata standards that ensure consistency and quality
Implement modern data services using tools like Snowflake, Databricks, dbt, Airflow, and Kafka
Design and build infrastructure that supports data science workflows, including feature stores, model training pipelines, and experiment tracking
Collaborate with data scientists to understand requirements for ML model development, deployment, and monitoring
Establish DataOps practices including CI/CD for data pipelines, automated testing, monitoring, and observability
Implement data quality checks, alerting, and incident response procedures to ensure reliability
Create reusable frameworks, templates, and tooling that accelerate team productivity and enforce standards
Establish data governance frameworks including cataloging, lineage, and quality controls
Remove technical blockers and streamline processes to shorten delivery cycles
Evaluate and integrate emerging technologies to continuously improve our data capabilities
Key requirements and needs (What is needed)
5+ years of hands-on experience designing and implementing enterprise data solutions
Deep expertise with modern data stack tools (e.g., Snowflake, Databricks, Fivetran, dbt)
Strong business of data acumen with ability to understand information domain challenges and translate functional needs into technical solutions
Proven track record of collaborating with business stakeholders and delivering solutions that drive measurable outcomes
Experience balancing technical excellence with pragmatic delivery to maximize time to value
Strong experience with lakehouse architectures and table formats (Delta Lake, Apache Iceberg, Apache Hudi)
Proven ability to design and build dimensional data marts, star/snowflake schemas, and aggregation layers
Strong proficiency in SQL, Python, and at least one cloud platform (AWS/Azure/Google Cloud Platform)
Experience building both batch and real-time data pipelines
Proven track record of shipping production-quality data solutions and software releases
Strong coding skills with commitment to clean code, testing, and documentation standards
Experience mentoring and upskilling engineering teams through demonstration and collaboration
Understanding or exposure to data science workflows, tools, ML lifecycle, and model operationalization (MLOps)
Knowledge or understanding of feature engineering, model serving infrastructure, and experiment tracking tools (MLflow, Weights & Biases)
Proven experience implementing DataOps practices including version control, automated testing, and CI/CD for data pipelines
Familiarity with monitoring and observability tools (Datadog, Grafana, Monte Carlo, Great Expectations)
Solid understanding of data modeling techniques (dimensional, vault, normalized)
Knowledge of data governance, security, and compliance best practices
Excellent problem-solving skills with a pragmatic, results-oriented approach
Outstanding communication and facilitation skills to engage leaders, translate technical concepts for non-technical stakeholders, work collaboratively with IT teams.
Bonus Points
Experience with containerization (Docker/Kubernetes) and IaC (Terraform/CloudFormation)
Background in data mesh or decentralized data architecture patterns
Hands-on experience with ML frameworks (TensorFlow, PyTorch, scikit-learn)
Knowledge of vector databases and LLM application infrastructure
Experience with stream processing and event-driven architectures
Deep expertise in query optimization, indexing strategies, and performance tuning
Experience with data virtualization and federated query engines (Trino, Presto)
Background in agile delivery methodologies and product management practices
Open source contributions or technical blog demonstrating thought leadership
Certifications in cloud platforms or data technologies
This is a high-impact role for someone who wants to build, not just design. They will be a working architect who leads from the
front, setting the technical standard through shipped code and delivered solutions while maintaining strong partnerships with business and IT stakeholders. The ideal candidate will love working with data, solving business problems, building scalable architectures, enabling data science innovation, implementing production-grade DataOps practices, and elevating engineering teams through example.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.

About Virtualan Software