Overview
On Site
Hybrid2-3 days a week
Depends on Experience
Full Time
Skills
Python
PySpark
Databricks
data lake
coding
Spark
Scala
Hadoop
Big data
Job Details
Job Title : Bigdata Databricks Architect
Client Location: Boston, MA (Hybrid 2-3 days a week onsite)
Duration : Long term contract
Education and experience Qualification
- Bachelor s degree in Computer Science, Information Systems, or equivalent education or work experience
- 8+ years of IT experience in big data space like Hadoop, data lake, data engineering using Python & Spark programming languages
- Any AWS and/or Databricks certification will be a plus
Roles & Responsibilities
- Recognize the current application infrastructure and suggest new concepts to improve performance
- Document the best practices and strategies associated with application deployment and infrastructure support
- Produce reusable, efficient, and scalable programs, and also cost-effective migration strategies
- Develop Data Engineering and ML pipelines in Databricks and different AWS services, including S3, EC2, API, RDS, Kinesis/Kafka and Lambda to build serverless applications
- Work jointly with the IT team and other departments to migrate data engineering and ML applications to Databricks/AWS
- Comfortable to work on tight timelines, when required.
Skill Sets Required
- Good decision-making and problem solving skills
- Solid understanding of Databricks fundamentals/architecture and have hands on experience in setting up Databricks cluster, working in Databricks modules (Data Engineering, ML and SQL warehouse).
- Knowledge on medallion architecture, DLT and unity catalog within Databricks.
- Experience in migrating data from on-prem Hadoop to Databricks/AWS
- Understanding of core AWS services, uses, and AWS architecture best practices
- Hands-on experience in different domains, like database architecture, business intelligence, machine learning, advanced analytics, big data, etc.
- Solid knowledge on Airflow
- Solid knowledge on CI/CD pipeline in AWS technologies
- Application migration of RDBMS, java/python applications, model code, elastic etc.
- Solid programming background on scala, python
- Experience with Docker and Kubernetes is a plus