Project Description :
Seeking Sr. Cloud Systems Engineer to serve as Databricks Administrator as the hands-on technical owner for the Databricks platform supporting EDP (Enterprise Data Platform). This role is accountable for platform operations, security and governance configuration end-to-end--ensuring the environment is compliant, reliable, cost-controlled, and enables secure analytics and AI/ML workloads at scale.Period of Performance: March 16, 2026 to December 31, 2027.Place of Performance: On-Site Washington, DC location(s).
Qualification Requirements :
The candidate shall also demonstrate the below knowledge and experience: Hands-on experience administering Databricks (workspace administration, clusters/compute policies, jobs, SQL warehouses, repos, runtime management) and expertise using Databricks CLI. Strong Unity Catalog administration: metastores; catalogs/schemas; grants; service principals; external locations; storage credentials; governed storage access. Identity & Access Management proficiency: SSO concepts, SCIM provisioning, group based RBAC, service principals, least-privilege patterns. Security fundamentals: secrets management, secure connectivity, audit logging, access monitoring, and evidence-ready operations. Automation skills: scripting and/or IaC using Terraform/CLI/REST APIs for repeatable configuration and environment promotion. Experience implementing data governance controls (classification/tagging, lineage/metadata integrations) in partnership with governance teams. CI/CD practices for jobs/notebooks/config promotion across SDLC environments. Understanding of lakehouse concepts (e.g., Delta, table lifecycle management, separation of storage/compute). Strong troubleshooting and problem-solving; communicate clearly during incidents and changes. Experience administering Databricks serverless compute, Workspace Git integrations (GitLab), Databricks Asset Bundles (DABs) for deployment automation, and modern workspace features supporting DevOps workflows. Bachelor's degree in a related field or equivalent practical experience. 7+ years in cloud/data platform administration and operations, including 3+ years administering Databricks** U.S. citizenship or is acceptable.
Skills Requirements :
Highly valued(Desirable, but not required) knowledge, skills and experience Cloud platform expertise (AWS ): IAM roles/policies, object storage security patterns,networking basics (VPC concepts), logging/monitoring integration. SQL proficiency and data engineering fundamentals for troubleshooting queryperformance issues, understanding ETL/ELT workflow patterns, and debugging datapipeline failures; basic Python/Scala familiarity for notebook/code issue diagnosis. Experience with compliance and regulatory frameworks (FedRAMP, HIPAA, SOC2, orsimilar) including implementation of data residency requirements, retention policies, andaudit-ready evidence collection. Hands-on experience with AWS security and networking services including PrivateLink,Secrets Manager/Systems Manager integration, CloudWatch/CloudTrail integration, S3bucket policies, cross-account access patterns, and KMS encryption key management. Demonstrated experience in Databricks and Cloud FinOps and budget management SLA/SLO management, incident management, and stakeholder communication skills;ability to define platform service levels, produce operational reports, translate technicalissues to business stakeholders, and manage vendor relationships (Databricks accountteams). 5+ years of demonstrated experience administering Databricks Databricks Platform Administrator/Databricks AWS Platform Architect Databricks Certified Data Engineer Associate/Professional AWS Certified Solutions Architect Associate or Professional
Responsibilities :
The Contractor shall deliver, but not limited to, the following: Administer Databricks account and workspaces across SDLC environments; standardize configuration, naming, and operational patterns. Configure and maintain clusters/compute, job compute, SQL warehouses, runtime versions, libraries, repos, and workspace settings. Implement platform monitoring/alerting, operational dashboards, and health checks; maintain runbooks and operational procedures. Manage change control for upgrades, feature rollouts, configuration changes, and integration changes; document impacts and rollback plans. Enforce least privilege across platform resources (workspaces, jobs, clusters, SQL warehouses, repos, secrets) using role/group-based access patterns. Enable and maintain audit logging and access/event visibility; support security reviews and evidence requests. Administer Unity Catalog governance: metastores, catalogs/schemas/tables, ownership, grants, and environment/domain patterns. Configure and manage external locations, storage credentials, and governed access to cloud object storage. Coordinate secure connectivity and guardrails with cloud/network teams: private connectivity patterns, egress controls, firewall/proxy needs. Implement cost guardrails: cluster policies, auto-termination, scheduling, workload sizing standards, and capacity planning. Produce usage/cost insights and optimization recommendations; address waste drivers (idle compute, oversized clusters, inefficient jobs). Automate administration and configuration using APIs/CLI/IaC (e.g., Terraform) to reduce manual drift and improve repeatability. Maintain platform documentation: configuration baselines, security/governance standards, onboarding guides, and troubleshooting references. Monitor and optimize platform performance, including SQL warehouse query tuning, cluster autoscaling configuration, Photon enablement, and Delta Lake optimization guidance (OPTIMIZE, VACUUM, Z-ordering strategies). Administer Delta Live Tables (DLT) pipelines and coordinate with data engineering teams on pipeline health, data quality monitoring, failed job remediation, and pipeline configuration best practices. Manage third-party integrations and ecosystem connectivity, including BI tool integrations (e.g., Power BI), and external metadata catalog integrations. Implement Databricks Asset Bundles (DABs) for standardized deployment patterns; automate workspace resource deployment (jobs, pipelines, dashboards) across SDLC environments using bundle-based CI/CD workflows. Conduct capacity planning and scalability analysis, including forecasting concurrent user/workload growth, platform scaling strategies, and proactive resource allocation during peak usage periods. Facilitate user onboarding and enablement, including new user/team onboarding procedures, training coordination, workspace access provisioning, and creation of self-service documentation/guides.
Job ID : 1505
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.
- Dice Id: 10301578
- Position Id: 1505
- Posted 13 days ago