Data Operations Engineer
Location: Cupertino, CA/ Austin, CA(Onsite)
Experience : 15+ years required
As data operations engineer, you will collaborate with various infrastructure teams, platforms, product, engineering and scientists, to identify requirements that will derive the creation of sensible operations to build and manage data analytics foundations. The ideal candidate is a self-motivated teammate with good problem solving and communication skills with theability to adapt and learn quickly, provide results with limited direction, and choose the best possible operational decisions.
CORE RESPONSIBILITIES -
Lead day-to-day operations to ensure organizational delivery of quality results. Responsible for provisioning, enabling, scaling and maintaining our team’s data, analytics and ML infrastructures forbatch and real time systems including pipelines, frameworks, tools and services in hybrid cloud. Shepherd zero-downtime deployment process through continuous delivery practices, rapidly releasing features thatprovide critical and faster insights to business users. Collaborate with the platform team by building the right tools for observability, monitoring, alerting and self-healing forthe day to day management of analytics foundations. Debug complex problems in distributed environment and ability to run the prod incidents efficiently to following up withPost incident reviews. Developing self-service tools and automation to improve engineering efficiency and the quality of services. Excellent verbal and written communication skills. Self-starter with forward thinking capability with strong executional track record and be accountable for business priorities. Hands-on experience with CI/CD pipelines and cloud environments like Gitlab, Spinnaker , Docker/Kubernetes observability using Splunk, Grafana/Data dog is a huge plus. Proficient with one or more programming languages such as Python/Go/Rust/Java/Scala for automation and API integrations. Experience in implementing security controls, governance processes, compliance validation, infrastructure cost analysis and optimization. Knowledge of analytics and Applied ML stack like Apache Spark, Trino/Pinot, Iceberg, Atlas, Flink, Airflow/Luigi, Tableau, Snowflake, Databricks, MLFlow, Data Catalogs, Jupiter Notebooks, Vector database and Cassandra. Solid understanding of IAAC (infrastructure as a code) techniques like terraform, orchestration and tooling. Experience scaling operations in a fast-paced and dynamic environment. Experience working in agile or evolving product environments