Finops Data Engineer--Only W2

Overview

Remote
$55 - $65
Contract - W2
Contract - 12 Month(s)

Skills

Finops
"data engineer"

Job Details

Three (3) data engineering contractor positions specializing in cloud billing data modeling and platform development. These engineers will build and maintain the data models, pipelines, and analytics infrastructure on our data platform to support FinOps capabilities across AWS, Google Cloud Platform, and Azure.

2. Position Overview

All three positions are engineering roles focused on building data models and analytics capabilities on the data platform. Each role specializes in a different domain of cloud financial data engineering.

Position 1

Position 2

Position 3

Billing Data Ingestion Engineer

Cost Allocation Data Engineer

Optimization Analytics Engineer

Raw billing ingestion, normalized schemas, staging models

Allocation models, cost attribution, showback/chargeback data marts

Utilization models, commitment analytics, optimization data products

Focus: Source systems, ETL, data quality

Focus: Business logic, allocation rules, dimensional models

Focus: Analytics models, metrics, optimization insights

3. Foundational Knowledge Required (All Positions)

Based on FinOps Foundation best practices, all engineers must demonstrate deep understanding of cloud billing data structures to effectively model this data on the platform.

3.1 Cloud Billing Data Structures

Engineers must understand that cloud billing data is:

  • Usage-based (metered resources) Charges based on actual consumption, requiring time-series modeling
  • Time-series oriented Hourly or per-second granularity requiring appropriate partitioning strategies
  • Highly dimensional Account, project, subscription, service, SKU, region, tags requiring flexible dimensional models
  • Large scale Billions of line items monthly, requiring efficient storage and query optimization

Key Billing Concepts for Data Modeling

  • Raw usage vs. cost: Separate fact tables for consumption metrics and financial charges
  • Amortized costs: Upfront fees distributed across billing periods; requires time-based allocation logic
  • Unblended rates: Actual rates charged; varying by volume and time within the month
  • Blended rates: Standardized distributed rates; requires calculation models
  • Fully loaded costs: Amortized + shared costs + discounts; complex derived calculations
  • Commitment coverage: RI/Savings Plan/CUD application logic and utilization tracking

3.2 AWS Billing Data (CUR)

AWS Cost and Usage Report schema knowledge:

  • Line item types: Usage, DiscountedUsage, SavingsPlanCoveredUsage, RIFee, Credit, Tax
  • Linked account / payer account hierarchy modeling
  • Savings Plans and RI amortization fields and calculation logic
  • Instance size flexibility normalization factors
  • Resource tags column structure (dynamic columns)
  • Pricing columns: public, effective, reservation rates

3.3 Google Cloud Platform Billing Data (BigQuery Export)

Google Cloud Platform Billing Export schema knowledge:

  • SKU structure and pricing tier modeling
  • Project / folder / organization hierarchy
  • Committed Use Discount (CUD) credit application
  • Sustained Use Discount calculations
  • Labels structure and propagation
  • Credits, promotions, and negotiated pricing fields

3.4 Azure Billing Data (Cost Management Export)

Azure Cost Management Export schema knowledge:

  • Subscription / resource group / management group hierarchy
  • Reservation utilization and amortization fields
  • EA vs. MCA schema differences
  • Meter categories and normalization requirements
  • Tags structure and resource metadata

4. Position 1: Billing Data Ingestion Engineer

This engineer builds the foundational data models that ingest and normalize raw billing data from all cloud providers into the data platform.

  • 4-6 years of data engineering experience
  • This role requires deep expertise in ETL/ELT patterns, handling schema evolution, and building robust pipelines at scale. The complexity of cloud billing data (billions of rows, late-arriving data, provider schema changes) demands someone who's solved these problems before.

4.1 Key Deliverables

  1. Source System Connectors: Build ingestion pipelines for AWS CUR (S3), Google Cloud Platform Billing Export (BigQuery), and Azure Cost Management Exports (Blob Storage)
  2. Staging Data Models: Create staging tables that preserve raw billing data with appropriate partitioning and clustering
  3. Normalized Base Models: Design and implement a unified schema that normalizes billing data across all three providers
  4. Data Quality Framework: Implement data quality tests, freshness checks, and reconciliation models
  5. Schema Evolution Handling: Build processes to detect and adapt to provider schema changes

4.2 Data Models to Build

  • stg_aws_cur: Staging model for AWS Cost and Usage Report with all raw columns
  • stg_Google Cloud Platform_billing: Staging model for Google Cloud Platform Billing Export
  • stg_azure_cost: Staging model for Azure Cost Management Export
  • base_cloud_usage: Normalized usage fact table with common schema across providers
  • base_cloud_cost: Normalized cost fact table with provider-agnostic structure
  • dim_cloud_account: Unified dimension for AWS accounts, Google Cloud Platform projects, Azure subscriptions
  • dim_cloud_service: Service/SKU dimension with provider mapping
  • dim_cloud_region: Region dimension normalized across providers
  • dim_resource_tags: Flattened tag/label dimension for cost allocation

4.3 Engineering Responsibilities

  • Design incremental ingestion logic handling daily/monthly billing updates
  • Implement late-arriving data handling (billing adjustments, credits)
  • Build reconciliation models comparing processed totals to provider invoices
  • Create data contracts and documentation for downstream consumers
  • Optimize model performance for large-scale billing datasets (partitioning, clustering)
  • Implement CI/CD for data model deployments and testing

4.4 Required Technical Skills

  • Expert SQL and data modeling (dbt, Dataform, or similar)
  • Experience with data warehouse platforms (Snowflake, BigQuery, Redshift, Databricks)
  • Python for data pipeline scripting and automation
  • Understanding of cloud billing export formats (CUR, BigQuery export, Azure exports)
  • Data quality testing frameworks (dbt tests, Great Expectations)

5. Position 2: Cost Allocation Data Engineer

This engineer builds the data models that allocate, attribute, and distribute cloud costs to business dimensions, enabling showback and chargeback capabilities.

  • 5-7 years of data engineering experience
  • This is the most complex position from a business logic perspective. Building allocation rules, amortization calculations, and understanding how to model financial constructs requires both technical depth and domain experience. Ideally someone who has built chargeback/showback systems before.

5.1 Key Deliverables

  1. Cost Allocation Models: Build models that map costs to teams, products, environments, and cost centers
  2. Amortization Logic: Implement models for RI/Savings Plan/CUD amortization across billing periods
  3. Shared Cost Distribution: Create models for allocating shared costs (support, shared services) to business units
  4. Showback/Chargeback Data Marts: Build final consumption-ready models for finance and team reporting
  5. Tagging Analytics Models: Create models tracking tag coverage, compliance, and untagged attribution

5.2 Data Models to Build

  • int_cost_with_amortization: Intermediate model applying amortized commitment costs
  • int_cost_fully_loaded: Intermediate model with discounts, credits, and custom rates applied
  • dim_cost_center: Business unit / cost center dimension with hierarchy
  • dim_team: Team/owner dimension mapped from tags and account ownership
  • dim_environment: Environment dimension (prod, staging, dev, sandbox)
  • bridge_cost_allocation: Allocation rules mapping resources to business dimensions
  • fct_allocated_cost: Final allocated cost fact table with full business attribution
  • fct_shared_cost_distribution: Model distributing shared costs (proportional, even, fixed)
  • mart_showback_by_team: Team-level showback data mart
  • mart_chargeback_by_cost_center: Cost center chargeback data mart for finance
  • rpt_tag_coverage: Tag compliance and coverage reporting model

5.3 Engineering Responsibilities

  • Design allocation rule engine supporting tag-based, account-based, and custom mappings
  • Build amortization models for upfront payments (1-year, 3-year commitments)
  • Implement shared cost distribution models (proportional, even split, fixed coefficient)
  • Create untagged cost attribution logic using account ownership and heuristics
  • Build budget vs. actual variance models
  • Design forecasting input models based on historical trends
  • Ensure data models align with accounting periods and matching principles

5.4 Required Technical Skills

  • Expert SQL and dimensional modeling
  • Experience building allocation and chargeback systems
  • Understanding of FinOps cost allocation methodologies
  • Data modeling tools (dbt, LookML, or similar)
  • Understanding of financial accounting concepts (amortization, accruals)

6. Position 3: Optimization Analytics Engineer

This engineer builds the data models that power cost optimization, including utilization analysis, commitment planning, rightsizing recommendations, and savings tracking.

  • 4-6 years of data/analytics engineering experience
  • Requires strong analytical modeling skills and understanding of cloud infrastructure. The commitment planning and unit economics work benefits from someone with experience bridging infrastructure metrics and financial data.

6.1 Key Deliverables

  1. Utilization Data Models: Build models integrating utilization metrics with cost data for waste identification
  2. Commitment Analytics Models: Create models tracking RI/Savings Plan/CUD utilization, coverage, and recommendations
  3. Rightsizing Models: Build recommendation models identifying oversized and idle resources
  4. Savings Tracking Models: Implement models measuring cost avoidance and realized savings
  5. Unit Cost Models: Create unit economics models (cost per request, per GB, per transaction)

6.2 Data Models to Build

  • stg_cloudwatch_metrics / stg_stackdriver_metrics / stg_azure_monitor: Utilization metric staging
  • int_resource_utilization: Resource-level utilization (CPU, memory, network, storage)
  • fct_commitment_inventory: All active RIs, Savings Plans, CUDs with terms and rates
  • fct_commitment_utilization: Daily utilization of each commitment
  • fct_commitment_coverage: Coverage analysis (on-demand vs. covered usage)
  • fct_idle_resources: Resources with zero or minimal utilization
  • fct_rightsizing_recommendations: Oversized resources with target size and savings estimate
  • fct_savings_realized: Tracked savings from rate optimization
  • fct_cost_avoidance: Tracked cost avoidance from usage optimization
  • mart_unit_cost: Unit economics ($/request, $/GB, $/CPU-hour)
  • mart_optimization_opportunities: Prioritized optimization recommendations
  • rpt_commitment_recommendations: RI/SP/CUD purchase recommendations based on usage patterns

6.3 Engineering Responsibilities

  • Integrate utilization metrics (CloudWatch, Stackdriver, Azure Monitor) with cost data
  • Build commitment utilization and coverage calculations by provider
  • Implement rightsizing logic using utilization thresholds and cost impact
  • Create commitment recommendation models analyzing usage stability
  • Build unit cost calculations tied to business metrics
  • Design what-if scenario models for commitment planning
  • Implement savings tracking models comparing actual vs. baseline costs

6.4 Required Technical Skills

  • Expert SQL and analytics modeling
  • Experience with cloud monitoring data (CloudWatch, Stackdriver, Azure Monitor)
  • Understanding of cloud commitment mechanisms (RIs, Savings Plans, CUDs)
  • Statistical analysis for usage pattern identification
  • Data modeling tools (dbt, Dataform, or similar)

7. Acceptance Criteria

All data models must meet the following engineering standards:

  • Data Accuracy: Reconciliation models must validate within 0.1% of provider invoices
  • Freshness: Models refresh within 24 hours of source data availability
  • Test Coverage: All models include data quality tests (uniqueness, not null, referential integrity)
  • Documentation: All models documented with descriptions, column definitions, and business logic
  • Performance: Query performance meets SLAs for downstream dashboards and reports
  • Version Control: All code in Git with PR reviews and CI/CD deployment

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.