Overview
Skills
Job Details
EXISTING TECHNOLOGY ENVIRONMENT
Existing technology environment includes the following:
1. Data Collection
a. Email: Most data collection is currently done by email. Obligated entities share spreadsheets by email in response to requests made by the department.
b. Manual Download: Third party data is often manually downloaded from the web by staff members and converted into Excel spreadsheets.
c. API: There are limited instances where data is collected via API integration with Power BI.
2. Data storage
a. SharePoint: Most data assets are stored in Excel files in SharePoint.
b. Local Storage: Some data assets are stored in Excel files on local hard drives.
3. ETL/Analytics
a. Most analysis is conducted manually in Excel workbooks.
b. Some ETL and analysis is done Power BI / Power Query
4. Sharing
a. Most data is shared manually via email in response to a direct request or is manually uploaded to ePUC (the Public Utility Commission s case management platform).
b. Analysis and visualizations are often imbedded in Power Point presentations or Word documents which are then posted to the Department s web site or are uploaded to ePUC. REQUIREMENTS:
Knowledge Requirements
1. Azure SQL DB The vendor must demonstrate strong experience in designing, implementing, and maintaining Azure SQL Databases. This includes expertise in database architecture, performance tuning, indexing strategies, security configurations, backup and restore operations, and managing high availability and disaster recovery setups. Familiarity with scaling databases to meet varying workloads and implementing monitoring and alerting solutions is required.
2. ETL Processing with Python in Data Factory The vendor must have hands-on experience in building, deploying, and managing ETL pipelines using Azure Data Factory, with Python-based transformations. This includes extracting data from multiple sources, transforming it according to business rules, and loading it into target systems efficiently. Knowledge of orchestrating , pipelines, scheduling, error handling, logging, and implementing reusable and maintainable Python scripts is essential.
3. Azure Blob Storage The vendor must have practical knowledge of Azure Blob Storage for storing, managing, and retrieving large datasets. This includes configuring containers, managing access policies, implementing security measures, optimizing storage for performance and cost, and integrating Blob Storage with other Azure services. Experience with automated data movement, lifecycle management, and data archiving is preferred.
Functional Requirements
1. Requirements Elaboration o Working with PSD and ADS staff, refine functional (business) and non functional (technical) requirements, culminating in a requirements repository. o Working with PSD and ADS staff, establish minimum performance expectations and success criteria for the end product.
2. Documentation & Training o Deliver schema design and database structure documentation. o Provide security configurations and access policies. o Train VT PSD personnel on database usage and Power BI integration.
Non-Functional Requirements
1. Data Management & Storage o Support storage and retrieval of specified data types and sources. o Handle expected data volumes efficiently. o Ensure data integrity and accuracy during migration/import.
2. Database Schema & Design o Implement an optimized, normalized relational database schema. o Support relational integrity with primary and foreign keys. o Include indexing for performance optimization. o Allow for future scalability and extensibility.
3. Extract Transform Load (ETL) Processing o Implement an optimized ETL solution that utilizes Python scripting to handle the transformation in Azure infrastructure. o ETL processing should be able to be scheduled when new data files are received. o ETL processing needs be standardized between the data files and source systems. o ETL Templates based on the data sources for future expansion.
4. Integration Capabilities o Enable integration with existing VT PSD systems. o Ensure Azure SQL Database connectivity with Power BI. o Support direct querying and/or data export to Power BI.
Provide necessary data pipelines or views for Power BI. o Enable API connections with current and future utility data systems, including, NEPOOL GIS, ISO-NE, and VELCO's VX platform. 5. Security & Compliance o Implement role-based access control (RBAC). o Ensure data encryption at rest and in transit. o Adhere to compliance requirements from VT PSD and VT ADS. 6. Testing & Validation o Validate database integrity and security controls. o Implement and validate a database maintenance plan. 7. Deployment & Support o Deploy the database in the Azure SQL production environment. o Provide post-implementation monitoring and issue resolution. o Ensure database connectivity via VPN and office locations. 8. Performance & Optimization o Optimize database performance for querying and analytics. o Ensure query performance under expected load conditions. Figure 1. ADS proposed data architecture for the Department. Phase 1: Requirements Gathering 1. Collaborate with stakeholders at VT PSD to refine business requirements, including: o Data types, sources, and expected data volumes. o Key metrics and reports needed for Power BI dashboards. Page 7 of 14 o Data security, compliance, and access control requirements.
2. Document the functional and non-functional requirements for the database. 3. Identify any existing systems or processes that need to integrate with the new database including available staff capacity. Phase 2: Database Schema Design 1. Logical Design: o Design an optimized, normalized schema to support VT PSD s data structure, ensuring scalability and performance. o Include relational tables, primary keys, foreign keys, indexes, and relationships. o Plan for future extensibility of the database. 2. Physical Design: o Define storage requirements and configurations for the Azure SQL Database. o Optimize database performance for querying and analytics. 3. Present and review the schema design with VT PSD stakeholders for approval before implementation. 4. PSD project manager sign-off
Phase 3: Database Implementation
1. Set up the Azure SQL environment, including resource groups, server configurations, and database deployment.
2. Create the approved schema in the Azure SQL environment.
3. Migrate or import existing data, if applicable, ensuring data integrity and accuracy.
4. Implement security measures identified in Phase 1, including: o Role-based access control (RBAC).
o Data encryption at rest and in transit.
5. Perform data validation and integrity checks.
6. Implement Database Maintenance plan in accordance with ADS best practices.
Phase 4: Testing and Validation
1. Conduct thorough testing of the database, including:
o Query performance under expected load conditions.
o Security tests to verify data access controls.
2. Test database maintenance plan.
3. Address and resolve any issues identified during testing.
Phase 5: Documentation and Training
1. Deliver comprehensive documentation, including:
o Schema design and database structure.
o Security configurations and access control policies.
2. Provide training for VT PSD personnel on:
o Database usage and maintenance.
Phase 6: Deployment and Support
1. Deploy the database to the production environment in Azure SQL.
2. Monitor performance and resolve any post-deployment issues.
3. Confirm connectivity and major functionality works for PSD staff via VPN and office locations.
4. Provide post-implementation support for a defined period (e.g., 90 days).
Deliverables
1. Approved database schema design.
2. Fully implemented Azure SQL Database.
3. Documentation and training materials.
4. Knowledge transfer to enable the state to support the solution post-implementation.
Timeline
Estimated duration: 3 months from finalized SOW Agreement and purchase order.
Qualifications Required
1. Proficiency in Azure SQL Database architecture and implementation.
2. Experience designing and optimizing relational databases.
3. Expertise in integrating Azure SQL with Power BI.
4. Knowledge of data security and compliance best practices.
5. Strong communication skills for collaboration with VT PSD stakeholders.
Assumptions and Dependencies
VT PSD will provide timely access to stakeholders and existing data systems.
Azure infrastructure (subscriptions and accounts) will be provisioned by VT ADS.
Any third-party tools required will be procured by the State.