Role Title: Data Engineer (W2 Only)
Location: Remote
Visa: (GC), (USC) Only
Role Description: Role summary Design, build, and operate reliable archiving and purging solutions on Microsoft SQL Server for datasets stored/represented in EBCDIC format, using SQL Server stored procedures and Python automation.
Own batch execution and scheduling via Control‑M, ensuring performance, auditability, retention compliance, and safe deletion practices.
Required Skills:
Build and maintain SQL Server stored procedures for data archiving, purge eligibility determination, and purge execution (soft/hard deletes as required). Handle EBCDIC data processing requirements (e.g., EBCDIC↔ASCII/Unicode conversion, fixed-length records, packed decimals/COBOL-style encodings when applicable) and ensure data integrity post-conversion. Develop Python utilities for orchestration, validation, reconciliation, exception handling, and operational automation (e.g., pre/post checks, file-based extracts, run reports).
Implement purge controls: retention rules, legal hold/exclusions, referential integrity management, dependency ordering, and rollback/restore strategy. Optimize performance for large volumes (partitioning strategies, indexing, batching, minimal logging approaches where appropriate, transaction scoping). Integrate and run jobs in Control‑M: job definitions, calendars, dependencies, resource limits, alerting, rerun/restart logic.
Create operational runbooks, monitoring dashboards/metrics, and support production incidents (root cause, fixes, postmortems).
Ensure security and compliance: least-privilege access, audit trails, PII handling, and evidence for retention/purge execution.
Collaborate with DBAs, data governance, and application teams to validate retention policy interpretation and downstream impacts.
Required technical skills
SQL Server: T‑SQL, stored procedures, transactions, error handling, temp tables, indexing, query tuning, SSMS, SQL Agent fundamentals (even if Control‑M is primary).
EBCDIC: understanding of common code pages (e.g., 037/1047), conversion pitfalls, fixed-width layouts, and validation strategies.
Python: building maintainable scripts/modules for ETL-style tasks, database connectivity (ODBC), logging, configuration, and testability. Batch scheduling: Control‑M job creation/execution, dependencies, alerts, reruns, and operational support. Data lifecycle: archiving patterns, retention, purge frameworks, auditability, and safe-delete patterns.
Nice-to-have skills
Experience with very large tables (VLDB), table partitioning/switching, and archival to cheaper storage tiers. Familiarity with mainframe-originated data models, COBOL copybooks, and packed numeric formats. CI/CD for database objects (DACPAC/SSD T or equivalent), Python packaging, and automated testing. Knowledge of data governance tools/processes and legal hold workflows. Experience in regulated environments (SOX, PCI, HIPAA, GDPR-like retention controls).