Job Title: ETL Data Architect – Telecom Domain
Location: West Chester, PA
Experience: 10+ years (preferred)
Technology Focus: PySpark, Spark Cluster, Python, ETL Data Pipelines, Databricks, Rundeck
Domain Expertise: Telecom (Buyflow, Billing, Provisioning, Customer Journeys, Order processing etc.)
Role Overview
We are looking for an experienced Telecom ETL Data Architect who can design, architect, and optimize large-scale data pipelines within the telecom ecosystem. This role requires deep hands-on skills in PySpark, Spark clusters, Databricks, and ETL orchestration, along with strong communication, articulation, and storytelling abilities.
The ideal candidate will be someone who can translate complex technical workflows into clear business narratives, influence stakeholders through structured communication, and derive different use-cases from different domain data insights.
Key Responsibilities
Data Architecture & Engineering
• Design and architect scalable, high-performance ETL/ELT data pipelines using PySpark, Python, and Spark clusters.
• Develop data models and frameworks aligned with telecom processes such as buyflow, billing, usage, customer management, and order processing.
• Build, optimize, and monitor pipelines on Databricks (Delta Lake, Workflows, Cluster configuration).
• Define and enforce ETL standards, data quality rules, and engineering best practices.
Domain & Business Integration
• Work closely with business and product teams across telecom functions:
o Customer / Account onboarding
o Buyflow journeys
o Billing & payments
o Network provisioning
o Customer service & troubleshooting
• Translate domain processes into logical and physical data flows.
Communication & Storytelling
• Clearly articulate technical solutions to non-technical stakeholders.
• Create data Stories and high-impact presentations connecting data insights to business outcomes.
• Communicate architectural decisions, trade-offs, and roadmap recommendations.
• Present complex architecture in simplified, visual storytelling formats.
Orchestration & Operations
• Implement and maintain job scheduling/orchestration using Rundeck or similar tools.
• Build monitoring, logging, and automated recovery mechanisms for pipelines.
• Ensure end-to-end performance tuning, cost optimization, and SLA adherence.
Collaboration & Leadership
• Work with cross-functional engineering teams, product owners, and solution architects.
• Mentor junior data engineers and set coding and architecture standards.
• Drive design sessions, code reviews, and architecture assessments.
Technical Skills
• Strong expertise in PySpark, Spark SQL, Spark Streaming.
• Proven experience building pipelines in Databricks (Delta Lake, Jobs, Clusters).
• Solid Python programming and modular ETL development experience.
• Experience with telecom systems, data domains, and operational workflows.
• Knowledge of CI/CD, version control (Git), and cloud platforms (Azure/AWS/Google Cloud Platform).
• Experience with orchestration tools such as Rundeck, Airflow, or Control-M.
Soft Skills
• Excellent articulation and structured communication.
• Ability to simplify complex data concepts into business-friendly narratives.
• Strong problem-solving, analytical, and decision-making skills.
• Ability to lead conversations with senior stakeholders.
• A natural storyteller with the ability to craft architecture and data stories.
Preferred Qualifications
• Background working with large telecom providers (e.g., Comcast/Xfinity, AT&T, Verizon, etc.).
• Experience with event-driven architecture, Kafka, or real-time streaming.
• Understanding of telecom KPIs, customer journeys, and operational systems.
• Experience in performance tuning of Spark workloads.