Overview
Skills
Job Details
Analyze large-scale wireless (RAN, Core) and IEN network alarm data from OSS/NMS systems
* Identify patterns, trends, and recurring fault signatures across network domains
* Develop KPIs and dashboards to track network health and fault trends
* Build machine learning models for alarm correlation, noise reduction, root cause analysis, anomaly detection, and predictive fault forecasting
* Apply supervised and unsupervised learning techniques such as clustering, classification, and time-series analysis
* Clean, normalize, and enrich alarm data from multiple sources
* Integrate data from OSS, EMS, NMS, CMDB, and performance systems
* Automate fault insight pipelines and model deployment
* Collaborate with NOC, Network Engineering, and Reliability teams to translate analytical findings into operational recommendations
* Support proactive maintenance and incident prevention initiatives
* Create interactive dashboards and reports for real-time fault monitoring
* Present insights clearly to technical and non-technical stakeholders
* Strong proficiency in Python or R (Pandas, NumPy, Scikit-learn, PySpark)
* Experience with time-series data and event/alarm analytics
* Knowledge of machine learning algorithms for classification, clustering, and anomaly detection
* Experience with SQL and big data platforms such as Spark or Hadoop
* Familiarity with visualization tools like Tableau, Power BI, Grafana, or Python visualization libraries
* Understanding of wireless networks (2G/3G/4G/5G, RAN, Core)
* Knowledge of IEN/IP/Ethernet networking concepts
* Familiarity with network alarms, fault management, and OSS/NMS systems
* Understanding of MTTR, SLA, availability, and reliability metrics
* Strong analytical and problem-solving skills
* Ability to communicate insights effectively and work in cross-functional operational teams
* Experience in telecom, ISP, or network operations environments preferred
* Knowledge of AIOps or network intelligence platforms preferred
* Experience with real-time streaming data tools such as Kafka or Flink preferred
* Exposure to ITIL and incident/problem management frameworks preferred
* Deliverables include alarm correlation and RCA models reducing false positives, predictive fault alerts improving proactive maintenance, operational dashboards for NOC and engineering teams, and documentation with model performance reports