On Wednesday 14th November, Modern Networks telecoms partner, Gamma Telecom, suffered a major service outage (MOS) affecting the Horizon Telephony platform. Many of our clients who use the Horizon system were affected by the outage and experienced disruption to their business activities. We have now received a Service Incident Report from Gamma.
The incident occurred due to a bug being discovered on the platform. Although the bug was fixed within an hour and a half of its discovery, the wider platform suffered instability problems for the rest of the day. Following planned work on Horizon servers, one server started to have connectivity problems. Initially, this problem only affected a small number of users. An emergency patch was implemented, the server restarted and tested. However, similar connectivity issues started to appear in several other servers as the volume of traffic across the platform increased. By 2.50pm, Horizon engineers and their partners had made changes to core equipment, which resulted in a significant uplift of handsets successfully re-registering with the system. At 4.25pm some configuration changes were made to the system to improve the stability of registered handsets. By 1am the following morning, all patches had been applied to all Horizon servers, completely resolving the registration and stability issues experienced by users.
Horizon and its partners are now conducting more in-depth investigation into the causes of the incident and the company’s capacity to deal with the incidents so lessons can be learned and implemented.