Performance Degraded
Incident Report for Apteryx XVWeb
Postmortem

Content/Background

XVWeb runs on multiple server instances in the Azure Cloud. To better serve our customer's security and reliability needs, XVWeb has moved to a new isolated service environment over the past month, providing us with our own dedicated hardware within the Azure Cloud.

Problem Summary

Starting at approximately 1:56 pm EST, 1 of our 35 server instances failed to restart after a rise in traffic caused it to crash. Normally, an automated system removes the server instance from circulation so our customers are unaffected. However, we identified a previously unknown condition in our new isolated service environment where the unhealthy server instance was not detected, so traffic continued to go to the affected server instance.

Affected customers would see 500.30 ASP.NET Core app failed to start when loading the XVWeb site. A refresh of the browser would resolve this issue for most XVWeb.net customers. Denticon customers would need to reload the Dentiray Web imaging window until the patient's images loaded. Once the XVWeb imaging page was loaded, customers would be able to continue their workflow normally. Customers would sporadically see thumbnails fail to load in the series bar, and some customers whose traffic was sent to the unhealthy server instance would have seen slow loading times on images as the requests were re-tried on healthy servers.

Mitigation

At 3:54 pm EST we were able to identify the issue causing intermittent failures.

At 4:10 pm EST, XVWeb was restarted with fixes to the automated system that removes unhealthy server instances.

Posted Jan 05, 2023 - 16:44 EST

Resolved
Monitoring has shown system performance to be stable since the fix was implemented.
Posted Jan 04, 2023 - 18:14 EST
Monitoring
A fix has been implemented and we are monitoring to ensure system performance returns to normal.
Posted Jan 04, 2023 - 16:52 EST
Investigating
XVWeb is experiencing a spike in exceptions, and our engineers are actively investigating.
Posted Jan 04, 2023 - 16:00 EST
This incident affected: XVWeb (XVWeb).