Intelligent Cloud Services Status

ICRT Service Affecting Issues 14 February 2018, 09:42:30 – 10:03:45utc and 10:00 – 12:45utc
On 14 February 2018 at 09:42utc, a tenant unintentionally initiated Salesforce Outbound Messages that generated over 500K process instances within a 20 minute period. This was followed at approximately 10:00utc with another tenant also unintentionally initiating a significantly greater number of processes also using Salesforce Outbound Messages. These occurrences were coincidental.

Between 09:42utc and 10:02utc, adjustments were made to distribute the processing load. This was achieved by 10:02utc. During this period, high levels of CPU utilization resulted in 90th percentile response times of 1 to 9 seconds with outliers as high as 103 seconds.

At approximately 10:03utc API response times returned back to slightly above normal by approximately 0.5 seconds. Between 09:42utc and 12:45utc Process Designer was sluggish and listing processes using Process Console was slow and could timeout. This Process Designer and Console behavior continued until 11:54utc at which point the conditions started to subside. Processing returned to normal operating conditions at 12:45utc.

We have taken initial steps to better handle the high CPU utilization condition that exhibited themselves as unacceptably high 90th percentile response times, and sluggish Process Designer and Process Console. XML DOM processing of large payloads under high load conditions such as the one experienced between 09:42utc and 12:45utc are being reviewed.
Posted Feb 14, 2018 - 17:48 PST