What you need to know
- Azure virtual machines were hit by a global outage last night around 1 AM EST on October 13, 2021.
- The outage lasted around seven hours and affected several services that depend on Azure virtual machines.
- Microsoft explains that the outage occurred because a "required artifact version data could not be queried."
Microsoft's Azure virtual machine service went down for almost 7 hours last night. The outage started at 1:12 AM EST and lasted until 7:45 AM EST. A subset of Windows Virtual Machines may have failed during the outage when performing certain tasks, including start, create, update, delete. Services with dependencies on Windows Virtual Machines may have also failed during the same time. Microsoft explains that non-Windows Virtual Machines and existing running Windows Virtual Machines should not have been impacted by the issue.
Microsoft identified the preliminary root cause, which is outlined on the Azure status page:
We identified that calls made during service management operations were failing as a required artifact version data could not be queried. Our investigation focused on the backend compute resource provider (CRP) to determine why the calls were failing, and identified that a required VMGuestAgent could not be queried from the repository. The VM Guest Agent Extension publishing architecture was being migrated (as part of a migration of legacy service management backend systems) to a new platform which leverages the latest Azure Resource Manager (ARM) capabilities.
Microsoft mitigated the impact of the issue by "marking the appropriate extensions to the correct expected level (in this case, public)." The company will perform a full Root Cause Analysis (RCA) within 72 hours of the incident.