Most common deployment slot swap failures and how to fix them

Azure Web App Deployment Slots are used to deploy new versions of an application code into production with no interruption to the production traffic. In order to achieve this the swap process involves multiple steps that are performed to prepare the new version of the code to successfully handle the load once it is in production slot. Some of these steps may go wrong, especially when the new version of the code does not cooperate well. This in turn either causes the swap to fail or it results in swapping new code in production while it is still not ready to handle the production load. This post describes the most common reasons why this may happen and how to correct them.

In order to better understand the reasons for the swap failures it is first necessary to explain how the application code in the staging slot is initialized / warmed up prior to the swap to production. Failures during these steps are the most common reasons for the overall failure of the swap operation.

The swap operation is done by an internal process that runs within a scale unit where web app is hosted. Here are the steps that it performs to ensure the application is initialized prior to the swap. Note that the same sequence of actions happens during Auto-Swap and Swap with Preview.

  1. Apply the production configuration settings to all web app’s instances in the staging slot. This happens when web app has appsettings or connection strings marked as “Slot settings” or if Continuous Deployment is enabled for the site or if Site Authentication is enabled. This will trigger all instances in the staging slot to restart. (For Swap with Preview this the first phase of the swap after which the swap process is paused and you can validate that the application works correctly with production settings)
  2. Wait for every instance to complete its restart. If some instance failed to restart then the swap process will revert any configuration changes to the app in staging slot and will not proceed further. If that happens the first place to look into is the D:\home\LogFiles\eventlog.xml file of the application specific error log (such as php_errors.log for PHP apps) where you may find more clues what prevents application from starting.
  3. If Local Cache is enabled then swap process will trigger Local Cache initialization by making an HTTP request to the root directory URL path (“/”) of the web app on every web worker. Local Cache Initialization consists of copying the site’s content files from network share to the local disk of the worker and then re-pointing the web app to use local disk for its content. This causes another restart of the web app. The swap process will wait until the Local Cache is completely initialized and restarted on every instance before proceeding further. A common reason why Local Cache Initialization may fail is when site content size exceeds the local disk quota specified for the Local Cache. If that is the case the the quota can be increased by following instructions from Local Cache Documentation.
  4. If Application Initialization (AppInit) is enabled then swap process will make another HTTP request to the root URL path on every web worker. The AppInit is a module that runs within the web app request processing pipeline and it gets executed when web app starts. The only thing the swap process does with its first HTTP request to the web app is it triggers the AppInit module to do its work. After that it just waits until AppInit reports that it has completed the warmup. AppInit module uses the list of URL paths specified inside web.config file and makes internal HTTP requests to each of those. All these requests are within the web app process. It does not call any external URL’s and its requests are not going through the scale unit’s front ends. Also, neither the initial HTTP request nor AppInit internal requests follow HTTP redirects. That causes the most common problem that users run into with this module. If web app has such rewrite rules as “Enforce Domain” or “Enforce HTTPs” then none of the warmup requests will actually reach the application code. All the requests will be shortcut by the rewrite rules. In order to prevent that the rewrite rules need to be modified like below:
    <rule name="Canonical Host Name" stopProcessing="true">
    <match url="(.*)" />
    <conditions>
    <add input="{WARMUP_REQUEST}" pattern="1" negate="true" />
    <add input="{REMOTE_ADDR}" pattern="^100?\." negate="true" />
    <add input="{HTTP_HOST}" negate="true" pattern="^ruslany\.net$" />
    </conditions>
    <action type="Redirect" url="http://ruslany.net/{R:1}" redirectType="Permanent" />
    </rule>
    
    <rule name="Redirect to HTTPS" stopProcessing="true">
    <match url="(.*)" />
    <conditions>
    <add input="{WARMUP_REQUEST}" pattern="1" negate="true" />
    <add input="{REMOTE_ADDR}" pattern="^100?\." negate="true" />
    <add input="{HTTPS}" pattern="^OFF$" />
    </conditions>
    <action type="Redirect" url="https://{HTTP_HOST}/{R:1}" redirectType="Permanent" />
    </rule>
    

    The {WARMUP_REQUEST} is a server variable that is set by AppInit module for each of its internal requests. That is a reliable way to distinguish whether the request is external or is made by AppInit module. The {REMOTE_ADDR} is a server variable that contains the IP address of HTTP client. The IP address ranges starting with “10.” or “100.” are internal to the scale unit and no outside HTTP client can use them.

  5. If AppInit is not enabled then swap process just makes an HTTP request to the root path of the webapp on each web worker and as long as it receives some HTTP response it considers the warmup complete. Again the rewrite rules in the web app can cause the site to return HTTP redirect response and the actual application code will not be executed at all. Since the AppInit is not involved here the only way to prevent the rewrite rules is to use the {REMOTE_ADDR} server variable in the rule’s conditions as shown below.
    <conditions>
    <add input="{REMOTE_ADDR}" pattern="^100?\." negate="true" />
    </conditions>
    
    
  6. After all the above steps are completed successfully the actual swap is performed by switching the routing rules on the scale unit’s front ends. More details on what happens during the swap can be found in other blog post “How to warm up Azure Web App during deployment slots swap

Some other common problems that cause the swap to fail:

  • An HTTP request to the root URL path times out. The swap process waits for 90 seconds for the request to return. In case of timeout the request will be retried for up to 5 times. If after that the request still times out then the swap operation will be aborted. If that happens then check the eventlog.xml or application specific error log to see if there are any indications of what causes the timeout.
  • An HTTP request to the root URL path is aborted. This may happen if web app has a rewrite rule to abort some requests. If that is the case then the rule can be modified by adding the {REMOTE_ADDR} check as shown in previous examples.
  • Web App has IP restriction rules that prevent the swap process from connecting to it. In that case you’ll need to allow the internal IP address range used by the swap process:

2,867 views

ruslany on November 2nd 2017 in WAWS

PoorFairAverageGoodExcellent (5 votes, average: 4.40 out of 5)

Trackback URI | Comments RSS

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

XML Markup: If You want to add XML code to the comment please XML encode it first, otherwise the code will not show up.

Recently Published Articles