All Products
Search
Document Center

Elastic Compute Service:Known issues of public images

Last Updated:Nov 14, 2024

Public images may have some known security vulnerabilities or configuration issues. Known issues of public images help you understand potential security risks and take corresponding measures to locate and resolve the issues at the earliest opportunity.

Known issues of Windows images

Specific features do not work as expected when a Windows operating system is used on an instance type that has 512 MB of memory

  • Problem description

    When the Windows Server Version 2004 Datacenter 64-bit (Simplified Chinese, Without UI) operating system is used on an Elastic Compute Service (ECS) instance that has 512 MB of memory, issues occur. For example, the password configured during instance creation does not take effect, the instance password fails to be changed, and commands fail to be run.

  • Cause

    Virtual memory cannot be allocated because paging file management is disabled. As a result, exceptions occur when programs are run.

  • Solution

    Due to the small memory size of the problematic instance, you cannot attach Pre-installation Environment (PE) disks to the instance. You can neither log on to the instance due to the ineffective password that you configured during instance creation. Therefore, you can enable paging file management for the instance only by using Cloud Assistant.

    1. You can use one of the following methods to run commands by using Cloud Assistant:

    2. Run the following command to enable paging file management:

      Wmic ComputerSystem set AutomaticManagedPagefile=True
      Note
      • The preceding command may fail to be run. Try multiple times until the command is run.

      • You can also run the Wmic ComputerSystem get AutomaticManagedPagefile command to check whether paging file management is enabled. The following command output indicates that paging file management is enabled:

        AutomaticManagedPagefile
        TRUE
    3. Restart the instance for the changes to take effect.

Windows Server 2016: The operating system does not respond when a software installation package is run

  • Problem description

    When a software installation package is downloaded and run within Windows Server 2016, the operating system does not respond.

  • Cause

    1. For security reasons, the Windows operating system enables the Express settings by configuring the ProtectYourPC option during the Sysprep phase when the operating system is started. Then, the system carries the SmartScreen system process after the operating system starts. In most cases, the SmartScreen system process is used to protect the operating system from redirection to malicious websites and insecure downloads.

    2. When you try to download or run a software installation package from the Internet, the web identifier carried by the package triggers the SmartScreen system process. The SmartScreen system process recognizes that the software originates from the Internet and may lack reputation information. As a result, the software is blocked by the SmartScreen system process.

  • Solutions

    Use one of the following solutions to resolve the issue:

    Unblock the software installation package

    1. Select Unblock in the Properties dialog box of the software installation package.

      image

    2. Rerun the software installation package.

    Disable SmartScreen Filter

    1. Go to the C:\Windows\System32 directory.

    2. Double-click the SmartScreenSettings.exe file.

    3. Select Don't do anything (turn off Windows SmartScreen) in the Windows SmartScreen dialog box. Then, click OK.

    4. Rerun the software installation package.

    Modify Group Policy settings

    1. Open the Run dialog box, enter gpedit.msc, and then click OK.

    2. In the Local Group Policy Editor dialog box, choose Computer Configuration > Windows Settings > Security Settings > Local Policies > Security Options.

    3. Find the User Account Control: Admin Approval Mode for the Built-in Administrator account option and right-click the Properties option.

    4. On the Local Security Settings tab, select Enabled and click OK.

    5. Restart the operating system for the configurations to take effect.

    6. Rerun the software installation package.

Windows Server 2022: The KB5034439 patch fails to be installed

  • Problem description

    The KB5034439 patch fails to be installed in the Windows Server 2022 operating system.

  • Cause

    The KB5034439 patch is an update released by Microsoft in January 2024 and used to restore the environment. By default, the update repository for images is the Alibaba Cloud internal Windows Server Update Services (WSUS) server that does not provide the patch. If you configure Microsoft Windows Update as the update repository and trigger an environment update, the system can search for and install the patch, but the installation fails. The issue is as expected and does not affect normal use of the operating system. For more information, see KB5034439: Windows Recovery Environment update for Windows Server 2022: January 9, 2024.

A patch released by Microsoft in June 2022 causes RRAS issues on servers for which NAT is enabled

  • Problem description: According to an announcement from Microsoft on June 23, 2022, the installation of a security patch released by Microsoft in June 2022 may pose the following risks: A Windows server that is using the Routing and Remote Access Service (RRAS) might lose connection to the Internet, and devices that connect to the server might be unable to connect to the Internet.

  • Affected versions of Windows Server:

    • Windows Server 2022

    • Windows Server 2019

    • Windows Server 2016

    • Windows Server 2012 R2

    • Windows Server 2012

    When you check for system updates for Windows Server 2012 R2 and Windows Server 2012, select Check for updates that is marked ①, as shown in the following figure. The update repository to which the ① option is linked is the Alibaba Cloud internal WSUS server. The update repository to which the ② option is linked is the official Microsoft Windows Update server. In particular cases, security updates may cause potential issues. To prevent this scenario, Alibaba Cloud checks the Windows security updates from Microsoft and releases only the updates that pass the check to the internal WSUS server. 检查更新

  • Solution: The relevant patch has been removed from Alibaba Cloud WSUS. To prevent your Windows Server operating system from being affected by the issue, we recommend that you check whether the patch is installed on the operating system. Run one of the following commands based on the version of your operating system:

    Windows Server 2012 R2: wmic qfe get hotfixid | find "5014738"
    Windows Server 2019: wmic qfe get hotfixid | find "5014692"
    Windows Server 2016: wmic qfe get hotfixid | find "5014702"
    Windows Server 2012: wmic qfe get hotfixid | find "5014747"
    Windows Server 2022: wmic qfe get hotfixid | find "5014678"

    If the command output indicates that the patch is installed and you are experiencing RRAS issues on the Windows Server operating system, we recommend that you uninstall the patch to restore functionality to the Windows server. Run one of the following commands based on the version of your operating system to uninstall the patch:

    Windows Server 2012 R2: wusa /uninstall /kb:5014738
    Windows Server 2019: wusa /uninstall /kb:5014692
    Windows Server 2016: wusa /uninstall /kb:5014702
    Windows Server 2012: wusa /uninstall /kb:5014747
    Windows Server 2022: wusa /uninstall /kb:5014678
    Note

    For further updates and operational guidance on the issue, follow the instructions in the official Microsoft documentation. For more information, see RRAS Servers can lose connectivity if NAT is enabled on the public interface.

A patch released in January 2022 causes abnormal behavior on Windows Server domain controllers (DCs)

  • Problem description: According to an announcement from Microsoft on January 13, 2022, the installation of a security patch released by Microsoft in January 2022 may pose the following risks: Virtual machines in Hyper-V cannot start, Windows Server DCs cannot restart or fall into a restart loop, and IP security (IPSec) virtual private network (VPN) connections fail.

  • Affected versions of Windows Server:

    • Windows Server 2022

    • Windows Server, version 20H2

    • Windows Server 2019

    • Windows Server 2016

    • Windows Server 2012 R2

    • Windows Server 2012

    When you check for system updates for Windows Server 2012 R2 and Windows Server 2012, select Check for updates that is marked ①, as shown in the following figure. The update repository to which the ① option is linked is the Alibaba Cloud internal WSUS server. The update repository to which the ② option is linked is the official Microsoft Windows Update server. In particular cases, security updates may cause potential issues. To prevent this scenario, Alibaba Cloud checks the Windows security updates from Microsoft and releases only the updates that pass the check to the internal WSUS server. 检查更新

  • Solution: The relevant patch is removed from Alibaba Cloud WSUS. To prevent your Windows Server operating system from being affected by the issue, we recommend that you check whether the patch has been installed on your operating system. Run one of the following commands based on the version of your operating system:

    Windows Server 2012 R2: wmic qfe get hotfixid | find "5009624"
    Windows Server 2019: wmic qfe get hotfixid | find "5009557"
    Windows Server 2016: wmic qfe get hotfixid | find "5009546"
    Windows Server 2012: wmic qfe get hotfixid | find "5009586"
    Windows Server 2022: wmic qfe get hotfixid | find "5009555"

    If the patch is already installed on your operating system and the DCs cannot be used or the virtual machines cannot start, we recommend that you uninstall the patch to restore the operating system. Run one of the following commands based on the version of your operating system to uninstall the patch:

    Windows Server 2012 R2: wusa /uninstall /kb:5009624
    Windows Server 2019: wusa /uninstall /kb:5009557
    Windows Server 2016: wusa /uninstall /kb:5009546
    Windows Server 2012: wusa /uninstall /kb:5009586
    Windows Server 2022: wusa /uninstall /kb:5009555
    Note

    For further updates and operational guidance on the issue, follow the instructions in the official Microsoft documentation. For more information, see RRAS Servers can lose connectivity if NAT is enabled on the public interface.

.NET Framework 3.5 fails to be installed in Windows Server 2012 R2

  • Problem description: If the Windows Server 2012 R2 operating system uses the images that are mentioned in this section, you cannot install .NET Framework 3.5 in the operating system, because one of the following patches is installed in the images: the KB5027141 patch released in June 2023, KB5028872 patch released in July 2023, KB5028970 patch released in August 2023, or KB5029915 patch released in September 2023.

    Important

    If you still want to use the Windows Server 2012 R2 operating system, we recommend that you create instances in the ECS console by using one of the following Windows Server 2012 R2 community images that have .NET Framework 3.5 installed: win2012r2_9600_x64_dtc_zh-cn_40G_.Net3.5_alibase_20231204.vhd and win2012r2_9600_x64_dtc_en-us_40G_.Net3.5_alibase_20231204.vhd. For information about how to search for an image that you want to use, see Find an image.

    Windows Server 2012 R2 images in which the preceding patches are installed

    • Images in which the KB5027141 patch released in June 2023 is installed

      • win2012r2_9600_x64_dtc_en-us_40G_alibase_20230615.vhd

      • win2012r2_9600_x64_dtc_zh-cn_40G_alibase_20230615.vhd

    • Images in which the KB5028872 patch released in July 2023 is installed

      • win2012r2_9600_x64_dtc_en-us_40G_alibase_20230718.vhd

      • win2012r2_9600_x64_dtc_zh-cn_40G_alibase_20230718.vhd

    • Images in which the KB5028970 patch released in August 2023 is installed

      • win2012r2_9600_x64_dtc_en-us_40G_alibase_20230811.vhd

      • win2012r2_9600_x64_dtc_zh-cn_40G_alibase_20230811.vhd

    • Images in which the KB5029915 patch released in September 2023 is installed

      • win2012r2_9600_x64_dtc_en-us_40G_alibase_20230915.vhd

      • win2012r2_9600_x64_dtc_zh-cn_40G_alibase_20230915.vhd

    image.png

  • Solution:

    1. On the control panel of your on-premises computer, find the KB5027141, KB5028872, KB5028970, or KB5029915 patch, right-click the patch, and then select Uninstall from the drop-down list to uninstall the patch. For example, uninstall the KB5029915 patch as shown in the following figure.

      image

    2. Restart the ECS instance.

      For more information, see Restart an instance.

    3. Install .NET Framework 3.5 by using one of the following methods.

      Installation by using Server Manager

      1. In the Server Manager window, click Add roles and features.

      2. Follow the wizard default configuration, click Features in the left-side navigation pane, and then select .NET Framework 3.5 Features.

        image

        Follow the wizard to confirm the settings until the installation is complete.

        image

      Installation by running PowerShell commands

      Run one of the following commands:

      • Dism /Online /Enable-Feature /FeatureName:NetFX3 /All 

        image.png

      • Install-WindowsFeature -Name NET-Framework-Features

        image.png

Known issues of Linux images

CentOS

CentOS 8.0: The image version numbers of created instances change after the public image is updated

  • Problem description: After you connect to an instance created from the centos_8_0_x64_20G_alibase_20200218.vhd public image, you find that the operating system version of the instance is CentOS 8.1.

    testuser@ecshost:~$ lsb_release -a
    LSB Version:    :core-4.1-amd64:core-4.1-noarch
    Distributor ID:    CentOS
    Description:    CentOS Linux release 8.1.1911 (Core)
    Release:    8.1.1911
    Codename:    Core
  • Cause: The centos_8_0_x64_20G_alibase_20200218.vhd image is a public image that was updated by using the latest community update package. The version of CentOS in the image is upgraded to 8.1. Therefore, the actual operating system version is CentOS 8.1.

  • Affected image: centos_8_0_x64_20G_alibase_20200218.vhd.

  • Solution: You can call an API operation, such as the RunInstances operation, and set the ImageId parameter to centos_8_0_x64_20G_alibase_20191225.vhd to create an instance whose operating system version is CentOS 8.0.

CentOS 7: An issue may be caused by updates of specific image IDs

  • Problem description: The IDs of specific CentOS 7 public images were updated, which may affect the policies for obtaining image IDs during automated O&M.

  • Affected images: CentOS 7.5 and CentOS 7.6.

  • Cause: The image IDs used by the latest versions of CentOS 7.5 and CentOS 7.6 public images are in the following format: %<OS type>%_%<Major version number>%_%<Minor version number >%_%<Special field>%_alibase_%<Date>%.%<Format>%. For example, the image ID prefix of CentOS 7.5 public images is updated from centos_7_05_64 to centos_7_5_x64. In this case, you must modify the automated O&M policies that may be affected when the image IDs are updated. For information about image IDs, see Release notes for 2023.

CentOS 7: The hostname changes from uppercase letters to lowercase letters after an instance is restarted

  • Problem description: The first time some instances that run CentOS 7 are restarted, the hostnames of these instances change from uppercase letters to lowercase letters. The following table describes some examples.

    Hostname

    Hostname after the instance is restarted for the first time

    The hostname remains in lowercase after the instance restarts

    iZm5e1qe*****sxx1ps5zX

    izm5e1qe*****sxx1ps5zx

    Yes

    ZZHost

    zzhost

    Yes

    NetworkNode

    networknode

    Yes

  • The following CentOS public images and custom images derived from these public images are affected:

    • centos_7_2_64_40G_base_20170222.vhd

    • centos_7_3_64_40G_base_20170322.vhd

    • centos_7_03_64_40G_alibase_20170503.vhd

    • centos_7_03_64_40G_alibase_20170523.vhd

    • centos_7_03_64_40G_alibase_20170625.vhd

    • centos_7_03_64_40G_alibase_20170710.vhd

    • centos_7_02_64_20G_alibase_20170818.vhd

    • centos_7_03_64_20G_alibase_20170818.vhd

    • centos_7_04_64_20G_alibase_201701015.vhd

  • Affected hostnames: If the hostnames of your applications deployed on the instances are case-sensitive, services may be affected when you restart these instances. The following table describes whether the hostname changes after an instance is restarted.

    Current state of hostname

    The hostname changes after an instance is restarted

    Time when the hostname changes

    Continue to read this section

    The hostname contains uppercase letters when you create the instance in the ECS console or by calling ECS API operations.

    Yes

    The first time the instance restarts.

    Yes

    The hostname contains only lowercase letters when you create the instance in the ECS console or by calling ECS API operations.

    No

    N/A

    No

    The hostname contains uppercase letters, and you modify the hostname after you log on to the instance.

    No

    N/A

    Yes

  • Solution: To retain uppercase letters in the hostname of an instance after you restart the instance, perform the following operations:

    1. Connect to an instance.

      For more information, see Methods for connecting to an ECS instance.

    2. View the existing hostname.

      [testuser@izbp193*****3i161uynzzx ~]# hostname
      izbp193*****3i161uynzzx
    3. Run the following command to make the hostname static:

      hostnamectl set-hostname --static iZbp193*****3i161uynzzX
    4. Run the following command to view the updated hostname:

      [testuser@izbp193*****3i161uynzzx ~]# hostname
      iZbp193*****3i161uynzzX
  • What to do next: If you use an affected custom image, we recommend that you update cloud-init to the latest version and then create another custom image. To prevent this issue, you can use the new custom image to create instances. For more information, see Install cloud-init and Create a custom image from an instance.

CentOS 6.8: An instance on which the NFS client is installed does not respond

  • Problem description: A CentOS 6.8 instance on which the NFS client is installed does not respond and must be restarted.

  • Cause: When you use the NFS service on instances whose operating system kernel versions range from 2.6.32-696 to 2.6.32-696.10, the NFS client attempts to end a TCP connection if a glitch occurs due to communication latency. If the NFS server is slow in responding to NFS requests, the connection initiated by the NFS client may remain in the FIN_WAIT2 state for an extended period of time. In most cases, the connection times out and is closed 1 minute after the connection enters the FIN_WAIT2 state. Then, the NFS client can initiate a new connection. However, kernel versions 2.6.32-696 to 2.6.32-696.10 have issues with establishing TCP connections. As a result, the connection remains in the FIN_WAIT2 state, the NFS client is unable to recover the TCP connection, and a new TCP connection cannot be initiated. This causes the requests to freeze, and the only way to fix the issue is to restart the instance.

  • Affected images: centos_6_08_32_40G_alibase_20170710.vhd and centos_6_08_64_20G_alibase_20170824.vhd.

  • Solution: Run the yum update command to update the kernel to 2.6.32-696.11 or later.

    Important

    Before you perform operations on the instance, you must create a snapshot to back up your data. For more information, see Create a snapshot.

Debian

Debian 9.6: Instances in the classic network have network configuration issues

  • Problem description: Instances in the classic network that were created from Debian 9 public images cannot be pinged.

  • Cause: By default, the systemd-networkd service is disabled in Debian 9. Instances in the classic network that were created from Debian 9 public images cannot be automatically assigned IP addresses by using the Dynamic Host Configuration Protocol (DHCP).

  • Affected image: debian_9_06_64_20G_alibase_20181212.vhd.

  • Solution: Run the following commands in sequence:

    systemctl enable systemd-networkd 
    systemctl start systemd-networkd

Fedora CoreOS

The hostnames of instances created from Fedora CoreOS custom images do not take effect

  • Problem description: After you use a Fedora CoreOS image to create Instance A, you create a Fedora CoreOS custom image from Instance A and use the custom image to create Instance B. The hostname of Instance B remains the same as that of Instance A and the hostname specified for Instance B does not take effect.

    For example, you create a Fedora CoreOS custom image from Instance A that runs a Fedora CoreOS operating system and set the hostname of Instance A to test001. Then, you create Instance B from the custom image and set the hostname of Instance B to test002. After Instance B is created and connected, the hostname of Instance B remains test001.

  • Cause: Fedora CoreOS public images provided by Alibaba Cloud use Ignition offered by Fedora CoreOS to initialize instance configurations. Ignition is a utility used by Fedora CoreOS and RHEL CoreOS to manage disks in the initramfs during startup. The first time a Fedora CoreOS instance starts, coreos-ignition-firstboot-complete.service in Ignition checks whether the /boot/ignition.firstboot file exists and determines whether to initialize instance configurations. If the /boot/ignition.firstboot file exists, the system initializes instance configurations (including the hostname configuration) and deletes the /boot/ignition.firstboot file.

    The Fedora CoreOS instance must have been started at least once before it is used to create a Fedora CoreOS custom image. The first time the instance starts, the system deletes the /boot/ignition.firstboot file from the image of the instance. Hence, the Fedora CoreOS custom image created from the instance does not contain the /boot/ignition.firstboot file. The first time instances created from the Fedora CoreOS custom image start, the system does not initialize the instance configurations. In this case, the hostnames of the instances remain unchanged.

  • Solution:

    Note

    To ensure the security of data stored on the instance, we recommend that you create snapshots for the instance. If data exceptions occur on the instance, you can use snapshots to roll back the disks of the instance to the normal status. For more information, see Create a snapshot.

    Before you use the Fedora CoreOS instance to create custom images, use the root permissions (the administrator permissions) to create the /ignition.firstboot file in the /boot directory. Perform the following operations:

    1. Run the following command to re-mount /boot in read/write mode:

      sudo mount /boot -o rw,remount
    2. Run the following command to create the /ignition.firstboot file:

      sudo touch /boot/ignition.firstboot
    3. Run the following command to re-mount /boot in read-only mode:

      sudo mount /boot -o ro,remount

    For information about how to configure Ignition, see Change /boot/ignition/config.ign permissions to 0600 and delete it after provisioning.

openSUSE

openSUSE 15: Kernel updates may cause the system to freeze during startup

  • Problem description: When openSUSE kernel versions are updated to 4.12.14-lp151.28.52-default, instances that have specific CPU types may freeze during startup. The known CPU type is Intel®Xeon®CPU E5-2682 v4 @ 2.50GHz. The following code describes the call trace debugging result:

    [    0.901281] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [    0.901281] CR2: ffffc90000d68000 CR3: 000000000200a001 CR4: 00000000003606e0
    [    0.901281] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [    0.901281] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [    0.901281] Call Trace:
    [    0.901281]  cpuidle_enter_state+0x6f/0x2e0
    [    0.901281]  do_idle+0x183/0x1e0
    [    0.901281]  cpu_startup_entry+0x5d/0x60
    [    0.901281]  start_secondary+0x1b0/0x200
    [    0.901281]  secondary_startup_64+0xa5/0xb0
    [    0.901281] Code: 6c 01 00 0f ae 38 0f ae f0 0f 1f 84 00 00 00 00 00 0f 1f 84 00 00 00 00 00 90 31 d2 65 48 8b 34 25 40 6c 01 00 48 89 d1 48 89 f0 <0f> 01 c8 0f 1f 84 00 00 00 00 00 0f 1f 84 00 00 00 00 00 ** **
  • Cause: The new kernel version is incompatible with the CPU microcode. For more information, see Issues of freezing during startup.

  • Affected image: opensuse_15_1_x64_20G_alibase_20200520.vhd.

  • Solution: In the /boot/grub2/grub.cfg file, add the idle kernel parameter to the row that starts with linux and set this parameter to nomwait. The following example shows how to modify the file:

    menuentry 'openSUSE Leap 15.1'  --class opensuse --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-simple-20f5f35a-fbab-4c9c-8532-bb6c66ce****' {
            load_video
            set gfxpayload=keep
            insmod gzio
            insmod part_msdos
            insmod ext2
            set root='hd0,msdos1'
            if [ x$feature_platform_search_hint = xy ]; then
              search --no-floppy --fs-uuid --set=root --hint='hd0,msdos1'  20f5f35a-fbab-4c9c-8532-bb6c66ce****
            else
              search --no-floppy --fs-uuid --set=root 20f5f35a-fbab-4c9c-8532-bb6c66ce****
            fi
            echo    'Loading Linux 4.12.14-lp151.28.52-default ...'
            linux   /boot/vmlinuz-4.12.14-lp151.28.52-default root=UUID=20f5f35a-fbab-4c9c-8532-bb6c66ce****  net.ifnames=0 console=tty0 console=ttyS0,115200n8 splash=silent mitigations=auto quiet idle=nomwait
            echo    'Loading initial ramdisk ...'
            initrd  /boot/initrd-4.12.14-lp151.28.52-default
    }

Red Hat Enterprise Linux

Red Hat Enterprise Linux 8 64-bit: The kernel version cannot be updated by running the yum update command

  • Problem description: After you run the yum update command on an ECS instance that runs a RHEL 8 64-bit operating system to update its kernel version, the kernel version of the instance operating system remains unchanged even after the instance is restarted.

  • Cause: In the RHEL 8 64-bit operating system, the size of the /boot/grub2/grubenv file that stores GRUB2 environment variables is not 1,024 bytes. As a result, the kernel version cannot be updated.

  • Solution: After you update the kernel version, set the new kernel version to the default startup version. Perform the following operations:

    1. Run the following command to update the kernel version:

      yum update kernel -y
    2. Run the following command to obtain the kernel startup parameter of the operating system:

      grub2-editenv list | grep kernelopts
    3. Run the following command to back up the old /grubenv file:

      mv /boot/grub2/grubenv /home/grubenv.bak
    4. Run the following command to create the /grubenv file:

      grub2-editenv /boot/grub2/grubenv create
    5. Run the following command to set the new kernel version to the default startup version.

      In this example, the new kernel version is /boot/vmlinuz-4.18.0-305.19.1.el8_4.x86_64.

      grubby --set-default /boot/vmlinuz-4.18.0-305.19.1.el8_4.x86_64
    6. Run the following command to set the kernel startup parameter.

      In this example, run the - set kernelopts command to set the kernelopts value to the value of the kernel startup parameter obtained in Step ii.

      grub2-editenv - set kernelopts="root=UUID=0dd6268d-9bde-40e1-b010-0d3574b4**** ro crashkernel=auto net.ifnames=0 vga=792 console=tty0 console=ttyS0,115200n8 noibrs nosmt"
    7. Run the following command to restart the instance for the new kernel version to take effect:

      reboot
      Warning

      The restart operation stops the instance for a short period of time and may interrupt services that are running on the instance. We recommend that you restart the instance during off-peak hours.

SUSE Linux Enterprise Server

SUSE Linux Enterprise Server: The SMT server cannot be connected

  • Problem description: When you use a paid Alibaba Cloud image for SUSE Linux Enterprise Server or SUSE Linux Enterprise Server for SAP, connection errors such as a connection timeout may occur on the simultaneous multithreading (SMT) server. When you download or update a component of the SMT server, error messages similar to the following ones are returned:

    • Registration server returned 'This server could not verify that you are authorized to access this service.' (500)

    • Problem retrieving the repository index file for service 'SMT-http_mirrors_cloud_aliyuncs_com' location ****

  • Affected images: SUSE Linux Enterprise Server and SUSE Linux Enterprise Server for SAP.

  • Solution: Register and activate SMT again.

    1. Run the following commands in sequence to register and activate SMT:

      SUSEConnect -d
      SUSEConnect --cleanup
      systemctl restart guestregister
    2. Run the following command to verify whether SMT is activated:

      SUSEConnect -s

      If SMT is activated, a command output similar to the following one is returned:

      [{"identifier":"SLES_SAP","version":"12.5","arch":"x86_64","status":"Registered"}]

SLES 12 SP5: Kernel updates may cause the system to freeze during startup

  • Problem description: When an earlier kernel version is updated to SLES 12 SP5 or when you update the kernel of SLES 12 SP5, instances that have specific CPU types may freeze during startup. These known CPU types are Intel®Xeon®CPU E5-2682 v4 @ 2.50GHz and Intel®Xeon®CPU E7-8880 v4 @ 2.20GHz. The following code describes the call trace debugging result:

    [    0.901281] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [    0.901281] CR2: ffffc90000d68000 CR3: 000000000200a001 CR4: 00000000003606e0
    [    0.901281] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [    0.901281] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [    0.901281] Call Trace:
    [    0.901281]  cpuidle_enter_state+0x6f/0x2e0
    [    0.901281]  do_idle+0x183/0x1e0
    [    0.901281]  cpu_startup_entry+0x5d/0x60
    [    0.901281]  start_secondary+0x1b0/0x200
    [    0.901281]  secondary_startup_64+0xa5/0xb0
    [    0.901281] Code: 6c 01 00 0f ae 38 0f ae f0 0f 1f 84 00 00 00 00 00 0f 1f 84 00 00 00 00 00 90 31 d2 65 48 8b 34 25 40 6c 01 00 48 89 d1 48 89 f0 <0f> 01 c8 0f 1f 84 00 00 00 00 00 0f 1f 84 00 00 00 00 00 ** **
  • Cause: The new kernel version is incompatible with the CPU microcode.

  • Solution: In the /boot/grub2/grub.cfg file, add the idle kernel parameter to the row that starts with linux and set this parameter to nomwait. The following example shows how to modify the file:

    menuentry 'SLES 12-SP5'  --class sles --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-simple-fd7bda55-42d3-4fe9-a2b0-45efdced****' {
            load_video
            set gfxpayload=keep
            insmod gzio
            insmod part_msdos
            insmod ext2
            set root='hd0,msdos1'
            if [ x$feature_platform_search_hint = xy ]; then
              search --no-floppy --fs-uuid --set=root --hint='hd0,msdos1'  fd7bda55-42d3-4fe9-a2b0-45efdced****
            else
              search --no-floppy --fs-uuid --set=root fd7bda55-42d3-4fe9-a2b0-45efdced****
            fi
            echo    'Loading Linux 4.12.14-122.26-default ...'
            linux   /boot/vmlinuz-4.12.14-122.26-default root=UUID=fd7bda55-42d3-4fe9-a2b0-45efdced****  net.ifnames=0 console=tty0 console=ttyS0,115200n8 mitigations=auto splash=silent quiet showopts idle=nomwait
            echo    'Loading initial ramdisk ...'
            initrd  /boot/initrd-4.12.14-122.26-default
    }

Other issues

A call trace may occur when instances of specific instance types that run operating systems with more recent kernel versions are started

  • Problem description: If an instance of a specific instance type such as ecs.i2.4xlarge runs an operating system with a more recent kernel version, such as Red Hat Enterprise Linux (RHEL) 8.3 or CentOS 8.3 with the 4.18.0-240.1.1.el8_3.x86_64 kernel version, a call trace may occur when the instance is started. Call trace example:

    Dec 28 17:43:45 localhost SELinux:  Initializing.
    Dec 28 17:43:45 localhost kernel: Dentry cache hash table entries: 8388608 (order: 14, 67108864 bytes)
    Dec 28 17:43:45 localhost kernel: Inode-cache hash table entries: 4194304 (order: 13, 33554432 bytes)
    Dec 28 17:43:45 localhost kernel: Mount-cache hash table entries: 131072 (order: 8, 1048576 bytes)
    Dec 28 17:43:45 localhost kernel: Mountpoint-cache hash table entries: 131072 (order: 8, 1048576 bytes)
    Dec 28 17:43:45 localhost kernel: unchecked MSR access error: WRMSR to 0x3a (tried to write 0x000000000000****) at rIP: 0xffffffff8f26**** (native_write_msr+0x4/0x20)
    Dec 28 17:43:45 localhost kernel: Call Trace:
    Dec 28 17:43:45 localhost kernel:  init_ia32_feat_ctl+0x73/0x28b
    Dec 28 17:43:45 localhost kernel:  init_intel+0xdf/0x400
    Dec 28 17:43:45 localhost kernel:  identify_cpu+0x1f1/0x510
    Dec 28 17:43:45 localhost kernel:  identify_boot_cpu+0xc/0x77
    Dec 28 17:43:45 localhost kernel:  check_bugs+0x28/0xa9a
    Dec 28 17:43:45 localhost kernel:  ?  __slab_alloc+0x29/0x30
    Dec 28 17:43:45 localhost kernel:  ?  kmem_cache_alloc+0x1aa/0x1b0
    Dec 28 17:43:45 localhost kernel:  start_kernel+0x4fa/0x53e
    Dec 28 17:43:45 localhost kernel:  secondary_startup_64+0xb7/0xc0
    Dec 28 17:43:45 localhost kernel: Last level iTLB entries: 4KB 64, 2MB 8, 4MB 8
    Dec 28 17:43:45 localhost kernel: Last level dTLB entries: 4KB 64, 2MB 0, 4MB 0, 1GB 4
    Dec 28 17:43:45 localhost kernel: FEATURE SPEC_CTRL Present
    Dec 28 17:43:45 localhost kernel: FEATURE IBPB_SUPPORT Present
  • Cause: The kernel version is updated by using the latest community update package to include the patches for writes to Model-Specific Registers (MSRs). However, some instance types such as ecs.i2.4xlarge do not support writes to MSRs due to the limits imposed by virtualization.

  • Solution: The call trace does not affect system operation or stability. You can ignore this issue.

Compatibility issues between specific Linux kernel versions and the hfg6 general-purpose instance family with high clock speeds may cause kernel panic

  • Problem description: When the kernels of some open source Linux distributions such as CentOS 8, SUSE Linux Enterprise Server (SLES) 15 SP2, and openSUSE 15.2 are updated to the latest versions in hfg6 instances, a kernel panic error may occur. The following figure shows an example of the call trace debugging method.kernel panic

  • Cause: Some Linux kernel versions are incompatible with the hfg6 general-purpose instance family with high clock speeds.

  • Solution:

    • The compatibility issue is fixed in the latest kernel versions of SLES 15 SP2 and openSUSE 15.2. The following code shows the information of the change commit. If your latest kernel version contains this information, the kernel version is compatible with the hfg6 instance family.

      commit 1e33d5975b49472e286bd7002ad0f689af33fab8
      Author: Giovanni Gherdovich <ggherdovich@suse.cz>
      Date:   Thu Sep 24 16:51:09 2020 +0200
      
          x86, sched: Bail out of frequency invariance if
          turbo_freq/base_freq gives 0 (bsc#1176925).
      
          suse-commit: a66109f44265ff3f3278fb34646152bc2b3224a5
          
          
      commit dafb858aa4c0e6b0ce6a7ebec5e206f4b3cfc11c
      Author: Giovanni Gherdovich <ggherdovich@suse.cz>
      Date:   Thu Sep 24 16:16:50 2020 +0200
      
          x86, sched: Bail out of frequency invariance if turbo frequency
          is unknown (bsc#1176925).
      
          suse-commit: 53cd83ab2b10e7a524cb5a287cd61f38ce06aab7
      
      commit 22d60a7b159c7851c33c45ada126be8139d68b87
      Author: Giovanni Gherdovich <ggherdovich@suse.cz>
      Date:   Thu Sep 24 16:10:30 2020 +0200
      
          x86, sched: check for counters overflow in frequency invariant
          accounting (bsc#1176925).
    • If you run the yum update command to update the kernel of CentOS 8 to kernel-4.18.0-240 or later in hfg6 instances, a kernel panic error may occur. If this error occurs, roll the kernel back to the previous version.

Pip requests time out

  • Problem description: Pip requests occasionally time out or fail.

  • Affected images: CentOS, Debian, Ubuntu, SUSE, openSUSE, and Alibaba Cloud Linux.

  • Cause: Alibaba Cloud provides three pip repository addresses. The default address is mirrors.aliyun.com. To access this address, instances must be able to access the Internet. If your instance is not assigned a public IP address, pip requests time out.

    • Default public repository address: mirrors.aliyun.com

    • Internal repository address in virtual private clouds (VPCs): mirrors.cloud.aliyuncs.com

    • Internal repository address in the classic network: mirrors.aliyuncs.com

  • Solution: You use one of the following methods to resolve the issue:

    • Method 1

      Associate an elastic IP address (EIP) with the instance. For more information, see Associate an EIP with an ECS instance.

      You can also re-assign a public IP address to a subscription instance when you change the instance configurations. For more information, see Upgrade the instance types of subscription instances.

    • Method 2

      If a pip request fails, you can run the fix_pypi.sh script in your instance and retry the pip operation. Perform the following steps:

      1. Connect to an instance.

        For more information, see Connect to an instance by using VNC.

      2. Run the following command to obtain the script file:

        wget http://image-offline.oss-cn-hangzhou.aliyuncs.com/fix/fix_pypi.sh
      3. Run one of the following scripts based on the network type of the instance:

        • If your instance resides in a VPC, run the bash fix_pypi.sh "mirrors.cloud.aliyuncs.com" script.

        • If your instance resides in the classic network, run the bash fix_pypi.sh "mirrors.aliyuncs.com" script.

      4. Retry the pip operation.

      The following sample code describes the fix_pypi.sh script:

      #!/bin/bash
      
      function config_pip() {
          pypi_source=$1
      
          if [[ !  -f ~/.pydistutils.cfg ]]; then
      cat > ~/.pydistutils.cfg << EOF
      [easy_install]
      index-url=http://$pypi_source/pypi/simple/
      EOF
          else
              sed -i "s#index-url.*#index-url=http://$pypi_source/pypi/simple/#" ~/.pydistutils.cfg
          fi
      
          if [[ !  -f ~/.pip/pip.conf ]]; then
          mkdir -p ~/.pip
      cat > ~/.pip/pip.conf << EOF
      [global]
      index-url=http://$pypi_source/pypi/simple/
      [install]
      trusted-host=$pypi_source
      EOF
          else
              sed -i "s#index-url.*#index-url=http://$pypi_source/pypi/simple/#" ~/.pip/pip.conf
              sed -i "s#trusted-host.*#trusted-host=$pypi_source#" ~/.pip/pip.conf
          fi
      }
      
      config_pip $1