Comment 2 for bug 1357368

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/114539
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=aa9104ccedb3ff13cc34a498b11f5e8ff100fd99
Submitter: Jenkins
Branch: master

commit aa9104ccedb3ff13cc34a498b11f5e8ff100fd99
Author: Jeegn Chen <email address hidden>
Date: Fri Aug 15 21:40:14 2014 +0800

    Clean up iSCSI multipath devices in Post Live Migration

    When a volume is attached to a VM in the source compute node through
    multipath, the related files in /dev/disk/by-path/ are like this

    stack@ubuntu-server12:~/devstack$ ls /dev/disk/by-path/*24
    /dev/disk/by-path/ip-192.168.3.50:3260-iscsi-iqn.1992-04.com.emc:cx.
    fnm00124500890.a5-lun-24
    /dev/disk/by-path/ip-192.168.4.51:3260-iscsi-iqn.1992-04.com.emc:cx.
    fnm00124500890.b4-lun-24

    The information on its corresponding multipath device is like this
    stack@ubuntu-server12:~/devstack$ sudo multipath -l 3600601602ba034
    00921130967724e411
    3600601602ba03400921130967724e411 dm-3 DGC,VRAID
    size=1.0G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
    |-+- policy='round-robin 0' prio=-1 status=active
    | `- 19:0:0:24 sdl 8:176 active undef running
    `-+- policy='round-robin 0' prio=-1 status=enabled
      `- 18:0:0:24 sdj 8:144 active undef running

    But when the VM is migrated to the destination, the related information is
    like the following example since we CANNOT guarantee that all nodes are able
    to access the same iSCSI portals and the same target LUN number. And the
    information is used to overwrite connection_info in the DB before the post
    live migration logic is executed.

    stack@ubuntu-server13:~/devstack$ ls /dev/disk/by-path/*24
    /dev/disk/by-path/ip-192.168.3.51:3260-iscsi-iqn.1992-04.com.emc:cx.
    fnm00124500890.b5-lun-100
    /dev/disk/by-path/ip-192.168.4.51:3260-iscsi-iqn.1992-04.com.emc:cx.
    fnm00124500890.b4-lun-100

    stack@ubuntu-server13:~/devstack$ sudo multipath -l 3600601602ba034
    00921130967724e411
    3600601602ba03400921130967724e411 dm-3 DGC,VRAID
    size=1.0G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
    |-+- policy='round-robin 0' prio=-1 status=active
    | `- 19:0:0:100 sdf 8:176 active undef running
    `-+- policy='round-robin 0' prio=-1 status=enabled
      `- 18:0:0:100 sdg 8:144 active undef running

    As a result, if post live migration in source side uses <IP>, <IQN> and
    <TARGET LUN Number> to find the devices to clean up, it may use 192.168.3.51,
    iqn.1992-04.com.emc:cx.fnm00124500890.a5 and 100.
    However, the correct one should be 192.168.3.50, iqn.1992-04.com.emc:cx.
    fnm00124500890.a5 and 24.

    Similar philosophy in (https://bugs.launchpad.net/nova/+bug/1327497) can be
    used to fix it: Leverage the unchanged multipath_id to find correct devices
    to delete.

    Change-Id: I875293c3ade9423caa2b8afe9eca25a74606d262
    Closes-Bug: #1357368