[SRU] Volume creation from image fails for UEC and Glance API version 2

Bug #1439371 reported by Jon Bernard
28
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Cinder
Fix Released
High
Jon Bernard
Kilo
Fix Released
High
Unassigned
Ubuntu Cloud Archive
Fix Released
Undecided
Unassigned
Icehouse
In Progress
Undecided
Unassigned
cinder (Ubuntu)
Invalid
Undecided
Unassigned
Trusty
Fix Released
Undecided
Unassigned

Bug Description

[Description]
[Test Case]
When creating a volume from a glance UEC image, the RBD driver fails to combine the rootfs, kernel, and ramdisk into a single volume suitable for booting an instance. Instead, only the rootfs is written to the volume, which is empty. This causes nova's boot-from-volume operation to fail if the volume is ceph-backed and the image is of type UEC.

By contrast, the same operation on an LVM volume yields a volume that contains all the necessary pieces to boot an instance.

This is also likely the reason for recent ceph CI job failures in test_volume_boot_pattern, as tempest executes this exact set of operations.

[Regression Potential]
Regression potential is low. Patch is cherry-picked from upstream stable/liberty branch without any changes.

Jon Bernard (jbernard)
tags: added: ceph drivers
removed: driver
Revision history for this message
Matt Riedemann (mriedem) wrote :

Sounds like dansmith is going to work on changing the ceph job to use the disk image to workaround this in the gate so it wouldn't be a release blocker.

Revision history for this message
Matt Riedemann (mriedem) wrote :

The nova bug reported for the ceph job was bug 1439273.

Revision history for this message
Deepak C Shetty (dpkshetty) wrote :

For glusterfs CI too, we are seeing failures for test_volume_boot_pattern, so just wondering if this is ceph specific ?

Revision history for this message
Matt Riedemann (mriedem) wrote :

d-g patch to archive the ceph logs and ceph.conf during job runs: https://review.openstack.org/#/c/170900/

Revision history for this message
Matt Riedemann (mriedem) wrote :

This tempest patch skips test_volume_boot_pattern until this bug is fixed https://review.openstack.org/#/c/170903/.

Revision history for this message
Matt Riedemann (mriedem) wrote :

This change makes the ceph job voting in the check queue on cinder/glance/nova changes:

https://review.openstack.org/#/c/170913/

That depends on the tempest change to skip the test until this bug is fixed.

Revision history for this message
Dan Smith (danms) wrote :

Just an update on what seems to be happening:

If we do a BFV nova boot from a UEC image per the docs, the command looks something like this:

nova boot --flavor 1 --block-device source=image,id=72dbd780-8fa2-4fb5-8dbe-94c32cd0a511,dest=volume,shutdo
wn=preserve,bootindex=0,size=1 foo

That creates a volume from an image in cinder during the boot process. In the nova code, it uses the image id on the BDM to get the image metadata from glance. When it does, it sees that it needs direct kernel boot and does so. Thus, with ceph and a UEC image, this type of BFV succeeds.

The other way to do this is to pre-create the volume from a glance image with cinder, and then call nova-boot on that volume once it's done. In that case, nova relies on cinder's volume_image_metadata to be a copy of the glance image metadata. This is where the (important part of the) breakage is. The cinder metadata is not complete and is missing the kernel and ramdisk info when using ceph as the backend:

| volume_image_metadata | {u'container_format': u'ami', u'min_ram': u'0', u'disk_format': u'ami', u'image_name': u'cirros-0.3.2-x86_64-uec', u'image_id': u'72dbd780-8fa2-4fb5-8dbe-94c32cd0a511', u'checksum': u'4eada48c2843d2a262c814ddc92ecf2c', u'min_disk': u'0', u'size': u'25165824'} |

In this case, nova can't see that this guest needs direct kernel boot and just assumes the disk is bootable. It looks like it *is* complete when doing the same with the LVM driver, which is why it works there.

Revision history for this message
Deepak C Shetty (dpkshetty) wrote :

It looks like glusterfs failure is different than ceph.
FWIW: I opened https://bugs.launchpad.net/cinder/+bug/1441050 for tracking the glusterfs issue

Revision history for this message
John Griffith (john-griffith) wrote :

Spent a little time last night looking at this finally. Not sure why Nova isn't just going to glance for the metadata in the Ceph case? I believe that's how the other backends work, but Ceph has some different paths for when it is the Glance store for optimization (which doesn't apply in this case).

Revision history for this message
Jon Bernard (jbernard) wrote :

In order for users to take advantage of COW volumes created from a glance image, Cinder must be configured to use Glance API version 2 (default is 1). In version 2, the required boot metadata (kernel_id and ramdisk_id) are no long stored in the 'properties' dict, but as standalone fields in the GET response from glance. The existing cinder parser for the glance request is not aware of this and the volume created form a v2 image will lack this required metadata. And this is the cause of the tempest failure.

I have a small patch to update the parser in testing now, should be posted soon.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (master)

Fix proposed to branch: master
Review: https://review.openstack.org/171312

Changed in cinder:
status: New → In Progress
Jon Bernard (jbernard)
summary: - Volume creation from image fails for UEC+Ceph
+ Volume creation from image fails for UEC and Glance API version 2
Revision history for this message
Dan Smith (danms) wrote : Re: Volume creation from image fails for UEC and Glance API version 2

John, the nova code for boot-from-volume is driver-independent. In other words, it doesn't matter whether you use lvm or ceph, it still decides whether to get the metadata from glance or cinder based on the same criteria.

Revision history for this message
Matt Riedemann (mriedem) wrote :

Sounds like this is also affecting tempest runs with test_volume_boot_pattern for glusterfs and gpfs, but we just don't have jobs for those running in infra (I haven't dug into third party CI for those backends in cinder).

Changed in cinder:
importance: Undecided → High
Matt Riedemann (mriedem)
tags: added: kilo-rc-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (master)

Reviewed: https://review.openstack.org/171312
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=ea109b5f24dca93fd6f660bc436a685d6101bcea
Submitter: Jenkins
Branch: master

commit ea109b5f24dca93fd6f660bc436a685d6101bcea
Author: Jon Bernard <email address hidden>
Date: Tue Apr 7 13:57:36 2015 -0400

    Include boot properties from glance v2 images

    In order for users to take advantage of COW volumes created from
    a glance image, Cinder must be configured to use Glance API version
    2 (default is 1). In version 2, the required boot metadata (kernel_id
    and ramdisk_id) are no long stored in the 'properties' dict, but as
    standalone fields in the GET response from glance. The existing cinder
    parser for the glance request is not aware of this and the volume
    created form a v2 image will lack this required metadata.

    This was causing the recent Ceph CI gate failures for
    test_volume_boot_pattern.

    Change-Id: I688898b3841691369d73887f7eabdceb05155db1
    Closes-Bug: #1439371

Changed in cinder:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (proposed/kilo)

Fix proposed to branch: proposed/kilo
Review: https://review.openstack.org/172978

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/kilo)

Fix proposed to branch: stable/kilo
Review: https://review.openstack.org/174050

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on cinder (proposed/kilo)

Change abandoned by Doug Hellmann (<email address hidden>) on branch: proposed/kilo
Review: https://review.openstack.org/172978
Reason: replaced by https://review.openstack.org/174050 on stable/kilo

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (stable/kilo)

Reviewed: https://review.openstack.org/174050
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=51bfd201e888caecf8dbdce8c8999bfa8ed05a26
Submitter: Jenkins
Branch: stable/kilo

commit 51bfd201e888caecf8dbdce8c8999bfa8ed05a26
Author: Jon Bernard <email address hidden>
Date: Tue Apr 7 13:57:36 2015 -0400

    Include boot properties from glance v2 images

    In order for users to take advantage of COW volumes created from
    a glance image, Cinder must be configured to use Glance API version
    2 (default is 1). In version 2, the required boot metadata (kernel_id
    and ramdisk_id) are no long stored in the 'properties' dict, but as
    standalone fields in the GET response from glance. The existing cinder
    parser for the glance request is not aware of this and the volume
    created form a v2 image will lack this required metadata.

    This was causing the recent Ceph CI gate failures for
    test_volume_boot_pattern.

    Change-Id: I688898b3841691369d73887f7eabdceb05155db1
    Closes-Bug: #1439371
    (cherry picked from commit ea109b5f24dca93fd6f660bc436a685d6101bcea)

Thierry Carrez (ttx)
tags: removed: kilo-rc-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (master)

Fix proposed to branch: master
Review: https://review.openstack.org/179287

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (master)
Download full text (6.4 KiB)

Reviewed: https://review.openstack.org/179287
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=cabe7c1a1d5b35e58fc4ed34b12fcccd4416835e
Submitter: Jenkins
Branch: master

commit 5987bb2290f629e59b0bcced2f8fe22cdeb9cc6d
Author: John Griffith <email address hidden>
Date: Thu Apr 23 12:07:12 2015 -0600

    Add external genconfig calls

    After moving to oslo.config we still were using
    incubator config generator. This was ok, but the
    problem is we haven't been pulling config options
    from the oslo libs.

    This is a hack that just appends external lib calls
    and appends those options to the sample file being built.

    Change-Id: I2634b20ef4abd3bf7990f845d59ad3d208db234f
    (cherry picked from commit 51a22591a44932463847ed3247899db32ac49444)
    Closes-Bug: #1447380

commit b05274c96bc48e749e6ad21633b39158838c313e
Author: Brant Knudson <email address hidden>
Date: Wed Apr 22 14:57:53 2015 -0500

    service child process normal SIGTERM exit

    service.py had some code where the child process would catch the
    SIGTERM from the parent just so it could exit with 1 status rather
    than with an indication that it exited due to SIGTERM. When
    shutting down the parent doesn't care in what way the child ended,
    only that they're all gone, so this code is unnecessary.

    Also, for some reason this caused the child to never exit while
    there was an open connection from a client. Probably something
    with eventlet and signal handling.

    This is a cherry-pick of oslo-incubator commit
    702bc569987854b602ef189655c201c348de84cb .

    Change-Id: I87f3ca4da64fb8070e4d6c3876a2f1ce1a3ca71d
    Closes-Bug: #1446583
    (cherry picked from commit d73ac96d18c66aa4dd5b7d7f8d7c22e8f8434683)

commit 2727e8865ce7b9ef4eec81f7f07b7a0726eb304b
Author: Lucian Petrut <email address hidden>
Date: Fri Mar 27 14:15:25 2015 +0200

    Windows SMBFS: fix volume extend

    The Windows SMBFS driver inherits the Linux SMBFS driver,
    overriding Windows specific methods.

    This commit Ic89cffc93940b7b119cfcde3362f304c9f2875df added the
    volume name as an extra argument to the _do_extend_volume in order
    to check if differencing images are pointing to backing files other
    than the according volume disks.

    Although this is not required on Windows, this method should accept
    this extra argument in order to have the same signature as the
    method it overrides. At the moment, this raises the following
    exception:

    TypeError: _do_extend_volume() takes exactly 3 arguments (4 given)

    Closes-Bug: #1437290
    (cherry picked from commit dca29e9ab3cdde210d3777e7c6b4a6849447058a)
    Change-Id: I868d7de4a2c68f3fc520ba476a5660a84f440bb1

commit cc9bd73479ab4f0d14ee66eccab6fa285b8836b9
Author: Daisuke Fujita <email address hidden>
Date: Wed Apr 15 14:03:31 2015 +0900

    Fix a wrong argument of create method

    Change the argument 'QoSSpecs.create' to 'qos_specs.create'.

    Closes-Bug: #1443331
    (cherry picked from commit a3c0a4104f95acff00d3a9721caa4da730619fb7)
    Change-Id: Iabebc5f1681be75fb06d83...

Read more...

Thierry Carrez (ttx)
Changed in cinder:
milestone: none → liberty-1
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in cinder:
milestone: liberty-1 → 7.0.0
James Page (james-page)
Changed in cloud-archive:
status: New → Fix Released
summary: - Volume creation from image fails for UEC and Glance API version 2
+ [SRU] Volume creation from image fails for UEC and Glance API version 2
Changed in cinder (Ubuntu):
status: New → Invalid
Changed in cinder (Ubuntu Trusty):
status: New → In Progress
description: updated
description: updated
Revision history for this message
Martin Pitt (pitti) wrote : Please test proposed package

Hello Jon, or anyone else affected,

Accepted cinder into trusty-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/cinder/1:2014.1.5-0ubuntu2.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in cinder (Ubuntu Trusty):
status: In Progress → Fix Committed
tags: added: verification-needed
Revision history for this message
Seyeong Kim (seyeongkim) wrote :

this is additional to https://bugs.launchpad.net/cinder/+bug/1323660, for compatibility. verified

tags: added: verification-done
removed: verification-needed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package cinder - 1:2014.1.5-0ubuntu2.1

---------------
cinder (1:2014.1.5-0ubuntu2.1) trusty; urgency=medium

  * Include boot properties from glance v2 images (LP: #1439371):
    - d/p/include-boot-properties-from-glance-v2-images.patch
  * Fix extract properties from image with glance api v2 (LP: #1323660):
    - d/p/fix-properties-extracting-from-image-with-glance-api.patch

 -- Seyeong Kim <email address hidden> Fri, 11 Nov 2016 11:23:53 +0900

Changed in cinder (Ubuntu Trusty):
status: Fix Committed → Fix Released
Revision history for this message
Robie Basak (racb) wrote : Update Released

The verification of the Stable Release Update for cinder has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.