Thursday, August 21, 2008

Live Upgrade with ZFS root using SXCE (Nevada)

Couple of weeks ago I did a fresh install of "Solaris Express Community Edition snv_94" and selected ZFS root to see what it was like on my desktop. Then I thought I would upgrade to snv_95, but then I could not see any information about doing a live upgrade, what was I going to do!

After some searching I found a couple of old pages which hinted at it but not a full example, so there was only one way to find out......

Step-by-step guide.
  1. Download latest SXCE in this case sol-nv-b95-x86-dvd.iso
  2. Mount the ISO on the system
  3. # lofiadm -a /export/iso/sol-nv-b95-x86-dvd.iso
    /dev/lofi/1
    # mount -o ro -F hsfs -o ro /dev/lofi/1 /mnt
  4. Upgrade to the latest Live Upgrade Packages
  5. # pkginfo SUNWluu SUNWluzone SUNWlur SUNWlucfg
    application SUNWlucfg Live Upgrade Configuration
    application SUNWlur Live Upgrade (root)
    application SUNWluu Live Upgrade (usr)
    application SUNWluzone Live Upgrade (zones support)

    # pkgrm SUNWluu SUNWluzone SUNWlur SUNWlucfg
    # cd /mnt/Solaris_11/Product
    # pkgadd -d . SUNWlucfg SUNWlur SUNWluu SUNWluzone

  6. Now the fun! Create a new boot environment. lucreate has a new option -p which specifies the ZFS pool
  7. -p zfs_root_pool
    Specifies the ZFS pool in which a new BE will reside.
    This option can be omitted if the source and target BEs are within the same pool.
    e.g. lucreate -c b94 -n b95a -p newpool
    # lucreate -c b94 -n b95a
    Checking GRUB menu...
    System has findroot enabled GRUB
    Analyzing system configuration.
    Comparing source boot environment <b94> file systems with the file
    system(s) you specified for the new boot environment. Determining which
    file systems should be in the new boot environment.
    Updating boot environment description database on all BEs.
    Updating system configuration files.
    Creating configuration for boot environment <b95a>.
    Source boot environment is <b94>.
    Creating boot environment <b95a>.
    Cloning file systems from boot environment <b94> to create boot environment <b95a>.
    Creating snapshot for <rpool/root/snv_94> on <rpool/root/b95a@b95a>.
    Creating clone for <rpool/root/b95a@b95a> on <rpool/root/b95a>.
    Setting canmount=noauto for </> in zone <global> on <rpool/ROOT/b95a>.
    Saving existing file </boot/grub/menu.lst> in top level dataset for BE <b95a> as <mount-point>//boot/grub/menu.lst.prev.
    File </boot/grub/menu.lst> propagation successful
    Copied GRUB menu from PBE to ABE
    No entry for BE <b95a> in GRUB menu
    Population of boot environment <b95a> successful.
    Creation of boot environment <b95a> successful.

  8. Check all is well
  9. # lustatus
    Boot Environment           Is       Active Active    Can    Copy
    Name Complete Now On Reboot Delete Status
    -------------------------- -------- ------ --------- ------ ----------
    b94 yes yes yes no -
    b95a yes no no yes -

    # zfs list
    NAME                       USED  AVAIL  REFER  MOUNTPOINT
    rpool 112G 116G 41K /rpool
    rpool/ROOT 14.7G 116G 18K legacy
    rpool/ROOT/snv_94 14.6G 116G 6.78G /
    rpool/ROOT/b95a 81.8M 116G 9.51G /.alt.tmp.b-Cz.mnt/

  10. Now do the upgrade
  11. # luupgrade -u -n b95a -s /mnt
    System has findroot enabled GRUB
    No entry for BE <b95a> in GRUB menu
    Copying failsafe kernel from media.
    Uncompressing miniroot
    Uncompressing miniroot archive (Part2)
    13364 blocks
    Creating miniroot device
    miniroot filesystem is <ufs>
    Mounting miniroot at </mnt/Solaris_11/Tools/Boot>
    Mounting miniroot Part 2 at </mnt/Solaris_11/Tools/Boot>
    Validating the contents of the media </mnt>.
    The media is a standard Solaris media.
    The media contains an operating system upgrade image.
    The media contains <Solaris> version <11>.
    Constructing upgrade profile to use.
    Locating the operating system upgrade program.
    Checking for existence of previously scheduled Live Upgrade requests.
    Creating upgrade profile for BE <b95a>.
    Checking for GRUB menu on ABE <b95a>.
    Saving GRUB menu on ABE <b95a>.
    Checking for x86 boot partition on ABE.
    Determining packages to install or upgrade for BE <b95a>.
    Performing the operating system upgrade of the BE <b95a>.
    CAUTION: Interrupting this process may leave the boot environment unstable or unbootable.
    Upgrading Solaris: 100% completed
    Installation of the packages from this media is complete.
    Restoring GRUB menu on ABE <b95a>.
    Adding operating system patches to the BE <b95a>.
    The operating system patch installation is complete.
    ABE boot partition backing deleted.
    PBE GRUB has no capability information.
    PBE GRUB has no versioning information.
    ABE GRUB is newer than PBE GRUB. Updating GRUB.
    GRUB update was successful.
    Configuring failsafe for system.
    Failsafe configuration is complete.
    INFORMATION: The file </var/sadm/system/logs/upgrade_log> on boot environment <b95a> contains a log of the upgrade operation.
    INFORMATION: The file </var/sadm/system/data/upgrade_cleanup> on boot environment <b95a> contains a log of cleanup operations required.
    WARNING: <1> packages failed to install properly on boot environment <b95a>.
    INFORMATION: The file </var/sadm/system/data/upgrade_failed_pkgadds> on boot environment <b95a> contains a list of packages that failed to upgrade or install properly.
    INFORMATION: Review the files listed above. Remember that all of the files are located on boot environment <b95a>. Before you activate boot environment <b95a>, determine if any additional system maintenance is required or if additional media of the software distribution must be installed.
    The Solaris upgrade of the boot environment <b95a> is partially complete.
    Installing failsafe
    Failsafe install is complete.

  12. Check status
  13. # lustatus
    Boot Environment           Is       Active Active    Can    Copy      
    Name Complete Now On Reboot Delete Status
    -------------------------- -------- ------ --------- ------ ----------
    b94 yes yes yes no -
    b95a yes no no yes -

  14. If you want to look at the new filesystem
  15. # zfs mount rpool/ROOT/b95a
    # df -F zfs
    /                  (rpool/ROOT/snv_94   ):243553838 blocks 243553838 files
    /export (rpool/export ):243553838 blocks 243553838 files
    /export/home (rpool/export/home ):243553838 blocks 243553838 files
    /rpool (rpool ):243553838 blocks 243553838 files
    .alt.tmp.b-Cz.mnt (rpool/ROOT/b95a ):243553838 blocks 243553838 files
    # zfs unmount rpool/ROOT/b95a

  16. Make it live
  17. # luactivate b95a
    System has findroot enabled GRUB
    Generating boot-sign, partition and slice information for PBE <b94>
    Saving existing file </etc/bootsign> in top level dataset for BE <b94> as <mount-point>//etc/bootsign.prev.
    WARNING: <1> packages failed to install properly on boot environment <b95a>.
    INFORMATION: </var/sadm/system/data/upgrade_failed_pkgadds> on boot environment <b95a> contains a list of packages that failed to upgrade or install properly. Review the file before you reboot the system to determine if any additional system maintenance is required.

    Generating boot-sign for ABE <b95a>
    Saving existing file </etc/bootsign> in top level dataset for BE <b95a> as <mount-point>//etc/bootsign.prev.
    Generating partition and slice information for ABE <b95a>
    Copied boot menu from top level dataset.
    Generating direct boot menu entries for PBE.
    Generating xVM menu entries for PBE.
    Generating direct boot menu entries for ABE.
    Generating xVM menu entries for ABE.
    Disabling splashimage
    Re-enabling splashimage
    No more bootadm entries. Deletion of bootadm entries is complete.
    GRUB menu default setting is unaffected
    Done eliding bootadm entries.

    **********************************************************************
    The target boot environment has been activated. It will be used when you
    reboot. NOTE: You MUST NOT USE the reboot, halt, or uadmin commands. You
    MUST USE either the init or the shutdown command when you reboot. If you
    do not use either init or shutdown, the system will not boot using the
    target BE.
    **********************************************************************
    In case of a failure while booting to the target BE, the following process
    needs to be followed to fallback to the currently working boot environment:
    1. Boot from Solaris failsafe or boot in single user mode from the Solaris
    Install CD or Network.
    2. Mount the Parent boot environment root slice to some directory (like
    /mnt). You can use the following command to mount:

    mount -Fzfs /dev/dsk/c1d0s0 /mnt

    3. Run <luactivate> utility with out any arguments from the Parent boot environment root slice, as shown below:

    /mnt/sbin/luactivate

    4. luactivate, activates the previous working boot environment and indicates the result.

    5. Exit Single User mode and reboot the machine.

    **********************************************************************

    Modifying boot archive service
    Propagating findroot GRUB for menu conversion.
    File </etc/lu/installgrub.findroot> propagation successful
    File </etc/lu/stage1.findroot> propagation successful
    File </etc/lu/stage2.findroot> propagation successful
    File </etc/lu/GRUB_capability> propagation successful
    Deleting stale GRUB loader from all BEs.
    File </etc/lu/installgrub.latest> deletion successful
    File </etc/lu/stage1.latest> deletion successful
    File </etc/lu/stage2.latest> deletion successful
    Activation of boot environment <b95a> successful.
    #

  18. Reboot I see what happens.....

17 comments:

.:: Aurora ::. said...

Hi,

Thanks for this !
Just a question, what was all this about: "WARNING: <1> packages failed to install properly on boot environment .
INFORMATION: on boot
environment contains a list of packages that failed to upgrade or
install properly. Review the file before you reboot the system to
determine if any additional system maintenance is required."

Thanks,
Edward.

Andrew Watkins said...

Thanks for the comment. I should have mentioned that.

# cat system/data/upgrade_failed_pkgadds
SUNWmccom

It's a known bug and already fixed in B96
6731827 SUNWmccom did not install due to bad postinstall script in snv_95.

markm said...

You might want to fix the output of those commands - a lot of the LU messages use < and > characters, so the lucreate output looks a bit odd...

Andrew Watkins said...

Thanks Mark or should I say NOT ;-)

Looks a little better now.. Cheers

John said...

Bad and good news:

Your instructions worked like a champ
for me, from snv_95 to snv_96.

However, trying the same steps to
go from snv_96 to snv_97 crashed
my box during `luupgrade`. Looking
at the zpool history of my rpool,
it looks like LU got really confused
between my old snv_95 and new snv_97
BE's and FS'. I'll try to analyze
the dump later.

For now, beware.

Andrew Watkins said...

John sorry to hear that.
I am in the middle of trying it myself on my work machine snv_95 to snv_97, but I will not reboot the system until tomorrow, but no errors seem to show, but we will see when I reboot.
Summary:
# pkgrm SUNWluu SUNWluzone SUNWlur SUNWlucfg
# cd /mnt/Solaris_11/Product
# pkgadd -d . SUNWlucfg SUNWlur SUNWluu SUNWluzone
# lucreate -n b97
# lustatus
# luupgrade -u -n b97 -s /mnt

Anonymous said...

did not work for me either. Everything went through fine, but luactivate did not write the new BE into the menu.lst file.

Andrew Watkins said...

So, did you reboot your system (init 6) and nothing happened?

- "lustatus" shows the correct information?
- "luactivate" displayed no errors
- "zpool history" shows no errors

May be I am just lucky for once, since mine is working fine. I will look at it again when B98 comes out.

Anonymous said...

It worked the second time around. The only difference in step is init 6 vs shutdown using the desktop menu. Thanks for the wonderful information. I am now up on snv_97. Now, if I could get virtualbox to run on host interface networking over my wireless card, I would be all set.

Hans van der Made said...

Worked like a charm. Just updated snv_96 to snv_97 without any problems. Thanks!

CrossMod said...

Upgrade worked flawlessly with these instructions going from NV95 to NV98. I did have a problem trying to switch back to NV95 though:


bash-3.2# luactivate NV95
System has findroot enabled GRUB
Generating boot-sign, partition and slice information for PBE NV98
ERROR: cannot mount '/.alt.tmp.b-nnb.mnt/': directory is not empty
ERROR: cannot mount mount point /.alt.tmp.b-nnb.mnt/ device rpool/ROOT/snv_95
ERROR: failed to mount file system rpool/ROOT/snv_95 on /.alt.tmp.b-nnb.mnt/
ERROR: unmounting partially mounted boot environment file systems
ERROR: cannot mount boot environment by icf file /etc/lu/ICF.1
ERROR: Unable to mount the boot environment NV95.


Any advice? I can confirm that the tmp directory gets created with /export/home inside, which seems to be killing the mount attempt.

Andrew Watkins said...

CrossMod:
I think it is a matter of cleaning up some old directories.
I guess try the following:
# lucurr
NV98


Check only current system mounted (NO /.alt..)
# df -F zfs
/ (rpool/ROOT/snv_98 )
/export (rpool/export )
/export/home (rpool/export/home )
/rpool (rpool )

# ls -ad /.alt*
/.alt.NV95 /.alt.tmp.b-Cz.mnt

Check that they do not contain any usefull data. and then delete them (BE CAREFUL with -r)
# rm -ir /.alt.NV95 /.alt.tmp.b-Cz.mnt
# lumount NV95
# luunmount NV95


Andrew

CrossMod said...

Andrew:

Thanks for the suggestions, but I tried that already (deleting all the .alt* directories). A new .alt directory is added each time you try to run the luactivate. After it bombs out with the error I posted, if you go check out the .alt directory that was created, it contains /export/home (empty).

Andrew Watkins said...

Sorry, I think you are on your own with this one...
I have not done an luactivate to go back, but some commands you could try to see if it points you in the right direction:
# zpool history
# zpool status
# zfs list
# zfs mount rpool/ROOT/snv_95
# ls /.alt*
# ls /.alt*/export/home
# zfs umount rpool/ROOT/snv_95
# cd /etc/lu
# lufslist b95a
# more ICF*

Sorry, I can't give a better answer..

Martin said...

I get a weird error on luupgrade. Everything else seems to go fine though.

# lucreate -c snv_101 -n snv_103

(no errors)

# lustatus
Boot Environment Is Active Active Can Copy
Name Complete Now On Reboot Delete Status
-------------------------- -------- ------ --------- ------ ----------
snv_101 yes yes yes no -
snv_103 yes no no yes -
#


# luupgrade -u -n snv_103 -s /mnt

System has findroot enabled GRUB
No entry for BE < snv_103 > in GRUB menu
Uncompressing miniroot
Copying failsafe kernel from media.
52161 blocks
miniroot filesystem is < lofs >
Mounting miniroot at < /mnt/Solaris_11/Tools/Boot >
Validating the contents of the media < /mnt >.
The media is a standard Solaris media.
The media contains an operating system upgrade image.
The media contains < Solaris > version < 11 >.
Constructing upgrade profile to use.
Locating the operating system upgrade program.
Checking for existence of previously scheduled Live Upgrade requests.
Creating upgrade profile for BE < snv_103 >.
Checking for GRUB menu on ABE < snv_103 >.
Saving GRUB menu on ABE < snv_103 >.
Checking for x86 boot partition on ABE.
Determining packages to install or upgrade for BE < snv_103 >.
Performing the operating system upgrade of the BE < snv_103 >.
CAUTION: Interrupting this process may leave the boot environment unstable
or unbootable.
ERROR: Installation of the packages from this media of the media failed; pfinstall returned these diagnostics:

Processing profile

Loading local environment and services

Generating upgrade actions

ERROR: No upgradeable file systems found at specified mount point.
Restoring GRUB menu on ABE < snv_103 >.
ABE boot partition backing deleted.
PBE GRUB has no capability information.
PBE GRUB has no versioning information.
ABE GRUB is newer than PBE GRUB. Updating GRUB.
GRUB update was successful.
Configuring failsafe for system.
Failsafe configuration is complete.
The Solaris upgrade of the boot environment < snv_103 > failed.
Installing failsafe
Failsafe install is complete.
#

Andrew Watkins said...

Are you sure you installed the latest Live Upgrade packages from the new version.

# pkgadd -d . SUNWlucfg SUNWlur SUNWluu SUNWluzone

Dan Reiland said...

Thanks for the detailed instructions -- nice to have everything well documented in a single place. Procedure worked flawlessly for snv_111 to snv_112.