Solaris Containers, available starting with Solaris 10, allow us to portion a physical server into one or more logical units. Whilst Containers are a form of virtualisation, they are not in the traditional sense (multiple OS instances with VMware, or hardware partitioning with LDoms).
The container can be thought of more like a "chroot" environment (in the case of sparse zones) where system resources are also in effect "chrooted" so that processes cannot run away and consume all of the resources of physical parent (the global zone), thus rendering the system inoperable. Only a single instance of Solaris 10 is ever installed (in the global zone) making package and patch management simple. Just apply the patch to the global zone, and all child zones will use the same binary set.
Some, or all, of the parent's filesystems can be mounted read-write or read-only within the zone. Special care must be taken when mounting a global zone filesystem read-write, as the child zone may be able to cause a denial of service to the global zone by filling a disk.
Up until recently, Solaris Containers could only inherit the global (i.e. physical parent) zones TCP/IP stack. Now, we can assign exclusive phyical interfaces to the Container. Although in this article I'll be creating a dynamic link aggregation of three NICs in the global zone, and then allowing the zones to create virtual interfaces on this aggregation.
There are *many* more features of Solaris Containers I've not had time to mention - Sun do a perfectly good job of this over at docs.sun.com - as can other elimentary topics that have not been covered in this discussion.
As we're running this setup on a T2000 (SPARC T1 with 4 cores, 4 threads per core - i.e. 16 logical cores) with 8Gb RAM. Ensure you partition so that your zoneroot (i.e. the filesystem on which you will store your zones) has plenty of free capacity. The system has 4 x e1000g interfaces - I will assign a single interface for the exclusive use of the global zone. I will then assign the remaining three interfaces to an aggregation.
I'll employ resouce controls, and bind a dynamic pool of between 8 and 15 logical cores to the Containers (using Fair Share Scheduling to assign CPU shares on a zone-by-zone basis).
The terms "Containers" and "zones" are use interchangably and at my whim. They both refer to the same thing.
The first thing to do is aggregate three of the parents physical NICs into a dynamic link - this will create a high throughput link on which the zones can reside.
First, configure your switch. Our Cisco switches support EtherChannel, so configuring the aggregation switch-side with LACP was straightforward. I will not cover switch configuration here, but may add something to the Tips and Tricks section in the future.
Once your NICs are connected to a configured switch, you can start configuring the aggregation.
If you're using the ipge driver for your NICs, they will be seen as legacy devices, and will not be appropriate for aggregation. On a T2000, you can transition to the e1000g driver via patch 123334-02.
Check if the patch is installed
# showrev-p | grep 123334-02
If not, download from sunsolve.sun.com, and install it
# patchadd 123334-02
Once this is complete, you can perform the transition
# svcadm milestone single-user
# /usr/sbin/e1000g_transition
Answer "y" when asked to proceed, and "n" when asked about halting the system. Once the transition is complete, you can reboot
# shutdown -y -g0 -i6
Once the system comes up, you can begin configuring the aggregation. First, ensure that all NICs are now using the e1000g driver.
# dladm show-dev
e1000g0 link: up speed: 100 Mbps duplex: full
e1000g1 link: down speed: 0 Mbps duplex: half
e1000g2 link: down speed: 0 Mbps duplex: half
e1000g3 link: down speed: 0 Mbps duplex: half
Good, we can now check that the types are no longer legacy devices
# dladm show-link
e1000g0 type: non-vlan mtu: 1500 device: e1000g0
e1000g1 type: non-vlan mtu: 1500 device: e1000g1
e1000g2 type: non-vlan mtu: 1500 device: e1000g2
e1000g3 type: non-vlan mtu: 1500 device: e1000g3
If they were still legacy, you'd see type: legacy
Notice that the show-dev subcommand displays link information, whilst show-link shows device information? Hmm...
If this all checks out, you're ready to create your aggregation
# dladm create-aggr -P 3 -l active -d e1000g1 -d e1000g2 -d e1000g3 -u 00:14:4f:6a:ce:c4 1
Here we specify the LACP policy (3), and LACP mode of active (this must match the switch configuration). Next, we define the devices to use in the aggregation, the MAC to assign, and finally a unique instance number to assign to the aggregation. So our configured aggregation will have an "aggr1" logical device name.
Check that the aggregation has been formed correctly
# dladm show-aggr -L
key: 1 (0x0001) policy: L3 address: 0:14:4f:6a:ce:c4 (fixed)
LACP mode: active LACP timer: short
device activity timeout aggregatable sync coll dist defaulted expired
e1000g3 active short yes yes yes yes no no
e1000g2 active short yes yes yes yes no no
e1000g1 active short yes yes yes yes no no
Check that the aggregation can be plumbed
# ifconfig aggr1 plumb
# ifconfig aggr1
aggr1: flags=1000842 mtu 1500 index 3
inet 0.0.0.0 netmask 0
ether 0:14:4f:6a:ce:c4
Make the interface persistent with a dummy IP
# echo "0.0.0.0" > /etc/hostname.aggr1
And test the changes persist across the reboot
# shutdown -y -g0 -i6
Now we can move on to configuring our dynamic processor pool
We will provision resources as follows. The zones will have a dynamic processor pool of between 2 and 15 logical cores, which will vary as utilisation calls for it. The default pool will have a minimum of 1 logical core allocated to it. The default pool will be left available for exclusive use of the global zone, again so that zones cannot consume all resources. We'll use the TS scheduler for the default pool, but will use FSS (Fair Share Scheduling) to assign weighted CPU shares to our zones.
First, enable the pools and pools/dynamic SMF services
# svcadm enable svc:/system/pools:default
# svcadm enable svc:/system/pools/dynamic:default
Once done, we can set the default pool scheduler to TS
# poolcfg -c 'modify pool pool_default ( string pool.scheduler="TS" )'
Instantiate the configuration
# pooladm -c
Next, we define our zone processor set (zone_pset) and the pool that the zones will use (zone_pool), and configure the pool scheduler to be FSS
# poolcfg -c 'create pset zone_pset ( uint pset.min=2; uint pset.max=15 )'
# poolcfg -c 'create pool zone_pool'
# poolcfg -c 'associate pool zone_pool ( pset zone_pset )'
# poolcfg -c 'modify pool zone_pool ( string pool.scheduler="FSS" )'
# pooladm -c
Once this has been completed, we can begin provisioning our Containers.
I will provision container as follows. The container will be a sparse root zone, and will inherit read-only filesystems (such as /lib, /usr) from the global zone. The container will mount my home directory from the global zone, and will do so read-write (/home is on a different partition in the global zone). This allows me to have a consistent home environment across all zones on the server, as well as the global zone itself. It will be assigned 100 CPU shares.
Configure the zone using zonecfg
# zonecfg -z test-zone
> create -F
> set zonepath=/var/zones
> set autoboot=true
> set pool=zone_pool
> add net
> set address=192.168.x.x
> set physical=aggr1
> end
> add rctl
> set name=zone.cpu-shares
> add value ( priv=privileged,limit=100,action=none )
> end
> add fs
> set dir=/home/kevin
> set special=/home/kevin
> set type=lofs
> set options=[rw,nodevices]"
> end
> verify
> commit
> exit
Stepping through this - we create the zone "test-zone", set the zonepath (path to directory where zones will be stored), set the zone to automatically boot when the global zone boots and bind the zone to the zone_pool resource pool.
Next, We add a network interface, bound to aggr1 in the global zone, and assign an IP address. Once your zone is in operation, you'll see an interface alias (aggr1:1) in the global zone.
A resource control is added next, assigning our zone 100 FSS CPU shares. My home directory is then imported, from /home/kevin in the global zone (special) to /home/kevin in the child zone (dir). See that we specify "rw" (read-write) in our options.
Once the configuration is verified and committed, we can install the zone.
# zoneadm -z test-zone install
The zone will appear in the output of zoneadm list
# zoneadm list
global
test-zone
Once this has completed, we can boot the zone. However, I like to perform some post install steps on the zone.
First, I disable the /home automount. I dislike automount - I don't use NFS.
Here, I show a handy technique you can use to administer the zone from the global zone - the zone's filesystems will appear under
/<zone_path>/<zone_name>/root
For example, on our zone the auto_master file can be reached from the global zone at
/var/zones/test-zone/root/etc/auto_master
Disable the NFS4 domain question from appearing on the zones initial boot
# touch /var/zones/test-zone/root/etc/.NFS4inst_state.domain
I also create a sysidcfg file, that (like JumpStart) allows for an unattended first boot of the zone.
# cat > ~root/sysidcfg
system_locale=C
timezone=Your/TimeZone
terminal=ansi
security_policy=NONE
root_password=eNcRyPtEdPa55w0rD
timeserver=localhost
name_service=NONE
network_interface=primary { hostname=test-zone
netmask=255.255.255.0
default_route=192.168.x.1
protocol_ipv6=no }
nfs4_domain=dynamic
^D
# chmod 400 ~root/sysidcfg
# cp -p ~root/sysidcfg /var/zones/test-zone/root/etc
Once this is done, you can boot the zone
# zoneadm -z test-zone boot
If you hadn't performed the post install, you'd have to use the zlogin command to connect to the console of the zone, and run through the initial boot dialog
# zlogin -C -e'#.' test-zone
If you don't need the console, connect a virtual tty to the zone, and start working!
# zlogin test-zone
I have written two scripts to automate the zone provisioning process.
configure_pset_pool.sh - This script is designed to be run on a freshly installed Solaris 10 global zone (on a 16 thread T2000) and will set up the zone resource pools described in this article.
provision_zone.sh - An incredibly detailed script, that will not only provision zones as above, but also perform a large amount of additional post installation steps (for example setting up user accounts, a few security steps, etc).
Instead of typing everything above (except the Link Aggregation), I could have achieved the same with
# configure_pset_pool.sh
# provision_zone.sh -z test-zone -p aggr1 -i 192.168.x.x -a -b
But where would the fun have been in that?! ;-)
I hope this article has proved to be informative, and has highlighted methods for deploying Solaris Containers on your hardware, as well as controlling basic zone resources.
Comments and feedback welcome as always.
Cheers
Kevin Waldron
kevin@zazzybob.com
Disclaimer! - This article is provided for guidance only, and does not replace the relevant official documentation and manuals. I will not be held liable for any hosed systems and/or data.