KVM IO Benchmark, lvm2 vs. qcow2

Table of Contents

1 LVM Installation

1.1 Create logical volume

lvcreate -n wheezy-lvm --size 10g vg0

1.2 Create virtual instance (LVM)

virt-install -d --name wheezy-lvm --ram 1024
             --disk path=/dev/vg0/wheezy-lvm,bus=virtio,cache=none 
             --network bridge=natbr,model=virtio,mac=00:00:00:00:50:02 
             --os-variant debianWheezy 
             --os-type linux 
             --virt-type kvm 
             --vnc 
             -k de 
             -c /home/thomas/linux/isos/debian/wheezy/debian-7.0.0-amd64-netinst.iso

2 QCOW2 Installation

2.1 Create image disk

qemu-img create -fqcow2 -o preallocation=metadata wheezy.qcow2 10G

2.2 Create virtual instance (Imagedisk)

virt-install -d --name wheezy-qcow --ram 1024
             --disk path=/var/lib/libvirt/images/wheezy.qcow2,bus=virtio,cache=none  
             --network bridge=natbr,model=virtio,mac=00:00:00:00:50:03 
             --os-variant debianWheezy 
             --os-type linux 
             --virt-type kvm 
             --vnc 
             -k de 
             -c /home/thomas/linux/isos/debian/wheezy/debian-7.0.0-amd64-netinst.iso

3 DD Experiment

#!/bin/bash
# Written by Thomas Stibor <thomas@stibor.net>, Sun 12 May 2013 08:44:43 PM CEST
type="qcow2"           # which underlying device/image type is used
outputf="results.txt"  # file contains result for later processing, e.g. with R
tmpdata="/tmp/zero"    # dummy output file filled with zeros read from /dev/zero
blksize=$((2**20))     # block size of 1 MiB read from /dev/zero and written to $tmpdata 
R=15                   # number of experimental repetitions
j=0                    # helper variable
runtime=0              # helper variable for run-time
mibsecs=0              # helper variable for MiB/s 
truntime=0             # integrated run-time over R repetitions
tmibsecs=0             # integrated MiB/s over R repetitions

echo -e "size_MiB\truntime_secs\tMiB_per_secs\ttype" > $outputf
for (( i=7; i <= 11; i++ )) # 2^7 = 128 MiB, ..., 2^11 = 2048 MiB
do
  cval=$((2**i)) 
  for (( r = 1; r <= R; r++ )) # repeat experiment R times
  do 
    echo 3 > /proc/sys/vm/drop_caches                                             # clear free pagecache, dentries and inodes (requires root permission)
    result=$((sh -c "dd if=/dev/zero of=$tmpdata bs=$blksize count=$cval") 2>&1)  # execute the benchmark command, here good old disk dump (dd)
    sync                                                                          # force changed blocks to disk, update the super block
    runtime=$(echo ${result} | awk '{print $12}')                                 # extract run-time sec. field
    mibsecs=$(echo ${result} | awk '{print $14}')                                 # extract MiB/s field
    truntime=$(echo "scale=5; $truntime + $runtime" | bc)                         # add run-time values up, use "bc" otherwise we concat strings
    tmibsecs=$(echo "scale=3; $tmibsecs + $mibsecs" | bc)                         # add MiB/s values up
    echo -e "$cval\t$runtime\t$mibsecs\t$type" >> $outputf                        # write plain measurements to file for later processing
  done
  avgruntime=$(echo "scale=5; $truntime / $R" | bc)
  avgmibs=$(echo "scale=3; $tmibsecs / $R" | bc)
  echo "size $cval MiB, average run-time $avgruntime sec, average MiB/s $avgmibs, experimental repetitions $R"
  truntime=0
  tmibsecs=0
  j+=1
done

rm $tmpdata

Run the bash script and concatenate the output result into the file results.txt which is used as input for the following R script:

# Written by Thomas Stibor <thomas@stibor.net>, Mon 13 May 2013 03:31:47 PM CEST
data <- read.delim("results.txt");

from.2 <- 7;
to.2   <- 11;

type <- data$type == "qcow2";
data.qcow2 <- data[type,];
type <- data$type == "lvm2";
data.lvm2 <- data[type,];
type <- data$type == "native";
data.native <- data[type,];

stats.qcow2  <- matrix(nrow = to.2 - from.2 + 1, ncol = 5);
stats.lvm2   <- matrix(nrow = to.2 - from.2 + 1, ncol = 5);
stats.native <- matrix(nrow = to.2 - from.2 + 1, ncol = 5);
j <- 1;
for (i in 2^(from.2:to.2)) {

  qcow2  <- data.qcow2[data.qcow2$size_MiB == i, ];
  lvm2   <- data.lvm2[data.lvm2$size_MiB == i, ];
  native <- data.native[data.native$size_MiB == i, ];

  stats.qcow2[j,1:3]  <- c(i, mean(qcow2$runtime_secs), sd(qcow2$runtime_secs));
  stats.lvm2[j,1:3]   <- c(i, mean(lvm2$runtime_secs), sd(lvm2$runtime_secs));
  stats.native[j,1:3] <- c(i, mean(native$runtime_secs), sd(native$runtime_secs));

  stats.qcow2[j,4:5]  <- c(mean(qcow2$MiB_per_secs), sd(qcow2$MiB_per_secs));
  stats.lvm2[j,4:5]   <- c(mean(lvm2$MiB_per_secs), sd(lvm2$MiB_per_secs));
  stats.native[j,4:5] <- c(mean(native$MiB_per_secs), sd(native$MiB_per_secs));

  j <- j + 1;
}

# Create dd runtime plot.
png("dd_runtime_secs.png");
plot(data.qcow2$size_MiB, data.qcow2$runtime_secs, log = "x",
     xlab = "dd size in MiB", ylab = "dd runtime in secs", xaxt="n",
     cex.lab = 1.25, cex = 1.25, cex.axis = 1.25,
     col = "red", pch = 18, ylim = c(min(data$runtime_secs),
                                     max(data$runtime_secs)));
axis(1, at = 2^(from.2:to.2), las = 2, cex.axis = 1.25);

legend('topleft', 'groups',
       c("lvm2","qcow2","native"),
       lty = c(1, 1, 1),
       lwd = c(2.5, 2.5, 2.5), col = c("blue","red","green"));

points(data.lvm2$size_MiB, data.lvm2$runtime_secs,
       col = "blue", pch = 19);

points(data.native$size_MiB, data.native$runtime_secs,
       col = "green", pch = 20);

# Plot mean values of each type and connect them with a line.
points(stats.qcow2[,1], stats.qcow2[,2], type = "o", col = "red", lwd = 2, pch = 10, cex = 2);
points(stats.qcow2[,1], stats.qcow2[,2], type = "l", col = "red", lwd = 2, pch = 10);

points(stats.lvm2[,1], stats.lvm2[,2], type = "o", col = "blue", lwd = 2, pch = 10, cex = 2);
points(stats.lvm2[,1], stats.lvm2[,2], type = "l", col = "blue", lwd = 2, pch = 10);

points(stats.native[,1], stats.native[,2], type = "o", col = "green", lwd = 2, pch = 10, cex = 2);
points(stats.native[,1], stats.native[,2], type = "l", col = "green", lwd = 2, pch = 10);

dev.off();

# Create dd MiB per seconds plot.
png("dd_mib_per_secs.png");
plot(data.qcow2$size_MiB, data.qcow2$MiB_per_secs, log = "x",
     xlab = "dd size in MiB", ylab = "dd MiB per secs", xaxt="n",
     cex.lab = 1.25, cex = 1.25, cex.axis = 1.25,
     col = "red", pch = 18, ylim = c(min(data$MiB_per_secs),
                                     max(data$MiB_per_secs)));
axis(1, at = 2^(from.2:to.2), las = 2, cex.axis = 1.25);

legend('topleft', 'groups',
       c("lvm2","qcow2","native"),
       lty = c(1, 1, 1),
       lwd = c(2.5, 2.5, 2.5), col = c("blue","red","green"));

points(data.lvm2$size_MiB, data.lvm2$MiB_per_secs,
       col = "blue", pch = 19);

points(data.native$size_MiB, data.native$MiB_per_secs,
       col = "green", pch = 20);

# Plot mean values of each type and connect them with a line.
points(stats.qcow2[,1], stats.qcow2[,4], type = "o", col = "red", lwd = 2, pch = 10, cex = 2);
points(stats.qcow2[,1], stats.qcow2[,4], type = "l", col = "red", lwd = 2, pch = 10);

points(stats.lvm2[,1], stats.lvm2[,4], type = "o", col = "blue", lwd = 2, pch = 10, cex = 2);
points(stats.lvm2[,1], stats.lvm2[,4], type = "l", col = "blue", lwd = 2, pch = 10);

points(stats.native[,1], stats.native[,4], type = "o", col = "green", lwd = 2, pch = 10, cex = 2);
points(stats.native[,1], stats.native[,4], type = "l", col = "green", lwd = 2, pch = 10);

dev.off();

3.1 Results

The dd benchmark script is first executed on a host Wheezy system (named native in the plots). Then in a qcow2 KVM instance and finally in a lvm2 KVM instance. Surprisingly, the dd experiment seems to be inappropriate as a simple (poor man) benchmark. The qcow2 KVM instance outperforms the native system which is "fishy".

http://web-docs.gsi.de/~tstibor/iozone/qcow.vs.lvm/dd_results.png

Please note, that of course different values of block sizes need to be tested (here I used only 1 MiB), moreover I am roughly testing only a plain write performance rather than random read/write etc. Consequently another proper benchmark test is performed.

4 IOzone Experiment

The following IOzone command is executed in the qcow2 and lvm2 KVM instance to verify whether the results from the poor man dd benchmark are indeed "fishy".

iozone -a 2G -R -b iozone_result.wks > iozone_result.txt

4.1 Results

The IOzone result files are statistically compared and visualized with iozone-results-comparator. The IOZone benchmark shows that the IO performance of a qcow2 and lvm2 KVM instance are roughly equal, where there is a slight IO performance advantage of the lvm2 KVM instance.

http://web-docs.gsi.de/~tstibor/iozone/qcow.vs.lvm/html.iozone.comparator/Compare-html/summary.png

The complete analysis and results are provided here: All results.

5 Experimental Data

Here is a link provided to all created data: Data

Date: 2013-05-14 12:16:38 CEST

Author: Thomas Stibor

Org version 7.8.11 with Emacs version 24

Validate XHTML 1.0