Cloning a VirtualBox openSUSE image

I recently tried to clone an openSUSE VirtualBox image. I thought it would be as simple as just copying the virtual disk image (vdi) file and create a new virtual machine based on it. So that’s what i did. But when i tried to couple the new hard disk file to the virtual machine VirtualBox gave me the following error:

Apparently a new virtual disk image in VirtualBox needs a unique identifier key. So after some research i found out it’s possible to make a clone of a hard disk that has a different unique identifier. But unfortunately that’s not all of the story. You have to make some additional changes to the clone to make it all work. So here are the steps you need to follow to make a successful clone.

1. Clone the hard disk

To clone the hard disk open a terminal window and issue the following command:

VBoxManage clonevdi <original>.vdi <clone>.vdi

2. Create a new Virtual Machine

Now create a new virtual machine with basically the same settings as the original virtual machine and couple it to the new cloned virtual disk image. As this image has been given a new unique identifier, you should have no problem registering it now.

3. Alter the hard disk identifiers

Having cloned the virtual disk image, i thought i was ready to roll. But on startup i encountered the following problem:

Apparently part of the UUID of the virtual disk image is used to identify the hard disk on startup. We have to change these references to their appropriate new ids.

3a. Startup in rescue mode

To startup the openSUSE vm in rescue mode insert the iso file you used for installation in the CD Drive of the vm. Upon startup select the Rescue System option from the menu.

3b. Mount the hard disk

Login as root and mount the hard disk (on my system this was /dev/sda2, this could be different on your system) via the following command:

mount /dev/sda2 /mnt

3c. Alter the identifiers

Now first find out what the new identifier of the hard disks should be. Issue the following command

hdparm -i /dev/sda

Note the identifier called SerialNo. This is the one you need. On my system it was VBa79c17fb-f28bb7c1.
Now there are 2 files you need to alter. First edit the file /mnt/etc/fstab and alter all the identifiers between /dev/disk/by-id/ata-VBOX_HARDDISK_ and -partx with the new identifier.
Next make corresponding changes to the file /mnt/boot/grub/menu.lst.
After this you can reboot the system

shutdown now -r

4. Repair your network settings

If you made no typos, openSUSE should start up with no problems. There’s one piece of configuration to do though. The clone has messed up the network configuration. This can be easily repaired via the YaST GUI tool. Open it and select Network Devices > Network Settings. You should see 2 Ethernet Controllers. One of them is not configured. Configure this controller with default settings and delete the other one. Now your clone is ready for use.

Devoxx 2010 impressions – day 2

My second and (unfortunately) last day at Devoxx i visited two completely different talks.

The first talk was about Cassandra, one of the many nosql databases that have emerged in the last years. Nosql was definitely one of the biggest topics on Devoxx. Four university slots were reserved for it. Apart from the Cassandra talk, there was a talk on Hadoop, MongoDB and HBase.

The second talk discussed two of the most used JavaScript libraries today, Prototype and jQuery.

Cassandra by Example

by Jonathan Ellis

Cassandra is one of the many nosql databases around. It offers scalability, reliability and high availability. It has good clustering and failover capabilities. It also integrates some Hadoop functionality for analyzing its datastore. Currently it has support for Pig and Hive. The scalability and reliability comes at price though. Cassandra doesn’t support ACID transactions and has only limited support for ad-hoc queries (hence the name nosql). This makes it especially suitable for very large internet applications where ACID isn’t that big a deal. Applications like Twitter and Facebook come to mind.

The idea behind Cassandra (and other nosql databases) is that IO should be optimized. Data closely related to each other is stored (often redundantly) close together on disk, thus significantly reducing IO overhead. Compared to relational databases, where scaling is accomplished predominantly by increasing cpus and memory, Cassandra scales by optimizing IO which comes at the cost of increased disk space. But since disk space had become relatively cheap these days, as compared to memory and processor units, this is a fair price to pay.

There is a another price to pay however. As mentioned, Cassandra stores a lot of data redundantly. Data related to a particular query eg. is stored in one row. This highly increases complexity. Since Cassandra doesn’t support referential integrity, so there’s no such thing as a cascaded delete for example, it’s up to the application (hence the programmer) to make sure that the stored data remains consistent with the datamodel. And as mentioned before there is limited support for ad-hoc querying

A good deal of the talk showed some examples on Cassandra’s shell client and Cassandra’s java api Hector. The examples were pretty complex, and coding against the API comes with a lot of boilerplating. Jonathan made and interesting comparison. He compared Hector with JDBC (for relational databases) and foresaw a JPA-like API in the near future. Now this would be really interesting from a JEE programmer point of view. JPA could abstract away the fact that you’re programming against a nosql database! All the examples Jonathan showed,  belonged to an online example Twitter-like application called Twissandra. The example project regarding Hector is called Twissjava.

At the end Jonathan mentioned some more advanced features of Cassandra

  • Batch insertions;
  • Secundary indexes, i.e. the ability to add indexes on additional columns in a row;
  • SuperColumns, which are basically maps of columns, used to denormalize data to avoid extra queries;
  • RipCord, a management suite for Cassandra;
  • JMX monitoring.

All in all, a very interesting talk. I’m definitely going to dive in in this topic in the near future.

And here’s a link to the slides.

Ajax Library Smack down: Prototype vs. jQuery

by Nathaniel Schutta

In the first half of his very inspiring talk Nathaniel Schutta compared two of the most used JavaScript libraries, i.e. Prototype and jQuery. Both have their pros and cons and which one is best suited for the job is mostly a matter of taste.

Both libraries offer a lot of the same goodies:

  • cross browser abstractions;
  • simplified AJAX;
  • CSS selectors;
  • event handling;
  • widgets, effects, animations;
  • javascript utilities.

Both have excellent online documentation available, are widely used, have good community support and are very small libraries.

Apart from the similarities, there are also a lot of differences between the two. According to Nathaniel Prototype has been developed from a programmer’s viewpoint (API centric), whereas jQuery is more focussed on HTML elements. Here’s a list of the pro’s and con’s.

Prototype pros:

  • adds useful functions to core elements (on very very large pages this can cause a significant memory footprint);
  • widgets and effects available via script.aculo.us;
  • ruby flavored javascript;
  • widely used.

Prototype cons:

  • no minified version;
  • performance not always a priority;
  • pollutes the global namespace.

jQuery pros:

  • focussed on HTML elements;
  • doesn’t pollute global namespace;
  • dom traversal is a snap;
  • extensive array of plugins available.

jQuery cons:

  • parameter ordering in apis not alway intuitive;
  • plugins required for a variety of functionality;
  • some functions reassign this.

I’m definitely not an expert on using either one of these libraries but after Nathaniel compared both libraries i think jQuery has a slight advantage. This is because it had plugins in mind when it was first designed. There are a lot of plugin libraries available and you can pick the ones you need and omit the ones you don’t need. In Prototype you only have two options: with or without script.acul.us.

In the second half of his talk he showed some nice examples on using jQuery. One of the points he made is that contrary to public opinion JavaScript isn’t that hard a language. And indeed all his examples where pretty simple. He also stressed the point that code should be self explanatory, so the name of a method for example is very important and should clearly and without any reason for debate state what the method actually does. And of course on more than one occasion he made a case that the question of using one of the open-source JavaScript libraries or not is a no-brainer.
All probably very open doors to most programmers, but still i honestly believe that in practice a lot of these rules are violated.

Here’s a link to the slides.

%d bloggers like this: