[clug] Random Thought: support for Hot-code swap in the kernel

Tue Jun 30 01:04:12 GMT 2009

jm wrote:
> What would it take to do hot-code swapping at the OS level in Linux?
> 
> More fully, I was wondering what it would take to be able to have one 
> version of an application start up, inherit its state from another 
> version that is already running, have the old version shutdown and the 
> new version continue running without a noticable break in service.
> 
> There are a number of languages which can do this. However, it's a 
> language specific feature and it relies on an image or virtual machine. 
> Update the vm and you loose the continuity. Ideally, you'd also be able 
> to migrate processes between machines so that you could stop one machine 
> to upgrade the OS. The Xen hypervisor, for example, supports the 
> migration of host OSes between machines when the host OS's filesystem is 
> mounted via nfs.
> 
> Ignoring such migration for the moment, ie limiting the application to 
> one host OS instance. What is currently available to make the transfer 
> of OS process state possible and what is missing?
> 
> If it wasn't for the process state in kernel space it would just be a 
> matter of having a set of functions in the application which would 
> detect the presents of the new app, serialise the state, send the state 
> to the new app, etc. It's things like sockets, file descriptors, etc 
> which screw this idea up.
> 
> Of course, there's also problems with resuming correct execution after 
> the upgrade and security which I haven't even thought about. One step at 
> a time.
> 
> Just thinking aloud,
> 
> Jeff.

So, there is a distinct difference between migrating from one _version_
of a given app to another and moving an _instance_ of an app from one VM
to another. Most VMs can support moving instances of any app from one VM
to another, sockets and all.

Migrating from one _version_ of an arbitrary app to another requires
complete co-operation from the app itself (ie. the app programmers must
have allowed for this). There is simply no way that the OS can make
assumptions about things like memory structures not changing between
versions to make this, for all intents and purposes, impossible.
Serialising the data won't help if the new version has new fields in
it's structures.

Once the app programmers design for live migration, then the choice of
language can be important but most things can be dealt with through
clever design.

Cheers,

Bob Edwards.