[clug] Random Thought: support for Hot-code swap in the kernel

Tue Jun 30 02:40:32 GMT 2009

jm <jeffm at ghostgun.com> writes:
> Daniel Pittman wrote:
>> jm <jeffm at ghostgun.com> writes:
>>
>>> What would it take to do hot-code swapping at the OS level in Linux?
>>
>> For the kernel?  Installing the ksplice patches, probably, although you can
>> hand-write the patch code or potentially find other options.
>
> May be I should have use the words user-space to avoid confusing with
> kernel-space.

I think you clarified this later, and I considered going back and removing the
comments, but figured that it wasn't bad to leave them in since you were after
general discussion on the topic. :)

> I was wonder is it possible to do this without the VM for the application to
> run on, ie the application runs directly on the OS, just as /bin/ls does.

In case that wasn't later made clear: absolutely.  You can do it internally,
as I suggested below, or you can use ptrace to inject executable code into the
running application from outside.

These tools give you the capabilities required to do this, although they ...

>>> More fully, I was wondering what it would take to be able to have one
>>> version of an application start up, inherit its state from another version
>>> that is already running, have the old version shutdown and the new version
>>> continue running without a noticable break in service.
>>
>> Write it in a real, sane language.
>
> eg, Erlang, Lisp. Though sane is to a certian extent subjective :-).

... are made enormously easier to use through a language that supports the
activity, or an application that rewrites those same capabilities on a less
advanced base.  (eg: C can do anything that CL or Erlang can do, if you write
enough code.)

[...]

>> Not if the VM supports that.  IIRC, an Erlang application written
>> appropriately can have multiple instances on a single machine[1], and you
>> can shut down and restart the individual components.
>
> only 2 versions of code may be run at once - the old version and the new
> version.  There is SMP support in the current release of Erlang.

My knowledge may be out of date; I don't do any professional work with Erlang,
so my knowledge is limited to playing around at home.  Anyway, the basic idea,
that you can run multiple cooperating instances and use those to migrate
should stand.

(Specifically, I am confident of that, because Erlang targets systems with
 zero percent downtime, and achieves it, in production.  When run by experts. ;)

[...]

>>> If it wasn't for the process state in kernel space it would just be a matter
>>> of having a set of functions in the application which would detect the
>>> presents of the new app, serialise the state, send the state to the new app,
>>> etc. It's things like sockets, file descriptors, etc which screw this idea
>>> up.
>>
>> All of those are inheritable over exec, with appropriate care.  Heck, you
>> don't even need to ask the kernel for permission: just bind a trivial core
>> and an ELF dynamic linker in, then call down to that when you want to
>> restart.
>>
>> You can serialize in memory, unmap all the other code, map in new segments
>> and dynamically link appropriately, then return to the "application" rather
>> than the "restart" portion without having to do anything to the kernel
>> process context.
>>
>> Then, of course, you need to deal with any application level data structure
>> changes while you resume. :)
>
> Alright, I get the call to exec, because you inherite access to the parent
> processes kernel structures (sockets, etc).
>
> You then serialise, required, data and send that to the new version.

*nod*

> I'm unfamilar with unmap/map except in the general sense not having used
> them myself. Can you clarify what your unmapping out and mapping in?

The code, my boy.  The code.

Look, you know how the ELF interpreter works, right?  It reads the ELF segment
descriptors from the files, maps them into memory, adjusts permissions on the
pages, then performs dynamic linking to join all the bits together, right?

When you want to change the code version you just reverse the process: ensure
that everything stops execution somewhere safe, serialize or mark your data as
needed, unmap the code and data as appropriate (exempting the tiny core of the
"code upgrade manager"), find the new code, then act like the ELF interpreter,
except preserving what is presently in memory.

For sanity I would probably pick a dynamic entry point for resume that acted
like the default dynamic libc startup point, in that it called into (some
equivalent of) main, which would know that it needed to reconstruct all the
state, then call into my main loop.

main would, obviously, construct rather than reconstruct state, then call into
the same code path.

> How would you pick where to start? Have set of marshall points though out
> your program at which the program may update and resume at?

Pretty much, yeah.

> and this information is conveyed to new version which resumes at that
> point. This would imply an event loop style program would it not?

It isn't strictly necessary to do it that way; it would be reasonably sane to
have an FSM based application do this check on all edge transitions, or you
could just inject "checkpoint" calls somehow.

An event loop style program would be *MUCH* easier, because there is a well
defined "the application is at rest" point, but there is no specific technical
reason you couldn't do this in whatever style of application you wanted.

>> Anyway, nothing you are asking for is even particularly difficult.  The reason
>> that it is more common in VM hosted environments is that they almost always
>> inherently build in the dynamic linker, and probably even the compiler, not
>> because of any inherent difference in their abilities in this area.
>
> Would it be possible to produce a C-library to do this "today" then?

Sure.  Not easy, and doing something that supported more than one programming
style would be much, much, much harder, but there is nothing to stop you.

Heck, there is nothing to stop it running on much older kernels just as
effectively, because none of these facilities are new or different; most of
them go all the way back to prehistoric versions of Unix.

Regards,
        Daniel

Personally, I like the erlang style approach of allowing multiple concurrent
versions during the transition, run in different processes.  Because it is so
much easier to understand it costs a lot less to build.

Failing that, picking a language environment that has already solved the
problem will save both time and money, because it is possible, but not easy.