rsync lib
John E. Malmberg
wb8tyw at qsl.net
Mon Jul 11 03:31:14 GMT 2005
Olivier Thauvin wrote:
> Le Tuesday 5 July 2005 17:48, John E. Malmberg a écrit :
>
>>Olivier Thauvin wrote:
>>
>>>Currently there is no rsync library for rsync network function. the
>>>librsync project provide functions for the file access/md4 part.
>>>
>>>That exactly the reason why I just started a rewritten of rsync using a
>>>struct and to create a real library.
>>>
> It is a cvs co made recently, true.
>
> Rsync code use many global ans static variable, this is not usable for a
> library. So to done a lib, I took rsync, create a struct and it's typedef
> (see rspeer.h and rspeer.c) and I am removing all global/static variable from
> code to put it inside the rspeer struct.
There are about 300 to 400 global or static variables in rsync.
Of these, about 75% appear to be set at startup and only read by the
other processes that are created.
Another group of them are only used by a single process ever.
> The current code doesn't works and doesn't compile at time, notice I start
> this one or two weeks ago and I still have to modify all function to pass the
> rspeer struct as argument:
>
> -int allow_access(char *addr, char *host, char *allow_list, char *deny_list)
> +int allow_access(rspeer rsp, char *addr, char *host, char *allow_list, char
> *deny_list)
Since I was not looking at using this in a reentrant library, I took the
approach that only a integer thread index was needed to find the correct
variable. And only the variables that were used by more than one
process after the additional processes were forked needed to be changed.
I probably do not have all of the variables classified correctly yet.
It would help if their names were tagged for their use or a comment on them.
I am told from this list that only three processes are active, and in
getting the just the client to work, I am seeing only two. So I only
need an array of three structures which I set up as a static. So I
still have to find out what the third process is used for and how to do
the same on OpenVMS with out a fork() routine.
A routine could get the thread index by either by having it passed as a
parameter, or could make a POSIX thread call to find it out.
A library could use a process id also as an index for storage maintained
internally, with some care for garbage collection. On UNIX it appears
that each image is run with it's own process id. On OpenVMS that is not
the case, so a different method is needed to detect that the calling
image exited with out cleaning up.
By having many of the routines like write_int() look up thread index
instead of getting it passed, it significantly reduces the amount of
source code that needs to be changed or changes that need to be tracked.
In many cases, only the top of the files need to be changed.
As per your example, access.c that contains allow_access is one of the
routines that I did not need to change at all, since it never references
any of the global variables directly or has any local static variables.
Compiler macros are also used to minimize the code changes.
For example:
int am_sender; is a member of a structure of global thread specific
variables.
When I compile for POSIX threads, a macro gets defined:
#define am_sender main_global[thread].am_sender
So none of the references in the code to am_sender need to be changed.
These macros and structures reside in one module thread_global.h
>
>>I have an interest in such a project as the normal user interface for
>>OpenVMS is a bit different than on UNIX.
>>
>>In order for me to use such a library, all routines must be thread safe,
>>and allow a single process to do the work.
>
> I currently do not plan to change rsync code else for making code works from
> library, but I am in the first step of the project, and open to all
> improvements/suggestion.
>
> I am open to any help to.
See http://encompasserve.org/rsync_pthread_pre.zip.
This is a gnu unified diff between a snapshot taken today of the rsync
source + some additional files.
The files *.gdiff are difference files, the *_xxx.new files need to be
renamed *.xxx. The resulting files with *_vms_*.* or *.com, *.mms are
only for OpenVMS use.
Some routines now take an integer thread parameter, others look it up.
I also made some changes as ANSI C will not allow unsigned char and char
types to be mixed with out a cast, and fixed some other things that VMS
will need.
The resulting code with the macro USE_PTHREADS defined currently
compiles and links on my OpenVMS 8.2 system.
It will probably not run with USE_PTHREADS because I still have to write
the routine that looks up the thread index. This is a thread index
number that I will assign to a thread when it starts as I can not
predict what the actual thread number would be. I also have to add code
to set the stack size for each thread. By default on OpenVMS Alpha,
only an 8K stack per thread is allocated, and that is not enough for rsync.
With out that macro defined, it should build on a UNIX/LINUX system and
produce the same binary as the snapshot it was taken from would.
Currently all my changes to existing routines are done by writing VMS
specific editor macros. The rsync.mms is a VMS specific type of
makefile that has been set up to use them. I have not included the
*.tpu files, as I mainly wanted to make these difference files available
for your inspection.
I will probably remove to files from that server in a few weeks because
of quota limitations on that volume, and I do not know how long it will
be before they are out of date.
My current broadband ISP prohibits me setting up my own public server,
and they have no competition.
-John
wb8tyw at qsl.net
Personal Opinion Only
More information about the rsync
mailing list