RAM speedup

Rupert Gallagher ruga at protonmail.com
Tue Jun 30 11:08:57 UTC 2020


‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Sunday 28 June 2020 21:09, Matthias Schniedermeyer <ms at citd.de> wrote:

> > > Complete rsync commandline?
> >
> > /usr/local/bin/rsync --recursive --links --times --modify-window=1 --devices --specials --update --owner --group --perms --delete --delete-before --delete-excluded --exclude-from=/etc/excluded_from_backup.conf --numeric-ids --outbuf=Block --inplace --link-dest=/backup/latest/ /archive /backup

> Linkdest means "more metadata-operations".

The end result is a competitive advantage with respect to brute copying.

> This is a hardlinked backup-store?

Yes

> With or without deletion of older backups?

When the backup disk is full, we either re-format and start fresh if there is a single unit, or the backup continues on the second disk with a fresh first copy of the archive.

Your very thought that we could be deleting an older backup while pretending to read from a broken hard link shows your debugging spirit.

> What is the age of that backup store?

This is a fresh unit.

> Hardlink-farms age a filesystem pretty severely, IOW after some time the freespace gets heavyly fragmented. IOW the HDD has to seek like hell to piece the Meta-data & file-content into many small holes.

The idea with a large disk is to just add up and read occasionally. I do not understand what you mean by fragmentation of free space. The drive does its job, serving its purpose; reading a file directly is easier on the drive than following bread crumbs first, but this is the cost to pay for fast incremental backups. The first is hell, the latter are faster. Compare to copying everything each time, or copy just the new files and have you manually rebuild the full image. I prefer hard links, as it makes life easier.

> Personally i only use hardlink-farms on SSDs nowadays, HDDs "don't really like" hardlink-farms.

For backups and LTANS, the cost-benefit analysis is still in favour of spinning disks.

> > > What hardware? (From the numbers it is only clear that you seem to talk about HDDs.)
> > > What HDDs?
> >
> > source:
> > ST2000NX0403 sata hdd
> > Writing speed : 117 MB/s
> > Reading speed : 99 MB/s
> > destination:
> > ST5000LM000-2AN1 sata hdd
> > Writing speed : 74 MB/s
> > Reading speed : 89 MB/s

I forgot to tell you, those are measured speeds, writing a single file greater than RAM, so 16+ GB in this case. Reading and copying smaller files is much faster (except with rsync, where it scores 25MB/s).

> > > What computer? (Laptop? Desktop? Server? Raspberry Pi? Age?)
> >
> > Supermicro A2SDi-4C-HLN4F, newish
>
> That mainboard has a Intel Atom C3558 soldered to it. That's a 2017 Atom with 2,2 Ghz.

Yeah, you tell me. Supermicro uses what Intel provides, and there is no series 4 Atom yet.

> I have no personal experience with Atom CPUs, so i can only generically say: "not exactly build for speed".

It is built for its purpose, and it does it very well. There is no better workhorse at 16W.
> > > What "Buses"? ( a) Any modern "bus" is NOT saturated by those numbers. b) All modern "buses" (Except USB, to some degree) are P2P, you can't even connect 2 devices to the same bus. (Except USB, but there are usually several controllers so you don't have to use same bus).)
> >
> > Supermicro CSE-M14TQC 4xSAS/SATA bay, connected with a CBL-SAST-0616(50cm) Mini-SAS HD to 4 SATA cable. The CSE receives the 4 sata cables, the mini-sas end is plugged on the main board.
>
> AFAICT each HDD is in effect connected to a separate channel, so no contention there.

There is something odd happening there. If I run rsync across the network, I get full speed reading from the donor, saturated LAN link at 1Gbps (800/900 Mbps), full speed writing from the recipient, and the job done in 2 hours. If I rsync from a local disk to a local disk, using the hardware above, then the same job takes 8 hours.

> > > With or without networking involved?
> >
> > no network involved

> > > What Filesystem? What mount-options?
> >
> > FFS2
>
> This mean either you are using a Flash Filesystem for a HDD, which would be "odd".
> Or you are using a BSD-Type OS. I would guess FreeBSD?

This is BSD's Fast File System version 2.

> In both cases: No personal experience.

> I mainly use Linux Systems with XFS as a filesystem. I personally hadn't had a problem saturating most storages for more than a decade.
> But i also use seperate Storage-types with different content. "Low AVG filesize"-files i put on SSDs. And HDDs only get used for files with a largish AVG filesize, mostly more than 10MB per file.
> And i also use "rsync --preallocate", so large files are stored "as contiguosly as possible".

Interesting.

> > > AVG Filesize? Directory structure? Fragmentation?
> >
> > mixed
>
> That is what average means:
> Total number of files divided by total filesize. You have already determined the total file size.
> Now you only need to: $(find /source -type f | wc -l)
>
> Any given set of files has an AVG.

number of files: 2156330
total file size: 2750298032





More information about the rsync mailing list