[Samba] Spotlight configuration trouble

Perttu Aaltonen perttu.aaltonen at mac.com
Fri Oct 2 19:59:55 UTC 2020


Hi Ralph!

He’s aware of the issue, the first one on this page:
https://fscrawler.readthedocs.io/en/latest/user/tips.html

And slowly developing the solution:
https://github.com/dadoonet/fscrawler/issues/399

Are your customers using regular full reindexing and if they are, what kind of performance are they seeing? My concern is that with millions of files and >10 TB of data, the full indexing will take hours and hours. So if the need is to have the index up to date and run full indexing every night, it might run constantly while slowing down and taxing the system.

For example, a user gets a file as an email attachment and saves the file on the server. The file won’t get indexed if the last “incremental” indexing run was done after the file was created/modified. Or a user is restructuring the directory hierarchy and moves files and folders around. All files moved that have a modification time older than the last fscrawler run will not be indexed again but actually removed from the database, unless a full reindex is done.

Any idea what issue I might have with the Tracker implementation? My sense is that even if I can’t use inotify watchers due to memory and file descriptor constraints, I can use regular real “incremental” runs for the data set with tracker-miner-fs.

Cheers,
Perttu

> On 2 Oct 2020, at 19.35, Ralph Boehme via samba <samba at lists.samba.org> wrote:
> 
> Howdy!
> 
> Am 10/2/20 um 6:02 PM schrieb Perttu Aaltonen via samba:
>> I used your excellent packages but also tried building from source. I’m trying to get Tracker working. Elasticsearch was simpler to get going but it doesn’t really suite the use case due to limitations in fscrawler.
> 
> hm, a customer is using this with good results on laaaaaaarge datasets.
> I had seen you mentioning problems in an ealier mail but couldn't
> comment due to a lack of time.
> 
> If you think you're running into genuine fscrawler limitations I
> recommend talking to the main fscrawler developer. He was very helpful
> when I reached out to him some time ago with questions and a feature
> request. :)
> 
> Cheerio!
> -slow



More information about the samba mailing list