[Samba] Searching Samba share file contents
Noel Power
nopower at suse.de
Thu Jun 15 14:07:59 UTC 2023
On 15/06/2023 14:26, Nick Couchman via samba wrote:
>>> Hey, Noel,
>>> Is this ready to be tested out? Is the process simply to check out the
>>> npower_wsp_norecurse_client branch, build Samba from that, and then
>>> give it a go?
>> yes, simply add --enable-wsp to your configure line and you should be
>> good to go
>>
>> please note: this is only the client part and the client will only work
>> against windows machines where the WSP service is enabled and configured
>>
> Ah, bummer - I don't think I have any Windows file servers with search
> enabled, or enough data to make a difference - my need is the
> opposite, I need the WSP service on Samba so that my Windows Clients
> can search more quickly, especially over higher-latency (VPN) links.
I understood from you message you were interested in the client side
since you mentioned the merge request associated with
npower_wsp_norecurse_client. The good news is I also have experimental
server side code if you would like to try it out
To try out the server code best you can clone it from
https://git.samba.org/npower/samba.git
branch is current_wsp_417_wip
As the name suggests it is very much work in progress
Building
--------------
1. clone https://git.samba.org/npower/samba.git
2. checkout current_wsp_417_wip
3. ./configure.developer ${YOUR_OWN_CONF_SWITCHES} --enable-wsp && make -j
Indexing a share
---------------------------
1. install fscrawler (I used version fscrawler-es7-2.7-SNAPSHOT) [1]
https://fscrawler.readthedocs.io/en/latest/installation.html
2. install and start a version of elasticsearch (I used
elasticsearch-7.11.2) [2]
3. configure fscrawler (I attach my conf '_setting.yaml' file here) and
4. start fscrawler 'bin/fscrawler job_name'
configure samba
---------------------------
[global]
wsp backend = elasticsearch
elasticsearch:wsp_mappings=${PATH_TO}/elasticsearch_mappings.json
[share_to_search]
wsp = true
an initial version of elasticsearch_mappings.json is available from
'source3/rpc_server/wsp/elasticsearch_mappings.json' in the build tree
other global smb.conf settings
'elasticsearch:address' ip address of elasticsearch (defaults to
localhost)
'elasticsearch:port' port num to connect to
There are also some settings relevant to encryping the elasticsearch
connection with tls (I don't mention them here for now as it's best to
have clear text communication setting thing up
'elasticsearch:wspindex' name of index to search defaults to '_all'
'elasticsearch:acl_filtering' enables acl filtering of results based
on the authenticated user (enable this if the normal elasticsearch
security features are not enough for you, by default acl_filtering is
turned off and all results from elasticsearch are used. Note: it is
possible to set up elasticsearch for document and index security based
on user/gid but there are instances where this might not be appropriate
or enough for to satisfy a particular use case. At this time access to
elasticsearch is anonymous, even when we support accessing elasticsearch
with the current user there still might be reasons why elasticsearch
document or index security might not be enough and in this case it is
best to set acl_filtering. The down side of setting acl_filtering is
that only a limited number of results are available (as the server needs
to cache the results that it itself acl filters) The default no. of
results is returned with acl_filtering enabled is 200 (can be modified
with global param 'wsp results limit'
start samba
-------------------
depending on how you installed samba start as appropriate 😄
test search
------------------
a) with cmdline tool
wspsearch -U${USER}%${PASS} --kind documents //${SERVER}/${SHARE}
where 'kind' is one of
"Calendar|Communication|Contact|Document|Email|Feed|Folder|Game|InstantMessage|Journal|Link|Movie|Music|Note|Picture|Program|RecordedTV|SearchFolder|Task|Video|WebHistory"
see wspsearch --help for some more details
Note: when searching against a samba share only only a subset of the
categories above are supported (search is based on mimetype associations
setup in elasticsearch_mappings.json, please have a look in there to see
what types are supported) supported kinds include the obvious ones,
Music, Video, Pictures
b) with windows client
a) with windows explorer navigate to share contents, click on the
'Search' tab which should give you access to the search ribbon, from
there you can select the various 'kinds' from a dropdown, you can refine
the searches for example by size, date etc.
I've probably left out vital details, I probably should create something
on the wiki at some stage, if you want to try it out then please feel
free to mail with problems/questions etc.
Noel
[1] probably quite old now, I downloaded quite some time ago and didn't
update it (I include the version just for information as to the setup
that currently is working for me)
[2] again this is probably now an 'old' version, I don't recall when I
last (re)downloaded the rpm, again the version info is just for
completeness as this is what I am using for testing
[2] again this is probably now an 'old' version, I don't recall when I
last (re)downloaded the rpm, again the version info is just for
completeness as this is what I am using for testing ne it from
https://git.samba.org/npower/samba.git
More information about the samba
mailing list