CTDB

Fri Aug 17 03:54:37 GMT 2007

Andrew Bartlett wrote:
>>>> Given a 4 node cluster, with file A residing on node 2.  When a windows 
>>>> client requests file A, how does the data get to it?
>>>>     
>>>>         
>>> The design CTDB is for doesn't have the file residing on node2 as much
>>> as residing on a SAN, with access via any node.  If your 'SAN' actually
>>> happens to be a cluster of servers with local disk, then perhaps you are
>>> really looking for an MSDFS-based redirect setup (where your virtual
>>> cluster simply redirects clients to the correct node to find their
>>> file). 
>>>
>>>   
>>>       
>>>> Will the client go to any random node in the cluster and request the 
>>>> file A and then be directed to node 2 to pull the file, or will the file 
>>>> be pulled to the random node from node 2 and then sent to the client?  
>>>>     
>>>>         
>>> To avoid that double-hop, then you probably want the MSDFS design.  
>>>
>>> If your kernel-level cluster FS hides the detail about where the file
>>> actually is, then CTDB doesn't care about the details, as long as open()
>>> gets the file. 
>>>
>>>   
>>>       
>> My cluster will be made up of discs connected to the nodes, not on a 
>> SAN, so perhaps that is the route I need to take.  I have spent a bit of 
>> time reading about DFS and I know SAMBA is able to do that as well.  The 
>> only question I am left with is managing disk space.  It seems that as a 
>> portion of the DFS fills, I will have to augment that portion manually.  
>> I still need to learn how to do that.
>>     
>
> I'm sure someone has created complex management applications to handle
> this kind of thing, but it very much depends on your needs.  
>
> BTW, how are you handling data redundancy?  
>
> Remember that any two systems that actively share the same directory
> must be linked with CTDB.  If your system is not active/active for a
> single directory, then you don't need CTDB.
>
>   
The clusters themselves will be replicated, so that will provide one 
level of redundancy.  Within a DFS, I don't know yet.  I haven't 
research the DFS enough yet.

So, if I set up two DFS shares in SAMBA that are configured as replicas 
of the same data, I will need to be using CTDB on the two nodes with the 
replicated DFS data?