[clug] Substring repetition detection

David Price david.price at anu.edu.au
Wed Jun 25 16:37:21 EST 2003


I haven't tried this, so I don't know if it would work, but...

Record beside each letter, the number of characters since each
previous occurrence of that letter.  eg:

a  b  c  d  e  f  g  h  d  e  f  g  h  d  e  f  g  h  i  j  k
                        5  5  5  5  5  5  5  5  5  5
                                       10 10 10 10 10

Then look for continuous sequences of the same number, where the
length of the sequence is >= the number.  So here, the 10s can't
count, since there are only 5 of them, but the 5s can count because
there are 10 of them.  The period of repetition is the number of the
sequence.  Presumably, you would want the longest sequences possible,
so you'd start with the highest number, check if it works, then work
your way down.

I have no idea if this would work for all possible problems, but I
can't think of anything it wouldn't work for at the moment.

-- 
David.Price at anu.edu.au
uni phone: +61-2-612-58608  home phone: +61-2-6251-1597 




More information about the linux mailing list