A lot of questions comes on string or sets.
But if we now the basics , then we can solve most of the problems.
For strings, we should know how to find SUBSTRINGS.
For Sets, we should know how to find SUBSETS, PERMUTATIONS, COMBINATIONS.
Lets see one problem on STRING.
Given a string, find all the possible palindrome in it.
Now if you know how you can find all the substring, then you can check each for the palindrome test.
Now you can find all the substring using 3 loops.
Code is here : http://ideone.com/3W9mCl
Now we can use this code to work on variety of question, but if someone ask to provide 6th substring in hexa order, then first finding total substring and then sorting them, will consume space and increase complexity.....a better way is SUFFIX TREE.
The funda behind SUFFIX TREE is that , for a given string you find all the suffixes and then arrange them in ascending order. Now you store this order only.
Ex : For string "tree", we have
0 1 2 3
t r e e
suffixes :
0 tree
1 ree
2 ee
3 e
On arranging as per lexicographical order, we have
3 e
2 ee
1 ree
0 tree
So suffix array will be
0 1 2 3
3 2 1 0
How we read suffix array is like this : Suppose we have to find suffix whose order is 3rd in lexicographic order, in suffix array what index is at 3rd position, it is 0.
No in given string start from 0th index and go till the end of string...we have tree :)
Now suppose you have suffix array and you want to find 5th SUBSTRING.
Now what are possible substring of tree
tree,tre,ree,tr,re,ee,t,r,e,e
Now the suffix we have are tree,ree,ee,e ,
Now to get substring from suffix array, we find prefixes of each suffix , so
from tree we have tree,tre,tr,t
from ree ree,re,r
from ee ee,e
from e e
if you see we have all SUBSTRING.
Further if you NOTE , all substring derived from REE will have order greater than TREE.
So I know that my suffix order is
e from this I can have 1 substring
ee from this 2 substring
ree from this 3 substring
tree from this 4 substring
Now for the 5th substring we will have it in 1+2+3...so in 3rd suffix substrings
We have to also pay attention to DUPLICATE, for that we have LCS matrix. It value is derived by checking how many characters of two adjacent suffix in suffix array matches from start.
Ex for
LCS
0 3 e 0
1 2 ee 1 // this means 1 substring will be common in e and ee
2 1 ree 0
3 0 tree 0
Other example for "chtch"
LCS SUBSTRING
0 ch 0 ch,c
1 chtch 2 chtch,chtc,cht,ch,c See TWO common SUBSTRING
2 h 0 h
3 htch 1 htch,htc,ht,h SEE ONE common SUBSTRING
4 tch 0 tch,tc,t
The funda behind SUFFIX TREE is that , for a given string you find all the suffixes and then arrange them in ascending order. Now you store this order only.
Ex : For string "tree", we have
0 1 2 3
t r e e
suffixes :
0 tree
1 ree
2 ee
3 e
On arranging as per lexicographical order, we have
3 e
2 ee
1 ree
0 tree
So suffix array will be
0 1 2 3
3 2 1 0
How we read suffix array is like this : Suppose we have to find suffix whose order is 3rd in lexicographic order, in suffix array what index is at 3rd position, it is 0.
No in given string start from 0th index and go till the end of string...we have tree :)
Now suppose you have suffix array and you want to find 5th SUBSTRING.
Now what are possible substring of tree
tree,tre,ree,tr,re,ee,t,r,e,e
Now the suffix we have are tree,ree,ee,e ,
Now to get substring from suffix array, we find prefixes of each suffix , so
from tree we have tree,tre,tr,t
from ree ree,re,r
from ee ee,e
from e e
if you see we have all SUBSTRING.
Further if you NOTE , all substring derived from REE will have order greater than TREE.
So I know that my suffix order is
e from this I can have 1 substring
ee from this 2 substring
ree from this 3 substring
tree from this 4 substring
Now for the 5th substring we will have it in 1+2+3...so in 3rd suffix substrings
We have to also pay attention to DUPLICATE, for that we have LCS matrix. It value is derived by checking how many characters of two adjacent suffix in suffix array matches from start.
Ex for
LCS
0 3 e 0
1 2 ee 1 // this means 1 substring will be common in e and ee
2 1 ree 0
3 0 tree 0
Other example for "chtch"
LCS SUBSTRING
0 ch 0 ch,c
1 chtch 2 chtch,chtc,cht,
2 h 0 h
3 htch 1 htch,htc,ht,
4 tch 0 tch,tc,t
chtch : chtch , htch, tch, ch, h
So substring in lexicographic order is
c,ch,cht,chtc,chtch,h,ht,htc,htch,t,tc,tch 12 SUBSTRING
possible are 5*6/2 : since STRING LENGTH is 5 , so n(n+1)/2 ...but as you see 3 are duplicates
No comments:
Post a Comment