Author |
Topic: Strings permutation (Read 2284 times) |
|
priyank
Newbie
Posts: 5
|
|
Strings permutation
« on: May 14th, 2004, 7:00am » |
Quote Modify
|
This one goes like this: In this problem, we seek to locate permutations of one string which appear as sub-strings of another string. A permutation of a string is a string of the same length which can be obtained by rearranging the characters of the original string. Of course, I don’thave to tell you that the number of permutations of a long string can be enormous, and that there is a time limit on programs being judged. link:: www.csupomona.edu/~carich/programming_contests/199601/restrung.pdf I tried to search if it already exists on the forum, but couldn't find it.
|
« Last Edit: May 14th, 2004, 7:02am by priyank » |
IP Logged |
|
|
|
towr
wu::riddles Moderator Uberpuzzler
Some people are average, some are just mean.
Gender:
Posts: 13730
|
|
Re: Strings permutation
« Reply #1 on: May 14th, 2004, 7:37am » |
Quote Modify
|
I don't think this problem was on the forum yet, but I do remember it from my string algorithm course a few years back where we had to do it as homework. hint/comment ::If both strings are from a fixed/known size alphabet (which is generally the case), and the search string has length m and the text string length n, then you can solve it in O(n + m) time, and O(1) space Importantly, the number of permutations the search string can have doesn't factor into it It would be marginally slower if the size of the alphabet isn't fixed (and even then, only the alphabet size of the search string really matters) note, if you want to output the substring (instead of only its position), complexity naturally increases by a factor m in the worst case.::
|
« Last Edit: May 14th, 2004, 7:44am by towr » |
IP Logged |
Wikipedia, Google, Mathworld, Integer sequence DB
|
|
|
TenaliRaman
Uberpuzzler
I am no special. I am only passionately curious.
Gender:
Posts: 1001
|
|
Re: Strings permutation
« Reply #2 on: May 14th, 2004, 11:00am » |
Quote Modify
|
towr, i have an algorithm in my mind which would take O(mn) time and O(m) space, but it can print the permutation substring. So i think what i have in my mind is a totally different idea than yours. Any pointers to your method?
|
|
IP Logged |
Self discovery comes when a man measures himself against an obstacle - Antoine de Saint Exupery
|
|
|
towr
wu::riddles Moderator Uberpuzzler
Some people are average, some are just mean.
Gender:
Posts: 13730
|
|
Re: Strings permutation
« Reply #3 on: May 14th, 2004, 11:33am » |
Quote Modify
|
It's hard to think of another hint that wouldn't give the whole thing away. ::Consider how you can check if one string is a permutation of another of the same length n, in O(n) time and O(|alphabet|) space ::
|
« Last Edit: May 14th, 2004, 11:33am by towr » |
IP Logged |
Wikipedia, Google, Mathworld, Integer sequence DB
|
|
|
TenaliRaman
Uberpuzzler
I am no special. I am only passionately curious.
Gender:
Posts: 1001
|
|
Re: Strings permutation
« Reply #4 on: May 14th, 2004, 1:25pm » |
Quote Modify
|
i think i got it! Just tell me if i am right :: for simplicity, i will consider my initial set to be A={1,2,3,4,5,....,n} let B = some permutation of set A then flag = 1; counter=0; while(counter<#A) { if(A[counter]==B[counter] || A[B[counter]-1]==B[counter]) counter++; else { flag=0;break; } } if(flag) print("yes permutation"); else print("no not a permutation"); /*note to readers : this is not the actual program for the given question but yes there is a way to convert this code to get the required result.*/ ::
|
« Last Edit: May 14th, 2004, 1:26pm by TenaliRaman » |
IP Logged |
Self discovery comes when a man measures himself against an obstacle - Antoine de Saint Exupery
|
|
|
towr
wu::riddles Moderator Uberpuzzler
Some people are average, some are just mean.
Gender:
Posts: 13730
|
|
Re: Strings permutation
« Reply #5 on: May 14th, 2004, 2:11pm » |
Quote Modify
|
I don't think that would work, if B = {1,1,1,...,1} it would still say it's a permutation of {1,2,3,..,n} (An aside, it is inacurate to talk about sets here, as order doesn't matter to sets in the first place. Which is actually a sort of hint.)
|
« Last Edit: May 14th, 2004, 2:12pm by towr » |
IP Logged |
Wikipedia, Google, Mathworld, Integer sequence DB
|
|
|
priyank
Newbie
Posts: 5
|
|
Re: Strings permutation
« Reply #6 on: May 15th, 2004, 3:08am » |
Quote Modify
|
One method would be to make all the permutations of the search string and then search for it in the another string.But, this doesn't seems to be an efficient one.Is there any another method to do it?Can someone provide some algo for it? -thanks
|
|
IP Logged |
|
|
|
TenaliRaman
Uberpuzzler
I am no special. I am only passionately curious.
Gender:
Posts: 1001
|
|
Re: Strings permutation
« Reply #8 on: May 15th, 2004, 5:49am » |
Quote Modify
|
towr, thx for pointing that out. but i don't think i would mind a few spurious results if this method works efficiently enough. I am still thinking on your method though. priyank, the sorting method pointed to in your link is exactly the method i had in my mind.(Note that max(mn,mnlogm)=mn so time complexity is around O(mn) and to store the sorted value the space taken is O(m)).Obviously this is the most easiest method to be approached. i haven't checked the hashing formula they have proposed.I will check into it later, and if it works then its pretty darn neat too). but i am still overawed by the complexity proposed by towr O(m+n) time and O(1) space
|
« Last Edit: May 15th, 2004, 5:51am by TenaliRaman » |
IP Logged |
Self discovery comes when a man measures himself against an obstacle - Antoine de Saint Exupery
|
|
|
Barukh
Uberpuzzler
Gender:
Posts: 2276
|
|
Re: Strings permutation
« Reply #9 on: May 15th, 2004, 9:30am » |
Quote Modify
|
Nice problem, priyank! I think the following code meets the towr’s requirements (for the fixed size alphabet). I’ve made an assumption that ‘?’ sign is is not a part of the alphabet. I am not going to comment on this algorithm for a while… Code: #include <stdio.h> #include <stdlib.h> #define symbol(s, i) (((s[i] == ex) || (count[s[i]] == m+1)) ? ex : s[i]) char c, ex = '?'; main(int argc, char** argv) { char* s = argv[1]; char* S = argv[2]; int m = strlen(s); int n = strlen(S); int i, C = 0; int count[256]; /* O(1) space */ /* O(m) time */ for (i = 0; i < 256; i++) count[i] = m+1; for (i = 0; i < m; i++) { if (count[s[i]] == m+1) { C++; count[s[i]] = 0; } count[s[i]]++; } /* O(n) time */ for (i = 0; i < n; i++) { if (i - m >= 0) { c = symbol(S, i-m); if (! count[c]++) C++; if (! count[c]) C--; } c = symbol(S, i); if (! count[c]--) C++; if (! count[c]) C--; if (C == 0) printf("%d\n", i-m+1); } } |
|
|
|
IP Logged |
|
|
|
towr
wu::riddles Moderator Uberpuzzler
Some people are average, some are just mean.
Gender:
Posts: 13730
|
|
Re: Strings permutation
« Reply #10 on: May 15th, 2004, 1:27pm » |
Quote Modify
|
yep, I think that's it.. (It differs on a few details from what I had, but the essence seems to be the same)
|
|
IP Logged |
Wikipedia, Google, Mathworld, Integer sequence DB
|
|
|
TenaliRaman
Uberpuzzler
I am no special. I am only passionately curious.
Gender:
Posts: 1001
|
|
Re: Strings permutation
« Reply #11 on: May 16th, 2004, 2:03am » |
Quote Modify
|
Wonderful!! it works!! and i don't know why it works!!! i will have to work out the details step by step now so that i can see what is happening!!
|
|
IP Logged |
Self discovery comes when a man measures himself against an obstacle - Antoine de Saint Exupery
|
|
|
towr
wu::riddles Moderator Uberpuzzler
Some people are average, some are just mean.
Gender:
Posts: 13730
|
|
Re: Strings permutation
« Reply #12 on: May 16th, 2004, 4:51am » |
Quote Modify
|
isn't #define symbol(s, i) (((s[i] == ex) || (count[s[i]] == m+1)) ? ex : s[i]) equivalent to #define symbol(s, i) ((count[s[i]] == m+1) ? ex : s[i]) that seems a bit simpler I'm still trying to figure out what ex is used for in the first place.. I'm not sure if my code is any easier to understand though. (And it doesn't exactly help that it's two years old, I'm not quite sure why this specific variation of the algorithm works anymore..) Code:#include <iostream> int main(int argc, char **argv) { char const *search_string = argc > 1 ? argv[1] : "aabc"; char const *text = argc > 2 ? argv[2] : "abahgacabababahcacabahaab"; int length, count[256] = {0}; for(length=0; search_string[length]; length++) count[search_string[length]]--; // amount of each character we need to find unsigned left=0, right=0; for(count[text[right]]++; text[right]; count[text[++right]]++) while (count[text[right]] > 0) { if (right-left == length) std::cout << (left+1) << std::endl; count[text[left++]]--; } // despite apparantly two loops, still only O(n) if (right-left >= length) std::cout << (right - length+1) << std::endl; } |
|
|
« Last Edit: May 16th, 2004, 4:59am by towr » |
IP Logged |
Wikipedia, Google, Mathworld, Integer sequence DB
|
|
|
TenaliRaman
Uberpuzzler
I am no special. I am only passionately curious.
Gender:
Posts: 1001
|
|
Re: Strings permutation
« Reply #13 on: May 16th, 2004, 5:38am » |
Quote Modify
|
That is just great barukh and towr!! i just got the logic of the entire code.... if i had implemented it i would have still done it in O(mn) since i am sure i would have reset the count array manually instead of using the look-back technique used by u guys!!
|
|
IP Logged |
Self discovery comes when a man measures himself against an obstacle - Antoine de Saint Exupery
|
|
|
Barukh
Uberpuzzler
Gender:
Posts: 2276
|
|
Re: Strings permutation
« Reply #14 on: May 17th, 2004, 8:58am » |
Quote Modify
|
on May 16th, 2004, 4:51am, towr wrote:isn't #define symbol(s, i) (((s[i] == ex) || (count[s[i]] == m+1)) ? ex : s[i]) equivalent to #define symbol(s, i) ((count[s[i]] == m+1) ? ex : s[i]) that seems a bit simpler |
| Yes, of course! Moreover, I believe my code may be improved a lot more… Quote:I'm still trying to figure out what ex is used for in the first place.. |
| I use ex as a single representative of all the symbols that are in the S ("search string"), but not in s ("text"). For the fixed size alphabet it's not really helpful, but in the general case it would (IMHO). As there was an interest in explaining the algorithm, here's it goes: The first part (time O(m)) forms the array count[], where count[c] is the number of occurrences of symbol c in the string s. It also calculates C, the number of different symbols in s. If c is not in s, its count is set to m+1. The second part (time O(n)) scans the string S. Once the symbol enters the "window" of m symbols, its count is decreased. When it leaves the "window" (i.e. is m symbols away from the current symbol), its count is increased. Every time count[c] changes from [pm]1 to 0, C is decreased; every time count[c] changes from 0 to [pm]1, C is increased. When C becomes 0, that means the current window is a permutation of s.
|
|
IP Logged |
|
|
|
priyank
Newbie
Posts: 5
|
|
Re: Strings permutation
« Reply #15 on: May 18th, 2004, 11:59pm » |
Quote Modify
|
I think we can implement a much simpler algo: -get the sum S of the ascii code of all the character of s(the first string,length m) -compute the sum-M[i] of ascii of the first m chars of second search string. -this summing loop(i) would go from M[0] till M[n-m](n-length of the second string) -just check if S==M[i] -Also, we need not do summation M[i] in a loop as M[i+1]=M[i]+B[i+m]-B[i], where B[i] is the ith element of second string.Only for first M[0], we need the loop. This algo can be found at :http://homepage.mac.com/hteric/LT/SearchAlg/
|
|
IP Logged |
|
|
|
towr
wu::riddles Moderator Uberpuzzler
Some people are average, some are just mean.
Gender:
Posts: 13730
|
|
Re: Strings permutation
« Reply #16 on: May 19th, 2004, 1:12am » |
Quote Modify
|
on May 18th, 2004, 11:59pm, priyank wrote: -get the sum S of the ascii code of all the character of s(the first string,length m) |
| I don't think this will work 'b'+'b' = 'a'+'c', you loose critical information. If you use hashing, you will have to check wether you have an actual substring-match each time you have a hash-match, and in the worst case this will be O(n2). Only when the searchstring is known to be very infrequent may this be a good idea. And even then it probably isn't [e]hmm the algorithm on that site is worse than linear even.. Their 'CheckPerm()' function is allready O(m log(m) ), and then they still have to multiply it by n when they go through the text[/e]
|
« Last Edit: May 19th, 2004, 1:25am by towr » |
IP Logged |
Wikipedia, Google, Mathworld, Integer sequence DB
|
|
|
priyank
Newbie
Posts: 5
|
|
Re: Strings permutation
« Reply #17 on: May 19th, 2004, 1:25am » |
Quote Modify
|
oh!! i missed that..but, this points to another question..can we have a hashing that takes care of 2b!=a+c, so that we dont have to actually check for the substring?i mean some sort of hashing algo.. P.S:i'm not using any checkPerm() method in my last update.
|
« Last Edit: May 19th, 2004, 1:41am by priyank » |
IP Logged |
|
|
|
towr
wu::riddles Moderator Uberpuzzler
Some people are average, some are just mean.
Gender:
Posts: 13730
|
|
Re: Strings permutation
« Reply #18 on: May 19th, 2004, 1:53am » |
Quote Modify
|
hashes are usually lossy of course an integer is much larger than a character, so perhaps we have enough space. There is another problem though. The hash of a string must be the equal to the hash of any permutation of it, so it must be some commutative function on the characters (or functions depending on one character)
|
« Last Edit: May 19th, 2004, 1:56am by towr » |
IP Logged |
Wikipedia, Google, Mathworld, Integer sequence DB
|
|
|
towr
wu::riddles Moderator Uberpuzzler
Some people are average, some are just mean.
Gender:
Posts: 13730
|
|
Re: Strings permutation
« Reply #19 on: May 19th, 2004, 2:06am » |
Quote Modify
|
How about this, we associate a prime number with each character, and multiply them together. Of course there's a problem with integer division, so for the textstring we'd put the character-hash in another integer, and simply compare them each step. If we only look at letters and ignore case we can in the worst case use 4 letters in our searchstring 3 if we also have case and numbers Anymore might work but is no longer guaranteed to give a unique hash hmm.. why is this worse than just putting the characters (1 byte each) into the integer (4 bytes)?
|
« Last Edit: May 19th, 2004, 2:08am by towr » |
IP Logged |
Wikipedia, Google, Mathworld, Integer sequence DB
|
|
|
|