|
||
Title: Seperate Characters Post by shallow on Apr 28th, 2006, 8:26pm There is a sequence of charaters. each charater can be either one-byte charater or two-byte character. For the one-byte charater, its value fall in [0,127]; for a two-byte character, the value of the first byte must be larger than 127 and the second byte can take any value. Now give you a position along the sequence. It can be a one-byte character, or any one byte of the two-byte charater. Now you are required to return the character right before the position. |
||
Title: Re: Seperate Characters Post by cwolves on Apr 29th, 2006, 8:30am um....you need a bit more than that for it to even be a puzzle. All your question says is: "Here's a book. What's before the letter n?" even if the sequence is linear it's still impossible as any value can at some point have any other value before it. |
||
Title: Re: Seperate Characters Post by towr on Apr 29th, 2006, 9:25am Seems like you have to search back to the last <=127 char (or the start) before you can determine where characters begin/end |
||
Title: Re: Seperate Characters Post by Neelesh on May 1st, 2006, 6:24am I guess you will need to search for last two consecutive <=127 characters. This is because, supposing "L" represents such a character, then "LL" is the only sequence which gurantees that the previous character terminated on the first L whereas the new character started (and terminated) on the second L. Once such sequence is obtained, we could traverse in forward direction. Here is an untested code. [hideb] variables : gp = given position cp = current position pp = previous position p2pp = previous-to-previous position I assume that these are the pointers-like entities. Dereferencing them simply gives the values "L" or "M" depending on whether the character pointed by them in >127 or <= 127. cp = gp - 1 while ( (cp > 1) and ( (*cp == M) or (*cp -1 ) == M) ) cp--; pp = cp; while(gp >= cp) { p2pp = pp; pp = cp; cp += ( ( cp == M )? 2 : 1); // not tested for end of sequence, but that can be added easily) } return the character pointed by p2pp. This will be the necessary character. [/hideb] The code is untested, but I guess the underlying logic is flawless. |
||
Title: Re: Seperate Characters Post by SMQ on May 1st, 2006, 9:32am Neelesh: I first thought along those lines too, but then I realized that towr was right -- any byte <= 127 is sufficient, because you know it must be the last byte of a char. It could be a single-byte char or it could be the second byte of a two-byte char, but not the first byte of a two-byte char, so either way there is a char break after a byte <= 127. --SMQ |
||
Title: Re: Seperate Characters Post by Neelesh on May 1st, 2006, 10:00am on 05/01/06 at 09:32:58, SMQ wrote:
Yeah, I see what you are saying, one byte <= 127 is sufficient. Of course, we need a byte<=127 before (gp-1)'th position, (and not before gp'th position) where gp represents the given pointer. This is obvious since we are interested in the previous character, not the character at the gp. |
||
Title: Re: Seperate Characters Post by cwolves on May 4th, 2006, 7:00am If you're able to freely scan the entire array looking for a char <= 127 then why not just look at char n-1? And if you can't then the problem is impossible. |
||
Title: Re: Seperate Characters Post by towr on May 4th, 2006, 7:43am Well, a character isn't necessarily a char (i.e. 1 byte). You want the previous character, which may be either 1 or 2 bytes in size. |
||
Title: Re: Seperate Characters Post by Grimbal on May 8th, 2006, 9:27am Here is my take. This Java code returns the position of the start of the previous character. It doesn't return the character itself because it is not specified how 2 bytes combine into one char. The procedure returns -1 if there is no previous char. [hideb] // returns the char before the char at position curr // return -1 if curr is on the 1st char. static int getPrev(byte[]ch, int curr) { // align current char int i = curr; while( i>0 && (ch[i-1]&0x80) != 0 ) i--; if( (curr-i)%2!=0 ) curr--; // find and align previous char int prev = curr-1; if( i>prev ){ i = prev; while( i>0 && (ch[i-1]&0x80) != 0 ) i--; } if( (prev-i)%2!=0 ) prev--; // that's it return prev; } [/hideb] |
||
Title: Re: Seperate Characters Post by DewiMorgan on Jun 6th, 2006, 2:28pm You can optimise slightly: if the preceding byte is high, then the cursor needs to move back 2, assuming the file is valid. low high = n/a high high = 2 |
||
Powered by YaBB 1 Gold - SP 1.4! Forum software copyright © 2000-2004 Yet another Bulletin Board |