Author |
Topic: Seperate Characters (Read 1611 times) |
|
shallow
Newbie
Posts: 2
|
|
Seperate Characters
« on: Apr 28th, 2006, 8:26pm » |
Quote Modify
|
There is a sequence of charaters. each charater can be either one-byte charater or two-byte character. For the one-byte charater, its value fall in [0,127]; for a two-byte character, the value of the first byte must be larger than 127 and the second byte can take any value. Now give you a position along the sequence. It can be a one-byte character, or any one byte of the two-byte charater. Now you are required to return the character right before the position.
|
|
IP Logged |
|
|
|
cwolves
Newbie
Posts: 41
|
|
Re: Seperate Characters
« Reply #1 on: Apr 29th, 2006, 8:30am » |
Quote Modify
|
um....you need a bit more than that for it to even be a puzzle. All your question says is: "Here's a book. What's before the letter n?" even if the sequence is linear it's still impossible as any value can at some point have any other value before it.
|
|
IP Logged |
|
|
|
towr
wu::riddles Moderator Uberpuzzler
Some people are average, some are just mean.
Gender:
Posts: 13730
|
|
Re: Seperate Characters
« Reply #2 on: Apr 29th, 2006, 9:25am » |
Quote Modify
|
Seems like you have to search back to the last <=127 char (or the start) before you can determine where characters begin/end
|
|
IP Logged |
Wikipedia, Google, Mathworld, Integer sequence DB
|
|
|
Neelesh
Junior Member
Gender:
Posts: 147
|
|
Re: Seperate Characters
« Reply #3 on: May 1st, 2006, 6:24am » |
Quote Modify
|
I guess you will need to search for last two consecutive <=127 characters. This is because, supposing "L" represents such a character, then "LL" is the only sequence which gurantees that the previous character terminated on the first L whereas the new character started (and terminated) on the second L. Once such sequence is obtained, we could traverse in forward direction. Here is an untested code. hidden: | variables : gp = given position cp = current position pp = previous position p2pp = previous-to-previous position I assume that these are the pointers-like entities. Dereferencing them simply gives the values "L" or "M" depending on whether the character pointed by them in >127 or <= 127. cp = gp - 1 while ( (cp > 1) and ( (*cp == M) or (*cp -1 ) == M) ) cp--; pp = cp; while(gp >= cp) { p2pp = pp; pp = cp; cp += ( ( cp == M )? 2 : 1); // not tested for end of sequence, but that can be added easily) } return the character pointed by p2pp. This will be the necessary character. | The code is untested, but I guess the underlying logic is flawless.
|
« Last Edit: May 1st, 2006, 6:25am by Neelesh » |
IP Logged |
|
|
|
SMQ
wu::riddles Moderator Uberpuzzler
Gender:
Posts: 2084
|
|
Re: Seperate Characters
« Reply #4 on: May 1st, 2006, 9:32am » |
Quote Modify
|
Neelesh: I first thought along those lines too, but then I realized that towr was right -- any byte <= 127 is sufficient, because you know it must be the last byte of a char. It could be a single-byte char or it could be the second byte of a two-byte char, but not the first byte of a two-byte char, so either way there is a char break after a byte <= 127. --SMQ
|
|
IP Logged |
--SMQ
|
|
|
Neelesh
Junior Member
Gender:
Posts: 147
|
|
Re: Seperate Characters
« Reply #5 on: May 1st, 2006, 10:00am » |
Quote Modify
|
on May 1st, 2006, 9:32am, SMQ wrote:any byte <= 127 is sufficient, because you know it must be the last byte of a char. |
| Yeah, I see what you are saying, one byte <= 127 is sufficient. Of course, we need a byte<=127 before (gp-1)'th position, (and not before gp'th position) where gp represents the given pointer. This is obvious since we are interested in the previous character, not the character at the gp.
|
|
IP Logged |
|
|
|
cwolves
Newbie
Posts: 41
|
|
Re: Seperate Characters
« Reply #6 on: May 4th, 2006, 7:00am » |
Quote Modify
|
If you're able to freely scan the entire array looking for a char <= 127 then why not just look at char n-1? And if you can't then the problem is impossible.
|
|
IP Logged |
|
|
|
towr
wu::riddles Moderator Uberpuzzler
Some people are average, some are just mean.
Gender:
Posts: 13730
|
|
Re: Seperate Characters
« Reply #7 on: May 4th, 2006, 7:43am » |
Quote Modify
|
Well, a character isn't necessarily a char (i.e. 1 byte). You want the previous character, which may be either 1 or 2 bytes in size.
|
|
IP Logged |
Wikipedia, Google, Mathworld, Integer sequence DB
|
|
|
Grimbal
wu::riddles Moderator Uberpuzzler
Gender:
Posts: 7527
|
|
Re: Seperate Characters
« Reply #8 on: May 8th, 2006, 9:27am » |
Quote Modify
|
Here is my take. This Java code returns the position of the start of the previous character. It doesn't return the character itself because it is not specified how 2 bytes combine into one char. The procedure returns -1 if there is no previous char. hidden: | // returns the char before the char at position curr // return -1 if curr is on the 1st char. static int getPrev(byte[]ch, int curr) { // align current char int i = curr; while( i>0 && (ch[i-1]&0x80) != 0 ) i--; if( (curr-i)%2!=0 ) curr--; // find and align previous char int prev = curr-1; if( i>prev ){ i = prev; while( i>0 && (ch[i-1]&0x80) != 0 ) i--; } if( (prev-i)%2!=0 ) prev--; // that's it return prev; } |
|
|
IP Logged |
|
|
|
DewiMorgan
Guest
|
You can optimise slightly: if the preceding byte is high, then the cursor needs to move back 2, assuming the file is valid. low high = n/a high high = 2
|
|
IP Logged |
|
|
|
|