wu :: forums - infectious disease

wu :: forums « wu :: forums - infectious disease » Welcome, Guest. Please Login or Register. Aug 20^th, 2025, 6:40am
RIDDLES SITE WRITE MATH! Home Help Search Members Login Register

   wu :: forums
   riddles
   hard (Moderators: ThudnBlunder, Icarus, SMQ, towr, william wu, Grimbal, Eigenray)
   infectious disease

« Previous topic | Next topic »

Pages: 1

Notify of replies

Send Topic

Author

Topic: infectious disease (Read 1866 times)

tomas
Newbie

Posts: 8

infectious disease
« on: Dec 5^th, 2008, 8:26am »

Quote

Modify

In the town with the population of n people, there is an infectious disease.So far there has been 1% of the residents being infected.
Provided that there is a testing centre which can test up to 10^4 samples per day.

the centre can test either a combination of blood samples or separate ones

however , if there is more than 1 infectious sample in a m-sized sample (n>m>1), the result will be wrong.else, if there is 0 or 1 infectious sample in a m-sized sample, the result will be correct.

How many days does it take to find out all the infected people

« Last Edit: Dec 5^th, 2008, 8:32pm by tomas »

IP Logged

teekyman
Full Member

Gender:

Posts: 199

Re: infectious disease
« Reply #1 on: Dec 5^th, 2008, 8:42am »

Quote

Modify

Did you mean at least one infected sample makes the test detect the infection, or did you really mean more than 1?

Also, can the center process 10^4 people, or samples?

(I'm also assuming that n >> 10^4 because we're probably not looking for a naive solution.)

IP Logged

tomas
Newbie

Posts: 8

Re: infectious disease
« Reply #2 on: Dec 5^th, 2008, 8:52am »

Quote

Modify

if there is only 1 infectious sample in the m-sized sample , then the result will be correct.
else , there are > 1 infectious samples , it will be wrong
the centre can process a m-sized sample (combination of m people's blood sample, here 1<=m<=10^4)
people's blood sample= people (the same )
of course , n >10^4
let's say ,n --> infinity
but the strategy is unchanged for a random value of n

« Last Edit: Dec 5^th, 2008, 8:53am by tomas »

IP Logged

teekyman
Full Member

Gender:

Posts: 199

Re: infectious disease
« Reply #3 on: Dec 5^th, 2008, 10:08am »

Quote

Modify

Well if having two or more infected people donate samples causes the test to give a false negative infection, then that will complicate things indeed. In this situation since we know that a disease-positive test tells us that there is exactly one person in the group with the disease, but a disease negative test tells that either 0 or more than 2 people have the disease, we really only get knowledge out of disease positive tests. Say we test with groups of size k. (assume for now that k = 100 so that some calculations become much easier)

Then if we get a positive result, which happens with probability .01k*e^-.01k = e ^ -1 we can use binary search to isolate the single person with the disease in that group of size k which will take a total of log(100) (base 2) + 1 (about Cool

steps including the initial test. So doing this once through the entire population (dividing them into batches of size k), will take 8 * n / 100 steps, or 2/25 * n. We will successfully determine whether e^-1 * n are infected or not on average, and the rest we won't know anything about.

So after doing this, we are left with (on average) n * (1 - e^-1) people who tested negative for the disease, but still may be infected. Among the people that we were able to verify, 1/100 people are diseased, so the same ratio of infected people are left among the general population. So we do the exact same thing, choosing 100 random people per sample again. This time, we can reduce to n*(1-e^-1)^2 people left after an additional 8 * (1-e^-1) * n / 100 steps.

We quickly see that the total number of samples (if we choose k people at each step) needed is an infinite geometric series, which converges to n * 2e / 25, or about .217n. Now I don't think that choosing k = 100 is the best sample size per group. However, choosing different sample sizes will alter the proportion of the remaining population with the disease, and I am too lazy to do all that calculation right now, especially since the sample size k will probably need to change as the percentage of the population that is diseased changes.

I'm not sure im understanding the operating practices of the hospital though; I am right in saying that the hospital takes the same effort to process one sample of any size (even if it's just one person)?

« Last Edit: Dec 5^th, 2008, 10:09am by teekyman »

IP Logged

tomas
Newbie

Posts: 8

Re: infectious disease
« Reply #4 on: Dec 5^th, 2008, 5:01pm »

Quote

Modify

sorry for my ambiguous expression.
but the result is positive for either 1 infected sample or 0 infected sample in a k-sized sample ( in the first reply i assumed that u will understand that 0 infected sample always make the result correct).
Quote:

I am right in saying that the hospital takes the same effort to process one sample of any size (even if it's just one person)?

---> yes, u are right
one hint :even n --> infinity or a random large value of n gives us the same strategy and result , i mean the solution is independent of n .
anyways, nice try Roll Eyes

« Last Edit: Dec 5^th, 2008, 5:06pm by tomas »

IP Logged

tomas
Newbie

Posts: 8

Re: infectious disease
« Reply #5 on: Dec 6^th, 2008, 7:14am »

Quote

Modify

noone help me ? Huh

IP Logged

rmsgrey
Uberpuzzler

Gender:

Posts: 2874

Re: infectious disease
« Reply #6 on: Dec 6^th, 2008, 10:13am »

Quote

Modify

For clarification, what is the result of the test if:

a) there are 0 infected people
b) there is exactly 1 infected person
c) there are 2 or more infected people

in the sample?

IP Logged

tomas
Newbie

Posts: 8

Re: infectious disease
« Reply #7 on: Dec 6^th, 2008, 5:58pm »

Quote

Modify

a)the sample is positive
b)the sample is negative
here, if it is a sample of size 1 , u will know the infected person
if it is a sample of size m (m>=1), it will not inform the name of the infected person, only say "negative"
c) the sample is negative

IP Logged

rmsgrey
Uberpuzzler

Gender:

Posts: 2874

Re: infectious disease
« Reply #8 on: Dec 7^th, 2008, 9:09am »

Quote

Modify

In other words, the test tells you whether there is at least one infected person in the group tested rather than telling you whether there is exactly one (which is how people were reading the earlier statements)

IP Logged

tomas
Newbie

Posts: 8

Re: infectious disease
« Reply #9 on: Dec 7^th, 2008, 7:29pm »

Quote

Modify

sorry for my ambiguous expression in early posts.
in short ,
what u can get from the result is :
a/ positive : no infected person in the sample
b/ negative : at least 1 infected person in the sample

IP Logged

hawkxor
Newbie

Gender:

Posts: 5

Re: infectious disease
« Reply #10 on: Dec 7^th, 2008, 9:48pm »

Quote

Modify

tomas, you are still ambiguous about the abilities of the testing center.

is it correct that the center can process 10^4 samples each day, even if each of those samples consists of the blood of 1000 unique people?

IP Logged

teekyman
Full Member

Gender:

Posts: 199

Re: infectious disease
« Reply #11 on: Dec 8^th, 2008, 5:05am »

Quote

Modify

Just dealing with the total number of samples needed, (assuming nothing too funny happens with the processing of samples in the hospital once that gets cleared up), then we only gain information about the infected status of an individual person when either: that person's sampled in a batch of people that are tested positive OR that person is tested individually negative. To avoid testing everyone individually, our only hope to be faster is to perform batch tests of people and hope to clear large groups of people as free of disease at once before we resort to individual testing among a more concentrated pool.

If we test a batch of k people when the concentration of infected people is p, then the probability that they are cleared is (1-p)^k. the expected number cleared is k * (1-p) ^ k if k > 1. With a little calculus and algebra, we see this is maximized when k = -1/ln(1-p) ~ 1/p for small p. k as a function of p is a decreasing, concave up function until p = 1 - e^-2 (about .864), which tells us that greedily trying to maximize k until this point is probably is the right thing to do. After p = .3 or so, that expected value drops to below 1, and we want to start testing people individually anyways.

So the idea is to repeatedly:

1) Calculate the proportion p of remaining people in the population that are infected.
2) if this is below p = 1 - e^(-e^-1) (about .3) then test a population of size about k = -1/ln(1-p). If its above that value, then test single individuals.
3) If the group has at least one infected person, throw them back in the pool. If they are clear, set them aside.
3) repeat until everyone is tested.

I'm sure optimizations can be made with respect to rounding, and perhaps even with what to do with the people who test positive.

As for the total number of samples needed for this method: a lot. This will be a little annoying to calculate at best, and I will do it later, provided I am able to.

« Last Edit: Dec 8^th, 2008, 5:07am by teekyman »

IP Logged

rmsgrey
Uberpuzzler

Gender:

Posts: 2874

Re: infectious disease
« Reply #12 on: Dec 8^th, 2008, 12:16pm »

Quote

Modify

A naive solution (which I'm not even going to bother working out expected runtimes for):

For any infected sample, divide into two equal samples and test each separately. This will find k infected people in an initial sample of n in less than 2k log₂n tests

IP Logged

River Phoenix
Junior Member

Gender:

Posts: 125

Re: infectious disease
« Reply #13 on: Dec 8^th, 2008, 3:08pm »

Quote

Modify

on Dec 8^th, 2008, 12:16pm, rmsgrey wrote:

Suppose that we know that at least 1 from a sample of n people is infected:

Split in half, test each group. What is the probability that one of the groups is clean? I believe it is simply 2*0.99^(n/2). Then, the problem reduces to 1 instance of the the n/2 case. Otherwise, it reduces to 2 instances.

Let f(n) be the number of trials required for a sample of size n. Then f(2n) = (1 - 0.99^n) * 2 f(n)

Clearly f(2) = 2. It follows that:

f(2^n) = 2^n [1-0.99^2][1-0.99^4]...[1-0.99^(2^(n-1))]
= 2^n (1-0.99)^(n-1) [1+0.99][1+0.99^2]...[1+0.99^(2^(n-2))]
I believe that (1+a)(1+a^2)... converges, hence:

f(2^n) < C (0.02)^n

f(n) < C (0.02)^log₂n

So the probabilistic order of growth seems like it is actually very good.

« Last Edit: Dec 8^th, 2008, 4:49pm by River Phoenix »

IP Logged

tomas
Newbie

Posts: 8

Re: infectious disease
« Reply #14 on: Dec 8^th, 2008, 10:19pm »

Quote

Modify

rmsgrey and Phoenix are on the right track .However, so far no answer is correct.
You all make this one seem more complicated than it must be...
Let solve this extracted version then you can find the solution for the original one:
there are 100 people in the town and k infected people among them (1<=k<=100)
prove that : the number of tests required <=7k
(sorry if i made a mistake again, cause this is the first time i created a puzzle >_<)

IP Logged

River Phoenix
Junior Member

Gender:

Posts: 125

Re: infectious disease
« Reply #15 on: Dec 8^th, 2008, 11:52pm »

Quote

Modify

My calculation must be wrong by the way, the end equation is decreasingb

IP Logged

towr
wu::riddles Moderator
Uberpuzzler

Some people are average, some are just mean.

Gender: male

Posts: 13730

Re: infectious disease
« Reply #16 on: Dec 9^th, 2008, 1:34am »

Quote

Modify

on Dec 8^th, 2008, 10:19pm, tomas wrote:

Let solve this extracted version then you can find the solution for the original one:
there are 100 people in the town and k infected people among them (1<=k<=100)
prove that : the number of tests required <=7k

Do you know k to start with?

log₂ C(100,k)

« Last Edit: Dec 9^th, 2008, 1:34am by towr »

IP Logged

Wikipedia, Google, Mathworld, Integer sequence DB

tomas
Newbie

Posts: 8

Re: infectious disease
« Reply #17 on: Dec 9^th, 2008, 2:04am »

Quote

Modify

Towr, of course ur inequation is correct
but let's try that with k=1
[log₂C(1,100)]=6
pls tell me a strategy that work with 6 times to find out that 1 infected person ?

hidden:

( i think the optimal solution is 7 here )

denote the times required =f
7k is just a general, loose sup value
ex :
k=1, f=7
k=2, f=13(not 14)

IP Logged

towr
wu::riddles Moderator
Uberpuzzler

Some people are average, some are just mean.

Gender: male

Posts: 13730

Re: infectious disease
« Reply #18 on: Dec 9^th, 2008, 2:12am »

Quote

Modify

on Dec 9^th, 2008, 2:04am, tomas wrote:

Towr, of course ur inequation is correct
but let's try that with k=1
[log₂C(1,100)]=6

I get 7.

means you round x up. ~6.6 rounds up to 7.

Quote:

hidden:

( i think the optimal solution is 7 here )

indeed it is.

However, the key question is, do we know k beforehand, or do we need to find it, also?

« Last Edit: Dec 9^th, 2008, 2:13am by towr »

IP Logged

Wikipedia, Google, Mathworld, Integer sequence DB

River Phoenix
Junior Member

Gender:

Posts: 125

Re: infectious disease
« Reply #19 on: Dec 9^th, 2008, 3:59am »

Quote

Modify

on Dec 8^th, 2008, 3:08pm, River Phoenix wrote:

Actually (1-0.99^n) goes to 1, so if you just take it to be the case that f(2n) < 2 f(n), then f(n) < n. Which is obvious.. so I think this whole strategy is a deadend. I think the reason is that in the steady state as n-> infinity and you start subdividing, both halves always have at least one infected person, so it's no better order of growth wise than checking all people separately.

As towr notes, you can do it in log₂C(n,k) given that you know k (via binary search over all possible sets of k).

There is an obvious duality between this and the strategy I analyzed which ends up taking n steps: the sum of log₂C(n,k) over all k equals n. Since they both use binary search, they are essentially the same strategy. The cleverness is to separate out the instances by k, but in sum we still haven't really beaten the naive strategy of checking each person separately.

This is not that insightful but I think interesting.

« Last Edit: Dec 9^th, 2008, 4:01am by River Phoenix »

IP Logged

Pages: 1

Notify of replies

Send Topic


« Previous topic \| Next topic »