Content-Type: text/shitpost


Subject: More reservoir sampling
Path: you​!your-host​!walldrug​!prime-radiant​!computer​!hal9000​!plovergw​!shitpost​!mjd
Date: 2018-02-14T12:18:42
Newsgroup: misc.reservoir-sampling
Message-ID: <27121b51754ded85@shitpost.plover.com>
Content-Type: text/shitpost

Here's yet another variation on reservoir sampling that I haven't seen before.

You have a stream that will produce !!2N!! items, one at a time as they are requested, but you don't know ahead of time what !!N!! is. You want to select exactly !!N!! of them at random, and you want each of the !! 2N\choose N!! possible subsets chosen equiprobably.

But you must do this with exactly !!N + O(1)!! memory. (Doing it with !!O(N)!! memory is easy: just suck the whole stream into a buffer.) Is this possible? I'm really not sure. If not, is it possible with !!N + O(f(N))!! for any sublinear !!f(N)!!?

If it is possible, probably it is not hard to generalize the method to selecting !!N!! of !!kN!! items for any positive !!k!!.

[ Thanks to Dfan Schmidt for pointing out that I did not originally ask for what I wanted. ]

[ Answer: Nope. ]