一個文件中有40億個整數,每個整數為四個字節,內存為1GB,寫出一個算法:求出這個文件里的整數里不包含的一個整數
下面的代碼片段僅僅是一個樣例。
4個字節的整數最大可表示為2^32=4294967296, 一個數一個數的讀入內存,建立一個bit map,共需要4294967296個bits(也就是0.5G字節的內存,并沒有超過1G內存的限制),讀入每一個數,置相應的bit為1。
1 int N = 20; // # of number
2 int M = 1000; // number range
3 std::vector<int> a(N); // can be imported from external file number by number
4 for (int i = 0; i < N; i++)
5 a[i] = (int)rand()%M;
6 std::copy(a.begin(), a.end(), std::ostream_iterator<int>(std::cout, " "));
7 std::cout << "\n";
8 // bit map setup for existence of each number
9 unsigned int nbytes = M%8 ? (M/8+1) : (M/8);
10 std::cout << "nbytes = " << nbytes << "\n";
11
12 char* p = new char [nbytes];
13 memset(p, 0, sizeof(char)*nbytes);
14
15 for (int i = 0; i < N; i++) {
16 unsigned int index = a[i]/8;
17 unsigned int bitpos = a[i]%8;
18 char* tmp = p+index;
19 *tmp |= 1 << bitpos;
20 //std::cout << "bit pos set to 1 : " << 8*index+bitpos << "\n";
21 }
22 for (int i = nbytes-1; i >= 0; i--) {
23 printf("%02X ", (char)*(p+i)&0xFF);
24 }
25 std::cout << "\n";
26 delete [] p;
27