• <ins id="pjuwb"></ins>
    <blockquote id="pjuwb"><pre id="pjuwb"></pre></blockquote>
    <noscript id="pjuwb"></noscript>
          <sup id="pjuwb"><pre id="pjuwb"></pre></sup>
            <dd id="pjuwb"></dd>
            <abbr id="pjuwb"></abbr>

            chaogu ---大寫的人!

            LOG-2011-04

             

            //============================================================

            //============================================================

            DATE:2011-4-12

            TIME:01:18

            ICBC.pdf –finish

            //============================================================

            //============================================================

            DATE:2011-4-15

            TIME:00:00

            Reading the “NoSQL Datebase”

               Reason for use NoSQL

            1. Avoidance of Unneeded Complexity

            2. High Throughput

            3. Horizontal Scalability and Running on Commodity Hardware

            4. Avoidance of Expensive Object-Relational Mapping

            5. Complexity and Cost of Setting up Database Clusters

            6. Compromising Reliability for Better Performance

            7. The Current “One size fit’s it all” Databases Thinking Was and Is Wrong

            8. The Myth of Effortless Distribution and Partitioning of Centralized Data Models

            9. Movements in Programming Languages and Development Frameworks

            10. Requirements of Cloud Computing

            11. The RDBMS plus Caching-Layer Pattern/Workaround vs. Systems Built from Scratch with Scalability in Mind

            12. Yesterday’s vs. Today’s Needs

            Nosqldbs.pdf ----page19

             

            //============================================================
            //============================================================
            DATE:2011-4-16

            TIME:00:24

            Reading the cudaArticle—05

            A multiprocessor takes four clock cycles to issue one memory instruction for a "warp"

            Accessing local or global memory incurs an additional 400 to 600 clock cycles of memory latency

            -----------------------------------

            Cuda Memory

            Registers:

            The fastest form of memory on the multi-processor.

            Is only accessible by the thread.

            Has the lifetime of the thread.

            Shared Memory:

            Can be as fast as a register when there are no bank conflicts or when reading from the same address.

            Accessible by any thread of the block from which it was created.

            Has the lifetime of the block.

            Global memory:

            Potentially 150x slower than register or shared memory -- watch out for uncoalesced reads and writes which will be discussed in the next column.

            Accessible from either the host or device.

            Has the lifetime of the application.

            Local memory:

            A potential performance gotcha, it resides in global memory and can be 150x slower than register or shared memory.

            Is only accessible by the thread.

            Has the lifetime of the thread.

             

            // includes, system
            #include <stdio.h>
            #include <assert.h>
             
            // Simple utility function to check for CUDA runtime errors
            void checkCUDAError(const char* msg);
             
            // Part 2 of 2: implement the fast kernel using shared memory
            __global__ void reverseArrayBlock(int *d_out, int *d_in)
            {
                extern __shared__ int s_data[];
             
                int inOffset = blockDim.x * blockIdx.x;
                int in = inOffset + threadIdx.x;
             
                // Load one element per thread from device memory and store it 
                // *in reversed order* into temporary shared memory
                s_data[blockDim.x - 1 - threadIdx.x] = d_in[in];
             
            // Block until all threads in the block have written 
            //their data to shared mem
                __syncthreads();
             
                // write the data from shared memory in forward order, 
                // but to the reversed block offset as before
             
                int outOffset = blockDim.x * (gridDim.x - 1 - blockIdx.x);
             
                int out = outOffset + threadIdx.x;
                d_out[out] = s_data[threadIdx.x];
            }
             
            ////////////////////////////////////////////////////////////////////
            // Program main
            ////////////////////////////////////////////////////////////////////
            int main( int argc, char** argv) 
            {
                // pointer for host memory and size
                int *h_a;
                int dimA = 256 * 1024; // 256K elements (1MB total)
             
                // pointer for device memory
                int *d_b, *d_a;
             
                // define grid and block size
                int numThreadsPerBlock = 256;
             
            // Compute number of blocks needed based on array size 
            //and desired block size
                int numBlocks = dimA / numThreadsPerBlock; 
             
                // Part 1 of 2: Compute the number of bytes of shared memory needed
                // This is used in the kernel invocation below
                int sharedMemSize = numThreadsPerBlock * sizeof(int);
             
                // allocate host and device memory
                size_t memSize = numBlocks * numThreadsPerBlock * sizeof(int);
                h_a = (int *) malloc(memSize);
                cudaMalloc( (void **) &d_a, memSize );
                cudaMalloc( (void **) &d_b, memSize );
             
                // Initialize input array on host
                for (int i = 0; i < dimA; ++i) {
                    h_a[i] = i;
                }
             
                // Copy host array to device array
                cudaMemcpy( d_a, h_a, memSize, cudaMemcpyHostToDevice );
             
                // launch kernel
                dim3 dimGrid(numBlocks);
                dim3 dimBlock(numThreadsPerBlock);
            reverseArrayBlock<<< dimGrid, dimBlock, sharedMemSize >>>( d_b, d_a );
             
                // block until the device has completed
                cudaThreadSynchronize();
             
                // check if kernel execution generated an error
                // Check for any CUDA errors
                checkCUDAError("kernel invocation");
             
                // device to host copy
                cudaMemcpy( h_a, d_b, memSize, cudaMemcpyDeviceToHost );
             
                // Check for any CUDA errors
                checkCUDAError("memcpy");
             
                // verify the data returned to the host is correct
                for (int i = 0; i < dimA; i++){
                    assert(h_a[i] == dimA - 1 - i );
                }
             
                // free device memory
                cudaFree(d_a);
                cudaFree(d_b);
             
                // free host memory
                free(h_a);
             
            // If the program makes it this far, 
            //then the results are correct and
                // there are no run-time errors. Good work!
                printf("Correct!\n");
             
                return 0;
            }
             
            void checkCUDAError(const char *msg)
            {
                cudaError_t err = cudaGetLastError();
                if( cudaSuccess != err) 
                {
                    fprintf(stderr, "Cuda error: %s: %s.\n", msg, 
                                      cudaGetErrorString( err) );
                    exit(EXIT_FAILURE);
                }                         
            }

             

            //============================================================

            TIME:01:16

            Finsh reading the cudaArticle 06

             

            //============================================================

            DATE:2011-4-23

            TIME:09:31

            Reading berkeley view on cloud computing

               Page 10 classes of utility computing

             

            //============================================================

            DATE:2011-4-24

            TIME:00:16

            Reading Makefile.pdf

             

            --------------------------------------------------------------

            List macros specified by defalut(Makefile)

               Using : make –p

            $@ name of target

            $? List of dependents

            $^ gives all dependencies,whether more recent than the target

            $+ same as above,but keep the duplicate names

            $< the first dependencies

             

            --------------------------------------------------------------

            Reading berkeley view on cloud computing

               Page 19 Number 5 Obstacle: Performance Unpredictability

             

            //============================================================

            //============================================================

            DATE:2011-4-25

            TIME:01:40

            Finish reading Berkeley view on cloud computing

             

            //============================================================

            //============================================================

            DATE:2011-4-28

            TIME:21:22

            Coding the motion project

            The Visual Studio 2005 return an error that stack overflow

            “Unhandled exception at 0x00439a57 in motion.exe: 0xC00000FD: Stack overflow.”

             

            --------------------------------------------------------------

            'motion.exe': Unloaded 'C:\WINDOWS\WinSxS\x86_Microsoft.VC80.CRT_1fc8b3b9a1e18e3b_8.0.50727.4053_x-ww_e6967989\msvcr80.dll'

            'motion.exe': Unloaded 'C:\WINDOWS\system32\psapi.dll'

            'motion.exe': Unloaded 'C:\WINDOWS\system32\shimeng.dll'

            First-chance exception at 0x00439a57 in motion.exe: 0xC00000FD: Stack overflow.

            Unhandled exception at 0x00439a57 in motion.exe: 0xC00000FD: Stack overflow.

            The program '[2388] motion.exe: Native' has exited with code 0 (0x0).

            --------------------------------------------------------------

            Problem: using huge big objet

             

            //============================================================

            //============================================================

            DATE:2011-4-30

            TIME:01:40

            Coding CSE332 project 2

               Adding other data-counter Implementations

             

            posted on 2011-05-03 21:57 chaogu 閱讀(648) 評論(0)  編輯 收藏 引用

            導航

            <2011年5月>
            24252627282930
            1234567
            891011121314
            15161718192021
            22232425262728
            2930311234

            統計

            常用鏈接

            留言簿(1)

            隨筆檔案

            搜索

            最新評論

            閱讀排行榜

            評論排行榜

            久久精品无码专区免费| 久久国产成人精品国产成人亚洲| 囯产极品美女高潮无套久久久 | 国内精品久久九九国产精品| 久久精品一区二区三区不卡| 久久免费99精品国产自在现线| 日本精品久久久久影院日本| av无码久久久久不卡免费网站| 久久精品无码一区二区三区免费| 日本久久久久亚洲中字幕| 久久久艹| 国内精品欧美久久精品| 久久久久国产精品熟女影院| 亚洲国产日韩欧美久久| 国产福利电影一区二区三区久久老子无码午夜伦不 | 大美女久久久久久j久久| 色欲久久久天天天综合网精品| 久久久久免费视频| 久久se精品一区精品二区| 亚洲AV无码久久精品色欲| 亚洲国产成人精品91久久久| 大蕉久久伊人中文字幕| 国产99久久精品一区二区| 久久精品蜜芽亚洲国产AV| 亚洲精品无码成人片久久| 中文字幕乱码久久午夜| 国产精品99久久久久久宅男小说| 精品国产乱码久久久久久浪潮| 久久精品国产秦先生| 一本久久久久久久| 93精91精品国产综合久久香蕉| 久久亚洲高清观看| 国产成人香蕉久久久久| 国产亚洲美女精品久久久| 国产激情久久久久影院老熟女| 国产精品永久久久久久久久久 | 久久91精品国产91久久麻豆| 久久久青草久久久青草| 国产毛片久久久久久国产毛片| 精品久久人人做人人爽综合| 日本久久中文字幕|