最近在vLan上面鏖戰BF2142，著實被這個游戲深深地吸引住，所以就開始關注起BF系列的引擎起來，只知道Script部分是Python完成的。在國外的一個站點上發現了這個小小的訪談，翻譯給大家僅供了解。

Continuing our series of occasional interviews with game developers about current and upcoming hardware and game graphics engines, we chat with Marko Kylmamaa, senior graphics programmer for Digital Illusion' Canadian studio.

　　本期的采訪對象是來自DICE的高級圖像程序員Marko Kylmamaa先生。

??? FiringSquad: First, Intel and AMD are pushing dual core processors and within the next year four core processors are due to be released. How will DICE support this kind of tech in the Battlefield 2/2142 engine and will there be any need for special programming to fully support multi core CPUs in PCs?

　　提問：目前Intel與AMD力推雙核CPU，目前明年都準備推出４核心的CPU。DICE準備如何在BF2引擎中加入對這種技術的支持，如果這樣做需要什么特殊的編程技術么？

??? Marko Kylmamaa: While a program geared towards a single-core machine may run fine, with some exceptions, and perhaps even somewhat faster on a multi-core machine, in order to realize the real performance benefits a careful attention has to be paid into structuring the code for the correct granularity in mind, to make it suitable for multi-core execution. With the introduction of the next generation consoles and the PC hardware, the whole industry is in a learning phase for understanding the differences between the traditional multi-threading approaches, and multi-threading for multiple cores. DICE is working closely with hardware vendors in making sure that all of the future titles make the maximum use of the available multi-core architecture.

　　回答：本來單核心的機器就可以運行得很好，有些時候甚至要快于多核機器。其實問題主要是在多核心的處理比單核心復雜（類似于痛苦的多線程），需要正確的處理代碼的結構與處理同步。隨著下一代硬件的普及，整個領域開始學習多線程編程技術。DICE也在不斷和硬件廠商深入合作發揮多核架構的性能。

??? FiringSquad: The 64-bit CPU has taken longer to really appear in mainstream PCs than some people expected. Do you think 64-bit CPUs will become more popular and how does DICE support it in their Battlefield 2/2142 engine ?

　　提問：64位CPU的普及速度超過人們的預計到來得如此之快，您認為６４位cpu會流行起來么？DICE在BF2引擎中如何支持它呢？

??? Marko Kylmamaa: One of the problems with harnessing the full power of 64-bit CPU抯 is the lack of adoption of 64-bit operating systems. Due to this it抯 difficult for the game developers to make full use of the 64-bit execution potential without providing a separate set of executables compiled for the different operating systems. The current Battlefield 2 technology has been thoroughly tested on the 64-bit architecture for guaranteeing a solid performance, and optimizations have been made where possible with such architectures in mind.

　　回答：由于現在64bit操作系統對64位ＣＰＵ的支持不是非常好，所以還無法完全發揮６４位ＣＰＵ的性能。如果不分別的為不同平臺編寫程序就無法發揮６４位的性能，這是個難點。BF2已經在６４位平臺上經過測試與優化過。

??? FiringSquad: Game physics are getting more and more attention as well with more attention being put into destructible objects and better collisions. Where does DICE stand on this kind of support for its engine and what solution is best; having a dedicated card (AGEIA) using a graphics card (ATI/Havok) or using a CPU to handle it?

　　提問：游戲的物理特性越來越受到重視。DICE如何看待它？您認為哪種方案最好呢？是獨立的AGEIA物理卡，還是NV/Havok的圖形卡，還是用CPU處理？

??? Marko Kylmamaa: Especially with multiplayer games in mind, it is difficult to make use of scaleable physics, since especially from the gameplay perspective all of the players must experience the same end result in simulation regardless of their hardware. This leads to a lot of the scalability of the physics being used for visual effects such as richer particle effects or fluid simulation. The GPU can of course be used for offloading the physics simulation from the CPU, but this will compete with the remaining processing time for graphics. Therefore in most cases it is necessary to strike the right balance between the CPU and GPU usage with the needs of the particular game in mind. The next generation technology at DICE is being built on the bleeding edge and will make use of very comprehensive physical modeling.

　　回答：在多人游戲中使用物理特性是相當難做的，從玩家的視角來說，所有的交互角色必須體驗到相同的物理特性而不關系他們說使用的是何種硬件。已經使用的物理特性有比如流體模擬粒子系統等等。ＧＰＵ可以分擔一些ＣＰＵ的物理模擬計算工作，但是這樣就和圖形計算爭搶了寶貴的資源。雖然如此，我們依舊需要平衡ＣＰＵ和ＧＰＵ之間的負載。DICE將會充分的利用下一代技術為玩家構建最優秀的物理體驗。

??? FiringSquad: HDR lighting is also getting a lot of attention in more PC games. How does the Battlefield 2/2142 engine support those features and how will that help the graphics in games that use it?

　　提問：HDR光照效果也被越來越多的提及。BF2/2142引擎是如何支持這種特效，而且它將如何提升游戲畫面呢？

??? Marko Kylmamaa: HDR lighting can add significantly to the perceived realism in the modern graphics engines. It is becoming an increasingly common feature as the new hardware supports full floating point surfaces and has the required processing power for supporting a multitude of such high end features.
??? Some aspects of the HDR lighting were simulated especially in the Battlefield 2?? Expansion Pack: Special Forces, for adding a degree of realism to the night-time look. The effect is fairly settle and was used mainly for fine tuning the overall look. Battlefield 2142 does not have night-time levels, so the same technology was not applicable to it, however there are a great number of special lighting effects for enhancing the desired futuristic look of the game.

　　回答：HDR光照可以作為現代圖形引擎的一個特性。在新硬件完全支持浮點計算的方式下，它可以提高畫面質量讓它看起來更真實，同時也需要相當的計算量。ｈｄｒ在ｂｆ２特別武力　中被使用，用于夜視效果。BF2142沒有夜市場景，所以也就沒有使用這種技術（應該是HDR），不過我們使用其他的光照效果提高畫面的真實感。

??? FiringSquad: More and more games are using extensive pixel and vertex shading for visual and art effects. How does the Battlefield 2/2142 engine support these features currently and how will pixel and vertex shaders be used in the future, particularly with Windows Vista and DirectX10 support?

　　提問：越來越過的游戲廣泛使用PS及VS技術提高畫面質量。BF2/2142的引擎如何支持這些特色，未來PS VS將被如何使用，特別是VISTA和DX10的來臨？

??? Marko Kylmamaa: The Battlefield 2 engine has been built on the DirectX9 architecture and is a fully shader based model. This allowed for a great flexibility during the development, and not supporting the older fixed function pipeline model allowed us to concentrate solely on the high end features. Battlefield 2142 is based on the improved Battlefield 2 technology and will be released later this year, so considering that the DirectX10 hardware won抰 be widely available just yet, it hasn抰 been beneficial to re-architect the engine into a DirectX10 based model for this release. This allowed the available time to be used for adding a number of new special effects and polishing the overall look of the existing engine.

　　回答：目前BF2引擎完全構建于DX9架構，這是個完全基于Shader的模型。這提高了開發的可伸縮性，擺脫了FF管線模型讓我們得以實現最高級的特效。BF2142基于改進的BF2引擎技術，不久將發布于世，所以考慮到DX10硬件不會那么快的普及，我們將引擎重新構建以適應DX10的模型。這樣我們就有時間在以后的日子里繼續加入新的效果，拓展現有的引擎。

??? FiringSquad: What other advanced hardware and graphical features do you think will be supported in upcoming Battlefield 2/2142 engine games and in future graphics engine?

　　提問：您認為BF2/2142引擎將會支持哪些高級的硬件及其圖形技術，未來的引擎呢？

??? Marko Kylmamaa: Battlefield 2142 will support a large range of high end special effects geared towards creating the desired futuristic look. These involve for example new atmospheric effects for creating a unique look that is quite different from Battlefield 2.

　　回答：BF2142支持許多特效用來構建絢麗真實的圖像。比如，球體光照技術（Atomospheric Effect）技術就和BF2中的不同。

??? FiringSquad: Finally, Mark Rein from Epic has said that Intel is hurting the PC gaming industry through its use of intergrated graphics in PCs. Is this a real threat and if so what can be done about this from the game developer's side?

　　提問：最后，Epic（不要告訴我不知道，即將發布的UT2007）的Mark Rein說，Intel正在通過集成圖形硬件損害PC游戲工業。從游戲開發者的角度來說您如何看待這個問題？

??? Marko Kylmamaa: Intel produces what you could call the ultra-low end graphics cards for a market segment that typically doesn抰 wish to invest the money into a higher end, gaming geared hardware. Clearly there is a demand for this type of hardware as Intel抯 graphics cards boast a large user base. However, this does impose challenges for the games industry in our attempts at reaching especially for the casual gamer market. Hardware requirements for the next generation games keep growing faster than what is needed for running general applications, which increases the rift between the casual and hardcore hardware markets. I believe that we as an industry will also have to recognize the different requirements these markets impose.
??? From the perspective of a developer, it can be difficult or in some cases practically impossible to make the high-end game run on the ultra-low end hardware. Supporting such scalability range in performance could be prohibitive with the required development time and cost in mind. It is ultimately up to each developer to find the correct range of hardware which allows for the desired market penetration.

　　回答：買Intel的顯卡的人，就是那些你稱之為買低端貨的那些人，他們其實都不會花錢構建一個游戲平臺。雖然事實如此，由于這個原因的影響，我們還是不太容易開拓這樣的一個市場。游戲對硬件的需求總是要遠高于商用軟件，其實這也擴大了硬件市場的層次差距。我相信整個工業會對看清楚這個問題。從一個游戲開發者的角度來說，讓高端游戲運行在低端平臺上著實困難。因為要支持這些性能不一的硬件需要提高開發的時間和花費。更本上還是要開發者根據他們所要開發的市場這一角度進行硬件的平臺的選擇。

posted @ 2006-11-10 11:44 周波閱讀(594) | 評論 (0) | 編輯收藏

啃書記

　　最近找了幾本好書，仔細的看了一下，覺得非常好，同大家分享。

　　八十年代訪談錄(下載)
?
???????

　　我生于八十年代后期，沒有資格對那個時代進行評論，我們中的大部分還在我們的這個時代里面塑造著自己的未來，縱使很少人知道我們在謊言強權中豬狗不如的生活著，連一個人的姿態都沒有。每一次的感嘆都在告訴自己，躺下還是前進？也許精通技術很重要，但是用一句古把這個辯解的下一個理由的路封死，“茍利國家生死以，豈因禍福避趨之”。每當我看到如今的中小學生在如此骯臟的環境中成長，我就會安慰自己，一代人自有一代人的事情要做。
　　正是前人，正視自己，正視我們的社會，這個小小的部落。

　　歷史的終結(下載)

???????

　　歷史是否只是在不斷的上演重復的戲劇，飛機大炮代替了長矛大刀，可是屠殺暴政的形態，無論穿著什么樣的新衣都是表面現象，人們依舊重復著“興百姓苦，亡百姓苦”的悲慘命運。是否那種安逸和平富庶是可以一勞永逸的？現在正在朝著哪個方向發展，沒有人再去愿意發動一場戰爭和變革，代價太大，而且充滿了危險。可是那樣也就意味著，燦爛的歷史有可能結束，男人們向往的征戰的浪漫有可能就這樣埋沒在無休止的上班下班里面，成為真正的幻想。從此迎來一個明主自由資本不斷上升物質不斷豐富的時代，只要還有人從事科技活動。不過也許到那個時刻，也是黑暗與顛覆到來的時刻，我相信。

　　思想的歷險，與大師對話(sina在線)

???????

　　我們的國家缺少大師，經濟學家很少能夠身陷逆境而無所畏懼，音樂家很少能夠掌握歷史上的所有樂譜，文學家也不再有獨特的生活視角。我們缺少的是大師，我們更缺少的事成為大師的毅力和勇氣。

posted @ 2006-11-10 10:02 周波閱讀(311) | 評論 (0) | 編輯收藏

A Smalltalk about General Computing Platform

When Brook Meets ICE
A Smalltalk about General Computing Platform
Bosch Chou （zhoubo22@hotmail.com）

??? As we have seen, techniques about distributed communication such as CORBA, DCOM, even JAVA have been used widely at some corners on the earth. All of these could implement purposes such as RPC, distributed computing, and some others applications for business and science.
???? Let’s have a look at development of hardware on platform of PC. CPU is becoming much faster, and much cheaper than any time before. At the same time, GPU, or more generally, is the card we call Display Adapter. Since 1999, NVIDIA released the new generation graphic card series named Geforce, challenge the performance until now, next year we can buy DX10 cards on the markets. Graphic card could do vertex transform and lighting instead of CPU. It’s a great progress on both CPU and GPU. How to use these rich SIMD resources? We can easily understand why we will focus to GPU.
???? Calm down, what’s our desire platform?

Cross Operation System
Cross Networks
Cross hardware – This is the key problem I try to solve.

??? ?The specialties I showed here, except the last one, most of them had been solved by some current technique. So, how to ??I found 2 treasures. ICE, Internet Communication Engine, is much similar as classic CORBA, but much easily used than CORBA. Brook, from Stanford University, developed for years, designed for GPU stream computing. Both of them have the same usage, a front-compiler, which could translate string-codes to C++ language. Then we can add the .h, .cpp files to our projects, code the interface.
??? The process how does client pass its call to server as showed below.

Client pass the data which need to be computed to interface declared both side
Server receives the data, compute them, pass the results back to client
Client receives the result, do its work itself continually.

??? But, the problem is, it’s too kinds of IDL language, one is for internet application, another is for local GPU stream computing. And more, ICE have no stream data property. It sounds like C++ metaprogramming, but it’s quite different from each other. So, does it meaning that we must redefine a new IDL language? Let’s check current tools we have had now.
????? In fact, the most important is the base model. ICE supports a property called “Sequence”, mapped into STL container of C++. It could be considerate as the base data type in the language we thought should to invert one. When a client sent a request, server accepted, and then the client sent data wrapped in this container which will rebuild in memory of “Server” as texture structure. After server had prepared all the textures contained the data ready to compute, it called API, used the Shaders to computed data. All the progress I have illustrated as follows.
For example, we wrote these IDL sentences.
?

GPU?Interface?Foo

{

Add([ in ]? float ?a <> ?,?[ in ]? float ?b <> ,?[ out ]? float ?c <> ) {

/* ?some?stuff? */

}

CPU?Interface?Bar

{

?Add([ in ]? float ?a[],?[ in ]? float ?b[],?[ out ]? float ?c[]) {

? /* ?some?stuff? */

}

?We declared the 2 interface, attention, the “GPU” and ”CPU” is the key word here, they’are used to mark where the interface is used for, here, one will run on traditional CPU, another will run on GPU.
??

// On?Server?Side

?? // verify?the?validity?of?data

??vector < float > ?tex1;

??vector < float > ?tex2;

??vector < float > ?result;

??Add(tex1,tex2,result)? // use?reference,?avoid?stack-copy

?? {

???GLfloat * ?Tex1Ptr? = ? new ?GLfloat[tex1.size()];

??? /* ?some?stuff?as?above,?convert?container?to?texture?structure */

???GLuint?hTex1;?glGenTextures( 1 , & Tex1);

???glTexImage2D( /**/ ,Tex1Ptr);? // upload?the?data?into?memory?as?texture

???glUseProgram(g_hArithmetic);

??? /* Draw?something?to?get?all?the?data?out,?a?rectangle?etc. */

??}

???? If you’re familiar with GL programming, you will point out, “Why not add glFlush, glSwapBuffer above ? “, in fact that’s the key of my whole article. If we only need 1 + 1, even we do not need GPU. The men are greedy all the time. If we want GPU to compute the π for us, what’s should we do ? Assume, we want to compute π , 16 million digitals, but texture unit of GPU can only hold 4096x4096 floating texture size. When GPU will swap buffer, we must move all the data from framebuffer to disk, save them, then make GPU continue compute data. But How to ? I checked the OpenGL and D3D Manual, found nothing useful. So I thought several way to implement this key problem.

Next generation hardware architecture, CPU integrates GPU, I think AMD & ATi will do this.
Improve the current API & Drivers, support operate SIMD register directly.

All I said was above, about a special aspect of distributed computing, about how to use GPU to do compute as CPU. If this can be implemented one day, I think the modern science will be benefited much from this.

Reference:
ICE, Internet Communications Engine, Zeroc,Inc http://www.zeroc.com/
Brook, Stardford University, http://sf.net/projects/brook
NVIDIA Develper Zone, http://developer.nvidia.com/
OpenGL official Site, http://www.opengl.org/

posted @ 2006-10-28 11:58 周波閱讀(936) | 評論 (0) | 編輯收藏

GPU Gems3 即將到來

今天去找關于WGL的Specific，想不到看到了GEMS3征稿的消息

http://developer.nvidia.com/object/gpu-gems-3-call-for-participation.html

GPU Gems 3 Call for Participation

Following the success of GPU Gems and GPU Gems 2, NVIDIA has decided to produce a third GPU Gems volume to showcase the best new ideas and techniques for the latest programmable GPUs. We were honored that GPU Gems won the 2004 Game Developer Front Line Award and that GPU Gems 2 was a Finalist in the 2005 Game Developer Front Line Awards. What’s more, GPU Gems and GPU Gems 2 were the best-selling books?at the Game Developer Conference and SIGGRAPH in their respective years.

This latest GPU Gems will, like previous volumes, be hardbound and in full color. Tentatively titled GPU Gems 3, it will be edited by Hubert Nguyen, Manager of Developer Education at NVIDIA. Nguyen contributed to previous GPU Gems volumes and brings to this role vast experience in the field of computer graphics. Section editors include a team of expert NVIDIA engineers: Cyril Zeller, Evan Hart, Ignacio Casta?o, Kevin Bjorke, Kevin Myers, and Nolan Goodnight.

NVIDIA is looking for innovative ideas from developers who are using GPUs in new ways to create stunning graphics and cutting-edge applications. GPU Gems 3 will present techniques and ideas that are broadly useful to GPU programmers and that can be integrated into their applications. And, it will continue the tradition of featuring chapters exploring non-graphics applications of the computational capabilities of GPU hardware (learn more at www.GPGPU.org). Because our goal is to provide a comprehensive set of authoritative and practical chapters, we strongly suggest submitting ideas about techniques that you have already developed and tested.

If you would like to contribute to the GPU Gems series, please read the following submission guidelines. The deadline for proposal submissions is Monday, December 11, 2006. If your proposal is accepted, you will receive additional time to complete the chapter.

Guidelines for Chapter Proposal

Each chapter proposal should meet the following qualifications:

??Subject. Your chapter can be about any topic related to applying GPUs in useful and compelling ways. For example, you may choose to write about a specific shader or technique for rendering an interesting effect, or you could write about a strategy for integrating shaders into a game engine. Or, you might discuss an interesting way to apply the GPU’s horsepower in a non-graphics area. The main requirement is that your subject has practical value for the community and that you are committed to writing a clear, concise, and informative chapter.?

??Submission. Send an e-mail to articlesubmissions@nvidia.com with your proposed chapter title as the subject line, and a concise chapter description in the e-mail body (preferably no more than 300 words). To increase your chances of acceptance, we recommend that the description include screenshots or movies that demonstrate the technique in action. Ultimately, you must be able to provide a working program that demonstrates your technique. Complete source code is not necessarily required, though a self-contained example will be a plus.

??Deadline. We will be working on an aggressive schedule, so you must submit your proposal by Monday, December 11, 2006.

Notifications will be sent out by the end of the year. If your proposal is accepted, we will contact you via e-mail and discuss our expectations for the full chapter, as well as the next steps in the process. To assist you in finalizing your chapter, we will create your figures and provide copyediting services.

Final?Chapter Information

??Length. The final chapters should range from five to twenty pages of formatted book pages. This requirement accounts for figures, code samples, and page layout, so there would be approximately 200 to 300 words per page. In some cases, we may accept chapters that are shorter or longer than the suggested length, depending on the content. A chapter does not have to be long or complicated to be accepted. In fact, an idea that is simple and compelling is more likely to be accepted.

??Rights. You must have the right to publish your work, code?and images (diagrams and screenshots).

We look forward to reading your submissions.

聯想到中國的圖書市場，只有嘆息而以，什么時候才有這樣高素質的圖書出現？在國外真好。

posted @ 2006-10-28 11:41 周波閱讀(694) | 評論 (0) | 編輯收藏

GPU還可以做什么 —— Brook for GPUs,Stream Computing On GPUs

??? 研究GPGPU也有一段時間了，去年這個時候正在學習GLSL。一段時間前在opengl.org上面發了一個Suggestion，建議GLSL向Cg以及CgFX學學架構，不要這樣成對成對的零散使用，雖然說自己可以寫class進行封裝，可是如果Shader一多管理起來是相當的頭疼，應該學學HLSL Cg那樣的方式，通過technique與pass的選擇進行渲染，在概念上也符合multi-pass。

??? GPU的SIMD性能超強，比CPU強得太多太多，由此帶來異常強悍的浮點運算性能，請看下圖。

??? 畫外音：不知道我的6200A排在什么地方哈哈。

??? 其實上圖有偏頗，這張圖節選自Siggraph2004，而現在ATi 1800XT的SIMD性能已經超過了6800好多，可不是游戲性能。不過可以看出，比CPU的浮點運算性能高好幾倍是不真的事實，可是如何利用呢？

??? 可編程硬件的到來為我們開了一個好頭，也許未來計算機硬件的發展趨勢就是，通用計算Generic Computing（GC，自造詞匯，可不是垃圾收集）。顯卡一直以來都是和Pixel打交道，讀取Texel，處理Primitive，寫入FrameBuffer，為SIMD的應用打下了堅實的基礎。顯卡芯片從開始就是并行設計的，這樣從紋理單元讀取Texel時才能發揮效力，當年大名鼎鼎的Riva TNT2的意思其實是TwiNs Textures雙紋理，而不是黃色炸藥。Geforce3依靠添加的幾個昂貴的register實現了Vertex Programming。NV收購3dfx，推出NV30系列芯片，伴隨著DX8為PC機引入Shader，開創PC機圖像畫質飛躍的先河，如今熱門游戲大多數已經使用可編程著色技術用來實現以往在工作站上才能實現的效果，這就是為什么如今看游戲實時演算的畫面都比當年Square動用sgi工作站集群渲染出來的FF8動畫效果好的原因。其實高級CG圖形理論在80年代就已經相當成熟，比如78年的Shadow mapping，White的Ray-tracing等等。那些技術以后我會慢慢給大家介紹，大家不妨去NVIDIA下載一個SDK研究一下，還有MS DX SDK也是必需的。

??? 先說目前可編程硬件用作通用計算的局限，而且在我看來，這個局限在Vista與DX10流行后可能依舊得不到解決，那就是API的問題。顯卡廠商提供的驅動，無一例外的都是徹底為顯示服務的，而不是用來標榜自己是GPGPU的。雖然說都有了自己的本地編譯器（主要是用于編譯GLSL string codes，HLSL可以預先編譯好，然后再由驅動載入執行），可是依舊不是為了計算非圖形數據服務。于是找到了Sh。Sh是一個很有趣的東西，使用了metaprogramming技術，模擬圖形語言的算法，編譯的時候轉化為對應的低等級ASM語句，很多Graphic Slide里面進行核心算法展示的時候都用的Sh。有興趣地可以到這里看一下。強烈建議顯卡廠商推出可以直接進行計算的驅動，不要和FrameBuffer牽涉，可以直接通過Bus寫入內存，技術上并不難，也許是個商業問題。關鍵時刻永遠是商業左右技術的發展，而不是技術人員的一廂情愿就可以左右世界發展，如今已經不是工業革命時代了。

??? 給大家介紹來自Starford University的Brook（聽起來好像廣告，不過在Shading Language界可是有Starford Shading Language得一席之地的）。Brook可以理解為是一個C編譯器，只不過它編譯的不是Bin，而是C++ string codes，而且是著色計算語句數組。比如有這樣一段Brook代碼，簡單的Alpha混合，不對，不像，反正就是它了：

kernel?void?saxpy(float?alpha,?float4?x<>,?float4?y<>,
out?float4?result<>)?{
result?=?(alpha?*?x)?+?y;
}

???
??? 編譯成最終的C++代碼變成，

static?const?char*?__saxpy_fp30[]?=?{
"!!FP1.0\n"
"DECLARE?alpha;\n"
"TEX?R0,?f[TEX0].xyxx,?TEX0,?RECT;\n"
"TEX?R1,?f[TEX1].xyxx,?TEX1,?RECT;\n"
"MADR?o[COLR],?alpha.x,?R0,?R1;\n"
"END?\n"
"##!!BRCC\n"
"##narg:4\n"
"##c:1:alpha\n"
"##s:4:x\n"
"##s:4:y\n"
"##o:4:result\n"
"##workspace:1024\n"
"##!!multipleOutputInfo:0:1:\n"
"",NULL};
void?saxpy?(const?float?alpha,const?::brook::stream&?x,const?::brook::stream&?y,
::brook::stream&?result)?{
??? static?const?void?*__saxpy_fp[]?=?{"fp30",?__saxpy_fp30,?"ps20",?__saxpy_ps20,
??? ??? ??? ??? ??? "cpu",?(void?*)?__saxpy_cpu,?NULL,?NULL?};
??? static?__BRTKernel?k(__saxpy_fp);
??? k->PushConstant(alpha);
??? k->PushStream(x);
??? k->PushStream(y);
??? k->PushOutput(result);
??? k->Map();
}

???
??? 這不就是純粹的Shading Language么。不過值得注意的是，Brook通過運行庫進行封裝，把GPU當作Streaming Processor，由CPU進行控制，計算數據并輸出。目前似乎只能進行圖形的計算，比如FFT，Ray-Tracing等演示，還沒有到達能夠計算pi的程度。

??? 思考了一下。精度問題需要解決，FP16剛剛開始廣泛使用，FP32還不能夠支持硬件過濾。FP32僅僅只是IEEE754 float的精度而已，更本談不上double的精度，用在需要精度較高的地方可能還不是很適合。如我設想那樣，進行pi的幾百萬位的計算，目前來說不太可能，首先，Shading Language從來就沒有提供地址的操作，也就是無法選澤Pixel的位置，也就是無法對FrameBuffer進行準確定位。如果可以解決這個問題，那么就可以進行真正意義上的通用計算，那個時候FrameBuffer只是一個暫時的緩沖容器而已。

??? SIMD的物理計算可以相當的強悍。物理特性計算都是強調同時性的，而GPU可以同時并行計算，充分發揮了自己的優勢，難怪NVIDIA要和Havok進行合作。記得以前看過博客園中一位先生寫的物理引擎，著實震驚，我建議他不妨研究研究這一塊。Stream的概念將在DX10上得到徹底的詮釋，不妨看看我以前翻譯的DX10文章，其中Geometry Shader很有意思。

??? 我期待下一代API出現，一個嶄新的軟硬件組合方案，這樣就可能為Display Adapter這個古老的東西帶來真正的革命。值得注意的是，AMD已經收購了ATi，而Intel還在為100億美元收購NV的價格評估的時候，也許下一代變革已經開始了，讓我們拭目以待。

??? 提到的東西可以在這里找到
??? Brook http://sourceforge.net/projects/brook
??? libSh http://sourceforge.net/projects/libsh

posted @ 2006-10-14 22:21 周波閱讀(2592) | 評論 (1) | 編輯收藏

忍耐無奈在大學

??? 苦難的歲月，空虛的年代，頹廢的人生。

??? 這是最近最經常掛在嘴邊的話。

??? 大二真的和大一不一樣了，物是人非的變化太多太多。好多老的朋友都不聯系了，美名其曰，他們也有了自己的新朋友。開學搞了一臺過時的筆記本帶到學校，準備沒事好好的把Coding練習下，可是沒想到卻成為了爭先搶奪的游戲機，玩紅色警戒2。昨天從樓道走來，發現一個宿舍的四個人，兩個人在宿舍里面，兩個人蹲在走廊，找個紙箱，把筆記本放在箱子上，一臺Acer一臺ASUS A8H，在那邊聯機QQ斗地主不亦樂乎。我無語，我是多么希望我的機器可以跑GL2.0，可是那只是一臺P3 933的830M，只支持1.3，這里人家用奢侈品玩游戲。就是這樣了，什么人都有。家里人總說我摳門對自己刻薄，我總是覺得錢省著點用好些，說什么我也要把我的MX350焊接好，也不去買早就眼饞的BayerDymnaic。

??? 恰恰舞真的很有意思，我也第一次從運動中找到了所謂的自信，美女如云，老師身材狂好。可惜我從來就是一個悲觀的人，總是喜歡說“即使”“只不過”等等這樣的連詞。好像從來沒有人問過我計算機其他的問題，連電話都沒有，手機真正驗證了PHILIPS待機王的美譽。眼看著自己一天天的老去，還是沒有多大長進，畢業后難道真得要去邦德國人推銷木工刀具？還是自己做木材生意？那我還不如現在把俄語學好，然后集資去俄羅斯開發遠東森林資源，一年賺個幾百萬不成問題，可是這樣的道路最后可行么？概率太小，挑戰太大，說說可以，實踐艱難。Blizzard天天招人，Epic上海公司成立，可惜與我無緣，即使準備好了。目前，還算精通C++，理解WIN32的開發，善于設計和編碼，不太適合算法的研究，嘗試過的內容非常多，從J2EE到NetGrid，COM到ICE，maya到Photoshop，都可以很快的上手，有自信不長的時間精通，希望找個地方練練，南京就可以。

??? 我們一天的生活如下，吃飯，上課，睡覺，上網，如此循環。終于，我實在忍受不了物理課本的無聊，大喊一聲，娘的不如回到19世紀去！拉過被子，睡覺去也，不忘思考設計模式問題。管它下面要干什么，我要的生活其實很簡單，做喜歡做的事情。如果可能我情愿做一名記者或者是律師，不需要管這些個無聊的問題，簡直就是在浪費我的生命。搖滾明星也可以，成立一支黑暗民謠樂隊和歐洲人分庭抗禮。可惜都是胡扯，出了門，就是克扣工資與高額消費，就是高樓與窩棚，就是貧賤與歧視，與自己苦難的命運。

??? 和女生發信息，說，“你難道不覺得無聊么？”，我回復三個句號，無語，睡覺去也。我是個有輕微精神病的人，我相信這一樣。我可以旁若無人的上課時對著自己演講，大談美國的歷史、拿破侖民法、中國中產階級的發展、女性解放運動，等到上臺演講的時候卻發現結結巴巴什么都說不出來。人家說韋爾奇也是這樣，可惜我沒有他那么好的家庭條件和家庭教育。大多數人也都是如此。想談戀愛，甚至有一段時間為了談而三句不離女人，可惜我終究發現自己是個白癡。等到我現在的名聲已經相當出名，戀愛瘋子，響徹學院05屆的時候，才發現我要的感覺其實還沒有到，還遠遠的沒有來到，不過還是挺想念幾個女生的。就這樣，保持距離的暗暗欣賞，感覺也不錯。不過兄弟們說我眼光越來越低，也許和我越來越消沉有關系。錢包因沒有女人而越來越鼓，時間因么沒有女人而越來越充沛，可是，感情觀念因為沒有女人而越來越偏頗，自信心因沒有女人而越來越低落。有得有失吧。

??? 如果我要告誡下面的學生，我會說，抓緊時間，把游戲的時間盡量壓縮，不要以為三天打魚兩天曬網就可以學到東西，有時候，學習的曲線應該是導數曲線。戀愛要談，最好早談，早談的話很多時間不會在以后關鍵的時候浪費，這是和世界同步的概念哈哈。

??? 發點牢騷，美麗和丑陋盡在其中。

posted @ 2006-10-14 17:13 周波閱讀(458) | 評論 (4) | 編輯收藏

Wow服務器解析（一）

Wow 服務器解析（一）

?????? 最近抽空研究了一下 WOW 的服務器結構，也順便從那些項目中又復習了一下 ManGOs 中 template 方式下 SingleTon 的使用方法。不過有些不明白的，如果這樣， SingleTon<Master> 這樣的使用，如果傳入的類型不同，難道傳出的 static 是一樣的？不可能吧，如果打印出 this 指針看看呢？抽空我再試試。 SingleTon 在游戲設計中是相當重要的設計模式，大家一定要好好學習。

認證過程

Wow 的服務器有兩部分組成： Logon Server （以下簡稱 LS ）和 Realm Server （以下簡稱 RS ）。 LS 接受來自 Wow 客戶端的連接，主要有以下幾步完成：

檢查客戶端版本區域等信息，檢察賬號密碼

開始 / 繼續傳送 Patch （如果有）

與客戶端進行 SRP6 的加密會話，把生成的密匙寫入數據庫

根據客戶端請求發送 Realms 列表

當客戶端選擇好 Realms 后，客戶端就從 LS 斷開，連接到 RS 上：

認證，使用剛才生成的客戶端密匙

如通過，進行游戲循環的交互

RS 和 LS 使用相同的數據庫， SRP6 密匙被 LS 生成并寫入 DB 后還要由 RS 讀取出來進行下一步的認證。

Logon Server 詳解

基本的連接過程如下：

客戶端準備連接，發送 CMD_AUTH_LOGON_CHALLENGE 數據包，包含了所有登陸所需要的數據比如用戶名密碼等

服務端返回 CMD_AUTH_LOGON_CHALLENGE 數據包，填充字段包括有效驗證，以及計算好的服務端 SRP6 數據

如果有效，客戶端發送 CMD_AUTH_LOGON_PROOF 數據包，并把自己計算的 SRP6 數據填充進去

服務端進行驗證，發送回 CMD_AUTH_LOGON_PROOF ，包含了 SRP6 驗證的結果

如果一切正常，客戶端發送 CMD_REALM_LIST 數據包，請求發送有效的 Realm

服務器回復 CMD_REALM_LIST 數據報，并填充過客戶端需要的 Realm 數據

客戶端的 Realm 列表每隔 3-4 秒就會從服務器端刷新一次。

? 這個 SPR6 是一種什么樣的加密手段呢？以前我也沒有用過，看得最多的是 MD5SHA 等 hash 算法。 SPR 算法吸取了 EKE 類型算法的優點進行了改進，非常適合于網絡的認證服務，如果我沒有記錯， J2EE 包含了這個算法的實現。下面簡單介紹一下 SRP6a 運作機制，原文見這里。

N???? N = 2q + 1 ， q 是一個素數，下面所有的取模運算都和這個 N 有關

g ??? 一個 N 的模數，應該是 2 個巨大的素數乘得來

k???? k = H(N,G) 在 SRP6 中 k = 3

s????? User’s Salt

I????? 用戶名

p???? 明文密碼

H()? 單向 hash 函數

^????? 求冪運算

u???? 隨機數

a,b?? 保密的臨時數字

A,B? 公開的臨時數字

x???? 私有密匙（從 p 和 s 計算得來）

v???? 密碼驗證數字

其中 x? =? H(s,p) 和 v = g ^ x ， s 是隨機選擇的， v 用來將來驗證密碼。

主機將 { I,s,v } 存入數據庫。認證的過程如下：

客戶向主機發送 I ， A = g ^ a （ a 是一個隨機數）

主機向客戶發送 s ， B = kv + g^b （發送 salt ， b 是一個隨機數字）

雙方同時計算 u = H(A,B)

客戶計算機算 x = H(s,p) （開始 hash 密碼）， S = ((B - kg^x) ^ (a + ux) ) ， K = H(S) ，（開始計算會話 Key ）

主機計算 S = (Av^u)^b ， K = H(S) ，也生成會話 Key

為了完成認證，雙方交換 Key ，各自進行如下的計算：

客戶接收到來自主機的 key 后，計算 H(A,M,K)

同理，主機計算 M = H(H(N) xor H(g), H(I), s, A, B, K) ，驗證是否合自己儲存的數值匹配。至此完成驗證過程。

三、 Realm Server 詳解

從 LS 斷開后，開始和 RS 認證：

連接到 RS ，向服務器發送 SMSG_AUTH_CHALLENGE 數據包，包含上次所用的隨機種子

服務器發送回 SMSG_AUTH_CHALLENG 。客戶端從服務器端發送回來的種子和 SRP6 數據中產生隨機種子，生成 SHA1 字符串，用這些數據生成 CMSG_AUITH_SESSION 數據包，發送給服務端。

需要注意的是，這個過程是沒有經過加密的。當服務端收到認證回復后，通過客戶端產生的種子也生成一個 SHA1 串和來自客戶端的進行對比，如果相同，一切 OK 。

下面看一下對賬號創建的角色等操作進行分析。一個賬號最多可以建 50 個角色吧，我還沒有玩過，只是看了一下 Manual 。

?客戶端發送一個CMSG_CHAR_ENUM數據包請求接受角色

服務端發送回包含所有角色信息的 CMSG_CHAR_ENUM 數據包

這里客戶端可以對這些角色進行操作了， CMSG_CHAR_CREATE ， CMSG_CHAR_DELETE ， CMSG_CHAR_PLAYER_LOGIN

角色登陸完成后，服務器發送回 SMSG_CHAR_DATA 數據包

在游戲循環中是如何操作的呢？

如果玩家立刻退出游戲，那么客戶端發送 CMSG_PLAYER_LOGOUT ，服務器回復 SMSG_LOGOUT_COMPLETE

如果玩家選擇稍后退出游戲，發送 CMSG_LOGOUT_REQUEST 。服務端回復 SMSG_LOGOUT_RESPONSE 。如果玩家在倒計時階段退出，發送 CMSG_PLAYER_LOGOUT ，那么玩家的角色依舊等倒計時完成后再退出。

如果玩家中斷了退出繼續游戲，發送 CMSG_LOGOUT_CANCEL ，服務器回復 SMSG_LOGOUT_CANCEL_ACK 。

posted @ 2006-10-14 16:27 周波閱讀(5308) | 評論 (3) | 編輯收藏

World Of Warcraft Server Source Topic

聲明：World Of Warcraft，魔獸世界相關程序的源代碼所有權歸暴雪公司Blizzard所有。WowWow只是一個Wow的服務器端的模擬程序，由俄羅斯黑客逆向工程得來，在這里僅供學習網絡游戲服務器端或者交流之用，沒有任何來自于暴雪公司或者及其中國運行商九城的源代碼。任何個人或者組織使用此源代碼經營可能違反法律的事業活動與本人無關。特此聲明。

討厭中國的這些個破網站，下載源代碼竟然還要花錢申請什么破VIP，殊不知sf.net中好的代碼多的是。

這個是我從國外的一個論壇中拖回來的，由于自己的硬盤裝不下Wow客戶端所以也就沒有測試過，有條件的可以試試看。

我打算花些時間用C++重新寫一遍，雖然說已經有了類似的Mangos，實在不喜歡C# JAVA之類的虛擬機語言。.net人不要跳出來和我爭C#不是虛擬機軟件云云，懶得搭理。編譯出來的代碼很小，程序啟動速度奇慢無比，還必須要.net Frameworks的支持，麻煩。

最早的是WowEmu，許多單機版Wow附帶的也就是這個我就不列出地址了，BT上多的是。

然后就是Wowwow，可是它的內核代碼是不公開的，你可以看到decompiler云云
下載地址
附上一個有一些代碼的Wowwow Alpha v8.3
下載地址

目前我正在分析的是Mangos，老巢竟然在sf.net中，介紹是一點沒有提到World of War，可是實際上它運行的就是它。
去這里吧

歡迎交流，如果您覺得好請回復我一下謝謝咯~~~

posted @ 2006-10-05 13:59 周波閱讀(2306) | 評論 (9) | 編輯收藏

曾經深愛的你遠在他鄉祝你幸福快樂

??? 請允許我再叫你一聲親愛的，因為我害怕轉身后就把你從此遺忘。
??? 我們沒有聯系過了，就上一次我發錯了信息，把我的真心實感的負面說給了你聽。我知道你當時一定以為，原來我這么幾年都是在欺騙你，而你是絕對不可愚弄的人。就這樣我們從此就沒有再聯系過，你也一次沒有回復我。我也不想再多說什么，你的選擇你自己清楚，我也只是個普通人。我已經對你沒有任何感覺了，無奈之至的抉擇。如果為別人，我也許還有新的希望。希望你可以過得很好很好，祝福來自一個愛過你的人。
???
??????? 月石

??? 冰雪消融的季節，
??? 草種醒來不住啼哭。

??? 相愛與沉默的選擇，
??? 未曾在一起肩負過。

??? 衰老的人悲痛依舊，
??? 寒冷根植秋的寥落。

??? 一生只為別人憂傷，
??? 散曲終了繁花喑落。

??? 今夜里你淚流為誰，
??? 只憑明月相思千里。

??? 送給天下所有孤單的人，祝你們合家幸福，有情人終成眷屬。

posted @ 2006-10-02 13:54 周波閱讀(330) | 評論 (0) | 編輯收藏

一瞥美國的精英教育

??? 最近真是不好意思，開學實在太忙，文章都存在筆記本電腦里面，學校上網非常不方面，PocketPC轉換有非常麻煩，所以就今天一股腦獻上拙文。十一還要出去，偏偏又下雨，好不爽。

??? 中國家長對美國式的家庭教育總是向往不已，18歲把孩子掃地出門就不用管了，最好過個幾十年還能夠混出類似于卡耐基、蓋茨一樣的人物出來，一句話，生兒子就要生這樣的，才叫做值得。或者是從小好好學習，上大學考到博士后，最后出國留洋，不奢望富可敵國，也求有房有車中產階級。

??? 可是是事實是這樣么？高中的時代很羨慕美國的SAT考試制度，一年可以考多次，可以取最好的成績。聽起來好似機會很多，競爭開放，其實不然。在美國也是高手如云，優秀到另中國的“狀元”們汗顏的高中生多如牛毛。每年申請“常春藤”系列學校，比如哈佛耶魯普林斯頓等等世界一流高校的學生，30%學生的SAT分數高的嚇人，滿分1600，有些甚至能夠達到1560分，接近滿分。而且，為了申請這些學校，學生無不需要一場完美的面試，許多能夠表現個人能力經驗的材料。為了準備上大學，他們需要準備的材料，超過我們想象，不僅僅是類似于我們高中三年的拼搏，背誦多少試題，最后在獨木橋上殺落千軍萬馬神勇的進入大學。他們更多是從小的教育環境中取得超過別人的能力，因為中產階級如果想真正的進入上流社會，在如今只有通過教育，通過從小燒錢學習鋼琴、吉他、舞蹈等等，從小從內心以及情操塑造一個有資格和一流對手競爭的全能型弄潮兒。而申請這些世界一流高校的學生大多來自家底殷實的中高產階級，和歐洲那樣的“貴族”世襲概念不同，在美國，這種老子英雄兒好漢是通過教育實現的，布什父子就是一個連本拉登都知道的例子，老子有了本事，就可以讓兒子上最好的大學，接受最好的教育，結識社會的高層家庭。正如前面所說，這是通過從小的家庭熏陶，超過年齡承受能力的高強度教育訓練實現的，而且中西文化在這一點出奇的吻合，沒有刻苦哪有成功。

??? 不妨這樣想，中國的學生說，我曾經閱讀過多少本C++書籍，曾經寫過多少軟件，獲得過什么樣的獎項，僅此而已。而美國的學生表示，自己曾經暑假去AT&T實習，和多少世界一流的專家交流，而且把打工賺來的錢，到秘魯從事公益活動，為一個學校義務的教授計算機課程，受到當地的嘉獎等等。如果你是Microsoft的HR，你會選擇怎樣的學生加入？可以想象這樣的差距。或者說就是，我們在培養世界工廠的普通工人，而他們在課堂上灌輸的是如何引領世界的思想，以及如何獲得超越常人的能力和卓越的創造力。雖說如此，這種栽培精英的方式，其實也對廣大中下階級的心理產生反作用，而且相當明顯。美國的州立普通大學不乏一流的教學設施，頗有建樹的專家教授，但是學生大多數來自下層家庭，他們更多關注的是如何像麥當勞收銀員那樣可以掙到錢，至于是搶銀行還是老老實實的作汽車修理工是沒有想太多的。我想包括我，大多數的中國大學生在這個階段也都是這樣想，羨慕同伴打工掙的零用錢，買自己喜歡的東西，畢業后找個好工作就可以了。這些學生的出路也可以想象，也許有一些可以最終突破重重艱險進入高層社會，可是他們中的絕大部分將要繼續的生活在社會的最底層。中高層家庭出生的學生，教育優勢從出生下來就確立無疑，可以承受得起各種輔導班學習班出國交流費用等等。

??? 不過還有一個相當有趣的問題，在我們的印象里面，好像美國的一流大學比如哈佛耶魯，總是在比如經濟法律等文科方面建樹頗豐，而在一些基礎的學科比如數學物理方面則不是那么非常的出色。學校的性質不同是一個方面，老牌的理工學院比如加州理工麻省理工本來就非常強勢，另外一個重要的方面就是培養人才的對象不同，也就是說我的學校是專門培養這種精英式的人才而不是單純的技術人員。有一個很有意思的論點來自《國家的興衰探源》，就是，普通公民研究社會問題是沒有任何價值的浪費時間，因為他們缺少專業的知識來源，以及提供交流場所的其他個體，換句話說，研究這種東西對他們本身沒有任何的益處。換句話說就是，這里是獨立于社會而又高于社會的一個團體，它的存在職責就是為了培養領導這個社會的人物。而我們的北大也是文科非常出名的學校一樣，相反專業理科學校風頭反而不是那么強盛。這難道也是所謂的“勞心著治人”？也許有一些這樣的取向，不過經濟法律等等方面，教導人的是一種思維和態度，這才是最重要的。由此看來，蓋茨的父母并不是理工科出生，而是法律工作者，造就了他善于談判交際如合理性的思考問題也無不關系。就好像那些父母要求子女去學習鋼琴而不是計算機，需要積淀的是一種人格的姿態和氣質，而不是先天造就一個只懂得技術的不曉得如何看待大局的腐儒。

??? 眼看著自己也奔三進了大學，也被忽悠浪費了不少青春。心里想著和這些一流的白人比較高下，終究沒有機會也沒有實力。人類的進化已經許多萬年，智力水平已經趨于平均 —— 先天或者醫療水準造就的不算，真正決定自己實力以及所能上升的位置從大方向上已經確立。也許我們需要的是為自己的后代想想應該如何成長如何生活，自己所真正需要的。和我們競爭的就是太平洋彼岸的那些新教徒移民的后裔，而不僅僅是鄰居家考上南大的一介書生。

??? 對于現在我來說，技術也許重要，愛情也許迫切，但是我開始覺得，關注真正的世界才是我們最迫切需要仔細思考的。你覺得呢？Young Guys？

posted @ 2006-09-30 13:52 周波閱讀(290) | 評論 (0) | 編輯收藏

僅列出標題

2007年2月

日

一

二

三

四

五

六

周波 87年出生南京林業大學05421班242信箱專業木材科學與工程工業裝備與過程自動化遷移到 jedimaster(dot)cnblogs(dot)com

常用鏈接

留言簿(4)

隨筆分類

隨筆檔案

新聞檔案

2007年1月 (1)

同學們Blog

Rita的QZONE
隊長的BLOG
海翎的BLOG
胡小松的Blog
君胖的Blog
貓貓的BLOG
丫頭的BLOG
左圓的BLOG

搜索

積分與排名

積分 - 54571
排名 - 421