1、元數(shù)據(jù)(Metadata):維護(hù)HDFS文件系統(tǒng)中文件和目錄的信息,分為內(nèi)存元數(shù)據(jù)和元數(shù)據(jù)文件兩種。NameNode維護(hù)整個(gè)元數(shù)據(jù)。
HDFS實(shí)現(xiàn)時(shí),沒(méi)有采用定期導(dǎo)出元數(shù)據(jù)的方法,而是采用元數(shù)據(jù)鏡像文件(FSImage)+日子文件(edits)的備份機(jī)制。
2、Block:文件內(nèi)容而言。
尋路徑流程:
路徑信息 bocks[] triplets[]
Client ------------》INode---------------------》BlockInfo --------------------------》DataNode。
INode:文件的基本元素:文件和目錄
BlockInfo: 文件內(nèi)容對(duì)象
DatanodeDescriptor:具體存儲(chǔ)對(duì)象。
3 、 FSImage和edits的checkPoint。FSImage有2個(gè)狀態(tài),分別是FsImage和FsImage.ckpt,后者表示正在checkpoint的過(guò)程中,上傳后將會(huì)修改為FSImage文件,同理edits也有兩個(gè)狀態(tài),edits和edits.new。
4、NameNode format情景分析:
- 遍歷元數(shù)據(jù)存儲(chǔ)目錄,提示用戶(hù)是否格式化?(NameNode.java里format函數(shù))
- private static boolean format( Configuration conf ,
- boolean isConfirmationNeeded )
- throws IOException {
- Collection<URI > dirsToFormat = FSNamesystem. getNamespaceDirs(conf );
- Collection<URI > editDirsToFormat =
- FSNamesystem .getNamespaceEditsDirs (conf );
- for( Iterator< URI> it = dirsToFormat.iterator (); it. hasNext() ;) {
- File curDir = new File (it .next (). getPath()) ;
- if (! curDir. exists())
- continue;
- if (isConfirmationNeeded ) {
- System .err .print ("Re-format filesystem in " + curDir + " ? (Y or N) ");
- if (! (System .in .read () == 'Y')) {
- System .err .println ("Format aborted in " + curDir );
- return true ;
- }
- while(System .in .read () != '\n') ; // discard the enter-key
- }
- }
-
- FSNamesystem nsys = new FSNamesystem (new FSImage(dirsToFormat ,
- editDirsToFormat ), conf) ;
- nsys.dir.fsImage .format ();
- return false;
- }
- 創(chuàng)建元數(shù)據(jù)內(nèi)存鏡像,包括類(lèi)FSNamesystem實(shí)例化對(duì)象,類(lèi)FSDirectory實(shí)例化對(duì)象,類(lèi)FSImage對(duì)象,類(lèi)Edits對(duì)象。創(chuàng)建FsNameSystem對(duì)象主要完成:BlockManager,F(xiàn)SDirectory對(duì)象以及初始化成員變量。FSImage對(duì)象主要完成對(duì)layoutVersion、namespaceID,CTime賦值為0,實(shí)例化FSEditLog。在類(lèi)FSDirectory,創(chuàng)建了HDFS根目錄節(jié)點(diǎn)rootDir。
- FSNamesystem( FSImage fsImage, Configuration conf ) throws IOException {
- this. blockManager = new BlockManager (this, conf) ;
- setConfigurationParameters (conf );
- this. dir = new FSDirectory(fsImage , this, conf );
- dtSecretManager = createDelegationTokenSecretManager (conf );
- }
-
- FSImage( Collection< URI> fsDirs , Collection< URI> fsEditsDirs )
- throws IOException {
- this() ;
- setStorageDirectories( fsDirs, fsEditsDirs );
- }
-
- void setStorageDirectories(Collection <URI > fsNameDirs,
- Collection< URI> fsEditsDirs ) throws IOException {
- this. storageDirs = new ArrayList <StorageDirectory >() ;
- this. removedStorageDirs = new ArrayList <StorageDirectory >() ;
-
- // Add all name dirs with appropriate NameNodeDirType
- for (URI dirName : fsNameDirs ) {
- checkSchemeConsistency (dirName );
- boolean isAlsoEdits = false;
- for (URI editsDirName : fsEditsDirs) {
- if (editsDirName .compareTo (dirName ) == 0) {
- isAlsoEdits = true;
- fsEditsDirs .remove (editsDirName );
- break;
- }
- }
- NameNodeDirType dirType = (isAlsoEdits ) ?
- NameNodeDirType .IMAGE_AND_EDITS :
- NameNodeDirType .IMAGE ;
- // Add to the list of storage directories, only if the
- // URI is of type file://
- if(dirName .getScheme (). compareTo( JournalType.FILE .name (). toLowerCase())
- == 0){
- this.addStorageDir (new StorageDirectory(new File(dirName. getPath()) ,
- dirType ));
- }
- }
-
- // Add edits dirs if they are different from name dirs
- for (URI dirName : fsEditsDirs ) {
- checkSchemeConsistency (dirName );
- // Add to the list of storage directories, only if the
- // URI is of type file://
- if(dirName .getScheme (). compareTo( JournalType.FILE .name (). toLowerCase())
- == 0)
- this.addStorageDir (new StorageDirectory(new File(dirName. getPath()) ,
- NameNodeDirType .EDITS ));
- }
- }
- 對(duì)內(nèi)存鏡像數(shù)據(jù)中的數(shù)據(jù)結(jié)構(gòu)進(jìn)行初始化:主要有FSImage的format函數(shù)完成,layoutVersion:軟件所處的版本。namespaceID:在Format時(shí)候產(chǎn)生,當(dāng)data node注冊(cè)到Name Node后,會(huì)獲得該NameNode的NameSpaceID,并作為后續(xù)與NameNode通訊的身份標(biāo)識(shí)。對(duì)于未知身份的Data Node,NameNode拒絕通信。CTime:表示FSimage產(chǎn)生的時(shí)間。checkpointTime:表示NameSpace第一次checkpoint的時(shí)間。
- public void format () throws IOException {
- this. layoutVersion = FSConstants .LAYOUT_VERSION ;
- this. namespaceID = newNamespaceID ();
- this. cTime = 0L ;
- this. checkpointTime = FSNamesystem .now ();
- for (Iterator <StorageDirectory > it =
- dirIterator (); it. hasNext() ;) {
- StorageDirectory sd = it .next ();
- format (sd );
- }
- }
- 對(duì)內(nèi)存鏡像寫(xiě)入元數(shù)據(jù)備份目錄。FSImage的format方法會(huì)遍歷所有的目錄進(jìn)行備份。如果是FSImage的文件目錄,則調(diào)用saveFSImage保存FSImage,如果是Edits,則調(diào)用editLog.createEditLogFile,最后調(diào)用sd.write方法創(chuàng)建fstime和VERSION文件。VERSION文件通常最后寫(xiě)入。
- void format(StorageDirectory sd ) throws IOException {
- sd.clearDirectory (); // create currrent dir
- sd.lock ();
- try {
- saveCurrent (sd );
- } finally {
- sd .unlock ();
- }
- LOG.info ("Storage directory " + sd. getRoot()
- + " has been successfully formatted.");
- }
最后分析一下元數(shù)據(jù)應(yīng)用的場(chǎng)景:
1、格式化時(shí)。
2、Hadoop啟動(dòng)時(shí)。
3、元數(shù)據(jù)更新操作時(shí)。
posted on 2013-05-24 15:18
王海光 閱讀(1026)
評(píng)論(0) 編輯 收藏 引用 所屬分類(lèi):
Linux