正文

eSearch, Hybird globla index

(2005-01-18 18:39:38) 下一个

Index partition structures:
(local index)partition-by-document : document based: each host maintains a local inverted index of the documents it is responsible for. Query must be flooded to all peers. e.g. Gnutella.
(globle index)partition-by-keyword :assigns earch keyword to a single node, each node maintains the inverted lists of some keywords. a k-keywords query have to be checked k times and get the intersection.

Hybrid:distribute metadata based on terms. Each node j is responsible for the inveted list of some term t. In addition, for each document D in the iverted list for term t, node j also stores the complete term list for document D. The routing is similar to DHT routing, for example, query are routed with Chord to the node.

Figure 1: Comparison of distributed indexing structures.(i) Gnutella-like local indexing. (ii) Global indexing. (iii) Hybrid indexing. (iv) Optimized hybrid indexing. a, b, and c are terms. X, Y, and Z are documents. This example distributes metadata for three documents (X-Z) that contain terms from a small vocabulary (a-c) to three computers (1-3). Term list X->a,c means that document X contains term a and c. Inverted list a->X,Z indicates that term a appears in document X and Z.

 

[ 打印 ]
阅读 ()评论 (0)
评论
目前还没有任何评论
登录后才可评论.