By Victor Lavrenko
A smooth info retrieval process should have the potential to discover, manage and current very diversified manifestations of knowledge – equivalent to textual content, images, video clips or database documents – any of that could be of relevance to the person. besides the fact that, the concept that of relevance, whereas likely intuitive, is really challenging to outline, and it is even tougher to version in a proper way.
Lavrenko doesn't try to bring on a brand new definition of relevance, nor offer arguments as to why any specific definition can be theoretically more advantageous or extra whole. as a substitute, he's taking a broadly permitted, albeit a little conservative definition, makes numerous assumptions, and from them develops a brand new probabilistic version that explicitly captures that inspiration of relevance. With this ebook, he makes significant contributions to the sector of knowledge retrieval: first, a brand new technique to examine topical relevance, complementing the 2 dominant versions, i.e., the classical probabilistic version and the language modeling strategy, and which explicitly combines files, queries, and relevance in one formalism; moment, a brand new procedure for modeling exchangeable sequences of discrete random variables which doesn't make any structural assumptions concerning the info and that can additionally deal with infrequent events.
Thus his ebook is of significant curiosity to researchers and graduate scholars in info retrieval who focus on relevance modeling, rating algorithms, and language modeling.
Read or Download A Generative Theory of Relevance PDF
Similar structured design books
Each corporation desires to increase how it does company, to provide items and companies extra successfully, and to extend earnings. Nonprofit firms also are eager about potency, productiveness, and with attaining the pursuits they set for themselves. each supervisor knows that attaining those ambitions is a part of his or her task.
This booklet constitutes the refereed lawsuits of the 3rd foreign convention on Unconventional types of Computation, UMC 2002, held in Kobe, Japan in October 2002. The 18 revised complete papers offered including 8 invited complete papers have been conscientiously reviewed and chosen from 36 submissions.
From the preface: ''In their preliminary touch with machine programming, many scholars were uncovered to simply one programming language. This e-book is designed to take such scholars extra into thesubject of programming by way of emphasizing the constructions of programming languages. The publication introduces the reader to 5 very important programming languages, Algol, Fortran, Lisp, Snobol, and Pascal, and develops an appreciation of primary similarities and ifferences between those languages.
Extra resources for A Generative Theory of Relevance
Middle: language modeling framework  according to . Right: the generative model proposed in this book. Shaded circles represent observable variables. 3 A Generative View of Relevance In this chapter we will introduce the central idea of our work – the idea that relevance can be viewed as a stochastic process underlying both information items and user requests. We will start our discussion at a relatively high level and gradually add speciﬁcs, as we develop the model in a top-down fashion.
As a conclusion to this section, we would like to stress the following: Contrary to popular belief, word independence is not a necessary assumption in the classical probabilistic model of IR. A necessary and suﬃcient condition is proportional interdependence, which we believe holds in most retrieval settings. 2). 3 The Language Modeling Framework We will now turn our attention to a very diﬀerent approach to relevance – one based on statistical models of natural language. Statistical language modeling is a mature ﬁeld with a wide range of successful applications, such as discourse generation, automatic speech recognition and statistical machine translation.
All our probability estimates have to be based on Q and on the collection as a whole, without knowing relevance and non-relevance of individual documents. To complicate matters further, Q is not even present in the original deﬁnition of the model (eqs. 3), it becomes necessary only when we have no way of observing the relevance variable R. Faced with these diﬃculties, Robertson and Sparck Jones make the following assumptions: 1. pv =qv if v∈Q. When a word is not present in the query, it has an equal probability of occurring in the relevant and non-relevant documents.