Processing continuous text queries featuring non-homogeneous scoring functions

Abstract : In this work we are interested in the scalable processing of content filtering queries over text item streams. In particular, we are aiming to generalize state of the art solutions with non-homogeneous scoring functions combining query-independent item importance with query-dependent content relevance. While such complex ranking functions are widely used in web search engines this is to our knowledge the first scientific work studying their usage in a continuous query scenario. Our main contribution consists in the definition and the evaluation of new efficient in-memory data structures for indexing continuous top-k queries based on an original two-dimensional representation of text queries. We are exploring locally-optimal score bounds and heuristics that efficiently prune the search space of candidate top-k query results which have to be updated at the arrival of new stream items. Finally, we experimentally evaluate memory/matching time trade-offs of these index structures. In particular we experimentally illustrate their linear scaling behavior with respect to the number of indexed queries.
Document type :
Conference papers
Complete list of metadatas

https://hal.sorbonne-universite.fr/hal-01359505
Contributor : Bernd Amann <>
Submitted on : Friday, September 2, 2016 - 2:50:37 PM
Last modification on : Wednesday, May 15, 2019 - 3:35:29 AM

Identifiers

Citation

Nelly Vouzoukidou, Bernd Amann, Vassilis Christophides. Processing continuous text queries featuring non-homogeneous scoring functions. 21st ACM International Conference on Information and Knowledge Management, Oct 2012, Maui, Hawai, United States. pp.1065-1074 ⟨10.1145/2396761.2398404⟩. ⟨hal-01359505⟩

Share

Metrics

Record views

432