Running slow queries with Lucene

06/12/2017 - 16:30 to 17:10
Palais Atelier
long talk (40 min)

Session abstract: 

Search engines like Lucene have been designed to run full-text queries as fast as possible. You can search for combinations of keywords using boolean operators, and Lucene will give you results in milliseconds. This is possible thanks to the inverted index structure, which gives you a sorted list of ids for every term. Then boolean queries just have to compute the intersection or union of these sorted lists, which is a cheap operation. However in the real world, users often want to run more complicated queries like phrase queries, range queries or queries on scripts, which can't easily get you a sorted list of ids. In this session, we will dive into how Lucene executes queries and in particular recent improvements around execution of slow queries. No prior knowledge about Lucene is required, however users who have been exposed to Lucene, Solr or Elasticsearch in the past are more likely to enjoy this session.