Search This Blog

Total Pageviews

Sunday, 28 July 2013

Bangalore Apache Solr / Lucene Meetup No. 2: A Report

The second Bangalore Apache Solr Lucene Meetup yesterday saw an energetic attendance of Lucene and Solr developers and practitioners alike - people with varying levels of expertise. The community has grown rapidly over the last couple of months with over 180 members now, and both the meetups have been well attended. The second one happened at the Koramangala office premises of Flipkart, and it was great for them to host the meet as well as arrange for the morning refreshments.

Solr committer and LucidWorks engineer Shalin Shekhar Mangar started off with the first session of the day. After a show of hands by attendees revealed quite a few newbies to Solr and Lucene, Shalin gave a brief introduction and basic demonstration of Solr 4.4, which got released about a week back.

My session on "Knowledge Search at Infosys" was next, where I highlighted the use of Nutch and Solr for the Enterprise Search within Infosys. My talk included the usage and customization of Nutch to perform federated search over a million documents across multiple sources, and a Solr setup helping us with people and expertise search and a near real time micro blogging discussions search.

Flipkart's deep dive session followed, where Umesh Prasad and Thejus presenting about their specific architecture, the inability of TF-IDF to measure up to the e-commerce search requirements, their high latency cache, external fields and relevance tuning.

Final talk of the day was from Jaideep Dhok from InMobi, who highlighted the possibility of building a percolator with Lucene with applications in log debugging and streaming to name a couple. 

Overall, another great day of insightful interactions and an opportunity to connect with practitioners of the  Search and Information Retrieval domain.

5 comments:

Ajay V said...

Very very glad to hear that you had the opportunity to present and also a bit sorry that I could not see it live.

By the way, what is this percolator that is talked about - a search for application logs?

Prasanna said...

Hi Ajay, thanks a lot. I'll see what can be shared. Met Aswathi on the day too, totally unexpected :)

Ok, coming to the percolator, it is a very interesting concept. Kind of "reverse search". Instead of indexing the documents, you "register" the queries you want to keep track of, and feed the documents to this registered store. And you get back all the queries that matched the document. Very intelligent stuff isn't it?

This finds usage in handling huge logs for example, where you know what to track and get back a response for the log files you feed in.

Elastic Search, a Solr competitor which also builds on top of Lucene, supports this feature: http://www.elasticsearch.org/guide/reference/api/percolate/

Prasanna said...

Photos from the event here, credits to Anshum Gupta, one of the organizers: http://www.flickr.com/photos/38935700@N07/sets/72157634843778260/

shrikanth kondupalli said...

Hi Prasanna, Attended this Solr meetup and great to see your presentation. Updated myself with some Solr internal doings. If possible please share the presentation. - Shrikanth

Prasanna said...

Hi Srikanth, thanks. Will try to share it once I get clearances from the organization. It is taking a while...