Session abstract:
Your CEO runs up to you looking scared. Your competitors are recommending related articles based on context and machine learning and the current ML system keeps crashing.
Our embedded iframe is inside popular news sites with millions of articles and thousands of concurrent visitors, The system’s uptime should at least match these well established companies. You have to fix it, now.
What do you do? Run? Convince the CEO that Machine Learning and Natural Language Processing are passing trends? Or do you reach for open source tools and set out to do something better than your competitors in just a few days?
We went for the third option; using Elasticsearch, as the heart of this system.
Elasticsearch dynamic templating was used for mappings which support specific types like geopoints and dates but still let users dynamically add fields and events.
We wanted simplicity and reliability in an embarrassingly parallel system, and implemented a reactive streams system. This let us build an asynchronous recommendation engine caching recommendation results in the background so they can be promptly served when asked by the frontend, This has proven resilient enough to give us sleep, simple enough to be maintainable and flexible enough to serve millions of users while keeping costs low.
These kind of scenarios happen on a daily basis; I will demonstrate how the right design decisions got the product out of the door on time, kept management happy and kept us engineers sane despite the time pressures involved. If you are tired of those nightly dinner "treats" here's a solution.