How to improve on-site search on your e-commerce with Elasticsearch

If you still use MySQL or another relational database for search purposes on your e-commerce site, then you definitely need to read this blog post. I'm going to describe what Elasticsearch is and how it can dramatically improve customer satisfaction and, potentially - increase sales. Intrigued? Then continue reading.

This blog post is not about how Elasticsearch works under the hood, but describes its highlevel features that can help you make the right decision and migrate from poor MySQL solution.

Scenario

Imagine that you have an e-commerce project and you use MySQL as a full-text search engine. Your customers want to search products by category, price, brand, title and description. Moreover, you spend money on advertising, have a lot of visitors, but, for some reason, you are not earning as much as you would like. What is the problem? Maybe customers can't find what they're looking for even if you have it. What?!

Let's analyse in what areas MySQL is good and what it can provide us to satisfy our customers.

MySQL as full-text search solution

Even though MySQL has full-text search features, it was designed to perform structured queries. If you use or plan to use it, you should be aware of its pros and cons.

Pros

- Good structured search

By structured search people usually mean search on data with a precise structure. For example, price, date, age, size and so on. If the data has a precise structure, then we can apply logical operations to it. For example:

Show me laptops with the Dell brand, priced between 1000 and 3000 that were produced in 2015

Another example:

Show me monitors with the display size of 23 inches or more

Structured search has boolean logic. It answers one question: "Does a product match the request or not?" The data you search through must match all your critea or it will be filtered out.

- Full-text (words) search with relevance

It's possible and quite easy to perform word searches, not phrase searches. For example, products in our database have the following titles:

- dining table and chairs

- Ikea Wooden Dining Table With 4 Chairs Including Cushions

- Small Dining Table Square Kitchen Budget 2 People Furniture Solid Wooden Oak

- Designer Black Dining Table With 2 Chairs Wooden Canterbury Breakfast Set Room

And our query looks like:

wooden dining table

If we execute full text search using `match against`, then we will get next results:

- Ikea Wooden Dining Table With 4 Chairs Including Cushions

- Small Dining Table Square Kitchen Budget 2 People Furniture Solid Wooden Oak

- Designer Black Dining Table With 2 Chairs Wooden Canterbury Breakfast Set Room

- dining table and chairs

`Match against` treats our whole phrase as 3 separate words and looks for them ignoring order and count of matched words entirely. It just scores each product. This is called relevance. This type of search answers "How relevant are the products to the user request?"

- Free

MySQL community edition is free, so you can use it in your projects without spending extra money.

Cons

- Limited phrase search

As we have learnt above, MySQL supports full-text word searches with relevance. What if we need to match products by the whole phrase? MySQL doesn't provide you tools to do it out of the box. Of course, you can use a workaround with `regexp` queries, but it's a very limited solution.

- Doesn't have: highlighting, search suggestions, a fuzzy search, typo correction, autocompletion, powerful full-text search filters

Hmmm, it seems that MySQL doesn't support a lot of interesting features. What should we do? The answer is "Use Elasticsearch".

Elasticsearch to the rescue

Elasticsearch provides the most powerful full-text search capabilities available in any open source product. Moreover it's used by really big players in the market. Facebook, Microsoft, Cisco, Uber, The New York Times, Mozilla, eBay, Goldman Sachs and many more use it. So, If so many famous players are using it, we can trust Elasticsearch and definitely need to try it, too.

Let's go through cons of MySQL and check if and how Elasticsearch can solve them.

Pros

- Phrase search

It works out of the box in Elasticsearch. It also supports modifiers like Google does. By modifiers I mean + and (must or mustn't have a word) and so on. And again, after such search you will get results sorted by relevance by default. In many cases it's exactly what we need.

- Powerful filters

Human language is quite complex. What if we could better understand our customers? Any language has its stop words - small words that don't value too much, for example, in English we have articles like `a`, `an`, `the`. If we think a little bit more, we will remember that we have synonyms, too. We can also go even further - all words have their roots. All these important features can be configured and used in Elasticsearch. Just a little example:

User searches for:

tables

and we have the same products in our database:

- dining table and chairs

- Ikea Wooden Dining Table With 4 Chairs Including Cushions

- Small Dining Table Square Kitchen Budget 2 People Furniture Solid Wooden Oak

- Designer Black Dining Table With 2 Chairs Wooden Canterbury Breakfast Set Room

If we used oldschool `match against` in MySQL, it would return zero results. It seems unclear, maybe even confusing. Why have MySQL returned zero results? Becase the `tables` word is not the same as `table` for MySQL. Here Elasticsearch shines again. Both these words have the same root - `table`. So if we use filters for stemming words in Elasticsearch, user will see four products instead of zero. Great, this is exactly what we need! Elasticsearch has many useful filters, I showed you only the tip of an iceberg.

- Highlighting

Users like to see highlighted results, especially if they search through tons of products, documents and their long descriptions. Here are good examples of this feature in use:

The New York Times:

example of search results highlighting on the new york times

Google:

highlighting by google

Facebook:

highlighting by facebook

As you can see, it's easy for user to quickly locate interesting to him parts of content.

- Suggestions

One more important feature also known as "Did you mean". From time to time users make typos or don't know how exactly write a title of a product. With MySQL we would show them empty page, but with Elasticsearch it's possible to let the user know that something went wrong and suggest him the right option. Again, here are a few good examples in real world projects:

Amazon:

"Did you mean" example on Amazon

Google:

"Did you mean" example on Google

Youtube:

Youtube - did you mean feature demo

Why do they do this? Because it's another chance to get a satisfied customer. For example, on Amazon a user may notice that he made a typo, click on the "did you mean" string and see the results he is interested in. If the user sees what he is looking for, your chances to sale this thing to him increases. Using this simple feature we help users.

- A fuzzy search

This type of search uses similarity based on Levenshtein edit distance. It means how many characters can be changed in a word to make it similar to another one. For example, the user is looking for

laptop lenovo

but, he made a typo:

laptop lenovp

If we used MySQL search, then we would get zero results. Yes, it's sad because we've just lost a client. With Elasticsearch and fuziness we can help the user even in such cases. For example, we configure Elasticsearch fuziness as 1 (all strings that need only one character change to match the original query string will be ok), then the user will see `laptop lenovo` and everything is fine again. Nice? Sure!

- Autocompletion

Ok, we're now familiar with many useful features of Elasticsearch. We know how to help users if they make mistakes, how to show them the right path. Not so bad, right? Yes, but we can do better. Instead of correcting failure, we can prevent it at all. For such approach Elasticsearch has a secret weapon - autocompletion. Just imagine: the user is typing, entered a few chars and he sees possible options. He doesn't even need to continue typing, he can choose from one of the proposed variants. If he chooses from our own options, then he won't make a typo. This way we can do two important things:

1) We remove this unpleasant for the user step "Sorry, you made a mistake. Did you mean ..." - even better user experience

2) The user can get to his lovely product one step faster - this one is critical

If you ask me "who is using this feature?" - many, dozens of sites. Just remember a few of them:

eBay:

eBay autocompletion demo

Amazon:

Autocompletion on Amazon demo

Google:

Google - autocompletion demo

- Free

Man, you're lucky today, Elasticsearch won't cost you any penny. You can use it for free in any project.

Conclusion

If you have an e-commerce site and need only structured search by category, price, brand, year and so on, then MySQL will be OK for you. In another case, if you have tons of products with text titles and descriptions, your on-site search must be more powerful and flexible. This means that you need to use Elasticsearch for such use cases. All Elasticsearch features described in this blog post aim to help users find what they are looking for. The easier users can find your products using full-text search, the more products they will buy. I think it's exactly what you need.