Technology

A picture is worth a thousand words

What if we in the future will only use photos when searching for things online? Image recognition is already improving a lot of services dramatically – some at Schibsted’s marketplaces.

Long before we could read or write, visuals and sounds were the only way of communication, i.e. gestures, body language and simple drawings in the sand, accompanied by some sweet “Uh ga Chaka”. In fact, writing systems with letters are merely abstractions of the visual communication that preceded them. One could argue that the written language has been a temporary means of communication while we wait for technology that can process and derive meaning from the incredible amount of information that visual formats (e.g. photo or video) carry compared to written words. That technology is here and it’s here to stay. Its name is image recognition (or computer vision) and Schibsted is already experimenting with it.

Up until recently, computers have only been able to “see” our world. With recent advancements in artificial intelligence, however, computers are now able to “interpret” and “understand” what they “see” – sometimes even better than we do! Computers used to be like newborn babies: Most have eyes which give them a glimpse of the world around them. Yet, it’s not before they learn to focus their vision that they start to truly comprehend what is around them.

Speed and scale impossible for humans

Video is just a fast sequence of images, images are data and these data are processed through algorithms that are trained to recognize patterns in those data. Different models can return different results depending on the desired use case. With cloud computing and high resolution imagery, computers can now perform tasks at speeds and scales that are impossible for humans; detecting nanometer deviations in a production line; tracking and predicting hurricane impacts; guiding autonomous vehicles through traffic; real-time digital translator on your trips around the world (ref. Google Translate); to find an individual among hours of video in a matter of minutes.

This will inevitably alter how we go about our daily lives as products and services leverage the huge potential in computer vision.

This is particularly interesting for marketplaces that are about matchmaking (a.k.a. liquidity) – bringing buyers and sellers together and facilitating a transaction between them. Amongst other things, image recognition can improve the quality of data by enriching ads with metadata, and the user experience by enhancing the search and discovery part of the customer journey.

There are multitudes of applications of computer vision from hosts of providers. Companies like Google, Amazon, eBay and (in particular) Pinterest have been working on their own solutions for image recognition, and all of them have lately scaled their investments into the technology. Schibsted also has some features that are already live and more are in the pipeline. Below are some of our favorite and most promising use cases within marketplaces.

Visual search

Our culture is already dominated by visual stimuli. Half of the human brain is (directly or indirectly) devoted to the processing of visual information (MIT, 1996). It seems only natural that search and discovery starts with an image. Visual search, i.e. search with imagery as input (as opposed to text), enables quicker search and more accurate results, catering to a better user experience in the search and discovery phases. This drives better matchmaking. Notably, Pinterest recently integrated Shoppable Pins directly with their visual search (Pinterest Lens) which could disrupt the traditional customer journey as we know it. Finn Torget has an image search (beta) on web – try it: finn.no/bildebeta

Category suggestions

Sometimes the sellers on our marketplaces place their ads in the first category that comes to mind and that category is not always the correct one. Albeit, one can always improve the category taxonomy, it could also be solved with smart suggestions based on image recognition. The category suggestion service of the Cognition team helps people select the right category for their ads during the classified ad submission. The user snaps a picture and the service provides suggestions for which category the ad could belong in. This removes friction and shortens ad insertion time, and ensures that ads are categorized correctly so they are easy to find. This feature is in production on Blocket.

Recommend visually similar ads

Image recognition can capture remarkably more facets of an image than what a human is able to articulate in a text search or what collaborative filtering (a common recommendation method) is able to account for. Thus, recommendations based on visual similarity can be more relevant, which makes it easier for a buyer to find exactly what she or he is looking for.

Another useful distinction between recommendations based on collaborative filtering versus visual similarity is that while the former tends to favor recently published ads (because of users’ behavior), the latter is often indifferent to the age of the ads (as similarity trumps age). This means recommendations based on visual similarity can give new life to older ads and hopefully provide them a new owner.
This feature is in production for all categories on FINN Torget.

Metadata enrichment

We know that some sellers ”forget” to include significant information about their object in the ad. Meanwhile, some buyers find it hard to find exactly what they are looking for even though it might exist on the marketplace. We can cut out the humans from the equation and jump straight to automation. Image recognition allows us to populate an ad with additional metadata based on the uploaded image(s) which makes the ad more ”searchable” and consequently easier to find. That way we boost liquidity – and both the seller and buyer are happy! What’s more, it could even help valuate an object and auto-generate relevant alt text for images which is great for accessibility and SEO. A working prototype was built using Google Cloud Vision’s API during FINN hack days in May 2019.

Cropped visual search

Over the past few years though, we have seen a rise of inspirational platforms like social media and Pinterest. Instagram and Pinterest are incredibly good at monetizing the search for inspiration. So why is Schibsted not doing the same? We know that many of our users hang out on our real estate ads for inspiration. What if we could help them find the things they like in those images (e.g. that retro sofa or designer lamp) on our generalist marketplace or at Prisjakt? Food for thought. Bon appetit!

NAME: Arber Zagragja

TITLE: Product Manager, Tech Experiments

YEARS IN SCHIBSTED: 3

MY DREAM JOB AS A CHILD: Architect for Snøhetta