Serendipity – In search of the human algorithm

The very same Internet revolution that created the global village has made bumpkins of all of us. A system of organization, curation and recommendation of information makes each of our views of the Internet more and more insular. This lack of diversity makes us lose sight of the benefits of serendipity that often leads to creativity and new ideas, says Professor R. Ravi who is examining how to design for coincidence.

Imagine the pleasure of browsing in a large musty bookstore with racks of undiscovered treasures. Now imagine that all the books were of the category that you particularly liked. What if the bookstore itself was full of just your favorite authors and styles you enjoyed? This is the intrinsic promise of the Internet and its vast collection of curated and classified information. In fact, a precursor to the conception of the Internet by Vannevar Bush in his article “As we may think” particularly calls out the ability to navigate this maze of information using hyperlinks that lead to related topics.

The sheer volume and scale of information available on the Internet has made categorization, indexing and searching the most important task, as attested by the influence and market power of the leading search engine company. In an attempt to make the information presented to a searcher more and more relevant, an important new feature was introduced by Google in 2009: search personalization.

Different results for different people

What this means is that the search results for the same keyword will be different for different people. The search results for “whaling” will be different for an investment banker and for an environmental activist, and it will also be different between environmental activists in Norway and the US. Mostly, the difference makes the results more relevant and convenient to sift through. A similar kind of personalized recommendations exist in most large repositories such as YouTube, Amazon and news websites. One of the promises of Schibsted’s SPiD system is the development of a similar personalization for logged in customers.

The reason why this starts becoming problematic is articulated fully in the book titled “Filter Bubble” by Eli Pariser. In short, personalization removes a standard frame of reference that is necessary for common dialogue among an informed populace. This is an important responsibility of journalism, media and information sites in a well-functioning democracy.

Removing the common frame of information reference puts each of us in our own filter-bubble creating echo chambers where we predominantly hear only from like-minded individuals. Even less noticed is the loss of the various opportunities to form serendipitous connections to new information which are important for cultivating a diversity of thought. Such a diverse frame of mind is an important precondition for creative invention which could be stifled by excessive personalization as well.

Myopic recommendation systems

If there are so many problems with personalization, why is it so prevalent? The answer comes from the design decisions made in the crafting of the clever software systems that allow us to navigate through information. A common metric of effectiveness in navigating systems (such as search engines, recommendation systems and relevance engines) is how much responsiveness (click-through) the automatic suggestions receive.

These automated machine learning systems optimize for presenting recommendations that have high responsiveness. While this is a very clear and actionable objective for the design of these systems, they are myopic in their scope. There is little understanding by the designers of these systems of the long-term benefits of serendipity and the eventual harms of filter bubbles. Business managers that look for data-driven ways of managing such systems all too often focus on such myopic objectives as well: just look at the metrics offered by the various online companies of their effectiveness and they are filled with number like click-through rates, unique users and engagement time (which translate to having more recommendations that users click on and stay on the site).

In this age of globalization and rapid cross-cultural mixing, it is even more imperative for a global citizen to be exposed to differing points of view. Not being exposed to diversity in opinions and opposing arguments leads to very insular societies; even purely as a business skill, this does not allow us to prepare for other reactions and counter arguments to one’s own thinking. Being in a unduly personalized information world would deny us of this important advantage.

Serendipity

means a “fortunate happenstance” or “pleasant surprise”. It was coined by the author Horace Walpole in 1754. In a letter he explained an unexpected discovery he had made by reference to a Persian fairy tale, The Three Princes of Serendip. The princes, he told his correspondent, were “always making discoveries, by accidents and sagacity, of things which they were not in quest of”. Source: Wikipedia

There are additional reasons for encouraging and planning for serendipity: Letting one’s attention wander is an important precondition for free association and discovery via synthesis of ideas from various unrelated domains. Indeed, bright flashes of creativity are often preceded by such wanderings among seemingly unrelated concepts.

To counter these negative effects of personalization, you can take some action right away. You can make a personal choice to maximize the chance of serendipitous discovery in your information seeking habits: don’t use the same search engine every time; even though you may have a set of favorite bookmarked sites for information such as news, weather, entertainment, sports and fashion, develop a habit to visit alternates from time to time. Experiment consciously and look for information that is purposefully broader when using tools so that you automatically expand the scope of the filters they impose on you.

Designing for coincidence

Technologists can contribute in this endeavor too: it is possible to design recommendation systems that try not only to optimize for relevance but add in considerations of serendipity. As an example, the basic notion of popularity of a web page is its PageRank which can be approximated by the number of other pages that link to it. An information repository can be said to be maximally democratic if all pages have the same PageRank or number of pages recommending them. However, by sheer variability in quality as well as personal preferences, these recommendations tend to point over time to a few very popular pages. This winner-take-all syndrome is the result of the selffulling nature of their growing popularity as well as the human effects of wanting to have a set of common pages to share. My own research examines various ways in which such recommendation engines can be redesigned so as to maximize some combination of relevance and democratic equality: in addition to recommending relevant and personalized items, the system also pays attention to ensuring that most pages have roughly the same popularity (measured by the number of recommendations they receive).

grafik

More democratic systems

To design a system that also encourages equality, it is important to mathematically define the amount of inequality in an existing design. This is complicated by the fact that not all pages are equal in terms of the quality of their content: thus, we may want the resulting popularity of a page in a recommendation system to be proportional to its informative quality. A second complication is the interest on the topic of the page: a high-quality page on a fringe topic could still be deemed less popular or globally relevant than a medium quality page on a topic of wide interest. This can be modeled by looking at the distribution of interest in the topic of the pages. Putting these ingredients together, one can try to come up with an ideal measure of how popular a page or item must be in the most globally democratic recommendation system.

Once we have established how to quantify equality after correcting for quality and interest, the task of designing a system that achieves these target popularities remains. Algorithms for suggesting recommendations can be calibrated for how close their results come to the ideal; new methods can be devised that deliberately take into account this final goal of increasing the calibrated equality of pages. More practical methods can combine existing notions of effectiveness of recommendations (such as the user responsiveness measured in clicks) with these new notions of equality in their objective and thus achieve a smooth transition to more democratic suggestions while maintaining a good portion of their current useful behavior.

By incorporating such aspects of equality and coincidental discovery in their objectives, technologists can help assuage the filter bubble problem. Business managers can also help by trying to quantify the value of planning for such serendipitous results in software systems they are responsible for. One possible source of this value already comes from reducing the “creepy” factor that is associated with many modern personalization systems that make the user feel that the software system knows too much about her private preferences. When Target sent a coupon book of pregnancy products to a teenager in Minneapolis, it spooked her father who realized that she was pregnant only well after seeing the book. One antidote employed by such campaigns is to mix in random coupons for non-pregnancy products that are still relevant. This solution reduces the creepiness of the personal targeting while making space to show a more diverse set of products that may result in interesting discoveries.

Establishing order for controlled chaos

There is an even broader role for social and political bodies to play in this correction: All societies that oversee information collection curation and presentation, as well as those that set technological standards must be mindful of this trade-off between the benefits and harms of personalization. Journalists, sociologists and media executives must weigh in on the debate to highlight the importance of diversity by exposing the tangible benefits from not moving towards excessive personalization and maintaining a broad and common point of reference. Let us carefully understand the implications of the bargain we are making for better organization and access to our information and ask ourselves the following question: is the convenience, speed and quick gratification offered by personalization worth the loss of a more deliberate trawl through the information that holds the possibility of seeing a golden nugget of coincidental and view-altering perspective along the way? Keep this in mind in your next session of web surfing!