I love libraries and bookstores. I love the tactile, olfactory and social experiences these physical spaces allow. Clearly the Internet has given us ample and exciting new opportunities to engage with information resources, but the digital realm is still a ways off from satisfying many of our real-world needs.
Putting aside the physical niceties of brick and mortar information repositories, one thing the Internet has yet to reproduce is the ability to easily and pleasantly browse its vast reaches. Browsing is a crucial component of information discovery; it allows an information seeker to expand organically upon an initial vague, often unarticulated need.
Imagine head to the stacks at your local library to browse through the cookbooks. As your eye traverses the shelves, you spot a book on kimchi. This book is exactly what you wanted to read, even if you couldn’t have initially articulated that desire.
Experiences like these sit at the heart of browsing — aimless navigation by subject or genre that brings you to something unexpected, yet ultimately rewarding. Browsing is a common manner of information resource discovery. However, the practice is not well-supported by the search-based or social methods of information discovery that dominate the web today.
How We Lost Browsing to Searching
Search assumes a direct path between the seeker and the sought. Ironically, “search” works best when you have a pretty good sense of what you are looking for. But most people, most of the time, do not have concrete ideas of what they really want.
Netflix knows this. Amazon does too. The two offer users the ability to browse collections by subject or genre. Think of when and why you’ve used the search box, versus when you’ve chosen to simply browse. Search is used to locate known resources (I want to watch The Last Picture Show), and browsing to encounter unknown resources in an organized and meaningful way (I want to watch something funny).
Incidentally, search as we know it was birthed alongside the sort of subject indexing that supports browsing (as is often pointed out, Yahoo! is an acronym for Yet Another Hierarchical Officious Oracle). But because subjects tend to be higher level than keywords, they cannot yet be accurately assigned by computers. In other words, a document that uses the word “cats” 100 times might be considered by search to be more about felines than a document that mentions the word 10 times — even though we know this is not necessarily so. Subject cataloging needs human eyes, and the web grew too big and too fast for such a project to remain feasible.
This was and is unfortunate, because the larger the collection, the more useful browsing becomes.
The Bookshelf Analogy
Now, imagine if I had divided my books into fiction and non-fiction. A stranger who knew he was more interested in non-fiction would only have to look at a fraction of my books. If I were to further subdivide and group by narrowing subject matter, I save even more time, and present the opportunity to find increasingly relevant books without the use of happenstance. Organizing a collection also provides information about the collection as a whole, and makes for better browsing.
Browsing can also be a visually pleasant experience. This isn’t a trivial enjoyment. As I skim a bookshelf, I receive visual clues that tell me something about the contents of the books (genre, price, credibility). List form search results offer their own clues, but oftentimes only a clickthrough informs you whether the page is worth reviewing (Google recognizes this, and has added a convenient preview function to their list results).
The Internet is like a monster version of my randomly organized bookcase. Even though your computer help you to manage all of its information, it’s hard to get a sense of how much exists on any given topic. And searching for “history” is an entirely different experience than browsing the history section at the bookstore.
Search is a daunting entry point to discovering what web has to offer on a given topic. Most searches return vast results full of outdated, duplicated and dubious content. Users rarely push past the first page of results. The real problem on the web is that search requires direction. As librarian Barbara Fister notes, “When faced with an information need, a primary criterion most searchers consider is convenience. A good answer is valuable, but not if it’s too hard to find.”
From Search to Social
The emergent discovery model today is social media. The explosion of social networking sites like Facebook, Twitter and Google+ allows users to share information on a peer-to-peer basis. This form of distribution hinges on human recommendation rather than mysterious and sometimes problematic algorithmic search rankings.
In the social model, you encounter my overwhelmingly unorganized bookshelf, but I reply with a suggestion. This saves you the time and work of searching. My selection probably satisfies you, particularly because, in this instance, it was made for you as an individual, rather than for a wider social network audience. However, it also limits your knowledge of what my collection could potentially offer.
Discovering information sources via the social model allows seekers to bypass the initiative required by search. Instead the “best” contents of the web bubble up to the user via news feeds and trend lists. It also hosts engaging dialogue around content, which adds novelty to the discovery process. But to seek information this way requires people to be diligent listeners — something many are unwilling or unable to do, especially as our social networks become noisier.
It also tends to obscure as it illuminates. The social model highlights popular content, but ignores the niche. Social discovery does a disservice to individual information seekers who have little incentive to dig deeper.
But the real problem for undirected and overwhelmed information seekers (and I argue we are in the majority) is that the structure of a social network is shaped by social rules, and not by the beautiful subject hierarchies or systems of classification that, while painstakingly and artificially constructed, can allow for effortless and organic navigation.
Even Twitter, where hashtagging allows for folksonomous cataloging of links, is no place for the lighthearted browser. Following niche topics requires seeking out and following subject experts. On the other hand, many Twitter lists are highly reminiscent of library resource guides. Certainly, high quality subject experts are tweeting their hearts out, but users must also be cognizant of their individual agendas. Furthermore, their connections are a sort of a sprawling metropolis, not the neat, navigable shelves and sections of a library.
Why We Need to Bring Back Browsing
Browsing gives information seekers a high-level sense of what exists within a collection, while presenting easy entry points to explore the unknown. It also allows for lesser-known works to stand alongside — and compete with — the more canonical ones they resemble.
However, the web continues to grow enormously, making human indexing — of the sort that libraries have been doing for centuries — impossible.
Projects dedicated to pure human indexing of the web still exist, the Open Directory Project being the largest. But considering these projects ultimately rely on crowdsourcing, the rapid proliferation of digital information, and a lack of bibliographic control of the web, such projects begin to look like trying to move the ocean with a bucket.
Perhaps a truly human indexed open web is too lofty a goal, and those who pursue it are blindly clinging to old world practices. Regardless, we have to find better ways of giving structure to big data, because the problem of information overload is only becoming more dire, and the shadow cast by sources gone viral, more large.
The advent of machine learning, and specifically the ability to use human feedback to forge distinct and organized information, offers the potential for new tools to better organize big data with minimal human effort. At the startup where I work, we’re developing techniques to more easily group like content. By responding to user feedback, the system can hone in on a keyword-identified concept with surprising and increasing degree of accuracy. And once you can algorithmically surface content relevant to “hotdogs” and “BBQ,” it is relatively easy for a human to place those into the more ephemeral and larger bucket of “summer foods.” This is the sort of work that can make browsing possible.
Similarly, Like.com, an acquisition of Google, is an application that browses the commercial sector of the web. It allows users to navigate their online shopping using visual cues like material, color and shape (all what you’d use in a real store). Like.com boasts, “stop guessing at keywords” — music to the ears of would-be browsers.
The beauty of digital information is that we are no longer restricted by physical space. There exists the opportunity for simultaneous structures of organization. The more ways we find to slice and dice content, the more opportunities for discovery become available. We can have it all — search for seekers who know what they want, social for seekers who want to listen and engage, and browsing for seekers who wish to meander and let the collection speak for itself.