Why Surf Alone?: Exploring the Web with Reconnaissance Agents

Henry Lieberman, MIT Media Lab,
Christopher Fry, Bow Street Software
Louis Weitzman, IBM


Introduction

Every click on a Web link is a leap of faith. When you click on the blue underlined text or on a picture on a Web page, there is always a [sometimes much too long] moment of suspense when you are waiting for the page to load. Until you actually see what is behind the link, you don't know whether it will lead to the reward of another interesting page, to the disappointment of a "junk" page, or worse, to "404 not found".


But what if you had an assistant that was always looking ahead of you ­ clicking on the Web links and checking out the page behind the link before you got to it? An assistant that, like a good secretary, had a good, [although not perfect] idea of what you might like. The assistant could warn you if the page was irrelevant or alert you if that link or some other link particularly merited your attention. The assistant could save you time and certainly save you frustration. The function of such an assistant represents a new category of computer agents that will soon become as common as search engines in assisting browsing on the Web and in other large databases and hypermedia networks


Reconnaissance agents

These agents are called reconnaissance agents. Reconnaissance agents are programs that look ahead in the user's browsing activities and act as an "advance scout" to save the user needless searching and recommend the best paths to follow. Reconnaissance agents are also among the first representatives of a new class of computer applications ­ learning agents that infer user preferences and interests by tracking interactions between the user and the machine over the long term.


We'll provide two examples of reconnaissance agents: Letizia and Powerscout. The main difference is that Letizia uses "local reconnaissance" ­ searching the neighboorhood of the current page, while Powerscout uses "global reconnaissance" ­ making use of a traditional search engine to search the web in general. Both learn user preferences from watching the user's browsing, and both provide continuous, real-time display of recommendations. Reconnaissance agents treat Web browsing as a cooperative search activity between the human user and the computer agent, providing a middle ground between narrowly targeted retrieval such as provided by search engines, and completely unconstrained manual browsing.


One description of this landscape of systems organizes them around two axes, one characterizing reconaissance connectivity (local vs global) and one characterizing user effort (active vs passive). Local reconnaissance, such as that performed by Letizia, traces links locally, while global reconnaissance uses global repositories such as search engines. Figure 1 plots a number of tools and agents against these attributes. Typical file browsing is in the lower left quadrant while standard search engines are located in the upper left quadrant. See the "related work" section for more discussion about how our projects fit in with other work.


Figure 1. The two dimensions of using agents based on amount of user effort and the connectivity of the data


The One-Input Interface

Newcomers to the Web often complain that they have trouble using search engines because they feel that the interface to a search engine is too complicated. The first time I heard this complaint, I was astonished.


What could possibly be simpler than the interface to a search engine? All you get is a simple box for text entry, you type anything you want, and then say "go". [Of course, today's seach engines aren't really as simple as that, since they tend to contain advertisements, subject catalogs, and many other features. But the essential functionality lies in the simple query box.] How could you simplify this interface any further?


However, when you think about the task that the beginning user is actually faced with, the complexity becomes apparent. In the user's mind is a complex blend of considerations. They may be looking for something very specific, they may be just interested in generally learning about a given subject. How specific should they be in describing their interests? Should they use a single word, or is it better or worse to type in multiple words? Should they bother learning the "advanced query" syntax? How should they choose between the myriad search engines, and compare their often inconsistent syntax and behavior [8]? Would they be better off tracing through the subject catalog rather than the search engines, since most portal sites now offer both? And, regardless of what they choose to type in the search box, they are deluged with a torrent of [aptly-named]"hits", and are then faced with the problem of how to determine which, if any, of them is actually of interest. No wonder they get confused.


The Zero-Input Interface

The only thing simpler than an interface with only one input is an interface that takes no input at all! The zero-input interface. In fact, even before you type that word into that search engine, the computer already could know a great deal about what you're interested in. You've already given it that information ­ the problem is, the computer threw that information away. Why do you need to keep telling a system about your interests when it should already know?


Present-day computers throw away valuable history information. Every time you click on a link in a browser, that's an expression of interest in the subject of the link, and hopefully, the subject of the page that the link points to. If a person were watching your browsing activity, they would soon have a good idea of what subjects you were interested in. Unfortunately, the only use that the browser currently makes of that expression of interest is to fetch the next page. What if the computer kept that information, and, over time, used it to learn what you might or might not be interested in? Browsing history, after all, is a rich source of input about your interests.


Systems that track past user behavior and use it to predict future interest are a new form of "zero-input" interface. Even though they need no input in the sense that they do not require explicit interaction themselves, they repurpose input that you supply to the computer for other reasons. Such interfaces are rare now, because each application has its own self-contained interface and applications cannot reuse input from other applications or keep any significant histories of interaction.

Letizia

If I were with a friend and watching him or her browsing on the Web, I'd soon learn which pages were likely to attract my friend's interest. Given enough time, I might even become pretty good at predicting which link my friend would be likely to choose next. If it happened that my friend was viewing a site for the first time that I had previously explored thoroughly, I might even be in a position to give a better recommendation to my friend as to what might satisfy their interest than they might have chosen on their own. And all this could occur without my friend explicitly telling me what their interests were.


Letizia [5, 6] is a "zero-input" software agent that automates this kind of interaction. It learns a profile of the user's interests by recording and analyzing the user's browsing activity in real time, and provides a continuous stream of recommendations of Web pages.
Browsing with Letizia is a cooperative activity between the human user and the Letizia software agent. The task of exploring the Web is divided between the human, who is good at looking at pages and deciding what to view next, and Letizia, which uses the computer's search power to automate exploration.


Letizia runs simultaneously with Netscape, and determines a list of weighted keywords that represents the subject of the page [1], similar to the kind of analysis done by search engines. Over time, it learns a profile of the user's interests. Letizia simply uses Netscape as its interface, with one window dedicated to user browsing, and one or more additional windows continuously showing recommendations.


When the user is actively browsing and attention is focused on the current page, the user need not pay any attention to the agent. However, if the user is unsure where to go next, or dissatisfied with the current offerings, he or she can glance over to the recommendation window to look at Letizia's suggestions. Netscape's usual history mechanism allows easy access to past recommendations, just in case the user missed an interesting page when it first appeared.

 

Figure 2. Letizia "spirals out" from the user's current page, filtering pages through the user's interest profile

During the time that the user spends looking at a page, Letizia conducts a search in the "local neighborhood" surrounding that page. It traces links from the original page, spiraling out to pages one link away, then t two links away, etc. ­ as long as the user stays on the same page. The moment the user switches pages, Letizia drops the current search, and initiates a new search starting from the page the user is now viewing. We call this process reconnaissance, because, like military reconnaissance, Letizia "scouts out" new territory before the user commits to entering it.
This leads to a higher degree of relevance than search engines. Current search engines are just big bags of pages, taking no account of the connectivity between pages. Because two pages are connected on the Web only when some author thought that a user who was viewing one might want to view the other, that connection is a good indication of relevance. Thus the local neighborhood of a page, obtained by tracing a small number of links from the originating page, is a good approximation to the "semantic neighborhood" of the page.


If I'm looking for a place to eat lunch and I like Indian food, it's not a very good idea to type "Indian food" to a search engine -- I'm likely to get wonderful Indian restaurants, but they might be in New Delhi and Mumbai. What I want is the intersection of my interests and "what's in the neighborhood". If the idea of geographic neighborhood is replaced by the idea of semantic neighborhood on the Web, that intersection is what Letizia provides.


Further work on Letizia is now concerned with tracking and understanding users' patterns of web browsing, in a project underway with student Sybil Shearin. In one experiment, student Aileen Tang added to Letizia a real-time display of the agent's tracking of the user's interests. We observed that as users devle deeper into a subject, the interest function steadily increases, only to dive abruptly as the user changes topics. Watching such a display lets the user know when they're getting "off track". Letizia can segment the history into topic-coherent "subsessions", and we can detect browsing patterns such as tentative explorations that don't work out, or switching back and forth between two related topics.

 

Figure 3. Letizia's real-time display of user interests. The graph on the lower right climbs while the user keeps browsing on a single topic [here "Programming by Example"]; it abruptly dips when the user switches topics.


Letizia's local reconaissance is great when the most relevant pages to your interests are nearby in link space. But what if the best pages are really far off?


Powerscout

Letizia takes advantage of the user's behavior to create a zero-input program that finds pages interesting to the user. But there are a number of zero-input resources that Letizia does not exploit. Letizia only scouts pages close to the current page and can only hope to examine tens of pages, while there exists hundreds of millions of pages on the web, any one of which might be relevant. Users have many interests over time and Letizia only focuses on the current page that the user is looking at. Also, users have a rich browsing history over potentially many months that can be exploited to better understand their true interests. PowerScout is another zero-input reconnaissance agent whose vision addresses some of these more general issues.


PowerScout is a different kind of reconnaissance agent. While it, too, watches you browse and provides a continuous display of recommendations, it uses a different strategy to obtain and display those recommendations. Like people, agents needn't necessarily surf the Web alone, and PowerScout uses a familiar companion ­ a conventional search engine to support its global reconnaissance technique. PowerScout uses its model of the user's interests to compose a complex query to a search engine, and sends that query while the user continues to browse. If we think of a search engine as being a very simple sort of agent itself, [and the more complex among the search engines do have some agent-like features] PowerScout represents an example of how agents with different capabilities can cooperate. By using global search engines to find documents in the semantic neighborhood of the users current document, the system is performing a different class of browsing, concept browsing.

Concept Browsing

PowerScout introduced the term concept browsing to emphasize the idea of browsing links that were not specified by a document's author but are nonetheless semantically relevant to the document being viewed. This auxiliary set of links may have been overlooked or unknown to the author, or might not have even existed when the page was created.


The concepts are formulated by the PowerScout reconnaissance engine by extracting keywords from the current page. These concepts can be used directly or influenced by user-declared long-term interests which we call profiles. Figure 4 shows the user viewing a page in the browser on the left while the results of a search is displayed in the PowerScout window on the right. The results are grouped by the concepts used to find them. In this example, the recommendations are organized under the concept of "proposal writing".


Each concept represents a slightly different slant on what is important on the page. PowerScout does the best it can, but it can't read the user's mind. However, the user can quickly look at the concepts and decide which, if any, are important to them. Beneath each concept is a short list of page hits with summaries. Clicking on the title of a page brings the full page into the browser.

 


Figure 4. Powerscout's screen. The list of recommendations is grouped by concept and dynamically updated.

Powerscout's search dialog provides fine control.

Profiles

A profile represents a user's interest in a particular area. The user may create as many profiles as needed to characterize their interests. Each profile is made up of a set of ordered terms and a research notebook. The terms are words extracted from a number of different sources including web pages, explicit searches and text from documents and email messages. The research notebook, discussed later, provides the user a way to access and organize the source of the terms in the profile, i.e., the original URLs, searches and text.


PowerScout uses a variety of heuristics for extracting and scoring keywords, such as word position, utilization of HTML markup, emphasis of terms used in explicit searches, word frequency and multi-word terms. The details are beyond the scope of this article but these techniques are the foundation for accurately characterizing a page.


At any time, only one of the profiles is the current profile. By clicking on the "Add to Profile" button, the significant terms of the current page are merged into the term set of the current profile. This process extends the profile with one click. We like to think of it as focusing the profile to more closely match the user's real interest. Both the user's long term interest, modeled by the current profile, and short term interest, modeled by the current page, affect what pages PowerScout recommends.


By keeping areas of interest separated into distinct profiles, one area of interest doesn't "pollute" another. You may, for example, be interested in Computer Music and New Orleans Architecture, but not care about Computer Architecture or New Orleans Music. When you're browsing for Computer Music, you don't want to be distracted by articles on New Orleans Architecture even though later that day, the converse may be true.

Powerscout's Search

Unlike Letizia, PowerScout does not follow links on the page to get candidates for recommendations. Rather it presents complex queries to a traditional search engine, such as AltaVista. By automatically constructing queries, PowerScout employs the zero-input principle to perform multi-term, complex queries to search engines without bothering the user with the query language's syntax.

Figure 5. While the user views a page in the browser, PowerScout does global reconnaissance in the background by performing complex queries to a search engine.

Concept Browsing is actually an iterative process of multiple queries based on the current page. Initially, the queries are very specific and exacting. If few results are returned successive iterations relaxes the constraints on the search. These latter searches are guaranteed to retrieve some results, but will be less relevant than the initial queries. Each query uses several terms, combined in various patterns of ands and ors. There are significant design tradeoffs between the quality of recommendations and the time required to return those recommendations.


PowerScout's Search window compliments its automatic reconnaissance. The search window allows the user to manually refine automatically constructed queries or easily construct a new query. This search dialog box, illustrated in Figure 4, contains a text-edit field for each of seven terms. When the user requests that the dialog box be displayed, it is "pre-loaded" with the seven most significant terms from the page being reviewed. The user may then edit the terms as well as indicate where the search engine should look for them [i.e. in page's title, URL, body or temporarily ignored shown in the expanded popup menu on the left]. When the user clicks the search button, the complex query is generated and sent to the search engine of choice


Concept summaries

Unlike Letizia, PowerScout does not automatically display the top recommended page. In fact there is no single top recommended page. Rather, PowerScout displays a dynamically changing list of recommendations, so the user can get an overview of what's recommended. This is more appropriate to search engines' all-at-once interaction, rather than Letizia's incremental search.
PowerScout's list of concepts and the related page summaries are just like an auxiliary set of related links that the original author of the current page might or might not have included on the page. The recommendations are clustered in related groups by concept. Details of a recommendation can be hidden or expanded while the user reviews the search results. Figure 4 shows the expanded details in the recommendations list.


Often the top recommendation of any given algorithm is not the best for the users needs. Sorting, categorizing and summarizing a set of recommended pages is often needed to find the most relevant recommendations.

The Research Notebook

There's valuable information saved in Powerscout's profile's Research Notebook that can become a tool in the larger task of collecting and storing information. There are three data types: URLs, searches and notes.When the user adds a page to the current profile, not only are the significant keywords merged into the profile's terms, but the URL is recorded as well. The user can see these URL's and use them like a folder of bookmarks.
Refining an automatic query results in an explicit search, which can also be saved in a profile. PowerScout's search dialog box assists the user in making complex queries without having to know the syntax of the search engine. The user can re-run saved searches, such as a periodic query to see what's new on a given topic.


The user may also create notes of text to add to a profile. The full text of a note is kept for later viewing, and the significant keywords in the text are merged into the profile's terms. This is particularly useful when the user wishes to add just a paragraph of a page or perhaps the text of an email message from a colleague on the subject.


Profiles are useful as bookmarks for URLs, searches and notes, but they can also be valuable when shared with others. A profile is stored as a file, so it can be easily shared with colleagues. By inspecting someone else's research notebook on a topic, you have access to their URL's, searches and notes. In addition, you can effectively concept browse the web with the interests of the profile's author.

Related Work

The space of search engines and other tools for navigation of the Web is rapidly expanding, and it isn't possible to cover them all here. While search engines could be considered "first generation" Web browsing agents, second generation agents are becoming more sophsticated in their analysis and classification of documents. Some now use popularity of pages, such as ParaSite [9], and/or analysis of link structure patterns, such as IBM Clever [3]. But most still rely on the conventional paradigm of the user explicitly querying a centralized repository which indexes and ranks documents independent of the individual user. We see the future trend towards navigation assistants that are more personalized toward individual users' interests and dynamic user behavior. That's where reconnaissance agents come in.


Perhaps the most widespread example of a reconnaissance agent at present is Alexa [1], the service behind Netscape's "What's Related" option. Alexa does perform reconnaissance in tracking user browsing history, and uses collaborative filtering to recommend new pages. Collaborative filtering matches the behavior of each user with other users whose browsing pattern fits most closely, and returns what they looked at subsequently as its predictions. Alexa thus doesn't have to understand the content of pages, which is both its strength and its weakness.


Northwestern University's Watson [2] also uses tracking user behavior and automatically generated queries to a search engine to recommend pages, as does Powerscout, and also has some other interesting capabilities, such as searching for pictures or searching for contrasting information instead of that most similar.

 

Conclusion

Both the Letizia style of agent and the PowerScout style of agent have their advantages and disadvantages. Letizia finds pages on a site of interest that the user might over look because of the complexity of the site's navigation, while PowerScout's global reconnaissance is better at retrieving faraway pages. PowerScout may also work better over a low-bandwidth line, since most of the bandwidth requirements have effectively been absorbed in advance by the search engine.
Letizia's interface design goal was to keep the interaction as minimal as possible, and so it does not show the user directly its profile of the user's interests or its analysis of Web pages. It does not provide any means to directly tell what the user is interested in. PowerScout, on the other hand, does provides an extensive interface for its user model and lets the user specify with precision what he or she is or isn't interested in.


As of this writing, the World Wide Web consists of an estimated 800 million pages [4]. This wealth of information becomes useful only if we can find what we need. Web users are currently forced to choose between browsing and searching. Both are great tools, yet each limits the process of finding relevent information. A new generation of tools, reconnaissance agents, automatically search while you browse. This permits users to maintain a focus on their quest for information while decreasing the time and frustration of finding material of interest. This blending of browsing and searching empowers the user to focus on their task while engaging the system to do what it does best at ­ analyzing, storing and retrieving relevant information.

 

Acknolwedgments


The authors would like to thank Mike Plusch, Robert Young, Dorothy Woglom, and Allen Michaels for their work on Powerscout.


Lieberman's research was supported by the Digital Life Cosortium and the News in the Future Consortium, and other sponsors of the MIT Media Laboratory.

References


1. Alexa, http://www.alexa.com/
2. Budzik, Jay; Hammond, Kristian J.; Marlow, Cameron A.; and Scheinkman, Andrei. Anticipating Information Needs: Everyday Applications as Interfaces to Internet Information Sources. In Proceedings of the 1998 World Conference on the WWW, Internet, and Intranet
3. S. Chakrabarti, B. Dom, D. Gibson, J. Kleinberg, P. Raghavan, and S. Rajagopalan. Automatic Resource Compilation by Analyzing Hyperlink Structure and Associated Text. Proceedings of the 7th World-Wide Web conference, 1998.
4. Lawrence, Steve and Lee Giles, Accessibility and Distribution of Information on the Web, Nature, Vol. 400, pp. 107-109, 1999.
5. Lieberman, H. Letizia: An Agent That Assists Web Browsing, International Joint Conference on Artificial Intelligence IJCAI-95, Montréal, August 1995.
6. Lieberman, H., Autonomous Interface Agents, ACM Conference on Computers and Human Interface [CHI-97], Atlanta, May 1997.
7. Salton, G., Automatic Text Processing: The Transformation, Analysis and Retrieval of Information by Computer, Addison Wesley, 1989.
8. Shneiderman, B., Byrd, D., Croft, W. B., Clarifying Search: A User-Interface Framework for Text Searches, D-Lib Digital Libraries Magazine, January 1997
9. Spertus, Ellen, ParaSite: Mining Structural Information on the Web, Sixth World Wide Web Conference, Santa Clara, CA, 1997.
----------------------------------------------------------------------------------------------------------------

HENRY LIEBERMAN is a Research Scientist in the Software Agents group at the Massachusetts Institute of Technology Media Lab in Cambridge, Mass. lieber@media.mit.edu

CHRISTOPHER FRY currently works at Bowstreet Software in Portsmouth, NH on tools for developing complex web sites. cfry@bowstreet.com

LOUIS WEITZMAN is a Senior Software Engineer at the IBM Advanced Internet Technology Group in Cambridge, Mass. louisw@us.ibm.com