Ann Eagan's Notes From “How Users Search Online”
A workshop sponsored by the American Society of Indexers
February 24, 2001


Web searching causes a lot of frustration.  According to surveys by the Netsmart Research Corporation, there is an 83% failure rate for users in finding what they were specifically looking for.  If they were searching a particular web site the failure rate goes down to 70%.  In addition, online help often doesn’t.  If users cannot find the help they need in 2 or 3 clicks, they generally give up.

Some introductory information architecture terminology:

Precision and recall – precision is how close you can get to retrieving what you want and recall is the number of items you retrieve

Breadth and depth – breadth is the number of options at each level of a hierarchy, depth is how deep the hierarchy goes.  In general you want no more than 7 items in a list.

Granularity – how deep you index.  Example: Abobe’s help page indexes only at the top level; Quicken’s indexes everything and has figured out how to make the index dynamic so that when help pages change, the index changes with it.

Visibility and context – how much of the information in a page can you see and if previous information scrolls off the page, do you still know where you are.    For example, Yahoo provides ‘breadcrumbs’ to follow - Regional > U.S. States > Minnesota > Cities > Duluth.  Context is often provided by consistent labeling across pages.

Meaning and aboutness – something computers don’t do well.  Computers don’t understand the difference between monarch as in ruler of a country or monarch as in butterfly if you only enter monarch.

Scent of information – a feeling that you are close to finding the information you seek, that it should only be a click or two away.  As long as people feel they are getting closer to what they seek, they will continue to click.

Current tools we provide

Navigation bars and link lists

Tables of contents – these are often not used

Menu paths – invite a user down a path to the information they seek.  Voicemail is one example (and sometimes a bad one!)

Cascading menus – JC Penney’s web site was an example of the use of cascading menus.  Cornell Law School still uses a form of cascading menus. These don’t work well for people who have mobility problems.

Classification schemes – these provide breadth, depth, and visibility but can get confusing

Hyperbolic Trees – a little hard to describe with words.  Go to: http://www.inxight.com/products_wb/tree_studio/tree_studio_demos.html to see a few in action.

Full-text searching – provides total recall and no analysis.  Useful when you know something exists or have a known item search

Keyword searches – can offer high precision

Search Engines – can offer high frustration, high recall, low precision.  There is only a 30% success rate in the use of a search engine specific to a site.

Natural Language Search Engines – can incorporate meaning and aboutness by entering relationships between elements.  They cost a lot to build.  AnswerWorks and AnswerWizards are two examples.

Types of online indexes

HTML
HTMLHelp/WebHelp
Applets and XML
PDF

HTML pluses and minuses
- lack of type ahead capability
- no find tab
- single topic display – one to one index
- returning to index is difficult
- cross-references can be coded in
- size/patience issues – they are good for smaller indexes
- lowest common denominator
- updating can be hard if no connection between index and content is coded in

Micrsoft HTMLHelp/ Ehelp, WebHelp
- slow to load on Web
- easy to compile
- has type ahead capability
- one to many index
- has scrolling capability
- has a find tab
- provides side by side index/topic display – return to index is easy
- Cross-references are interactive
- Older browsers cannot use them

Applets and XML
- can load slowly
- custom programmed
- customized display and sorting
- one to many index
- cross-referencess are interactive
- Older browsers cannot display them
- User may need to install specific software

PDF
- no type ahead capability
- cross-references are not interactive
- no side by side index/topic display
- size/patience limitations
- visibility problems
- imprecise orientation on found page
- has find capability
- can be printed
- there are many tools available to help build PDF indexes

Issues to think about when creating an online index
- type ahead
- scrolling
- multiple panes
- breadth and depth
- visibility
- granularity
- stubheads – headings that are overall headings for a topic but do not have a link associated with them.  This is not a big deal in print indexes but often cannot be accommodated in online indexes.
 

Example:
 
Printing    -  in many online index creation tools you have to link this
  Four color
  Monochrome
  Screening
- Cross-references – how will see references and see also references work? Can there be generic cross references?

Things to do that can help

- Back to top links or an alphabetical list at the end of each letter
- Side by side frames
- Back buttons or index buttons in PDF
- Simplify complexity by using push-pull techniques – you usually only have 2 levels of indexing available online

What’s to come (well, some of it is here, kind of)

Database /XML compiled indexes
Intelligent agents (that actually work)
Natural Language Programming based technologies
Searching multiple sources simultaneously
Change monitoring (changes in the content will be immediately reflected in the index)
Filtering, monitoring or alerting
Machine-aided and automated indexing
Cross language retrieval
Speech recognition
Entity and relationship extraction (be able to understand that ex-president and Bill   Clinton could be related)
Text mining
Question answering systems (like the Wizards in Microsoft Office)
Concept extraction and mapping
Automatic summarization
Visualization
Data analysis and interaction tools
Evidence combination