ABSTRACT Title of Document: RELEVANCE JUDGMENTS AND QUERY REFORMULATION BY USERS INTERACTING WITH A SPEECH RETRIEVAL SYSTEM Jinmook Kim, Doctor of Philosophy, 2006 Directed By: Professor Dagobert Soergel Associate Professor Douglas W. Oard College of Information Studies This dissertation presents a framework for searcher behavior that can be used as a basis for designing future speech retrieval systems. It reports on an exploratory study that examines: the criteria searchers of oral history interviews use when judging the relevance of a recording or a passage; the attributes on which those judgments are based; the moves searchers adopt for information need refinements (INR); and the types of query reformulation by which those moves are realized. Eight participants that include faculty, Holocaust scholars, a film producer, and a high school teacher searched the Survivors of the Shoah Visual History Foundation's collection that consists of 116,000 hours of 52,000 testimonies in 32 different languages from the survivors, liberators, rescuers and witnesses of the Holocaust. Each participant performed a series of searches based on his/her own interests over a period of three to nine days. Data were collected through observation and screen capture, think aloud, semi-structured interviews, and focus group discussions; coded; and analyzed by looking for patterns. The cognitive process of relevance judgment and query reformulation occurred interactively during a search. As a result, some relevance criteria (topicality, comprehensibility, novelty of content, and acquaintance) and INR moves (clarification alone, specialization, restriction, and note for later) were observed during both processes. Some criteria, such as richness and emotion, were medium (i.e., speech) and domain (i.e., oral history) specific. The findings identified four different types of attributes of a recording or a passage that included spoken- content attributes (person, place, event/experience, organization/group, ?), audio and/or visual attributes (facial expression, voice, gesture, displayed artifact, ?), non- content attributes (cache, digitization, language, ?), and biographical attributes (name of interviewee, date of birth, gender, occupation, ?). Searchers used different query reformulation types, such as adding a condition, narrowing a condition, new term, broadening a condition, removing a condition, and modifying a condition, in order to achieve different INR moves. Some important implications for indexing and metadata assignment, support for search and browsing, and task-oriented system and interface design are drawn from the findings. It then concludes with discussions on limitations and ideas for future work. RELEVANCE JUDGMENTS AND QUERY REFORMULATION BY USERS INTERACTING WITH A SPEECH RETRIEVAL SYSTEM By Jinmook Kim Dissertation submitted to the Faculty of the Graduate School of the University of Maryland, College Park, in partial fulfillment of the requirements for the degree of Doctor of Philosophy 2006 Advisory Committee: Professor Dagobert Soergel, Chair Associate Professor Douglas W. Oard, Co-chair Associate Professor Eileen G. Abels Associate Professor Marilyn D. White Professor Andrew D. Wolvin ? Copyright by Jinmook Kim 2006 Acknowledgements I deeply thank and respect my advisor, Dr. Dagobert Soergel, for continuously encouraging and intellectually guiding me throughout the dissertation process. I truly believe that this dissertation would not have gotten done without his tremendous support. Another person to whom I would like to extend special thanks is my co- advisor, Dr. Douglas W. Oard, who challenged and stimulated me with insightful questions and advice during the study. I am also very thankful to my other committee members, Dr. Marilyn D. White, Dr. Eileen G. Abels, and Dr. Andrew Wolvin, for their support and invaluable contribution to this dissertation. Thanks to Rae Kammerer for her editing work and my colleagues at the University of South Carolina for their generous support. A special thank you goes to the Survivors of the Shoah Visual History Foundation for hosting the two user workshops and providing the research facilities. Finally and most gratefully, I am thankful to the study participants and the intermediaries, and others who contributed to the data collection process. This work has been supported in part by NSF IIS Award 0122466. Any opinions, findings, and conclusions or recommendations expressed in this dissertation are those of the author and do not necessarily reflect the views of the NSF. ii Table of Contents Acknowledgements................................................................................................ ii Table of Contents...................................................................................................iii List of Tables ........................................................................................................... vi List of Figures......................................................................................................... vii Chapter 1: Introduction........................................................................................ 1 1.1 Background ......................................................................................................... 1 1.2 Objectives and Description of the Study............................................................. 3 1.3 Study Context...................................................................................................... 5 1.4 Contributions....................................................................................................... 6 1.5 Organization of the Dissertation ......................................................................... 8 Chapter 2: Literature Review.......................................................................... 10 2.1 Relevance Judgment.......................................................................................... 12 2.2 Query Formulation and Reformulation............................................................. 16 2.3 Speech Retrieval................................................................................................ 23 Chapter 3: Methodology.................................................................................... 27 3.1 Conceptual Framework ..................................................................................... 28 3.2 Study Context and Participants ......................................................................... 33 3.2.1 Study Context: Cataloging and Search Systems ........................................ 33 3.2.2 Participants and User Workshops............................................................... 45 3.3 Data Collection Methods................................................................................... 48 3.3.1 Data Collection: Pilot Study....................................................................... 48 3.3.2 Data Collection: User Workshop 1............................................................. 49 3.3.3 Data Collection: User Workshop 2............................................................. 52 3.3.4 Data Collection Protocols........................................................................... 53 3.4 Data Analysis .................................................................................................... 57 3.4.1 Initial Phase: Developing an Initial Coding Scheme and Coding .............. 58 3.4.2 Final Analysis: Finding Patterns and Relationships................................... 64 3.5 Validity Issues................................................................................................... 65 iii 3.6 Confidentiality and Privacy............................................................................... 67 Chapter 4: Findings.............................................................................................. 68 4.1 Relevance Judgment: Relevance Criteria and Associated Attributes ................... 69 4.1.1 Topicality.................................................................................................... 73 4.1.2 Accessibility ............................................................................................... 84 4.1.3 Richness...................................................................................................... 86 4.1.4 Emotion ...................................................................................................... 88 4.1.5 Comprehensibility ...................................................................................... 90 4.1.6 Duration...................................................................................................... 91 4.1.7 Novelty of Content ..................................................................................... 92 4.1.8 Acquaintance .............................................................................................. 94 4.1.9 Access to the Interviewee........................................................................... 95 4.1.10 Miscellaneous............................................................................................ 96 4.2 Relevance Judgment: Characteristics, Usage Patterns, and External Factors... 98 4.2.1 Content and Non-Content Attributes.......................................................... 98 4.2.2 Proxy Use of Attributes............................................................................ 100 4.2.3 Granularity of Units Judged ..................................................................... 101 4.2.4 Searcher Expectation and the Availability of Attributes.......................... 103 4.2.5 Factors Affecting Relevance Judgment.................................................... 106 4.3 Query Reformulation: INR Moves and Query Reformulation Types............. 115 4.3.1 Clarification.............................................................................................. 127 4.3.2 Specialization ........................................................................................... 128 4.3.3 Restriction................................................................................................. 130 4.3.4 Generalization........................................................................................... 131 4.3.5 Parallel Movement.................................................................................... 132 4.3.6 Note for Later ........................................................................................... 133 4.4 Query Reformulation: Sources of Information for INR Moves and Intermediary Effect 134 4.4.1 Sources of Information for INR Moves.................................................... 134 4.4.2 Intermediary Effect................................................................................... 139 Chapter 5: Conclusions and Implications............................................... 142 5.1 Overview of Findings and Discussion ............................................................ 143 5.1.1 Relevance Judgment................................................................................. 144 5.1.2 Information Need Refinement and Query Reformulation........................ 147 5.1.3 Interaction between Relevance Judgment and Query Reformulation ...... 149 5.2 Implications..................................................................................................... 153 5.2.1 Indexing and Metadata Assignment ......................................................... 153 5.2.2 Support for Search and Browsing............................................................. 156 5.2.3 Task-Oriented Retrieval System and Interface Design ............................ 157 5.3 Limitations ...................................................................................................... 159 iv 5.4 Future Work .................................................................................................... 161 5.5 Finale............................................................................................................... 163 APPENDIX A ? PRE-QUESTIONNAIRE.................................................................. 165 APPENDIX B ? FORMS RELATED TO PERMISSION FROM IRB....................... 166 APPENDIX C ? OBSERVATION GUIDELINES DURING USER WORKSHOP 2 169 APPENDIX D ? EXAMPLE QUESTIONS FOR THE SEMI-STRUCTURED INTERVIEW DURING USER WORKSHOP 1.......................................................... 170 APPENDIX E ? EXAMPLE QUESTIONS FOR THE SEMI-STRUCTURED INTERVIEW 1 DURING USER WORKSHOP 2....................................................... 171 APPENDIX F ? EXAMPLE QUESTIONS FOR THE SEMI-STRUCTURED INTERVIEW 2 DURING USER WORKSHOP 2....................................................... 172 APPENDIX G ? EXAMPLE QUESTIONS FOR THE FOCUS GROUP DISCUSSION 1 DURING USER WORKSHOP 2.............................................................................. 174 APPENDIX H ? FOCUS GROUP DISCUSSION 2 DURING USER WORKSHOP 2 ...................................................................................................................................... 175 APPENDIX I ? CODING SCHEME ........................................................................... 176 APPENDIX J ? MATRICES USED FOR ANALYZING INR MOVE AND QUERY REFORMULATION TYPE (QTT) ............................................................................. 181 Glossary ................................................................................................................... 206 v List of Tables Table 1. Relevance criteria and attributes for journal articles .............................................13 Table 2. Relevance criteria and associated attributes for radio programs ..........................15 Table 3. Information needs refinement moves and types of query reformulation ..............22 Table 4. A comparison of the two VHF systems: the old system vs. the new system ..........36 Table 5. Participants and their information needs.................................................................46 Table 6. Final Coding Categories.............................................................................................60 Table 7. Mentions of relevance criteria by searchers.............................................................70 Table 8. Mentions of relevance criteria and associated attributes........................................72 Table 9. Sample quotes of the attribute for judging topicality..............................................73 Table 10. Mentions of associated attributes: Topicality ........................................................75 Table 11. Mentions of associated attributes: Accessibility ....................................................85 Table 12. Mentions of associated attributes: Richness...........................................................86 Table 13. Mentions of associated attributes: Emotion ...........................................................88 Table 14. Mentions of associated attributes: Comprehensibility..........................................90 Table 15. Mentions of associated attributes: Duration ..........................................................91 Table 16. Mentions of associated attributes: Novelty of content...........................................93 Table 17. Mentions of associated attributes: Acquaintance ..................................................95 Table 18. Mentions of associated attributes: Access to the interviewee ...............................96 Table 19. Mentions of associated attributes: Miscellaneous..................................................96 Table 20. Content and non-content attributes........................................................................99 Table 21. Examples of proxy use of attributes......................................................................100 Table 22. Granularity of units judged by attributes ............................................................102 Table 23. Associated attributes by the granularity of units judged....................................103 Table 24. Searcher expectation and the availability of attributes.......................................104 Table 25. Individual differences: relevance criteria, mention counts.................................107 Table 26. Individual differences: relevance criteria.............................................................108 Table 27. Individual differences: associated attributes........................................................110 Table 28. Individual differences: the importance of associated attributes ........................111 Table 29. Medium and domain differences of relevance criteria........................................113 Table 30. Observations of information need refinement (INR) moves...............................116 Table 31. Observations of INR moves: with or without clarification.................................117 Table 32. Observations of INR moves and subsequent query reformulation types ..........119 Table 33. Think-aloud transcript from P12..........................................................................120 Table 34. Sources of information for INR moves for query formulation and reformulation ............................................................................................................................................136 vi List of Figures Figure 1. Information system and search model ....................................................................10 Figure 2. The sense-making triangle of situation-gap-bridge ...............................................17 Figure 3. Taylor?s levels of question formation ......................................................................18 Figure 4. A discrete model of relevance judgment .................................................................29 Figure 5. A discrete model of query reformulation................................................................29 Figure 6. A conceptual model for relevance judgment (Dervin + Taylor) ...........................30 Figure 7 A conceptual model for query reformulation..........................................................32 Figure 8. VHF system: People Search .....................................................................................38 Figure 9. VHF system: Experience Search ? Initial screen ...................................................38 Figure 10. VHF system: Experience Search ? Optional criteria...........................................39 Figure 11. VHF system: Broad Categories Search ? Initial search screen ..........................40 Figure 12. VHF system: Keywords Search ? Initial search screen.......................................41 Figure 13. VHF system: Global Keywords Search ? Search constraints screen .................42 Figure 14. VHF system: Global Keywords Search ? Display screen ....................................43 Figure 15. VHF system: Global Keywords Search ? Viewing screen...................................45 Figure 16. A revised conceptual model for relevance judgment ........................................151 Figure 17. A revised conceptual model for query reformulation.......................................152 vii Chapter 1: Introduction This dissertation contributes to our knowledge of searcher behavior in speech retrieval systems with a focus on relevance judgment and query reformulation. Speech retrieval systems are now beginning to appear as a means to access spoken word collections (news, voicemail, meeting, lecture, interview, etc.). We know quite a lot about how people interact with text retrieval systems to specify their information needs and select relevant documents, but we do not yet understand well how those behaviors carry over to searching collections of speech recordings. 1.1 Background Far more is spoken each day than is written, and technology to acquire, store, and replay spoken content is now ubiquitous. Searching spoken word content is therefore a challenge of significant importance. Much of the research on speech retrieval to date has focused on the development of algorithms for automating the search process based on emerging speech technology, such as automatic transcription (speech recognition) and topic segmentation (Allan, 2002; Voorhees and Harman, 2000). As this technology has matured, end-to-end systems that support interactive searching have started to appear. Research systems such as Informedia (Christel et al., 1995), the Audio Notebook (Stifelman et al., 2001), and SCAN (Whittaker et al., 2002) have explored access issues to spoken word collections of television news, lectures, meetings, and voicemail. Commercial systems (e.g., Virage and Fast-Talk) are now starting to appear. The emergence of complete and scalable systems has in turn made it 1 possible for researchers to augment earlier component-oriented user studies (e.g. Arons, 1997) with studies of situated users performing typical tasks (e.g., Whittaker et al., 1998). However, there are still far more unknowns than knowns in user behavior with speech retrieval systems. An important process of user behavior in searching information is judging the relevance of information objects. Searchers bring to these judgments a set of relevance criteria they apply when selecting a document ? such as topicality, authority, and availability ? and they evaluate these criteria based on one or more attributes that can represent the document ? such as title, abstract, and keyword(s). Since speech and text have different characteristics, searchers of speech retrieval systems may use different criteria and attributes when making relevance judgments (Kim et al., 2003). For example, recorded speech carries additional attributes that are not present in written text (e.g., pitch and intonation). In contrast, some useful attributes that are generally available in text are often not available in speech recordings (e.g., title and abstract). The cognitive process of making a relevance judgment is closely coupled with that of query reformulation in that both occur simultaneously over the course of an information search. Searchers may refine their information needs and reformulate their queries while and/or after examining retrieved documents. Researchers have characterized tactics/moves searchers adopt for information need refinement (INR), called INR moves in this dissertation, such as generalization, specification, and parallel movement and examined the types of query reformulation between query terms, such as broader term, narrower term, and related term (Bates, 1990; Chen and Dhar, 1990; Fidel, 1991; Bruza and Dennis, 1997, Lau and Horvitz, 1999; Rieh and Xie, 2001). 2 Most existing studies have examined relevance judgment and query reformulation behavior in the context of searching text. A few studies have explored how searchers select stories from radio programs (Kim et al., 2003) and voicemail messages (Whittaker et al., 1998). No studies were found that have examined how searchers reformulate their queries when searching speech recordings. The lack of knowledge on this important area of searcher behavior has imposed some limitations to system/interface designers in developing effective speech retrieval systems. For instance, Kim et al. (2003) found story type (e.g., interview, special report, commentary, debate, announcement, call-in program, etc.) was an important criterion for searchers when selecting stories from radio programs, but, in many cases, audio replay was the only way of getting that information. Without knowing the type of attributes searchers use for a task, it is hard for system/interface designers to decide what attributes (metadata) to present to support the task in which searchers are engaged. This dissertation examines both relevance judgment and query reformulation behavior in the context of searching recorded oral history interviews. 1.2 Objectives and Description of the Study The objective of this dissertation is to explore the behavior of searchers using a retrieval system that provides access to recorded interviews. In order to achieve the objective, this dissertation investigates the search process of searchers as it occurs. The main research question that guided the investigation is: 3 How do searchers interact with speech retrieval systems to specify their information needs, select relevant recordings or passages, and reformulate their queries? In particular, this dissertation focuses on relevance judgment, query reformulation, and their interaction. This leads to the following research questions: 1. What criteria and processes do searchers use in making relevance judgment? 1.1 What relevance criteria do searchers apply when choosing a recording or a passage of a recording? 1.2 How do searchers map available attributes of speech recordings to their relevance criteria? 2. What processes do searchers use to refine their information needs and reformulate their queries? 2.1 What information need refinement (INR) moves do searchers adopt? 2.2 What types of query reformulation do searchers use as a means to reflect each adopted INR move? Due to the exploratory nature of elucidating human cognitive processes, this dissertation uses a case study approach with qualitative data collection and analysis methods. Qualitative research methods are well suited to a study that focuses on examining the cognitive processes underlying relevance judgment and query reformulation ? not merely the outcome of those processes (Creswell, 1994; Maxwell, 1996; Denzin and Lincoln, 2000). 4 1.3 Study Context Existing studies have examined some aspect of searcher behavior in the context of searching radio programs (Kim et al., 2003) and voicemail (Whittaker et al., 1998), but no studies were found that have explored the searcher behavior when seeking recorded interviews. Like news and voicemail, interviews constitute one of the speech genres that have great potential needs for access. Thus, examining the searcher behavior in the context of searching recorded interviews brings a new contribution to the field. Interviews exist in many subjects, such as politics, economics, entertainment, and oral history. The speech collection used in this dissertation consists of oral history interviews; it was chosen opportunistically. The University of Maryland is participating in the MALACH (Multilingual Access to Large Spoken Archives) project, a collaborative research effort between the Survivors of the Shoah Visual History Foundation (VHF) and other participating institutions. VHF assembled 52,000 videotaped testimonies (116,000 hours) of speech in 32 languages (including English, Russian, French, Spanish, Hungarian, Polish, Czech, Slovak, ?) from survivors, liberators, rescuers, and witnesses of the Holocaust (Gustman et al., 2002). Effective access to the VHF archive is necessary for teachers, students, historians, and others to use these valuable collections on the Holocaust. Manual cataloging has been in progress to provide access to these collections. However, the enormous scale of the VHF collections and the tremendous expense of manually cataloging audiovisual materials made it impractical to rely on manual techniques alone. The purpose of the MALACH project is to improve access to large 5 multilingual collections of recorded speech in oral history archives by advancing the state of the art in speech and retrieval technologies. This fits with the ultimate goal of this dissertation, to improve access to recorded interviews by better understanding searcher behavior. Key activities in the project include automatic speech recognition, computer- assisted translation of domain-specific multilingual thesauri, natural language processing techniques for automated creation of metadata, and developing search interfaces, all to support search and browsing. It is important to understand searcher behavior in designing effective access systems, but little is known about searchers of recorded interviews. The Survivors of the Shoah Visual History Foundation hosted two user workshops to better understand the searcher behavior and user requirements when accessing the VHF collection; these workshops provided a good opportunity for data collection. The searcher behaviors of relevance judgment and query reformulation were observed during the two user workshops, with eight searchers (four searchers for each user workshop) who had genuine information needs. 1.4 Contributions The research in this dissertation is unique in two ways: a. It examines searcher behavior in the context of searching recorded interviews (in particular, oral history interviews). No existing studies have examined searcher behavior in this context. b. It explores the cognitive processes of relevance judgment and query reformulation together. These two processes occur interactively during the iterative process of 6 information seeking. Searchers refine their information needs not only while formulating or reformulating their queries, but also while examining retrieved documents. Searchers reveal their relevance criteria while formulating or reformulating their queries, as well as while making relevance judgment. This dissertation examines the relevance criteria applied and the INR moves exploited both during relevance judgment and during query formulation/reformulation. No existing studies have examined the two processes together. As a result, this dissertation brings several contributions that are related to cataloging and metadata assignment, conceptual framework, and system/interface design: (1) The findings provide grounded theoretical knowledge about relevance criteria and attributes that searchers use when searching oral history interviews. Many of the existing studies have examined relevance criteria and attributes searchers use in the application of text retrieval. A few studies (e.g.Whittaker et al., 1998) have looked into the problem in the context of searching speech recordings (e.g., voicemail). Searchers of oral history interviews are likely to select testimonies using a different set of relevance criteria and attributes. This dissertation will provide catalogers with a basis for suggesting what attributes (metadata) to index. (2) The framework that examines the cognitive processes of relevance judgment and query reformulation together makes it possible for researchers to understand searcher behavior more comprehensively than studying either one alone. For example, searchers use overlapping and interacting sets of relevance criteria for the tasks of selecting documents and reformulating queries. Examining the two 7 processes together will lead to a better understanding of searcher behavior. (3) The findings suggest task-oriented recommendations for system/interface designers. This dissertation examines searcher behavior thoroughly, including the initial query formulation, relevance judgments, and query reformulation. This makes it possible to understand user requirements for different tasks during a search. System and interface designers can improve system performance by presenting an adequate function that supports the user task being performed. 1.5 Organization of the Dissertation The next chapter reviews previous studies of relevance judgment, query formulation and reformulation, and speech retrieval. The review presents the trends and the findings of previous studies, including different models of information seeking, such as Dervin?s sense-making model (1992) and Taylor?s four levels of question formation (1962). It then discusses what studies have been done in the context of speech retrieval. The conceptual framework (presented in Chapter 3) that guided the research was derived from the literature review and made it possible to examine the searcher behavior of relevance judgment and query reformulation more systematically. In addition to the conceptual framework, Chapter 3 lays out the methodology, including research questions, search systems and collections, detailed descriptions of the two user workshops mentioned above and the participants, data collection methods, procedures for data analysis, and some validity issues involved in this study. Chapter 4 presents research findings on the searcher behavior of relevance judgment and query reformulation. It first discusses relevance criteria and attributes participants used during their search, and then elucidates INR moves and query reformulation types they 8 adopted. The final chapter (Chapter 5) summarizes the dissertation, discusses the findings to derive useful patterns and trends, and presents implications of the findings that inform the design of future speech retrieval systems. The chapter concludes with limitations and some ideas for future work. 9 Chapter 2: Literature Review Although the different nature of speech and text may result in different user behavior, the overall structure of the information search model of speech retrieval systems may not be different from that of text retrieval systems. Therefore, examining the well-known user behavior in text retrieval systems could provide us with insight into envisioning the user behavior in speech retrieval systems. An information system and search model consists of four processes: docum ulation, matching, and examination. Figure 1 illustrates what tasks are performed within each process and how these tasks interact with each other. Document Processing Documents Info. Need Figure 1. Information system and search model 10 Matching Examination - Relevance judgm - Info. need refinem Document Representation Selected Docum ent processing, query form Query Formulation ent ent Query Representation Document List ents Refined Info. Need Query formulation refers to defining the information needs of users in a way the system can interpret and involves the cognitive process of both the initial query formulation and reformulation. An information seeking process begins with a user who has an information need. The information need must be transformed into a query. Once the query is formed, a representation of the query is required so that the system can find relevant documents. In Figure 1, the information need of the user is defined through query formulation and presented to the system in the form of a query representation. Meanwhile, each document coming into the system is represented through document processing. Once both the user need and the documents are represented, the system can find a set of documents best matching the user?s information need by comparing the query representation with the document representation. Boolean matching, the vector space method, and probabilistic matching are the three main techniques that can be used for matching (Turtle and Croft, 1992; Korfhage, 1997; Baeza-Yates and Ribeiro-Neto, 1999). Finally, the set of detected documents is examined and evaluated to select the ones that meet the information need, which involves the cognitive process of human relevance judgment. This process is iterative in that the user may continue his/her search until satisfied, which requires refining the information need and reformulating the query. In an interactive search system, examination is thus closely coupled with query formulation; users may reformulate their queries based on the outcome of their evaluation. This dissertation examines the user behavior that occurs both during examination (i.e., relevance judgment) and during query formulation, with a focus on 11 the cognitive processes of relevance judgment and query reformulation. The following three sections review literature in the areas of relevance judgment, query reformulation, and speech retrieval. 2.1 Relevance Judgment The concept of relevance is widely used as a basis for evaluating the effectiveness of information retrieval systems (Saracevic, 1976; Schamber et al., 1990). However, there is no consensus among researchers on a concise definition of the term relevance; people use it in different ways. Researchers have sought to define relevance from two perspectives that are often referred to as system-oriented and user-oriented. The system-oriented perspective focuses on topical relevance and concerns finding documents that address a concept-based information need. Under this paradigm, recall and precision are fundamental concepts for evaluating the performance of an information retrieval system (Cuadra & Katter, 1967; Cooper, 1971; Saracevic, 1976). The user-oriented perspective on relevance is somewhat broader, seeking to characterize the relationship between information and the user's problem situation and attempting to account for the various aspects of human cognitive processes used in making relevance judgments. In this paradigm, common terms that refer to relevance are utility, pertinence, satisfaction, and situational relevance (Wilson, 1973; Schamber, 1994). The user-oriented view does not reject topical relevance ? rather it sees it as one of many factors that affect the behavior of searchers (Wang and Soergel, 1998). In this dissertation, relevance is defined as the degree to which a speech recording (or a portion of a recording) meets a searcher?s need, which incorporates the user-oriented view of relevance. This broad statement is meant to be inclusive, 12 subsuming topical relevance, situational factors (e.g., purpose of information use), and other factors related to the nature of the recording (e.g., emotional expression presented in the audio). The cognitive processes underlying judgments of relevance have been widely studied, often with the goal of identifying criteria that influence relevance judgments (Park, 1993; Barry, 1994; Schamber, 1994; Mizarro, 1997; Wang and Soergel, 1998; Wang and White, 1999; Tang and Solomon, 2001). Table 1 summarizes the most widely observed criteria from those studies, all of which focused on selection of journal articles using bibliographic databases. Relevance criteria are, however, abstract concepts, and searchers must ground their interpretation of each criterion in some set of observable attributes (Table 1). For instance, a searcher might assess the topicality of a journal article based on the title and abstract of the article and any thesaurus descriptors that have been assigned to the article. Table 1. Relevance criteria and attributes for journal articles Relevance Criteria Associated Attributes Topicality Novelty Quality Availability Accessibility Recency Authority Reading time Relation/Origin Title, abstract, descriptors Title and author Journal, author, citation status Journal and document type, owning library Media, language Publication date Author and author?s affiliation Number of pages, level of difficulty Author 13 The set of criteria and their mapping to associated attributes that are shown in Table 1 may not be directly applicable to searching recorded speech, both because of the different characteristics of speech and because of the differences in the set of available attributes. For example, skimming, which involves rapid browsing, looking ahead, and looking back, is much more difficult for speech than for text (Arons, 1993, 1997; Whittaker et al., 1998; Shneiderman, 2000; Kim et al., 2003). Units of retrieval, such as fixed-length chunks or meaningful segments isolated through automatic segmentation, may lack a meaningful title. Genre is also an important factor. The set of available attributes may differ from one genre to another, such as voicemail, lecture, radio program, and interview. In voicemail, useful attributes users mentioned as important were caller name, caller number, reason for calling, important dates/times, action items, and intonation (Hirschberg and Whittaker, 1997; Whittaker et al., 2000). In recorded classroom lectures, you might expect associated visual materials (e.g., PowerPoint slides) to be useful attributes when assessing topical relevance. The relevance criteria and attributes searchers exploited when selecting stories of radio programs (shown in Table 2) appear to be somewhat different from others (Kim et al., 2003). For instance, story type was an important criterion, and audio replay was the only means to assess it for searchers when using SpeechBot. Searchers may use a somewhat different set of criteria and attributes when accessing the VHF collection of recorded oral history interviews. 14 Table 2. Relevance criteria and associated attributes for radio programs Associated Attribute Relevance Criteria NPR Online 1 SpeechBot 2 Topicality Story title Brief summary Audio replay Detailed summary Speaker name Speaker affiliation Short extract from transcript Audio replay Longer extract from transcript Highlighted terms in transcript Story type Detailed summary Brief summary Audio replay Story title Audio replay Time frame Broadcast date Brief summary Audio replay Broadcast date Recency Broadcast date Broadcast date Listening time Story length Program title Authority Speaker name Speaker affiliation Program title 1 NPR Online offers access to many National Public Radio programs and supports searching based on human-prepared metadata (available at http://www.npr.org/). 2 SpeechBot, developed by Compaq, is a search engine for audio (and video) content and relies entirely on automatic processing, using speech recognition as a basis for automatic indexing (available at http://speechbot.research.compaq.com/). 15 2.2 Query Formulation and Reformulation Query formulation and reformulation involve two processes: (1) refinement of the information need, which involves human cognition; (2) formulation (and reformulation) of a search query, which is the outcome of the human cognitive process. Researchers have studied both problems. Studies examining information need refinement were done mainly in the context of exploring how searchers understand their information need (Taylor 1962). Studies examining query formulation and reformulation focused on elucidating how searchers communicate the understanding of their information needs to the system (Salton and Buckley, 1990). One of the goals of studies examining query reformulation was to develop some techniques for automatic query reformulation or query expansion. Typically, techniques for automatic query reformulation or query expansion are studied in the context of relevance feedback provided by searchers (Rocchio, 1971; Salton and Buckley, 1990; Spink and Losee, 1996; Hearst, 1999). Queries can be automatically expanded either by exploiting positive and/or negative examples (Losee, 1994; French, et al., 1997; Belkin, et al., 2001) or by using a thesaurus (Gauch and Smith, 1991; Xu and Croft, 1996; Mandala et al., 2000; Greenberg, 2001). However, these techniques for query reformulation have their limits in that searchers exercise little control. Interactive query reformulation allows searchers to select search terms among the list of terms/phrases suggested by the system (Williams, 1984; Efthimiadis, 1996, 2000; Belkin et al., 2001). These techniques, including 16 interactive query reformulation, give searchers tools to improve their queries based on relevance feedback, but the studies put little emphasis on understanding the cognitive process of how searchers refine their information needs. Situation Gap Bridge Figure 2. The sense-making triangle of situation-gap-bridge An effort to understand the cognitive processes of refining information needs has been made in the context of modeling information seeking behavior (Belkin, 1980; Dervin 1992; Kuhlthau, 1992). Dervin (1992) presented the sense-making triangle of situation-gap-bridge, as shown in Figure 2. According to this model, searchers take iterative steps throughout their information seeking processes. They first establish the context of their information needs (the situation), find a gap between what they understand and what they need to know in order to make sense of the world (the gap), and then ask questions and look for answers to bridge the gap (the bridge). Dervin?s sense-making model focuses on how the searcher defines the situation, finds the gap, bridges the gap, and continues their search after bridging the gap. Another effort to explore the issue of information need refinement has been made within the framework of question asking (Taylor, 1962; White, 1983; Lang and 17 Dumais, 1992; Graesser et al., 1994). Taylor (1962) defined the cognitive processes of asking questions and presented a typology of information needs for representing four levels of question formation. In Figure 3, the visceral need (Q1) of a searcher, which is ?the actual but unexpressed need for information,? gives rise to a conscious need (Q2), which is ?the recognized within-brain description of the need? (p.392). In an intermediated search, the searcher may need to formally articulate his/her conscious need either in written or in oral format ? the formalized need (Q3). The intermediary and the searcher then may work together to come up with the actual query to be presented to the system ? the compromised need (Q4). In an end-user search, the formalized need (Q3) may not appear in the search process. Q1. Visceral Need Q2. Conscious Need Q3. Formalized Need Q4. Compromised Need (Query) End-user search Intermediated search Figure 3. Taylor?s levels of question formation 18 Taylor (1962) further points out that the task of question formation is often not easy for searchers due to the difficult nature of articulating their information needs. To make the problem worse, searchers often fail to come up with search terms that properly represent their information needs (Gauch and Smith, 1991; Efthimiadis, 1996; Spink and Saracevic, 1997; Belkin et al., 2001; Spink and Ozmultu, 2002). In the early 1990s, several studies examined the tactics searchers exploited for query reformulation and/or the type of query modifications searchers made during an information search (Bates, 1990; Chen and Dhar, 1990; Fidel, 1991). Bates (1990) analyzed the levels of system involvement in designing information retrieval systems and discussed how much and in what type of activity the user should be able to direct the system when searching information. In the process, she introduced the concept of information stratagem and described five tactics that searchers could take to further a search: ?monitoring tactics,? ?file structure tactics,? ?search formulation tactics,? ?term tactics,? and ?idea tactics? (p.579). The tactics relevant to this dissertation are search formulation tactics and term tactics. Bates categorized search formulation tactics into SPECIFY, EXHAUST, REDUCE, PARALLEL, AND PINPOINT and classified term tactics into SUPER, SUB, RELATE, REARRANGE, CONTRARY, RESPELL, and RESPACE. Bates then used these tactics to present a list of potential system reactions in responding to a searcher?s request. For example, the system could SPECIFY, EXHAUST, PINPOINT, or SUB in responding to a retrieval result with too many hits. Chen and Dhar (1990) examined how searchers of thesaurus-based online catalogs refine their queries and proposed a process model to facilitate the query 19 refinement process. Using a semantic network and a Problem Behavior Graph, they represented the query refinement process and identified five semantic relationships: synonymous term (ST), broader term (BT), narrower term (NT), adjacent term (AT), and disjointed term (DT). They then further observed that searchers refined their queries either by browsing the semantic network of concepts or by extracting terms from citations. Fidel (1991) studied the search behavior of 47 experienced online searchers to examine how they select/modify search terms and defined two types of moves ? ?operational moves? and ?conceptual moves? (p.515) ? which searchers used for modifying their search strategies to improve search results. She further divided these moves into three groups: moves to reduce the size of a set (increase precision), moves to enlarge the size of a set (increase recall), and moves to increase both precision and recall. According to Fidel, searchers might limit, intersect, eliminate, cut, or narrow search terms in order to reduce the size of a set. They might use add (synonyms, variant spellings, and/or terms), include, cancel, and expand to enlarge the size of a set, and refine and probe to increase both precision and recall. More recently, researchers have begun to explore the trends in Web queries (Jensen et al., 2000, Spink et al., 2001, 2002) and the types of query reformulation between/among the Web queries used (Bruza and Dennis, 1997, Lau and Horvitz, 1999; Rieh and Xie, 2001). Bruza and Dennis (1997) analyzed 1,040 Web queries and categorized them into eleven query reformulation types. They found the primary reformulation type was repetition of the previous query and observed searchers frequently used such reformulation types as term substitution, addition, and deletion, in 20 that order. Other reformulation types used with fewer frequencies were spelling correction, punctuation change, derived forms of words, case changing, abbreviation expansion or contraction, and term splitting or joining. Lau and Horvitz (1999) constructed user models of query refinement by analyzing a large corpus of Web queries generated by the Excite Internet search engine and presented Bayesian networks that could predict the progression of queries over time. They partitioned queries into seven refinement classes in which each class represented a user?s intent relative to his/her prior query. These classes include new, generalization, specialization, reformulation, interruption, request for additional results, and blank queries. Rieh and Xie (2001) analyzed 183 search session logs generated by a Web search engine (Excite) and reported the patterns and sequences of query reformulation. They presented three facets of Web query reformulation (content, format, and resources); each facet consists of sub-facets of query patterns used. For example, the sub-facets of content were specification, generalization, replacement with synonym, and parallel movement. Term variations, operator usage, and error correction were sub- facets of format, and general resource, special resource and site URL were subfacets of resource. They found most query reformulation involves content changes and about 15% of reformulation is related to format modifications. Table 3 summarizes the information needs refinement tactics/moves and the types of query reformulation identified by those studies that examined search queries used by searchers of online databases (Bates, 1990; Chen and Dhar, 1990; Fidel, 1991) and the Web (Bruza and Dennis, 1997, Lau and Horvitz, 1999; Rieh and Xie, 2001). In 21 Table 3, terms that convey the same/similar meaning in different studies are grouped together, and the relationship between INR moves and the types of query reformulation are presented. For example, a searcher may generalize his/her information need to enlarge the size of a retrieved set and consequently reformulate his/her query by using a broader term or a related term, adding term(s), or canceling restrictions. Table 3. Information needs refinement moves and types of query reformulation (Each box contains the terms used by different authors for the same concept) INR Moves Types of Query Reformulation New Disjointed term Generalization Moves to enlarge the size of a set Broader term, super, expand Related term, adjacent term, relate Addition, add, include Cancel restrictions Parallel movement Parallel Reformulation Term variation, abbreviation Spelling correction, respell, error correction Rearrange, respace, term splitting or joining Case change, punctuation change Synonym, synonymous term Contrary Substitution, Derived forms of words Specification Specialization Specify Exhaust Reduce Moves to reduce the size of a set Sub, narrower term, narrow Limit, intersect Deletion, eliminate, cut Pinpoint Moves to increase precision & recall Refine Probe 22 In summary, previous studies on query formulation and reformulation have primarily examined three problems: (1) the techniques used for query reformulation or expansion based on relevance feedback, (2) the tactics/moves adopted for information need refinement, and (3) the types of query reformulation made by searchers. This dissertation expands previous work on query formulation and reformulation by further exploring both INR tactics (called INR moves in this dissertation) and query reformulation types and mapping them with the degree to which they are associated. 2.3 Speech Retrieval As with written text, the dominant paradigm for providing access to collections of recorded speech is based on closely coupling search and browsing. Due to the differences of speech and text, searching and browsing speech collections present some unique challenges. For instance, it is difficult to directly browse speech because of the transient and linear nature of audio (Arons, 1993, 1997; Whittaker et al., 1998; Shneiderman, 2000). Content-based search is limited without reliable transcripts, and these are often not available; for example, recordings of news, voicemail, meetings, lectures, and interviews often go without transcripts. In order to support effective search and browsing, a substantial investment in algorithms for automatic speech recognition (ASR) and automatic topic-segmentation has been made (TREC 6-9; TDT 1-4). Usable transcripts (even with errors) can enhance content-based search. In the 1970s, several studies reported the groundbreaking theory of hidden Markov models (HMMs) and its application to ASR (Baker, 1975; Jelinek et al., 1975; Jelinek, 1976). 23 Since then, many engineers and researchers have engaged in developing algorithms for speech recognition based on the HMMs (Rabiner et al., 1985; Young, 2001), in the context of transcribing broadcast news programs (Pallett et al., 1996; Mohri et al., 2000) and conversational speech (Godfrey et al., 1992; Bacchiani, 1999; Stolcke et al., 2000; Byrne et al., 2004). As the ASR technology sufficiently matures, a variety of speech recognizers are now available for the purposes of research (e.g., Sphinx) and commerce (e.g., Dragon NaturallySpeaking and ViaVoice). The reformulation of recorded speech to written text through ASR has somehow turned the less-studied speech retrieval problem into the well-studied text retrieval problem. Yet, providing easy access to the transcript is another matter. New challenges have been raised. For instance, spoken documents often lack titles and abstracts. Techniques for document summarization have been applied to automatic title generation (Witbrock and Mittal, 1999; Kennedy and Hauptmann, 2000; Jin and Hauptmann, 2001) and to spoken document summarization (Zechner, 2001), although these techniques must cope with the imperfections of text generated through automatic speech recognition. Information retrieval is robust even with errorfull transcripts ? up to 40% word error rate is tolerable (Allan, 2002), but ASR errors do affect the selection behavior of searchers (Stark et al., 2000, Kim et al., 2003). Another important characteristic of speech recordings is that a recording may include multiple stories with different topics. For example, a news program can cover many different stories with different topics on any given day. Searchers may want to directly access individual stories. For this reason, researchers have studied techniques for automatic topic segmentation. Story boundaries can be detected based on topic 24 changes (Reynar, 1998), speaker changes (Hauptmann, 1995), and prosodic cues (Shriberg et al., 2000). Word similarity is measured in order to detect topic changes (Reynar, 1994; Yamron, et al., 1998; Takao, et al., 2000). Speaker identification using such acoustic cues as pitch and intonation has been studied (Fischer and Effelsberg, 1995; Hauptmann, 1995). Such prosodic cues as pitch (Kreiman, 1982; Hirschberg and Grosz, 1992; Chen and Withgott, 1992; Arons, 1994; Shriberg et al., 2000), intonation (Wightman and Ostendorf, 1992; Stifelman, 1995; Hirschberg and Nakatani, 1998), and prosody (O?Shaughnessy, 1992; Arons, 1997; Shriberg et al., 2000) are found to be useful in detecting story boundaries. Advances in ASR and automatic topic segmentation have made it possible for system designers to build speech retrieval systems that can provide access to a large collection of speech and/or video archives (e.g., Infomedia, and SpeechBot). With the emergence of speech retrieval systems, researchers have begun to study how these systems can be used more effectively. Various user interface issues have been studied (Oard, 2000; Whittaker et al., 1998b, 2000; Thong, et al., 2001). Tools for supporting search and/or browsing have been developed, such as SpeechSkimmer, the Audio Notebook, and SCANMail. SpeechSkimmer (Arons, 1997) supports audio browsing by time compression, pause removal, automatic emphasis detection, and non-speech audio feedback. The Audio Notebook (Stifelman et al., 2001) enhances audio browsing by enabling a listener to quickly access any portion of a recording using such acoustic cues as pitch, pause, and energy. SCANMail (Whittaker et al., 2002) supports both search and browsing by enabling content-based search, visual scanning, and information extraction. 25 A few user studies that examined searcher behavior have been done in the context of retrieving voicemail (Hirschberg and Whittaker, 1997) and radio programs (Kim et al., 2003). Hirschberg and Whittaker (1997) observed frequent users of a commercial voicemail system in order to examine how they use, store, search, and process large amounts of audio data. They found that voicemail users often made notes on such attributes as sender name, sender telephone number, important dates/times, and a few keywords to identify the relevant action. They also found intonation could be used as a vital clue for determining the urgency of a message. Using the attributes voicemail users valued could enhance search and browsing. Kim et al. (2003) reported on an exploratory study of the criteria searchers used when judging the relevance of recorded speech from radio programs and the attributes of a recording on which those judgments were based. They found relevance criteria used as a basis for selection were similar to those observed in relevance studies with printed materials, but the attributes used as a basis for assessing those criteria were modality-specific. A complete list of relevance criteria and the associated attributes used by searchers of radio programs was shown in Table 2. This study expands the work by Kim et al. (2003) by examining the searcher behaviors of relevance judgment and query reformulation together in the context of searching oral history interviews. 26 Chapter 3: Methodology The goal of this dissertation is to explore the behavior of searchers using a speech retrieval system providing access to recorded interviews. In particular, it focuses on characterizing the process of how searchers map available attributes of speech recordings to their selection criteria and exploring the process of how searchers attempt to match their query reformulations to their information needs. Due to the exploratory nature of elucidating human cognitive processes, we used a case study approach to examine the relationships between: (1) relevance criteria and attributes and (2) information needs refinement moves and query reformulation types. Qualitative research methods are well suited to a study of this type (Creswell, 1994; Maxwell, 1996; Denzin and Lincoln, 2000) in that they focus on examining the cognitive processes underlying relevance judgment and query reformulation ? not merely the outcome of those processes. Individuals are not alike. Thus, it may not be realistic to statistically generalize the process of how individual searchers perceive their decision-making processes during information seeking. Naturalistic inquiry (Guba, 1981; Guba and Lincoln, 1982; Lincoln and Guba, 1985; Guba and Lincoln, 1998) is often grounded in a conceptual framework that can provide the theoretical basis of the study. Building on the previous work on relevance judgment and query reformulation, we developed a conceptual framework (Figures 6 and 7) that combines Dervin?s sense-making model and Taylor?s question formation model. This framework guided the procedures for data collection and data analysis throughout the study. 27 3.1 Conceptual Framework Query reformulation is closely coupled with relevance judgment as discussed in Chapter 1. How then does the cognitive process of relevance judgment affect that of query reformulation or vice versa? What relevance criteria do searchers apply when choosing recordings or passages, and what attributes do they base it on? How do searchers refine their information needs and reformulate their queries? What information should the system provide to enable searchers to find information effectively? What tools should the system provide to assist searchers to reformulate a query? The conceptual framework presented in Figures 6 and 7 provides a guide to answering these questions systematically. A model is a representation of real-world phenomena in abstract terms that can be applied to novel cases (Rogers, 1981). Figures 6 and 7 are conceptual models that represent the cognitive processes of human relevance judgment and query reformulation, respectively. They combine Dervin?s sense-making model of information seeking and Taylor?s model of question formation. They present the cognitive processes associated with relevance judgment and query reformulation as a holistic process of sense making which involves the reformulation of information needs of individual searchers from the initial visceral need (Q1) to the compromised query (Q4). The information system and search model (Figure 1) presented in Chapter 2 suggests that an information search involves human cognitive processes of information need refinement and relevance judgment during examination. Figure 4 and 5 present extended models of examination that can each offer a framework for studying relevance judgment and query reformulation, respectively. 28 Figure 4. A discrete model of relevance judgment Relevance Judgment Query Reform Searcher Retrieved Docs. Selected Documents New Queries Relevance Criteria Info. Need Refinement Information Need Information Need Figures 4 and 5 assum reformulation) occurs separately f (relevance judgm 29 Figure 5. A discrete model of query reformulation e that the cognitive process of relevance judgm ent), which is the approach m ulation rom the cognitive process of ent (query query reformulation ost previous studies have taken, and Searcher consequently each process was examined individually. However, the cognitive processes of relevance judgment and query reformulation may occur interactively during a search, as discussed in the previous chapter. This dissertation takes the latter approach and examines both processes as parts of a whole. S 1 Gap n G 1 Relevance Judgment Query Reformulation Q1 Q4 (1) Relevance Criteria (RJ) Topicality Novelty Quality Availability Authority ? Q2 (1), Q3 (1) Relevance Criteria (both) Figure 6. A conceptual model for relevance judgment (Dervin + Taylor) Figures 6 and 7 combine the m sense-making model of information seeking (1992) and Taylor?s m formation (1962). Figure 6 illustrates the idea we want to explore, nam searchers apply their relevance criteria both in queries. The relevance criteria inserted f 30 Search n Action n A 1 Q4 (n+1) Selected Documents Q2 (n+1) Q3 (n+1) Relevance Criteria (QF) odels presented in Figures 4 and 5 with Dervin?s selecting docum or illustration are fr odel of question ely that ents and in reformulating om the literature review. In Figure 6, a searcher comes to the system with an unexpressed, actual information need (Q1) and types in the initial search query (Q4 (1) ) to find a set of documents which constitute Search 1 (S 1 ). The searcher then selects promising documents that meet the relevance criteria he/she possesses (e.g., topicality, novelty, quality, availability, ?) by examining the available document attributes listed in Table 1 (e.g., title, summary, descriptors, publication date, ?). The searcher may end his/her search, if the initial search was successful, but more likely will at least initially encounter a gap between what he/she wanted to find and what the system actually found ? i.e., Gap 1 (G 1 ). The searcher may try to bridge the gap by taking such actions as refining his/her information need and reformulating the search query ? i.e., Action 1 (A 1 ). The searcher then performs another search using the new query (Q4 (2) ) to reach Search 2 (S 2 ) and may continue his/her search until satisfied ? i.e., Search n (S n ). We use the model shown in Figure 6 as the conceptual framework for examining the relevance criteria and attributes searchers exploit. Given a set of retrieved documents, previous studies observed the selection behavior of searchers (within the framework of Figure 4). This dissertation expands the work previous studies have done by observing the relevance criteria and attributes searchers expressed not only during relevance judgment but also during query formulation and reformulation. Taking this approach offers some new benefits for systems/interface designers. For instance, the relevance criteria mentioned during query reformulation might be different from the ones mentioned during relevance judgment. These differences could 31 suggest system/interface designers what information to present to support searchers in performing each task. Search n S 1 Gap n G 1 Relevance Judgment Query Reformulation Q1 Q4 (n+1) Q4 (1) INR Moves (RJ) INR Moves (QF) Generalization Specification Parallel movement ? Q2 (1), Q3 (1) INR Moves (both) Information Figure 7 A conceptual model for query reformulation their queries in order to fill the gap they encounter during their sense-m seeking information. During/after relevance judgm his/her conscious need (Q2 (n+1) ), fo present, and reformulate the search query (Q4 iterative processes of sense making. 32 Action n A 1 Selected Documents Q2 (n+1) Q3 (n+1) Need Refinement (INR) ation needs and reformFigure 7 presents how searchers refine their inform ent, the searcher m rmally articulate it (Q3 (n+1) ) if an interm (n+1) ) to perform another search in the ulate aking process of ay need to refine ediary is Searchers may adopt different moves for refining their information needs (e.g., generalization), and each move can be achieved by using any of certain types of query reformulation shown in Table 3 (e.g., broader term). Searchers may refine their information needs not only during query reformulation but also during relevance judgment. Thus, examining INR moves during both processes may provide system designers with a more realistic representation of the cognitive process of query reformation. We use the model shown in Figure 7 as the conceptual framework for exploring the INR moves and the types of query reformulation searchers adopt. 3.2 Study Context and Participants Providing a detailed description (thick description) on the study context increases the transferability of an exploratory study. This section discusses the study context, such as the cataloging status of the VHF collection and the search system being used, and the participants and their information needs. 3.2.1 Study Context: Cataloging and Search Systems The participants in this study searched the Survivors of the Shoah Visual History Foundation?s collection of 52,000 videotaped testimonies (116,000 hours) in 32 different languages from survivors, liberators, rescuers, and witnesses of the Holocaust. Access to these testimonies was supported both by biographical search that enabled testimony-level access and by content-based search that allowed passage-level access. This section discusses the cataloging methods VHF has adopted and the search systems and interfaces used by the participants in this study. 33 3.2.1.1 Cataloging A pre-interview (pre-testimony) questionnaire (PIQ) was administered to each interviewee prior to the interview in order to gather detailed biographical information of the interviewee, such as demographic data, prewar life, wartime life, postwar life, family background, and others. VHF cataloged all 52,000 testimonies coarsely using the collected PIQ data, and fully augmented PIQ data were available for about 4,000 testimonies at the time of conducting the study (Summer, 2002). The PIQ data were used for biographical search. Manual cataloging has been in progress to support content-based search using two different methods: detailed cataloguing and simplified cataloguing. Using the detailed method, human catalogers viewed each interview and divided it into topical segments typically 2-5 minutes in length; for each segment they produced a three- sentence segment summary and assigned segment descriptors. They also produced a summary of the whole interview. Catalogers sometimes made notes using a scratchpad and later used these notes in composing segment summaries. The segment descriptors covered subjects, such as food and hiding, and names of places, persons, groups, and organizations. About 4,000 interviews out of 24,947 in English were cataloged using this method. The detailed cataloging method described above has been resource-intensive both in time and cost. VHF has adopted a new cataloging method to simplify the passage-level cataloging. Based on an assumption of individual differences in the way in which people perceive topic boundaries, VHF has eliminated the procedure for manual segmentation. Instead, human catalogers assigned descriptors as they viewed 34 an interview; descriptors are time-stamped with one-minute granularity. Searchers could then locate relevant spots in an interview based on the time-stamped descriptor and then determine their own segment boundaries around a relevant spot. Approximately, an additional 8,000 testimonies in English were catalogued using this simplified method at the time of this study. Indexing was completed in December, 2005. 3.2.1.2 Search Systems In conjunction with the change in cataloging method, VHF has developed two different search systems: the old VHF system (VHF Query Engine 3.5), which is designed to work with interviews cataloged by the detailed method, and the new VHF system. The old query engine works as a stand-alone application and supports both ? biographical search that enables searchers to access the augmented PIQ data and ? thesaurus-based content search that allows segment-level retrieval. The new VHF system is Web-based and supports search by people, experience type, and topic. Biographical search is not feasible in the new system, and searchers can only browse short PIQ data of the interviewees for those testimonies retrieved by the system. Approximately 12,000 testimonies with cataloging records were available in the new system, including the 4,000 testimonies catalogued by the detailed method. The new system disabled both the interview summaries and the passage summaries for those 4,000 testimonies in order to be consistent with the other 8,000 testimonies that do not have either of them. Table 4 summarizes the differences between the two VHF systems described in this section. 35 Table 4. A comparison of the two VHF systems: the old system vs. the new system The old VHF system The new VHF system Application Stand-alone Web-based Supported Search functions Biographical (PIQ) search Thesaurus-based content search People Search Experience search Thesaurus-based content search Number of testimonies catalogued About 4,000 testimonies About 12,000 testimonies Available Information Augmented PIQ data Interview code Interviewee name Interviewer name Videographer name Interviewee picture Interview city Interview date Language Segment descriptors Scratchpad notes (spotty) Interview summary Passage summary Length of interview Length of segment Segment replay (system defined) Interview replay Short PIQ data Interview code Interviewee name Interviewer name Videographer name Interviewee picture Interview city Interview date Language Gender Segment descriptors Length of interview Loading time Segment replay (user defined) Interview replay In this study, the new system was given to all participants as the default system to begin their search. The new VHF system has three different search modes which provide different search functions: People Search, Experience Search, and Global Keywords Search. People Search enables searchers to find a testimony of a specific person by the first and/or last name (Figure 8). Experience Search is a category search 36 that finds testimonies within one of the nine specific types of experiences of interviewees, namely: ? Jewish survivors ? homosexual survivors ? political prisoners ? Sinti and Roma survivors ? war crimes trials participants ? Jehovah?s witness survivors ? liberators and liberation witnesses ? rescuers and aid providers ? survivors of eugenic policies (Figure 9). Searchers then can further narrow their search by using a variety of optional criteria such as names, city of birth, country of birth, religious affiliation, ghettos, concentration camps, and other detailed experiences (Figure 10). 37 Figure 8. VHF system: People Search Figure 9. VHF system: Experience Search ? Initial screen 38 Figure 10. VHF system: Experience Search ? Optional criteria Global Keywords Search supports thesaurus-based content search and has two different search modes: Broad Categories Search and Keywords Search. The only difference between the two modes is the way in which they visualize thesaurus terms to help searchers formulate their queries. Broad Categories Search presents thesaurus terms with their hierarchical relationships so that searchers can find a search term by browsing terms following the hierarchy (Figure 11). Thesaurus terms in the Keywords Search mode are listed in an alphabetical order without the hierarchy among terms (Figure 12). 39 Figure 11. VHF system: Broad Categories Search ? Initial search screen 40 Figure 12. VHF system: Keywords Search ? Initial search screen In Figures 11 and 12, a searcher can find a thesaurus term either by typing it into the search box or by browsing the thesaurus. The scope note for the highlighted term will appear in the Keywords & Definition box on the upper right corner of the screen. The searcher may precede the search by adding the term into the My List box on the lower right corner of the screen and subsequently by clicking the next button below the box. The system will then prompt the searcher whether he/she wants to limit the search by the language of testimonies and/or gender of interviewees (Figure 13). The Search History on the left column of the screen in Figure 13 indicates the number of 41 search results (4,577 segments, in this case) which may affect the decision of whether the searcher wants to limit his/her search or modify the query. Figure 13. VHF system: Global Keywords Search ? Search constraints screen The new VHF search system maintains individualized user profiles (My Project) where searchers can save their search results for later review. A project in My Project is a folder that contains the search result of a single search by a searcher. The searcher can later use a project to limit his/her search, along with language and gender. For example, the searcher in Figure 12 may want to save the 4,577 retrieved segments on a project within My Project and later use it to limit his/her search. 42 The system then treats the saved project as a category and performs a within- category search. The participants in this study used My Project intensively, mainly because the system did not support the Boolean operator ?AND? in formulating queries. Instead, they had to save a search on My Project, run anther search, and limit the search by the previously saved project, in order to perform an ?AND? search. Figure 14. VHF system: Global Keywords Search ? Display screen The system displays the search result on the next screen (Figure 14) that presents a list of cataloger-defined segments (for those 4,000 testimonies cataloged by the detailed method) and/or one-minute chunks (for those 8,000 testimonies cataloged by the simplified method). 43 An entry on the display screen includes a picture and name of the interviewee, experience type, language, and the segment number. Searchers can resort these entries by last name, first name, icon (picture), experience type, or language ? names and faces of interviewees in Figure 14 were eliminated, in order to protect their privacy. Entries that say ?data only? without a picture indicate they are not yet digitized. Online access to these non-digitized testimonies was limited; searchers needed to make a special request to obtain the video tape for later viewing. Clicking on either the picture or the name of an interviewee takes the searcher to the viewing screen (Figure 15) where he/she can review more detailed information about the segment and the interviewee. On the viewing screen in Figure 15, the searcher can further examine the selected segment by browsing the descriptors assigned to that segment and/or adjacent segments by scrolling the ?Keywords for? box on the bottom center of the screen. The searcher can also examine a brief biographical profile of the interviewee that includes the name of the interviewee, date of birth, country of birth, religion, political affiliation, socio-economic status of the interviewee and his/her parents, immigration history, and others. Clicking the ?Biographical Profile? link on the left column of the bottom portion of the screen will replace the ?Keywords for? box with the ?Biographical Profile? of the interviewee. Both the ?Keywords for? and ?Biographical Profile? boxes are enlargeable by clicking the ?Maximize/Minimize Data? button on the right column of the bottom portion of the screen. 44 Searchers can view the tape by clicking either the segment number or the play button on the player, if the tape was already digitized ? a picture of the interviewee indicates the tape has been digitized. The download time next to the player indicates whether the tape was cached or not. ?Less than 1 minute? will appear for those cached tapes on the download time, and ?less than 15 minutes? for those not cached. Figure 15. VHF system: Global Keywords Search ? Viewing screen 3.2.2 Participants and User Workshops The Survivors of the Shoah Visual History Foundation (VHF) organized and hosted two user workshops in 2002 and invited eight participants to their On-Site Research and Training Center in Los Angeles. These user workshops were held in the 45 context of the MALACH project in which the primary goal was to understand user requirements in accessing Holocaust testimonies. VHF purposively selected a total of eight participants (four in each workshop) on the basis of mutual benefits, among those who submitted the Advanced Access Form to the Foundation?s testimonies. All eight participants had serious information needs in relation to their professional interests and were expecting to fulfill their information needs by participating in one of the workshops. VHF selected a broad range of user groups that included a film producer, Holocaust related scholars, educators, and graduate students, in order to understand user requirements from different groups of users. Table 5 lists each participant with a brief description on his/her information need. Table 5. Participants and their information needs Participant Profession (Type) Information Need User Workshop 1 P11 P12 P13 P14 Film producer Ph.D. candidate in history Master?s student in history Ph.D. in history The experiences of Jewish artists in Nazi Germany Jewish rehabilitation and non- rehabilitation after the Holocaust Jewish emigration to the US before the war and related US government immigration policies and actions Genocide events and related resistance movements User Workshop 2 P21 P22 P23 P24 Ph.D. in ethnography High school history teacher Sociologist Ph.D. candidate in German The interactions of history and social memory of the Holocaust Hiding and rescuer behaviors The life history of survivors born in Russia in a given time span The life history of ordinary women from different countries 46 3.2.2.1 User Workshop 1 User Workshop 1 had four participants (P11, P12, P13, and P14), as shown in Table 4. P11 was gathering information for a documentary film about the experiences of Jewish artists and the participation of Jews in the cultural life in Nazi Germany. P11 participated in an earlier study that was held in May 2002 at the University of Maryland where P11 searched about 1,000 VHF testimonies with audio, only. VHF and the investigator invited P11 to the workshop in order to further explore P11?s search behavior, providing a more comprehensive access to VHF testimonies with video. P11 participated in the workshop for three days. P12 was working on a Ph.D. dissertation that examined the issues of Jewish rehabilitation and non-rehabilitation after the devastation of the Holocaust. The focus of P12?s search at the Foundation was on finding testimonies that described life in Displaced Persons (DP) camps in Germany. P12 was interested in analyzing the factors that influenced the DP experience. P12 participated in the workshop for five days. P13 was working on a Master?s thesis that explored Jewish emigration to the US before the war or in the early days of the war and related US government immigration policies and actions. P13 participated in the workshop for nine days. P14 was a professor interested in the comparative study of genocide events and related resistance movements. P14 participated in the workshop for four days. 3.2.2.2 User Workshop 2 User Workshop 2 had another four participants (P21, P22, P23, and P24). P21 was working on post-doctoral research that examined the intersections of history and 47 social memory of the Holocaust. P21?s research at VHF focused on peoples memories of death camps. P22 had taught a course about the Holocaust in a high school for many years. Examination of survivor stories was a component of the course to stimulate students' own thinking about the dangers of intolerance. The focus of P22?s research at VHF was on those who survived by hiding and on rescuer behaviors. P23?s research at VHF focused on survivors born in Russia in a given time span. P23 was interested in examining the life history of those survivors by statistically analyzing demographic information. P24 was working on a dissertation that studied life histories of ordinary women and was especially interested in comparing women from different countries. 3.3 Data Collection Methods In keeping with the principle of triangulation, data were collected through the overlapping techniques of observation and screen captures, think-aloud, interviews, and focus group discussions. Triangulation is a process that can enhance the trustworthiness of an exploratory study by gathering evidence from multiple sources (Lincoln and Guba, 1985). This study used different procedures of data collection for the two user workshops hosted by the Survivors of the Shoah Visual History Foundation. 3.3.1 Data Collection: Pilot Study A pilot study was conducted with two participants in order to validate the data collection procedures. The first participant was a Ph.D. student in a library school who held a Master?s degree in History and searched testimonies using the old VHF system. 48 The second participant was a staff member at the Foundation who searched testimonies using the new VHF system. An intermediary was present during the pilot study that consisted of a 30-minute introduction and training, a 30-minute search, and a 15-minute interview. Special consideration was given during the pilot study to the procedures and the equipment used in this study in order to ensure the adopted procedures could collect the required information using the equipment. In addition to validating the data collection procedure, the pilot study served as training for the intermediary on what roles she should and should not play during the search session. 3.3.2 Data Collection: User Workshop 1 The focus of User Workshop 1 was on observing: 1) what criteria searchers used in selecting a recording or a passage (research question 1.1 in Chapter 1). 2) using what attributes (research question 1.2) 3) what INR moves they exploited to improve their queries (research question 2.1) 4) using what query reformulation types (research question 2.2) The study during User Workshop 1 used observation and screen captures, think- aloud, and semi-structured interviews in order to gather evidence from different sources. The procedure for data collection was somewhat longitudinal in that participants were sequentially observed for as short as 3 days and as long as 9 days during June, 2002, as was discussed in Section 3.2.1. Before the workshop, the investigator sent out a pre-questionnaire to each participant in order to gather such information as the information need, educational 49 background, professional affiliation(s), and system experiences of the participant (Appendix A). VHF gathered more detailed information about each participant, his/her information need, and the purposes of his/her information use. VHF shared the collected information from each participant with the investigator. Upon the arrival of each participant, VHF gave a 2-hour tour of the Foundation that provided the participant with an idea of how VHF gathered, stored, and cataloged testimonies. The participant was then brought to the On-Site Research and Training Center where he/she worked for the rest of his/her stay to find testimonies that met his/her information need. The center was equipped with five workstations that supported online search and access to the 52,000 Holocaust testimonies assembled by VHF. None of the participants in this study had previous experience with the VHF system. A staff from the Education Department of VHF gave comprehensive training on how to use the VHF system for each participant, followed by an unaided demonstration session using a topic given by the investigator. On average, the training including the demonstration session took about two hours. The investigator then gave a brief introduction to the study and explained the tasks and procedures that were involved with data collection. The participants read the Informed Consent Form (Appendix B), signed it, and submitted it to the investigator. The investigator explained the procedures for think-aloud and demonstrated how to think aloud. Participants then practiced how to think aloud during their search. This introduction including the training for think-aloud took about 30 to 40 minutes. 50 Each participant performed two search sessions for two hours each: one with and one without an intermediary. The role of the intermediary in this study was mainly to help searchers formulate their queries and operate the search system. All four participants had the same intermediary who was formerly a cataloger at the Foundation. Thus, the intermediary was familiar with the collection, the terms used in the VHF thesaurus, the context in which each term was used, and the procedure for cataloging. The intermediary also worked for several years answering mail requests but was not trained in reference interview techniques. The intermediary reviewed the Advanced Access Form each participant submitted prior to his/her participation in the workshop. Thus, the intermediary had some knowledge on the educational/professional background of each participant and his/her information need. The intermediary answered any technical questions participants had during the search but was instructed not to interrupt them during the processes of making relevance judgments. Thus, participants made relevance judgments independently, without any help or interruptions from the intermediary or the investigator. Participants reformulated their queries with help from the intermediary who encouraged them to refine their information needs and suggested new or alternative search terms to be used for the next search. The presence of the intermediary during the search was well appreciated by all participants. During the search, participants were asked to think aloud, in terms of how they selected or discarded an interview or a passage and reformulated their queries. The investigator made observations during the search on the search behavior being 51 examined in this study. Participants often made notes on paper during their search and the investigator collected and copied these notes. On-screen activities, including all search queries each participant used, were captured along with the synchronized audio, using a video screen-recording program (Camtasia). The observation lasted about two hours and was immediately followed by a 30-minute semi-structured interview. The second search session without an intermediary followed exactly the same procedure used in the first session. In addition to the two search sessions using the new VHF system, two of the four participants (P11, P14) performed a third search using the old VHF system where testimony summaries and segment summaries were available for those 4,000 testimonies cataloged using the detailed method. The third search session lasted about 20 minutes and was followed by a 10-minute interview. The investigator conducted a 30-minute final interview at the end of each participant?s stay in order to gather his/her insight into the foreshadowing questions of the study as he/she gained more experiences with the system and viewed several testimonies. All search sessions and interviews were audio taped and later transcribed for data analysis. 3.3.3 Data Collection: User Workshop 2 During User Workshop 2 (August 5 to 8, 2002), data were collected through observation, think-aloud, semi-structured interviews, and focus group discussions. Each participant was paired with an observer who monitored the participant?s information-seeking behavior throughout the workshop. Observational notes were made daily by each observer and later transcribed for data analysis. Each participant 52 performed a 2-hour search with an intermediary on day 1 of the workshop and was asked to think-aloud during the search, followed by a 30-minute semi-structured interview. On day 3, an additional semi-structured interview was completed by each participant. There were two intermediaries available during Workshop 2 who helped searchers formulate their queries and operate the search system. Both intermediaries were formerly a cataloger at the Foundation. Thus, they were familiar with the collection, the terms used in the VHF thesaurus, the context in which each term was used, and the procedure for cataloging. The intermediaries also worked for several years answering mail requests but were not trained in reference interview techniques. In addition to individual search observations and interviews, two moderated focus group discussions were held with all four participants (day 1 for one hour, day 4 for two hours), in order to discuss their experience with the system and collect ideas for improvement. Another focus group discussion with all intermediaries was held on day 4 in order to discuss their experience. All sessions including the search, interviews, and focus group discussions were audiotaped and later transcribed for data analysis. 3.3.4 Data Collection Protocols 3.3.4.1 Observation Protocol An observer (the author, in this case) was present during each search of the first user workshop. The focus of the observer?s activity during the search was on understanding the cognitive processes underlying user behaviors, primarily relevance judgment and query reformation. Therefore, the observer paid special attention to the processes participants used in selecting or rejecting an interview or an interview 53 passage, how the information need or interest of the participant changed in the course of sense making, and what search terms the participant used in reformulating queries. Any unexpected behavior was noted during the search, and later used to guide clarification questions during the interview. The observer tried not to interrupt the participant or the intermediary during a search, unless necessary. Participants were asked to consult the intermediary (the observer, if there was no intermediary available) for all questions during the search session. The interaction between the participant and the intermediary was observed, as well. Four observers were available during the second workshop. Three of the four observers were investigators of the MALACH project (two faculty members in an information studies school, and one faculty in the Center for Language and Speech Processing), and one observer was a staff member from VHF who served as the intermediary during the first workshop. A short briefing on how and what to observe was given before the workshop, and written instructions for observation were provided (Appendix C). Each observer was paired with a participant and observed the participant?s information-seeking behavior throughout the workshop, following similar instructions to those used in the first workshop. In addition to the user behavior of relevance judgment and query reformation, observers paid additional attention to user requirements that included user interface issues, thesaurus navigation, and other access issues to the testimonies. Observational notes from each observer were gathered by the author and transcribed for data analysis. 54 In Workshop 1, on-screen activities including all queries used during the search were saved on the hard drive using a video recording program. A low-resolution (NTSC) videotape of on-screen activities was also made using a video capture card. In Workshop 2, screen captures were not technically feasible. 3.3.4.2 Think-aloud Protocol Think-aloud methods have both advantages and disadvantages as a component of qualitative study designs. One important concern is that verbalizing thoughts can inspire introspection, which in turn might alter the behavior that we wish to study. On the positive side, verbal protocols do not change people?s cognitive process (Ericsson and Simon, 1993; Wilson, 1994) and can provide insights into the cognitive processes of a searcher that may not be available in any other way. A rich set of data on the cognitive processes underlying searcher behavior was collected in this study using the think-aloud protocol. All participants in this study were asked to think aloud during the search about why they selected or discarded an interview or a passage, what attributes they used to make relevance judgments, what differences they found between what they expected to get and what the system retrieved, and how they refined their information needs and reformulated their queries. Each participant received brief training on how and what to articulate in think- aloud before his/her first search session and was asked to articulate his/her thoughts in his/her own words. With the consent of each participant, the think-aloud was taped and subsequently transcribed with time stamps. 55 3.3.4.3 Interview Protocol A 30-minute semi-structured interview was conducted immediately following each 2-hour search session during User Workshop 1. The goal of the interview was to obtain additional information about the processes by which a participant made relevance judgments, about the associated attributes on which the decision was based, about the tactics which a participant used in refining his/her information need, and about the types of query reformulation used in achieving each tactic. Appendix D summarizes the topics that were explored during the interview and the questions that were used to initiate discussion on each topic. A 30-minute final interview was conducted at the end of each participant?s stay at the Foundation to gather further insight into the foreshadowing questions of this study as the participant gained more experiences with the system. A 30-minute semi-structured interview was conducted after the search session during User Workshop 2, as well. In addition to the above-mentioned goals, the interview focused on gathering information about system/interface issues and user requirements. Appendix E outlines guidelines for the interview. A 45-minute final interview was conducted on day 3 of the workshop, using the sample questions listed in Appendix F. With the consent of the participant, all interviews were taped and subsequently transcribed. 3.3.4.4 Focus Group Discussion Protocol Two focus group discussions with participants were conducted during User Workshop 2. The first one was done on day 2 of the workshop for 1 hour, and the second one on day 4 of the workshop for 2 hours. Discussion guidelines were prepared 56 in advance (Appendix G and Appendix H, respectively) to facilitate the discussion during the session. A moderator was present to coordinate the discussion. An additional group discussion with intermediaries was conducted at the end of the workshop for 1 hour, in order to gain their experiences in helping searchers. However, the focus of these group discussions was on system/interface issues including search experiences, use of transcripts, and other user requirements. Data gathered from these group discussions were not used in this dissertation unless they were directly related to the topics being examined. All group discussions were taped and subsequently transcribed, with the consent of the participants. 3.4 Data Analysis Qualitative research is best thought of as inductive rather than deductive, building concepts and theories from details of phenomena (Creswell, 1994). Therefore, data analysis in this dissertation was done in a way that can effectively support the process of inductive reasoning by providing rich context for each incident of searcher behavior gathered from different sources and by organizing individual incidents into similar categories. The primary objective of the data analysis was to find the major patterns of searcher behavior that would lead us to an understanding of the searcher behavior when using speech retrieval systems. Data analysis is divided into two phases. Phase one consists of developing an initial coding scheme and coding while simultaneously revising the coding scheme. Phase two consists of looking for patterns and relationships in searcher behavior. 57 3.4.1 Initial Phase: Developing an Initial Coding Scheme and Coding Data analysis began with the development of codes. An initial coding scheme was developed using both the relevance criteria and attributes that were identified from previous studies (Tables 1 and 2) and the available information/attributes from the VHF system (Table 4). Available speech attributes in the VHF system were significantly different from those of text and radio programs. Thus, the initial coding scheme was developed by mapping available speech attributes into each identified criterion. The initial coding scheme was then further modified as new criteria and attributes were found through the pilot study. A similar procedure was adopted when developing codes for INR moves and query reformulation types. INR moves and query reformulation types were identified from previous studies (Table 3) and then modified as new categories evolved later on. Coding began as soon as the first search session with the first participant concluded. QSR NVivo, a data analysis system that provided extensive support for qualitative analysis of coded datasets, was used as a tool for data analysis. Observation notes, think-aloud transcripts, semi-structured interview transcripts, and focus group discussion transcripts from user workshops 1 and 2 were entered into the system and categorized using the initial coding scheme mentioned above. New categories were created as the analysis revealed additional criteria, attributes, INR moves, and query reformulation types. For example, emotion was added to the category of relevance criteria as 6 participants mentioned it affected their selections of testimonies. As a result, facial expression, an associated attribute for emotion, was added to the initial coding scheme. 58 Meanwhile, some categories were discarded from the initial coding scheme, since no incident was found during either workshop. For example, no participant mentioned authority, a relevance criterion searchers often used when selecting text. As a result, the name of interviewees, the associated attribute for authority, was removed from the initial coding scheme. Final coding categories that were used in this dissertation are shown in Table 6. 59 Table 6. Final Coding Categories A. Relevance Criteria A1 Topicality A2 Accessibility A3 Richness A4 Emotion A5 Comprehensibility A6 Duration A7 Novelty of content A8 Acquaintance A9 Access to the interviewee A10 Miscellaneous B. Associated Attributes B1 Content B1.1 Time Period B1.1.1 Date B1.2 Event and Experience B1.2.1 Personal Event B1.2.1.1 Hiding B1.2.1.2 Escaping B1.2.1.3 Deportation B1.2.1.4 Life B1.2.1.5 Abandonment B1.2.1.6 Immigration B1.2.1.7 Incarceration B1.2.1.8 Forced Labor B1.2.1.9 Liberation B1.2.1.10 Suicide B1.2.1.11 Abortion B1.2.1.12 Wedding B1.2.1.13 Murder B1.2.1.14 Adaptation B1.2.2 Historic Event B1.2.3 Experience B1.2.3.1 Jewish Survivors B1.2.3.2 Homosexual Survivors B1.2.3.3 Political Prisoners B1.2.3.4 Sinti and Roma Survivors B1.2.3.5 War Crimes Trials Participants B1.2.3.6 Jehovah?s Witness Survivors 60 B1.2.3.7 Liberators and Liberation Witnesses B1.2.3.8 Rescuers and Aid Providers B1.2.3.9 Survivors of Eugenic Policies B1.3 Place B1.3.1 City B1.3.2 Country B1.3.3 Region B1.3.4 Ghetto B1.3.5 Camp B1.4 Person B1.4.1 Name B1.4.2 Date of Birth B1.4.3 Gender B1.4.4 Nationality B1.4.5 Country of Birth B1.4.6 Occupation, Interviewee B1.4.7 Occupation, Parents B1.4.8 Religion B1.4.9 Immigration History B1.4.10 Social Status, Interviewee B1.4.11 Social Status, Parents B1.4.12 Level of Education B1.4.13 Marital Status B1.4.14 Family Status B1.4.15 Address B1.5 Object B1.5.1 Specific Object B1.5.1.1 Ship B1.5.2 Type of Object B1.5.2.1 Weapon B1.5.2.2 Geographical Objects B1.6 Organization/Group B1.6.1 Specific Organization/Group B1.6.1.1 Cultural Organization/Group B1.6.2 Type of Organization/Group B1.6.2.1 Resistance Organization/Group B1.6.2.2 Cultural Organization/Group B1.7 Other Topics B1.8 Non-Textual Attributes B1.8.1 Emotion B1.8.1.1 Crying B1.8.2 Visual Features B1.8.2.1 Facial Expression B1.8.2.1.1 Humiliation 61 B1.8.2.2 Gesture B1.8.2.3 Visual Display B1.8.2.4 Picture B1.8.3 Audio Features B1.8.3.1 Whispering B1.8.3.2 Yelling B1.8.3.3 Singing B1.8.3.4 Voice Tone B1.8.3.5 Accent B2 Format B2.1 Length B2.2 Response Time B2.3 Language B2.4 Clearness of Speech B2.5 Catalogued/Not Catalogued B2.6 Presentation B2.7 Amount of time/percentage B2.8 Not Applicable C. Actions Taken (to Bridge a Gap) C1 Refine Information Need C1.1 Formalized Information Need C1.1.1 Formalized Information Need 1 (Q3 (1) ) C1.1.2 Formalized Information Need 2 (Q3 (2) ) C1.1.3 Formalized Information Need 3 (Q3 (3) ) ? C1.2 Refinement Strategy (INR Move) C1.2.1 Clarification C1.2.2 Specialization C1.2.3 Specialization by Elimination C1.2.4 Restriction C1.2.5 Generalization C1.2.6 Parallel Movement C1.2.7 Note for Later C2.3 Refinement Sources C2.3.1 Previous Knowledge C2.3.2 Intermediary C2.3.3 Pre-Interview Questionnaire (PIQ) C2.3.4 Thesaurus C2.3.4 Assigned Descriptors C2.3.4 Viewing C2.3.4 Number of a Search Result C2 Formulate or Reformulate Query C2.1 Compromised Query 62 C2.1.1 Compromised Query 1 (Q4 (1) ) ? Initial Query C2.1.2 Compromised Query 2 (Q4 (2) ) C2.1.3 Compromised Query 3 (Q4 (3) ) ? C2.2 Query Language Component (Reformulation Type) C2.2.1 Adding/Removing a Condition C2.2.1.1 Adding a Condition C2.2.1.1 Removing a Condition C2.2.2 Modifying a Condition C2.2.1 Narrowing a Condition C2.2.1.1 Narrower Term C2.2.1.2 Removing ORed terms C2.2.2 Broadening a Condition C2.2.2.1 Broader Term C2.2.2.2 Adding Terms with OR C2.2.3 Other Modification C2.2.3.1 Replacing a Term with a Spelling Variation C2.2.3.2 Replacing a Term with a Synonym C2.2.3.3 Replacing a Term with a Related Term C2.2.5 New Query In addition to the coding categories listed in Table 6, other categories that were related to different issues, such as user interface, functionality, thesaurus, and other MALACH-related issues, were developed (see Appendix I for a complete list of coding categories) and subsequently analyzed. Although these issues were not directly related to the topic this dissertation was focusing on, some analytical findings were mentioned if appropriate. For example, one of the objectives of this dissertation was to examine attributes/metadata searchers used during their search. This would inform system/interface designers what metadata to present, which was sufficient for the purpose of this dissertation, but not how to present them. User interface issues were examined, especially during the user workshop 2, in order to answer the how question, and some findings are mentioned in the implication section of this dissertation. 63 3.4.2 Final Analysis: Finding Patterns and Relationships Upon completion of coding all datasets, the investigator examined the resulting coded data to identify patterns and trends in the application of relevance criteria and attributes. Matrices resembling Table 1 that listed each relevance criterion and associated attributes were created. Mentions of each incident were counted both by participant and by source. When counting, overlapping mentions for the same incident gathered from different sources were cross-referenced (in order to gain confidence in the reliability of data interpretations, which is a process known as triangulation) and counted only once. A similar procedure was applied for examining INR moves and query reformulation types. Matrices were created within which each INR move and corresponding reformulation types were listed (see Appendix K). Mentions were counted by participant and by source. Transition diagrams using the pattern from Figure 7 were used to depict the interaction of INR moves during query reformulation and relevance judgment. In order to examine the relationship between INR moves and query reformulation types, it was critical to have all search queries participants used and the number of retrieved documents for each query. This data was available through the screen capture used during User Workshop 1, as mentioned in Section 3.3.2. Unfortunately, no such data was available from User Workshop 2, since screen capture was not employed during the workshop. For this reason, the data analysis for INR moves and query reformulation types used only the data gathered from the four participants in User Workshop 1. 64 In writing the report on the analysis, the investigator first created an overall scheme for presenting relevance criteria and INR moves. Each criterion and INR move was then expanded by including associated attributes and query reformulation types, respectively. This approach provided the investigator with sufficient examples of evidence that could enhance finding some patterns and trends of the searcher behavior of speech retrieval systems. The next chapter addresses the findings from data analysis. Due to the nature of qualitative research, the next chapter is written principally in narrative form. Quotes from think-aloud, interview, and group discussion transcripts that illustrate each incident of searcher behavior are included in order to enhance the process of inductive reasoning. 3.5 Validity Issues The trustworthiness of a qualitative study based on naturalistic inquiries can be judged by the credibility, dependability, and transferability of the results (Guba, 1981; Lincoln and Guba, 1985; Marshall and Rossman, 1989). Naturalistic inquiry deals with multiple realities that are in the minds of people. Thus, it is important for inquirers to ensure the interpretation of data sources is credible. This study used persistent observation, referential adequacy, and peer debriefing to ensure credibility. An observer worked consistently with a participant during the two workshops and had somewhat extended interactions with each participant. This consistent observation helped the investigator understand certain user behaviors better by providing some contextual information regarding the behavior. 65 Referential adequacy was achieved by collecting an extensive array of materials, as mentioned in Data Collection and Data Analysis. For example, the screen capture of on-screen activities was used in referencing certain elements (e.g., queries used) that were referred to in think-aloud transcripts. Peer debriefing was done throughout the study. Two of the committee members of this dissertation (chair and co-chair) were active participants of both user workshops (as observers) and thus had a comprehensive understanding of the study context. They both provided insights into study design, data collection, data analysis, and reporting findings. Other dissertation committee members conducted peer debriefing at different stages of study progress. The techniques that were used to ensure dependability included triangulation and corroborating the findings with the findings of existing literature. Triangulation, as already noted in the above section, was done by collecting data from a variety of perspectives using different methods, such as observation and screen capture, think- aloud, interviews, and focus group discussions. Confirming indications obtained from one data source using another helps to ensure the reliability of data interpretations. Research findings were compared with the findings of previous studies (Wilson, 1973; Bates, 1990; Chen and Dhar, 1990; Schamber et al., 1990; Fidel, 1991; Hirschberg and Whittaker, 1997; Wang and Soergel, 1998; Kim et al., 2003), when applicable, to confirm whether they were in agreement/disagreement with each other. Finally, transferability was fostered by using purposive sampling as described in the previous section, collecting extensive descriptive data, and developing a thick descriptive context that would permit comparison of this context to other contexts to which transfer might be contemplated (Guba, 1981). Quotations were given, when 66 appropriate, to reveal the context in as much detail as possible through the comments of the selected participants. These techniques would allow others with similar research questions to derive implications for their own contexts. 3.6 Confidentiality and Privacy Due to the nature of the testimonies, this study involved some serious issues of confidentiality and privacy. Some testimonies included incidents that could be politically sensitive. Other testimonies revealed memories that were very private. For these reasons, all parties involved with this study agreed to uphold confidentiality and privacy. Study participants were pre-approved by VHF and informed of the issues regarding the confidentiality and privacy involved in this study. Each participant was asked to sign an agreement that he/she would not use any contents he/she acquired by participating in this study, other than for the purposes pre-approved by VHF. The same policy was applied to the investigator, the intermediaries, and the observers. Any specific content information the investigator, the intermediaries, and the observers acquired by conducting this study should not be used other than as pre-approved by VHF. An informed consent form was created for transcribers of think-aloud, interviews, and focus group discussions to ensure these issues of confidentiality and privacy (Appendix C: Informed Consent Form for Transcribing Tapes). Careful consideration regarding the confidentiality and privacy issues was given when reporting the research findings. 67 Chapter 4: Findings Information seeking involves thinking, problem solving, cognition (Mayer, 1991). Searchers may first think to define their information needs, recognize problems in searching for information, try to solve these problems during the cognitive processes of relevance judgment and query reformulation, and iterate the whole process until they solve their problems. This chapter discusses the thinking and cognition processes of searchers during relevance judgment and query reformulation and examines how searchers solve the problems they encounter when seeking information. Relevance judgment is a decision-making process: searchers select or discard information objects among those the system retrieved. Searchers may apply one or more selection (relevance) criteria when they make a decision. Each relevance criterion is associated with one or more attributes that searchers used as the basis for assessing a criterion. Likewise, query reformulation is a problem-solving process: searchers reformulate their queries to fill the gap (i.e., a problem) between what they expected to find and what the system actually found. In order to solve the problem, searchers may need to refine their information need and/or find an alternative query that better represents their information needs. These cognitive processes of relevance judgment and query reformulation may occur interactively during the iterative process of information seeking. Searchers may reveal their relevance criteria during query formulation and reformulation and refine their information needs during relevance judgment. This dissertation examines: 68 1.1 the relevance criteria that searchers apply during relevance judgment and the criteria searchers reveal during query formulation and reformulation, 1.2 the associated attributes of recordings searchers use as the basis for judging each relevance criterion, 2.1 the types of moves searchers make in refining their information need and, 2.2 the types of query reformulation searchers use in achieving each move. 4.1 Relevance Judgment: Relevance Criteria and Associated Attributes Unlike previous studies on relevance that examined only the criteria and attributes searchers used during relevance judgment, this study further explored the relevance criteria and attributes searchers revealed during query formulation and reformulation. Table 7 summarizes the relevance criteria searchers mentioned, listed in order of decreasing number of mentions from the think-aloud during search sessions and interviews from both user workshops (percentages are shown in parentheses under the ?All? column). The number of participants who mentioned each criterion is listed in the last column. In Table 7, mentions were first identified by the source (think-aloud during search and interviews) and then further divided by the search behavior (relevance judgment and query formulation 3 ) participants were engaged in. Relevance criteria mentioned during focus group discussions are included in the ?Interviews? column ? these mentions from focus group discussions were few as participants discussed mainly how they perceived the VHF system during their search. All criteria mentioned during 3 The term ?query formulation? in Tables 7, 8, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and 23 refers to both the initial query formulation and query reformulation. 69 search sessions were re-mentioned during interviews and/or focus group discussions. Thus, gathering evidence from multiple sources established greater confidence in inferences of the trends and patterns of the searcher behavior of relevance judgment being examined in this study. Table 7. Mentions of relevance criteria by searchers (See Glossary for the definition of each criterion listed in this table) Number of Mentions Think-Aloud Relevance Criteria All (N=703) Relevance Judgment (N=292) Query Form. (N=248) Interviews (N=163) Number of Participants Who Mentioned Topicality 535 (76%) 219 234 82 8 Accessibility 43 (6.0%) 28 0 7 8 Richness 39 (5.5%) 14 0 25 6 Emotion 24 (3.4%) 7 0 17 6 Comprehensibility 14 (2.0%) 1 10 3 7 Duration 11 (1.6%) 9 0 2 4 Novelty of content 10 (1.4%) 4 2 4 3 Acquaintance 8 (1.0%) 3 2 3 4 Access to the interviewee 3 (0.4%) 2 0 1 2 Miscellaneous 16 (2.3%) 5 0 11 4 Topicality was mentioned remarkably more often than any other criterion (by all eight participants), which corresponds with previous findings both for document retrieval (Wang and Soergel, 1998) and for broadcast news retrieval (Kim et al., 2003). Some criteria were mentioned more frequently during query formulation than during relevance judgment (e.g., comprehensibility), and some were mentioned only during relevance judgment (e.g., accessibility, richness, emotion, ...). 70 Partially, this is due to what attributes are available for which process and to the searchers? knowledge of these attributes. For example, some participants wanted to find testimonies only in certain languages, like English ? in this case, language was the attribute participants used as the basis for assessing comprehensibility. Once participants limited their search by language during the initial query formulation, which was an available feature on the current VHF system, they did not need to apply it during relevance judgment. Contrariwise, no attribute was available for emotion during query formulation. Participants were able to assess emotion only during/after viewing the testimony. As a result, emotion was mentioned only during relevance judgment and during interviews. Table 8 provides a more detailed view of Table 7; criteria are paired with associated attributes (with mention counts for each association). For example, searchers judged comprehensibility by the language spoken in the testimony (10 mentions) during query formulation and by the clarity of speech (1 mention) during relevance judgment. Numbers in the last column indicate the number of participants who mentioned the corresponding attribute listed in each row. 71 Table 8. Mentions of relevance criteria and associated attributes Number of Mentions Think-Aloud Relevance Criteria Associated Attributes All Rel. Judgment Query Form. Interview Number of Participant Who Mentioned Topicality Person Place Event/Experience Organization/Group Time frame Object Other topics 535 124 120 116 44 32 26 73 219 33 48 54 26 14 13 31 234 67 57 46 11 11 6 36 82 24 15 16 7 7 7 6 8 8 7 4 5 2 6 Accessibility Cache Digitization 43 24 19 28 17 11 0 0 0 15 7 8 8 7 Richness Amount of info. Presentation skill 39 31 8 14 12 2 0 0 0 25 19 6 6 4 Emotion Facial expression Voice Gesture 24 19 4 1 7 5 2 0 0 0 0 0 17 14 2 1 6 3 1 Comprehensibility Language Clarity of speech 14 13 1 1 0 1 10 10 0 3 3 0 7 1 Duration Length 11 9 0 2 4 Novelty of content Topic 10 4 2 4 3 Acquaintance Interviewee, name 8 3 2 3 4 Access to the interviewee Interviewee, name Interviewee, address 3 2 1 2 1 1 0 0 0 1 1 0 2 1 Miscellaneous Visual features Audio features 16 10 6 5 4 1 0 0 0 6 6 5 3 3 72 4.1.1 Topicality Topicality, the most frequently mentioned criterion by all eight participants, indicates whether the information object being examined is topically relevant to the information needs searchers have. Person (23%), place (22%), and event/experience (22%) were the three most frequently mentioned attributes (Table 8). Table 9 presents samples of quotes that were used as incidents of each attribute for judging topicality (each incident within a quote is underlined). Table 9. Sample quotes of the attribute for judging topicality Attributes Quotes Person ?? Should we type in some names and then select friends of, ? Because there are a few like Kurt Singer, Rudolph Schwartz, and Julian Babb.? ?? After the sex, of course, the female or male. And after that, citizenship for these kids.? Place ?? Netherlands. That?s interesting. He talks about the Netherlands, he talks about South Africa.? Event/Experience ?? He is definitely of primary interest for me and he talks about the suicide attempts on the ship.? Organization/Group ?? Red Cross Danish, and it will see if somebody brought up the Danish Red Cross in combination with?? Time frame ?? I?m looking at the time from 1933 to 1945.? Object ?? I am working on the S.S. St. Louis, German ship of Jewish refugees.? Other subjects ?? This was the details they told me about their flight preparations.? Each attribute consists of one or more sub-attributes that can represent and/or be part of the attribute. For example, city and country can be part of place, and date can represent time frame. Person refers either to specific person or to personal 73 characteristics of an interviewee. Table 10 expands the section for topicality by presenting the list of attributes and sub-attributes participants used testimony or a passage 4 within a testimony (with mention counts for each association). Other criteria in Table 8 are further discussed in the subsequent Tables of 11, 12, 13, 14, 15, 16, 17, 18 and 19. 4 A passage in this dissertation refers to a portion of a recording individual searchers define and can consist of one or more system-defined segment(s). 74 Table 10. Mentions of associated attributes: Topicality Number of Mentions Think-Aloud Topicality Associated Attributes All Rel. Judgment Query Form. Interviews Number of Partici- pants Person Specific person Name Characteristics of person Date of birth Gender Occupation, interviewee Country of birth Religion Social status, interviewee Social status, parents Nationality Family status Address Occupation, parents Immigration history Level of education Political affiliation Marital status 34 30 20 15 14 8 8 7 7 5 5 4 3 2 1 1 10 13 4 3 4 4 0 0 0 3 4 1 0 0 0 0 17 4 12 9 8 2 4 2 7 1 0 0 2 1 1 1 7 13 4 3 2 2 4 5 0 1 1 3 1 1 0 0 7 6 5 6 4 2 2 2 4 2 1 2 2 2 1 1 Place Camp Country Ghetto City Region 64 47 20 20 6 16 22 8 7 3 37 19 10 10 2 11 6 2 3 1 6 6 5 6 3 Event/Exp. Personal event/experience Resistance Deportation Suicide Immigration Liberation Escaping Forced labor Hiding Abortion Wedding Murder/Death Adaptation Abandonment Incarceration Historical event/experience November Pogrom Beautification Polish Pogrom Olympic games 39 14 11 8 8 6 4 3 2 2 2 2 1 1 11 4 1 1 24 2 7 3 4 3 0 1 1 2 0 2 0 0 8 2 0 0 9 11 0 3 3 2 3 1 1 0 0 0 0 1 3 2 1 1 6 1 4 2 1 1 1 1 0 0 2 0 1 0 0 0 0 0 1 2 2 2 3 3 2 2 2 2 2 1 1 1 2 1 1 1 Org./Group Specific organization/group 44 26 11 7 7 Time Frame Date 32 14 11 7 5 Object Physical object Ship Weapon Geographical object 13 8 5 4 7 4 5 0 1 4 1 0 1 1 1 Other topics Subject 73 31 36 6 8 75 Person All eight participants mentioned person and were looking either for a specific person (34 mentions) or for some characteristics of a person (90 mentions) ? e.g., date of birth, occupation of interviewees, nationality of interviewees, etc. Participants used name when searching for a specific person who gave a testimony or who was mentioned in a testimony. Name, which was the most frequently mentioned specific sub-attribute of person, was used both for testimony-level access (e.g., People Search using the name of an interviewee) and for passage-level access (e.g., Keyword Search using the name of a person mentioned in a testimony). Names of remarkable individuals, such as camp commanders, leaders of organizations/groups, and captains of ships, were considered to be important when making relevance judgment and formulating queries. Participants often wanted to find testimonies by some personal characteristics of the interviewees. Table 10 lists example characteristics of person participants used when judging topical relevance of a testimony or a passage. Six of the eight participants (P11, P12, P13, P14, P21, and P23) indicated date of birth was important for their searches in that they could infer the age of interviewees at the time of the event or experience being searched. For example, participant P13 remarked: ?? In this way, together with the date of birth, if the person was very small, they didn?t have such a, you know, detailed memory? Ok, so here I have her. They all seem to be in their sixties, so that means they were all a child. And I am just curious to see somebody who was older at the time.? Although participants were not able to see date of birth of interviewees because of a system limitation at the time of the study, they consistently mentioned date of birth both during relevance judgments and during interviews. 76 Gender was the second most mentioned characteristic of person. In Table 10, gender was mentioned mostly during query reformulation (12 of 16 mentions; 75%), whereas it was during relevance judgment for date of birth (13 of 17 mentions; 76%). It seems this is due to the availability of such system features of limiting a search by gender and date of birth ? i.e., participants were able to limit their search by gender on the current VHF system but not by date of birth. It therefore could have been useful for participants if the system provided the capability of limiting a search by date of birth, in addition to by gender. Five participants (P12, P14, P22, P23, and P24) wanted to limit their search by gender in relation to their search topics and/or to the purpose of their information use, as stated in the following quote 5 from P12: ?? I want to incorporate gender analysis into my dissertation and I have more testimonies and more information about residents of DP (Displaced Persons) camps that I do of female so I want to bolster that. Also, because of the conference I?m going to I wanted to find some specific segments of women to show at the conference.? Other personal characteristics participants mentioned were occupation, social status, country of birth, nationality, and so on. Participants P11, P12, P14, P21, P23, and P24 mentioned occupation of the interviewee which they used for two different purposes. There were cases (10 of 15 mentions) participants stated that search by occupations of the interviewee (and people mentioned in the testimony) would have been useful to them, as P11 mentioned: ?? Alright, musicians, let's do that one. Doesn't work. Hmm, it would have been much better if it worked.? 5 All quotes in this dissertation are copied from the transcripts of recorded tapes of think-aloud, interviews, or focus-group discussions. Original transcripts are quoted without correcting grammatical errors. 77 Occupation of the interviewee was also used as a proxy for social status of the interviewee (5 of 15 mentions). Participant P23 and P24 used occupation of parents for the same purpose, together with social status of parents, especially when the interviewee was a child at the time of the event or experience being examined. For instance, P24 said: ?? There was a category ?socioeconomic status? so I was hoping that under this category there would subcategories and I could, you know, just click on that and find out exactly what the socioeconomic status of the testifier was. Kind of occupation? I mainly looked at the mother?s and father?s occupation, the interviewee?s occupation, pre-war, during the war and post-war, and mostly pre-war and then I looked at what it said in the summary and PIQ about their occupations. And, so I could exclude testimonies when I looked at them, I could exclude summaries when I found keywords like ?well to do? or they had domestic maids or the father owned factories, so I could exclude testimonies based on this information.? Both country of birth and nationality were mentioned by the same four participants (P12, P21, P23, P24) who used them interchangeably, as expressed in the following quote from P21: ?? Most directly I'm interested in finding survivors of Pechora, so I've certainly narrowed that down to two central, particular towns or ghettos to Pechora, but in addition to the Ukrainian ghetto, I'm trying to find Romanian Jews. So, we're doing country of birth.? One participant (P24) used country of birth as the basis for inferring the first language of the interviewee, saying: ?? And I?m looking to do a comparative analysis and between German women and Polish women, so I mean, my main focus are German women and German- speaking women because I?m in the Germany part of the dissertation? German speaking places. Yes, Austria, pre-war. So, we can narrow it down by country of birth very easily.? In summary, person is one of the three most frequently mentioned attributes of the most heavily used criterion topicality. Participants searched either for a specific 78 person by name or for characteristics of the person. Date of birth, gender, occupation, country of birth, and social status were example characteristics of person participants mentioned frequently. In some cases, participants used one characteristic as a proxy for another ? e.g., the social status of interviewees vs. occupation of the interviewee and/or parents. Occupation, in addition to finding people by their occupation, additionally inferred the social status of the interviewee. Therefore, it might be useful for catalogers/system designers to give special attention to those characteristics that were used as proxy for others. Finally, some participants wanted to find testimonies using certain characteristics of person, although that feature often was not available from the current VHF system. As a result, person was mentioned noticeably more often during query reformulation than during relevance judgment, as shown in Table 8. It therefore seemed that biographical search and browsing, when used in combination with content- based search, could help to provide effective access to the VHF archive. Place Place was mentioned by all eight participants in association with their search topics. Example sub-attributes of place were camp, country, ghetto, city, and region. Noting Table 10, camp, which was the most frequently mentioned sub-attribute of place, was mentioned more often during query reformulation (37 of 53 mentions; 70%) than during relevance judgment. It was observed that some participants searched testimonies primarily by names of camp, which was an available feature from the current VHF system. For example, 79 participants P12 and P21 chose to focus on testimonies of survivors from specific camps. As a result, they mentioned camp as a sub-attribute of place more often than other participants. It therefore appeared to be the use of camp was somewhat topic- dependent. It is quite possible that other searchers with different search topics could focus more on ghettos than camps. This implies an enhanced search by names of camp and ghetto might be useful for some searchers. Both country and city were mentioned by the same six participants (P11, P12, P13, P21, P22, and P24). Unlike camp and ghetto, it was rare that participants chose testimonies/passages primarily by country and/or city. In most cases, country and city were used in conjunction with other attributes, such as event/experience, time frame, and organization/groups. For instance, P11 was looking for testimonies that had stories about the Kulturbund, the Jewish cultural organization that was active in various cities in Germany until 1939. In this case, the primary attribute used by P11 was organization (Kulturbund), and country and city were used as the secondary criteria in making relevance judgments and in formulating queries. Finally, region was mentioned remarkably less than the others and only by three participants (P14, P23, and P24). Event/Experience Event/experience, along with person and place, is one of the three most frequently mentioned attributes used as the basis for judging topicality. Two different types of sub-attributes of event/experience were observed: 1) Personal event/experience and 2) Historical event/experience. 80 Personal event/experience refers to types of events or experiences individuals had gone through during/after the Holocaust. Participants seemed to pay attention to extreme personal events or experiences when they make relevance judgments. For example, participants did not mention anything about suicide, weddings or murder/death when they formulated their queries but mentioned them during relevance judgment and/or interviews. Historical event/experience refers to specific events/experiences during/after the Holocaust that are historically remarkable. Examples of observed historical event/experience were November pogrom, beautification, Polish pogroms, and Olympic games. Both personal event/experience and historical event/experience were used for passage-level access rather than for testimony-level access. It is unlikely that a whole interview was about a specific event/experience, such as suicide. Therefore, participants looked for passages in a testimony that discussed the specific event or experience they were interested in. Organization/Group Organization/group was mentioned by four participants (P11, P12, P14, and P22). It refers to all types of organizations or groups, including regional/international, cultural, political, and religious organizations or groups. Examples of organization/group mentioned by participants were Kulturbund, JDC (Joint Distribution Committee), Red Cross, and resistance groups. Some participants used organization/group as the primary interest in finding testimonies or passages (e.g. Kulturbund for participant P11 and specific resistance groups for participant P14), while others considered it as the secondary interest. As a 81 result, the majority of mentions of organization/group was made by P11 (59%) and P14 (20%) who chose to focus on testimonies from survivors of specific organizations or groups. It therefore seemed the importance of organization/group depended heavily on the topic. In Table 8, participants mentioned organization/group noticeably more often during relevance judgment than during query reformulation. This was somewhat a surprise considering the fact participants were able to search by names of organization or group. Partially, this was due to participant P11 who was able to find only one testimony using Kulturbund as a query. Participant P11 then found about 20 more potentially relevant testimonies by examining and viewing tapes that were retrieved using other related queries, such as music, camp orchestra, and performing arts. Time Frame Five participants (P11, P12, P13, P23, and P24) mentioned time frame, referring to a span of dates associated with other attributes, such as place, event/experience, topic, organization/group, and object. In the current VHF thesaurus, time frame is pre- combined with place (e.g., Germany 1900-1914), and participants could examine stories mainly by three time frames: pre-war, war time, and post-war. However, participants often wanted to associate time frame with other attributes in addition to place. For example, when searching for a specific event, participants could express a desire to limit their search to a certain date or a range of dates, as P23 stated: ?? I will tell you about my latest research. I look at how much people are to participant in the Olympic games together, 1952. After that, I look in the next Olympics Games. 1956 in Melbourne and so and so and so.? 82 Object Object, which refers both to physical object (e.g., ship, airplane, weapon, ?) and to geographical object (e.g., mountain, river, forest, ?), was mentioned by two participants (P13 and P14). As with organization/group, object was used by some participants as their primary interest in finding testimonies or passages (e.g., passengers of SS St. Louis for participant P13), while it was used as the secondary interest by other participants (e.g., weapons used during resistance for participant P14). Interestingly, weapon was mentioned only during relevance judgment. This was partly because weapon was not the primary interest of participant P14 and partly because P14 could not think of the correlation between weapon and resistance. As a result, participant P14 did not mention weapon during query formulation at all. Other topics Other topics refer to other subjects participants mentioned that do not belong to any other categories in Table 10. All eight participants mentioned other topics both during query formulation and relevance judgment. Example subjects that were mentioned by participants were food, flight preparation, and music. These subjects were used in association with other attributes such as place, event/experience, and organization/group when judging the topical relevance of a testimony or a passage. For example, P14 said: ?? Let?s do psychological reactions to resistance? Well, I?m hoping to find the evolution of the decision-making process, how do they begin to think from, let?s say, if there was a shift from thinking, uh, they?re telling us to march out of the city, for example, from that kind of a government order to, how does it shift to thinking? No, we are not going to march out of the city. Are we going to stay or are we going to fight?? 83 In this quote, the primary attribute P14 used was event/experience (resistance), and other topics (psychological reactions; decision-making process) were used to find detailed stories about the event/experience. Other topics therefore were used for passage-level access. Participants were able to bring up more topics as they continued their search. A learning effect was observed both during relevance judgment and during query formulation. Participants refined their information need in further detail during/after relevance judgment, which made it possible for them form a new query that could represent their information need better. 4.1.2 Accessibility Accessibility indicates whether rapid access to the retrieved testimonies/passages for viewing was available. All eight participants mentioned accessibility and seemed to consider it as an important criterion when making relevance judgment. Recording format (analog vs. digital) and access type (loading by a robot vs. loading from cache) were the two factors that affected accessibility. VHF testimonies were originally recorded on tape in analog format, and digitizing these tapes was in progress at the time of the study. Testimonies that were not digitized could not be accessed rapidly ? in this case, viewing the tape was usually available on the next day after making a special request. Once a tape was digitized, it then was stored in a remote site where a robot could locate the tape and load it to play when a request was made ? loading a tape by the robot took around 5 minutes. In order to reduce the loading time, VHF conducted a pre-search using the search topic each participant had and cached as many potentially 84 relevant testimonies as they could ? in this case, the loading time for viewing a tape was almost immediate. Table 11. Mentions of associated attributes: Accessibility Number of Mentions Think-Aloud Accessibility Associated Attributes All Relevance Judgment Query Form. Interviews Number of Participants Who Mentioned Cache Digitization 24 19 17 11 0 0 7 8 8 7 Table 11 summarizes how participants perceived the loading time during their search. Cache and digitization in Table 11 refer to whether the retrieved testimony was cached and digitized or not, respectively. Not surprisingly, the 5 minutes of loading time for those tapes that were digitized but not cached was significantly interrupting participants when making relevance judgment, which made them avoid viewing such tapes. To make the problem worse, the system did not allow searchers to do any other tasks while loading. Although, participants could tell whether the testimony/passage they wanted to view was digitized or not (and cached or not), it was not until they found such testimonies or passages. Therefore, both cache and digitization were not mentioned during query formulation at all. All eight participants stated rapid access to testimonies was desirable, as P22 mentioned: ?? The other thing that I thought that young people, and even myself, we're all so results oriented. We want to see results quicker- and in this process, it seemed like you had to do a lot of groundwork. You had to lay a pathway and say, ?okay, I'm going to lay down this pathway. I'm not going to get to work on it today, but I've got to lay this down; I've got to prepare the foundation, and then tomorrow I can start taking some steps.? ?The accessibility would be 85 somewhat frustrating to people sometimes-- that they are not able to get more results.? The associated attributes for accessibility can be differentiated from those for topicality by their origin. Unlike person, place, or any other associated attributes of topicality, cache and digitization do not originate from what the interviewee spoke in a testimony (content). They rather originate from how and where the testimony was recorded and stored (format). It may useful for system designers to pay attention to these attributes, since catalogers generally do not index such non-content-related attributes. 4.1.3 Richness Richness refers to how much detail of a subject the retrieved testimony/passage covers (amount of information) and/or how well the interviewee presents his/her experience in the testimony/passage (presentation skill). Table 12 shows mention counts for each associated attribute of richness made by participants. Table 12. Mentions of associated attributes: Richness Number of Mentions Think-Aloud Richness Associated Attributes All Relevance Judgment Query Form. Interviews Number of Participants Who Mentioned Amount of information Presentation skill 31 8 12 2 0 0 19 6 6 4 Six participants (P11, P12, P13, P14, P21, and P24) mentioned that amount of information was important when making their relevance judgments. It seemed amount of information was not a concern for the remaining two participants (P22 and P23) 86 mainly because of the nature of their information needs. For example, P23 was looking for some demographic distribution of interviewees that would meet P23?s own criteria. What the interviewee said in his/her testimony was not the primary concern of P23 at the time when P23 was conducting the search. It therefore seems the importance of amount of information depends on the information needs of searchers. Presentation skill was mentioned by four participants (P12, P14, P21, and P24) and seemed to be considered less important than amount of information (asserted by the quote below). Some interviewees presented their experiences more realistically than others, which drew more attention from participants. In many cases, participants judged the presentation skill of an interviewee somewhat in association with the way the interviewee presented his/her emotional changes when describing his/her experiences in the testimony/passage, as it was well expressed in the following quote from P14: ?How their personal, I mean some of them were more emotional than others. And I should also mention that since I?m not watching the entire testimony, I paid very close attention to that emotional aspect of that segment and tried to see if the interview was a good quality interview. ?But, that?s, again, not really all that important. It?s important but probably that won?t be the most critical factor. What I?m ultimately going to decide which interview to include is what?s in the interview.? The VHF system provided little means for searchers to infer the richness of a testimony or a passage. As a result, participants were able to assess it mainly by viewing the tape. In order to avoid viewing the tape (especially those tapes that were not cached), some participants tried to measure the amount of information presented in a passage by counting the number of consecutive segments assigned to a descriptor. However, that was neither accurate nor explicit. Two participants mentioned it would have been useful for them if time-stamped descriptors were available. 87 4.1.4 Emotion Emotion refers to the emotional expression presented in a testimony or a passage. VHF testimonies often included stories that were highly emotional, as interviewees discussed their experiences during the Holocaust. Interviewees often expressed their emotion while describing a story, using their voice with a combination of some facial expressions and gestures, as summarized in Table 13. Table 13. Mentions of associated attributes: Emotion Number of Mentions Think-Aloud Emotion Associated Attributes All Relevance Judgment Query Form. Interviews Number of Participants Who Mentioned Facial expression Voice Gesture 19 4 1 3 2 0 0 0 0 16 2 1 6 3 1 Participants often paid more attention to stories with emotional expressions, and perceiving emotion was greatly enhanced by some audio and/or visual features presented in the testimony. Examples of audio features were voice. For instance, yelling could represent anger and/or frustration. Examples of visual features were facial expression (mentioned by P12, P13, P14, P21, P22, and P23) and gesture (mentioned by P12). Participants might have perceived emotion from what the interviewee said, but paid much more attention to a story with such emotional audio and/or visual features. For example, some participants selected a passage they would have not chosen if no facial expression were present. The following quote from P12 well represents how the audio and visual features presented in the testimony affected P12?s selection decision: 88 ?? One of the biggest advantages of having a video testimony is that you can see facial expressions. See physical movements. You know it?s much more nuanced and you really get a feel for the information and how it?s being presented. So, if I just had a piece of paper or on the computer screen just give me a summary, I might say I don?t need that. Whereas if I had viewed it? I mean, one of the segments I?m going to be showing at the conference I?m going to, this woman is talking about post-war anti-Semitism, this boy chasing her back in Poland after the war and calling her ?dirty Jew.? And, so she?s talking about, you know this was happening after the war and she just could not accept that, so she beat up the kid, this boy. And, you know, which was very interesting to hear, but while she?s doing that, she?s punching and you can see her almost as this little girl how she?s beating this kid up and you can?t get that anywhere else. And, I might have lost that if the summary had just said, you know, ?Survivor talks about post-war anti-Semitism and verbal anti-Semitism? or something.? It seemed the importance of emotion was somewhat topic-dependent. P13 clearly mentioned the audio/visual features that expressed emotion did not affect P13?s selection decision, although they provided additional context of the story. P13 was mostly interested in what was said in the story. P11 and P24 did not mention emotion at all, mainly due to the purpose of their searches. For example, participant P11 wanted to quickly locate Kulturbund survivors so that P11 could conduct P11?s own interviews with them. As a result, P11 hardly used emotion as a selection criterion. Not surprisingly, the attributes associated with emotion were used for passage- level access. Using these attributes, system designers can help searchers access potentially relevant passages within a testimony effectively. P12 and P22, who were teachers, mentioned that emotional stories with audio and visual expressions were good candidates for their classroom presentation. Providing the capability of browsing such emotional stories within a testimony may be useful for some user groups, such as teachers. Techniques for automatic detection of emotional speech (Ang et al., 2002; 89 Lee and Narayanan, 2002) and classification schemes for labeling emotions (Liscombe et al., 2003) may have potential for providing such capability. 4.1.5 Comprehensibility Comprehensibility refers to the degree to which searchers can understand what the speaker says in a testimony/passage. VHF testimonies were given in 32 different languages. Interviewees who gave their testimonies in English were not native English speakers in most cases. Participants mentioned that language and clarity of speech affected their query formulation and relevance judgment, as summarized in Table 14. Table 14. Mentions of associated attributes: Comprehensibility Number of Mentions Think-Aloud Comprehensibility Associated Attributes All Relevance Judgment Query Form. Interviews Number of Participants Who Mentioned Language Clarity of speech 13 1 0 1 10 0 3 0 7 1 Seven participants (P12, P13, P14, P21, P22, P23, and P24) mentioned language mainly during query formulation or reformulation. Language was the primary concern of those seven participants (they were looking for English interviews), and limiting a search by language was an eminent feature of the VHF system: participants were guided through the system interface to make a selection of languages (as shown in Figure 13), such as all (default), English, German, and other languages. Thus, participants did not need to re-mention language during their relevance judgments once they limited their search by a language during query formulation/reformulation. 90 Language was not important to participant P11, who was a documentary film producer, since P11 planned to have translators for non-English testimonies. One participant (P23) mentioned the audio of a tape was hard to hear. In this case, clarity of speech was a concern for P23 when making relevance judgments. P23 might have not chosen to view the tape if the clarity of speech was known earlier during the search. 4.1.6 Duration Duration refers to the playing time of a testimony or a passage. The average length of a testimony is 2.5 hours, but some testimonies were longer (e.g., more than 4 hours) or shorter (e.g., less than 1 hour). Four participants (P11, P12, P13, and P22) paid attention to the length of a testimony (or a passage) while making relevance judgments, as shown in Table 15. Table 15. Mentions of associated attributes: Duration Number of Mentions Think-Aloud Duration Associated Attributes All Relevance Judgment Query Form. Interviews Number of Participants Who Mentioned Length 11 9 0 2 4 It seemed duration was not the primary criterion participants used in their selection processes. Participants indicated they would watch a testimony no matter how long it was, if the testimony was highly relevant to their topic. In many cases, participants wanted to know how much time they were going to spend on watching a testimony. Length was more important to teachers than to other participants. They 91 were concerned about the length of a testimony when planning their classroom presentations, as P12 mentioned: ?It?s important because it tells me, I know that the average length is about two hours or two and a half. I know that there are some that are shorter and longer. If I ever wanted to use certain tapes in the classroom, it gives me some idea. You know, is this a six-hour tape or is this going to be an hour and a half tape? And, will I be able to show all of it in one session of the class or two sessions or will I just show segments?? 4.1.7 Novelty of Content Novelty in text retrieval normally refers to whether the retrieved document was new to the searcher (newness). None of the participants had a chance to access VHF testimonies before participating in the two user workshops VHF hosted. All testimonies therefore were new to participants. Novelty of content in this dissertation refers to passage-level novelty rather than testimony-level novelty and covers three different types of cases: (1) participants found new facts about a known-event or phenomenon, (2) participants found new examples/incidents of a known-event or phenomenon, and (3) participants found an event or phenomenon that was new to them. Table 16 indicates that three participants (P12, P13, and P14) judged novelty mainly by topic. Five of ten mentions in Table 16 were cases in which participants found new examples or incidents about a known event, while three mentions were about new facts of a known event and two mentions for new event. In association with finding new examples or incidents of a known event, P12 found testimonies P12 could 92 rarely find in any other Holocaust documents or oral history archives. This indicates the uniqueness of some of the testimonies VHF collected. Table 16. Mentions of associated attributes: Novelty of content Number of Mentions Think-Aloud Novelty of Content Associated Attributes All Relevance Judgment Query Form. Interviews Number of Participants Who Mentioned Topic 10 4 2 4 3 P12 spent remarkably more time on watching testimonies during the search process than any other participant. As a result, novelty was mentioned mostly by P12 (8 of 10 mentions), since watching the tape was the main means available for identifying the novelty of content. P13 and P14 also spent significantly more time on watching testimonies than other participants, but each mentioned novelty only once. Thus, it seemed the importance of novelty was somewhat topic-dependent. Participants who did not mention novelty at all might have mentioned it if they had more time to watch testimonies. Once novelty of content was mentioned, the testimony/passage that was under examination immediately became highly relevant to the searcher. Participants valued novelty of content, if found, very highly during their search, as asserted in the following two quotes from P12: ?Suicide after liberation. This is really amazing because this is something that no one has talked about in my reading of the material, of documents and archives or interviewing people, no one has ever mentioned this suicide.? ??This one camp, Aglasterhausen? I had almost decided to discard it from my dissertation because I just didn?t have enough material?The most amazing 93 thing, in this past week, you have quite a few testimonies? And, because of the information I now have, not only is it definitely going to be my dissertation, but it?s going to be quite a large, hold quite a large weight.? It seems to be a hard task for system designers to support a function that can enable searchers to identify the novelty of a testimony or a passage effectively. One participant serendipitously found a keyword while browsing thesaurus terms; the participant indicated the keyword itself was novel. However, in most cases, it required more information than just keywords for participants to judge the novelty of a testimony or a passage. Providing testimony summaries and/or passage summaries may enhance the process of finding novelty for searchers. Producing such summaries manually is a very expensive procedure. Techniques, such as automatic summarization by detecting new sentences (Harman, 2002) and first story detection (Allan et al., 2000), may be useful for supporting the task of tracking novelty. 4.1.8 Acquaintance Acquaintance refers to the previous relationship of the searcher with the interviewee. Examples of such relationships were relatives, friends, and previously known figures to the searcher. In most cases, participants chose to view testimonies from people they knew or knew about, regardless of the topical relevance of the testimony, as P22 described it in the following quote: ?? This man is a survivor that lives in Israel, and he has some friends in the States that are friends of mine. When he was in the States for the first time about three years ago, he came and spoke to about 300 of my former or present students.? Table 17 indicates participants acknowledged acquaintance mainly by the name of interviewees. Four participants (P11, P12, P21, and P22) mentioned acquaintance 94 affected their decision of making selection. One of eight mentions in Table 17 was personal acquaintance, while seven of eight mentions were professional acquaintance. Table 17. Mentions of associated attributes: Acquaintance Number of Mentions Think-Aloud Acquaintance Associated Attributes All Relevance Judgment Query Form. Interviews Number of Participants Who Mentioned Interviewee, name 8 3 2 3 4 4.1.9 Access to the Interviewee Access to the interviewee indicates the physical distance between the searcher and the interviewees and was mentioned by two participants (P11 and P12). Participants sometimes found interviewees who were in their neighborhood and mentioned they might contact them to get more information, as stated in the following quote from P12: ?? There is one survivor here that I know that lives fairly close to me and I?ve been meaning to go and get her testimony so this will be excellent for me to see what she has recorded and that will really prepare me for when I go to actually interview her at her home.? Interviewee name and interviewee address were the two associated attributes of geographical proximity, as shown in Table 18. 95 Table 18. Mentions of associated attributes: Access to the interviewee Number of Mentions Think-Aloud Access to the Interviewee Associated Attributes All Relevance Judgment Query Form. Interviews Number of Participants Who Mentioned Interviewee, name Interviewee, address 2 1 1 1 0 0 1 0 2 1 4.1.10 Miscellaneous In addition to the relevance criteria discussed in the above Sections, participants used some other miscellaneous criteria that were hard to define. These criteria were mainly associated with attributes that originated from audiovisual features of VHF testimonies, as shown in Table 19. Besides the audiovisual cues participant used for judging emotion (e.g., facial expression, gesture, and voice), VHF testimonies contained some exceptional audio/visual features that were non-emotional and that participants used when finding a testimony or a passage. Table 19. Mentions of associated attributes: Miscellaneous Number of Mentions Think-Aloud Miscellaneous Associated Attributes All Relevance Judgment Query Form. Interviews Number of Participants Who Mentioned Visual Features Gesture Displayed artifact Still image 5 3 2 1 2 1 0 0 0 4 1 1 2 1 2 Audio Features Tone Whispering Singing 3 2 1 1 0 0 0 0 0 2 2 1 3 1 1 96 Participants paid attention both to visual features and to audio features. Example sub-attributes of visual features were gesture, displayed artifact, and still image. Two participants (P12 and P14) mentioned gesture and indicated it provided some additional context to what the speaker said. Displayed artifact was mentioned by one participant (P13) who used it as a basis for selecting testimonies. For example, P13 noted that the name of interviewees found sometimes did not match with the SS Saint Louis passengers? list P13 had, because some survivors changed their name (mainly last name after a marriage). Most VHF testimonies began with the interviewee holding a tag that displayed his/her name(s), including his/her original name. Participant P13 repeatedly viewed the beginning of every testimony in order to get that information. Two participants (P12 and P13), in some cases, were able to get additional information from the still image of interviewees. For instance, P12 inferred the sect (or denomination) of an interviewee?s Judaism by noticing him wearing a black kippot (kippah). Another example is that P13 sometimes tried to guess the age of interviewees during the Holocaust by looking at their pictures. Several participants stated some audio features affected their decision-making process of selecting a passage. Examples of these audio features were tone (P12, P14, and P22), whispering (P12), and singing (P12). Interestingly, audio features were mentioned mostly by P12 (4 of 6 mentions) and P22 (1 of 6 mentions) who both were teachers. It seemed they preferred to select passages with some exceptional audio features for the purpose of classroom presentation. Some attributes (displayed artifacts and still image) were used for testimony- level access and others (gesture, tone, whispering, and singing) for passage-level access. 97 Since the VHF system did not provide a ranked list of search results, participants had little means to identify which segments were more relevant than others in a testimony (and across testimonies). As with emotional expressions, it might be useful for searchers (especially teachers) if they could locate passages within a testimony by such exceptional audio/visual features presented in Table 19. 4.2 Relevance Judgment: Characteristics, Usage Patterns, and External Factors The first two research questions are (1.1) to characterize the relevance criteria searchers applied during their search and (1.2) to examine the attributes associated with each criterion. Section 4.1 discussed each criterion and its associated attributes individually. This section takes a holistic view of relevance criteria and associated attributes rather than individual discussion of each criterion and its associated attribute(s): it examines the type of attributes by their origin (Section 4.2.1), the usage patterns of criteria/attributes (Sections 4.2.2, 4.2.3, and 4.2.4), and factors that affect relevance judgment (Section 4.2.5). 4.2.1 Content and Non-Content Attributes Each associated attribute discussed in Section 4.1 can be classified into two different types by its origin: content and non-content. Some attributes were derived from the content of either the pre-interview questionnaire (e.g., person) or the testimony (e.g., place, facial expression), while others did not directly come from the content of the testimony (e.g., cache, digitization). Table 20 summarizes the two different types of associated attributes and presents examples of attributes that were mentioned 10 or more times by at least 2 participants during the two user workshops. 98 Table 20. Content and non-content attributes Number of Mentions Think-Aloud Number of Participants Who Mentioned Content & Non-content Type Associated Attributes All Interview Rel. Judgment Query Form. Person 124 33 67 24 8 Place 120 48 57 15 8 Event/experience 116 54 46 16 7 Organization/group 44 26 11 7 4 Time period 32 14 11 7 5 Object 26 13 6 7 2 Facial expression 19 5 0 14 6 Topic 10 4 2 3 Visual features 10 4 0 6 3 Specific content Voice 4 2 0 2 3 Amount of information 31 12 0 19 6 Cache 24 17 0 7 8 Digitization 19 11 0 8 7 Language 13 0 10 3 7 Non- content Length 11 9 0 2 4 4 In Table 20, person, place, event/experience, organization/group, time period, object, topic, and voice originated from what was spoken in the testimony (spoken- content attributes), while facial expression and visual features were derived from what was shown in the testimony. Therefore, they all originated from the specific content of the testimonies (or the content of the pre-interview questionnaire for most sub-attributes of person). On the other hand, amount of information, cache, digitization, language, and length are not derived from specific content elements. Interestingly, searchers did not expect these attributes (except language and length) to be their search criteria at the 99 beginning of their search (nor available for search from the system). But they immediately became a primary concern once noticed by searchers. Therefore, it would be useful for catalogers and system designers to pay attention to such non-content attributes as amount of information and digitization. 4.2.2 Proxy Use of Attributes Proxy use of attributes was observed mainly among the sub-attributes of person. Participants sometimes used one attribute as a proxy for another in order to infer some information about a person (the interviewee, in this case). Table 21 presents examples of the proxy use of attributes that were observed during the two user workshops. For instance, participants used country of birth as a proxy for nationality during language acquisition to infer the first language of the interviewee. Table 21. Examples of proxy use of attributes Looked-for Attribute Proxy Attribute Inferred Information Date of birth Still image (picture icon) Age Social status, interviewee Occupation, interviewee Occupation, parents Religion, interviewee Religion, parents Social Status, interviewee Nationality Country of birth First language Religious affiliation Color of skullcap (in still image) Religious affiliation Participants used these proxy attributes when no data were available for the looked-for attribute. For example, some participants used country of birth to infer the first language of the interviewee or the occupation of an interviewee to infer his or her social status or the occupation of the parent to infer the social status of an interviewee 100 who was a child during the Holocaust. One participant used religious affiliation to infer the social status of the interviewee, since Hasidic Jews were mostly poor. The color of skullcaps Jewish men wear indicates the religious affiliation of the interviewee, as discussed in Section 4.1.10. Level of education could have been used for inferring the social status of interviewees, but no such cases were observed during the two user workshops. 4.2.3 Granularity of Units Judged The granularity of units judged by an attribute was observed to be somewhat different among attributes. Table 22 lists attributes that were mentioned 10 or more times during the two user workshops and by at least two participants. Some attributes in Table 22 were used mainly for testimony-level access (e.g., gender), and others mainly for passage-level access (e.g., event/experience). There were cases where the granularity of units judged by an attribute was not clear or cases where an attribute was mentioned without indicating the intention of selecting testimonies or passages. In such cases, mention counts were added to the Not Clear column in Table 22. Participants used name, date of birth, gender, country of birth, cache, digitization, and language mainly for testimony-level access. Participants used names of interviewees to search testimonies (27 of 34 mentions) and names of figures that were mentioned in a testimony to search passages (5 of 34 mentions). No usage of passage-level access was observed for date of birth, gender, country of birth, cache, digitization, and language. Although participants used the occupation of interviewees, camp, and ghetto both for testimony-level access and for passage-level access, they 101 used them noticeably more often (more than two times as often) for testimony-level access. Table 22. Granularity of units judged by attributes Granularity of Unit Judged Relevance Criteria Associated Attributes All Testimony Passage Not Clear Number of Participants Mentioned Person Name Date of birth Gender Occupation, interviewee Country of birth 34 30 20 15 14 27 27 20 7 9 5 0 0 2 0 2 3 0 6 5 7 6 5 6 4 Place Camp Country Ghetto City 64 47 20 20 30 15 8 5 12 9 3 5 22 23 9 10 6 4 5 6 Event/experience 116 0 101 15 7 Organization/group 44 13 11 20 7 Time frame 32 12 7 13 5 Topicality Other topics 73 0 69 4 8 Cache 24 24 0 0 8 Accessibility Digitization 19 19 0 0 7 Richness Amount of information 31 0 31 0 6 Emotion Facial expression 19 0 19 0 3 Comprehensibility Language 13 13 0 0 7 Duration Length 11 0 2 9 4 Novelty of content Topic 10 0 10 0 3 Miscellaneous Visual features 10 0 10 0 3 Event/experience, other topics (for judging topicality), amount of information, facial expression, topic (for judging novelty), and visual features were used mainly for passage-level access. Interestingly, no usage for testimony-level access was directly observed for the above-mentioned attributes. For example, personal event/experience 102 and historical event/experience both were used for passage-level access rather than for testimony-level access. It was unlikely that a whole testimony was about a specific event/experience, such as suicide. Therefore, participants looked for passages in a testimony that discussed the specific event or experience they were interested in. Table 23 summarizes the associated attributes by the granularity of units judged. Table 23. Associated attributes by the granularity of units judged Mainly Testimony-Level Mainly Passage-Level Person Name Date of birth Gender Occupation, interviewee Country of birth Place Camp Country Ghetto City Event/experience Organization/group Time frame Cache Digitization Language Person Occupation, interviewee Place Camp Country Ghetto City Event/experience Organization/group Time frame Other topics (for judging topicality) Amount of information Facial expression Topic (for judging novelty) Visual features 4.2.4 Searcher Expectation and the Availability of Attributes An interesting observation was made on the relationship between searcher expectation and the availability of an attribute. Searcher Expectation, in Table 24, 103 refers to whether the searcher expected the attribute to be available for search from the system, and Availability indicates whether the system allows the searcher to retrieve testimonies or passages using the attribute. The Level of Interest refers to whether the searcher used the attribute as a primary criterion for search/selection or a secondary criterion. Table 24. Searcher expectation and the availability of attributes No. of Mentions Searcher Expectation Availability Level of Interest Associated Attributes Relevance Judgment Query Form. No. of Participants Gender 4 12 5 Primary Language 0 10 7 Country 22 19 6 Time frame 14 11 5 Secondary City 7 10 6 Event/experience 54 46 7 Camp 16 37 6 Organization/group 26 11 7 Name 10 17 7 Object 13 6 2 Yes Yes Both Ghetto 8 10 5 Primary Date of birth 13 4 6 Country of birth 4 8 4 Yes No Both Occupation, interviewee 3 9 6 Cache 17 0 8 Amount of Information 12 0 6 No No Primary Digitization 11 0 7 Table 24 summarizes the relationship among searcher expectation, the availability of attributes, and the level of searcher interest. For example, when an attribute was expected by the searcher and available from the system to be used as a 104 search criterion, the attribute was mentioned mainly during query formulation, only if it was the primary interest of the searcher. If an attribute was expected but was not available, it was mentioned mostly during relevance judgment, but only when the attribute was the primary interest of the searcher. Primary under the Level of Interest column in Table 24 indicates searchers used the attribute as a predominant criterion, in most cases. Secondary refers to cases where searchers used the attribute as one of the selection criteria but not as the primary attribute. Both indicate searchers used the attribute as the primary interest at one time and as the secondary interest at another time. Attributes listed in Table 24 include only those that were mentioned 10 or more times during think-aloud (excluding cases that were mentioned during interviews) and by at least two participants. Some attributes in Table 24 demonstrated the above-mentioned patterns. For example, most searchers wanted to limit their search by gender and/or language (i.e., expected), which was an available feature from the system (i.e., available). Gender and language both were the primary interest of those searchers in most cases. Therefore, searchers mentioned them mainly during query formulation. Likewise, six participants wanted to use date of birth as the primary criterion for their search, which was not feasible due to system limitation during the two user workshops. As a result, date of birth was mentioned mainly during relevance judgments. None of the above-mentioned patterns was obvious for those attributes that were not the primary interest of the searcher. Another pattern that was observed in Table 24 was when an attribute was neither expected by searchers nor available from the system (e.g., cache, amount of information, 105 digitization). Not surprisingly, searchers, in such cases, mentioned the attribute only during their relevance judgments. It may be useful for system/interface designers to make these attributes available at search time. No case that an attribute was not expected by searchers but available from the system was observed during the two user workshops. 4.2.5 Factors Affecting Relevance Judgment The previous sections discussed the type of attributes by their origin, the proxy use of attributes, and the relationship between searcher expectation and the availability of attributes. These sections examined the attributes themselves ? i.e., what characteristics they had, how they were used, and what relationship existed between/among them. The following two sections discuss external factors that potentially affect the selection process of searchers. 4.2.5.1 Individual Differences Individuals with different search topics may use different relevance criteria and attributes. In addition to the different search topic, problem situations individual searchers have may differ from one searcher to another. Previous studies of relevance judgment have found such individual differences affect the cognitive process of human relevance judgments (Wilson, 1973; Wang and Soergel, 1998). Table 25 presents mention counts made by individual searchers on each relevance criterion and examines how differently each participant used them during their search. Mention counts with more than 5% of total mentions by a participant are bolded in Table 25 (and Table 26) 106 In Table 25, the notable difference in the total number of mention counts for each participant between the two user workshop groups may have been due to the different settings for each user workshop (as described in Section 3.2.1). The difference among participants within the same user workshop group was partly due to their ability to think aloud and partly due to topic differences. Table 25. Individual differences: relevance criteria, mention counts Number of Mentions User Workshop 1 N=499 User Workshop 2 N=204 Relevance Criteria All N=703 P11 N=126 P12 N=125 P13 N=105 P14 N=143 P21 N=71 P22 N=58 P23 N=46 P24 N=29 Topicality 535 106 87 88 97 59 36 40 22 Accessibility 43 6 7 3 10 5 8 2 2 Richness 39 4 2 4 23 3 0 0 3 Emotion 24 0 5 3 5 2 8 1 0 Comprehensibility 14 0 3 1 3 1 1 3 2 Duration 11 4 3 1 0 0 3 0 0 Novelty of content 10 0 8 1 1 0 0 0 0 Acquaintance 8 4 2 0 0 1 1 0 0 Access to, interviewee 3 2 1 0 0 0 0 0 0 Miscellaneous 16 0 7 4 4 0 1 0 0 In order to reduce the effect of variables that are not directly related to individual differences, further analysis was conducted using percentages rather than mention counts. Figures after a criterion in Table 26 indicate mention counts (No.) and the percentage of the criterion (%), respectively. The data shown in Table 26 suggests that the relevance criteria individual searchers used were somewhat different. Some searchers mentioned more criteria than others ? e.g., P12 mentioned ten criteria, while P23 and P24 both mentioned only four. 107 Some searchers valued certain criteria more than others did ? e.g., P14 used richness (16%) more than anyone else. Table 26. Individual differences: relevance criteria User Workshop 1 User Workshop 2 Participants Relevance Criteria No. % Participants Relevance Criteria No. % P11 Topicality Accessibility Richness Duration Acquaintance Access to the interviewee 106 6 4 4 4 2 84 5 3 3 3 2 P21 Topicality Accessibility Richness Emotion Comprehensibility Acquaintance 59 5 3 2 1 1 86 7 4 3 1 1 P12 Topicality Novelty of content Accessibility Miscellaneous Emotion Comprehensibility Duration Richness Acquaintance Access to the interviewee 87 8 7 7 5 3 3 2 2 1 70 6 6 6 4 2 2 2 2 1 P22 Topicality Accessibility Emotion Duration Comprehensibility Acquaintance Miscellaneous 36 8 8 3 1 1 1 62 14 14 5 2 2 2 P13 Topicality Richness Miscellaneous Accessibility Emotion Comprehensibility Duration Novelty of content 88 4 4 4 3 1 1 1 84 4 4 4 3 1 1 1 P23 Topicality Comprehensibility Accessibility Emotion 40 3 2 1 87 7 4 1 P14 Topicality Richness Accessibility Emotion Miscellaneous Comprehensibility Novelty of content 97 23 10 5 4 3 1 68 16 7 3 3 2 1 P24 Topicality Richness Accessibility Comprehensibility 22 3 2 2 76 10 7 7 All eight participants used topicality most frequently. However, the usage percentage of topicality ranged from as little as 62% (P22) to as much as 87% (P23). Although the percentage range for topicality came approximately in line with what most 108 other previous studies found (Park, 1993; Green, 1995; Wang and Soergel, 1998), the difference between the higher group (P11, P13, P21, and P23) and the lower group (P14 and P22) was quite large (about 20%). Interestingly, P14 used richness remarkably more often than others, and a similar pattern was observed for P22 with accessibility and emotion. This implied that the individual difference resulted partly from the purpose of information seeking. For example, P14 was at the beginning stage of P14?s research and wanted to find testimonies that contain the most information. Thus, amount of information was important for P14, as well as topical relevance. In order to further discuss individual differences, Table 27 extends the previous table by presenting all attributes that are associated with each relevance criterion and, thus, summarizes the usage pattern of what associated attributes participants used to judge each relevance criterion they applied during the search. Mention counts with more than 5% of total mentions by a participant are bolded in Table 27. No. under the column for each participant in Table 27 represents mention counts. Table 27 indicates that some attributes appeared to be more important to one participant than they were to another. For instance, P22 used facial expression noticeably more frequently (14%) than any other participants. P22, who was a high school teacher, often mentioned that passages with facial expressions would convey strong messages to the classroom (as discussed in Section 4.1.4). In this case, the use to which P22 intended to put the information was what made P22?s search different from that of other participants. 109 Table 27. Individual differences: associated attributes Participants P11 N=126 P12 N=125 P13 N=105 P14 N=143 P21 N=71 P22 N=58 P23 N=46 P24 N=29 Relevance Criteria Associated Attributes All No. % No. % No. % No. % No. % No. % No. % No. % Topicality Person Place Event/experience Organization/group Time frame Object Other topics 124 120 116 44 32 26 73 22 21 3 26 14 0 20 17 17 2 21 11 0 16 18 26 20 4 7 0 12 14 21 16 3 6 0 10 26 14 20 0 5 13 10 25 13 19 0 5 12 10 9 14 38 9 0 13 14 6 10 27 6 0 9 10 17 24 15 0 0 0 3 24 34 21 0 0 0 4 2 11 12 5 0 0 6 3 19 21 9 0 0 10 18 6 8 0 3 0 5 39 13 17 0 7 0 11 12 4 0 0 3 0 3 41 14 0 0 10 0 10 Accessibility Cache Digitization 24 19 1 5 1 4 4 3 3 2 2 1 2 1 6 4 4 3 3 2 4 3 5 3 9 5 2 0 4 0 1 1 3 3 Richness Amount of information Presentation skill 31 8 4 0 3 0 1 1 1 1 4 0 4 0 19 4 13 3 1 2 1 3 0 0 0 0 0 0 0 0 2 1 7 3 Emotion Facial expression Voice Gesture 19 4 1 0 0 0 0 0 0 3 2 1 2 2 1 2 1 0 2 1 0 4 1 0 3 1 0 2 0 0 3 0 0 8 0 0 14 0 0 1 0 0 2 0 0 0 0 0 0 0 0 Comprehensibility Language Clarity of speech 13 1 0 0 0 0 3 0 2 0 1 0 1 0 3 0 2 0 1 0 1 0 1 0 2 0 2 1 4 2 2 0 7 0 Duration Length 11 4 3 3 2 1 1 0 0 0 0 3 5 0 0 0 0 Novelty of content Topic 10 0 0 8 6 1 1 1 1 0 0 0 0 0 0 0 0 Acquaintance Interviewee, name 8 4 3 2 2 0 0 0 0 1 1 1 2 0 0 0 0 Access to the interviewee Interviewee, name Interviewee, address 2 1 1 1 1 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Miscellaneous Visual features Audio features 10 6 0 0 0 0 3 4 2 3 4 0 1 0 3 1 2 1 0 0 0 0 0 1 0 2 0 0 0 0 0 0 0 0 110 Table 28. Individual differences: the importance of associated attributes Table 28 is based on Table 27. It lists for each participant the most often used attributes (more than 5%) in order of frequency. User Workshop 1 User Workshop 2 Participants Associated Attributes No. % Participants Associated Attributes No. % P11 Topicality Organization/group Person Place Other topics Time frame 26 22 21 20 14 21 17 17 16 11 P21 Topicality Place Person Event/experience 24 17 15 34 24 21 P12 Topicality Place Event/experience Person Other topics Time frame Novelty of content Topic 26 20 18 12 7 8 21 16 14 10 6 6 P22 Topicality Event/experience Place Other topics Organization/group Accessibility Cache Digitization Emotion Facial expression Duration Length 12 11 6 5 5 3 8 3 21 19 10 9 9 5 14 5 P13 Topicality Person Event/experience Place Object Other topics Time frame 26 20 14 13 10 5 25 19 13 12 10 5 P23 Topicality Person Event/experience Place Other topics Time frame 18 8 6 5 3 39 17 13 11 7 P14 Topicality Event/experience Place Other topic Object Person Organization/group Richness Amount of information 38 14 14 13 9 9 19 27 10 10 9 6 6 13 P24 Topicality Person Place Time frame Other topics Richness Amount of information Comprehensibility Language 12 4 3 3 2 2 41 14 10 10 7 7 As mentioned in the above, all eight participants used topicality most heavily, but the type of attributes that were used in judging topicality was somewhat different 111 among participants. Table 28 presents only those attributes that showed at least 5% usage by each participant. P11 used organization/group heavily (21%), while others used it 0% to 9%. Place was used by all eight participants with a minimum usage of 10% (P14) and a maximum usage of 35% (P21). Interestingly, 7 of 8 participants had a usage rate of at least 14% for person; P22 had a usage rat of only 3%. As a result, P22 happened to be the one who had the lowest usage rate for topicality (62%) among all participants. Instead, P22 used other criteria (e.g., accessibility, emotion, duration) more frequently than any other participant. In addition to individual differences, there may be group differences in the use of relevance criteria and attributes. For example, the high school teacher (P22) used a somewhat different set of relevance criteria than other participants. It would be interesting to examine the search behavior of different user groups, including teachers. Unfortunately, no meaningful analysis of group differences could be made in this dissertation, mainly due to the scarcity of participants by user groups ? e.g., there was only one school teacher (P22) among participants. 4.2.5.2 Medium and Domain Differences Another factor that may affect the searcher behavior of relevance judgment is the medium and domain differences of information objects. The type of available attributes may differ both from one medium (e.g., speech) to another medium (e.g., text) and from one domain (e.g., oral history interview) to another domain (e.g., radio news). Thus, searchers may adopt a different set of relevance criteria when selecting speech 112 recordings of oral history. Table 29 combines Tables 1 and 2 and compares them with the findings of this dissertation. Table 29. Medium and domain differences of relevance criteria Speech Oral History Interview Radio News and Opinion Text (Journal Article) Relevance Criteria Topicality Novelty of content Accessibility Duration Richness Emotion Comprehensibility Acquaintance Access to the interviewee Topicality Recency Authority Listening time Story type Topicality Novelty Quality Availability Accessibility Recency Authority Reading time In Table 29, both the medium difference between speech and text and the domain difference between oral history interviews and radio news and opinion were observed in the relevance criteria searchers adopted. Some criteria that were not mentioned in oral history were observed in text (e.g., quality, availability, recency, and authority) and in radio news and opinions (e.g., recency, authority, story type). More interestingly, there were criteria that were observed only in oral history such as richness, emotion, comprehensibility, acquaintance, and access to the interviewee. These differences were partly due to the distinct information needs of oral history users. For example, recency was observed both in radio news and opinion and journal articles 113 but not in oral history interviews. Recency would not likely to be a matter for the searchers of oral history interviews. Another factor that affected the medium and domain differences was the previous exposure of participants to the VHF collection. Some criteria that were observed in radio news and opinions and journal articles could have been mentioned in oral history interviews. For instance, previous studies found authority to be an important criterion in selecting radio program stories (Kim et al., 2003) and journal articles (Barry, 1994; Wang and Soergel, 1998). However, the searchers of oral history interviews did not mention authority, mainly due to the fact that the VHF collection was new to participants (and to general public). Thus, it was hard for participants to assess the authority of interviewees or testimonies. Users of oral history interviewees might well care about the credibility and the authority of interviewees, but the participant of this study did not explicitly express such concerns. Finally and most importantly, the medium and domain differences also resulted from the different sets of available attributes (metadata). For example, searchers used certain attributes, such as facial expression, voice, and gesture, to judge emotion. These attributes were available in oral history interviews but not in journal articles ? i.e., medium difference. Voice was an available attribute in radio news and opinions, but it was unlikely that searchers would select a news story based on the emotion speakers expressed in the story ? i.e., domain difference. Thus, it would be important to understand what metadata searchers would value when selecting oral history interviews, in order to design an effective retrieval system. 114 4.3 Query Reformulation: INR Moves and Query Reformulation Types One of the objectives of this dissertation is to characterize the type of strategies searchers adopted when refining their information needs (INR moves) and to explore how they realized each move when reformulating their queries (query reformulation types). Searchers often find a gap between their expectations and the actual set of retrieved documents. This may result from either defining their information needs poorly or using inappropriate queries (or both). This section discusses findings of how participants refined their information needs and reformulated their queries. The results in this section are based only on participant P11 through P14, since the collection of detail process data was not possible in Workshop 2. INR moves refer to the refinement of information needs described by the user. An INR move is coded based on what the user says during searching, as recorded in the think-aloud protocol. The INR description may refer to the thesaurus descriptors. Query reformulation occurs when the user enters or modifies descriptors Query reformulation types are coded from the screen capture protocols. INR moves are often followed by query reformulations; these associations are reported in Table 32. Table 30 summarizes the INR moves searchers mentioned during their search. Searchers may need to: ? elaborate different aspects of a topic (clarification), ? narrow the topic (specialization), ? narrow the topic by eliminating a subject (specialization by elimination), ? restrict a search by the characteristics of interviewee (restriction), ? expand the search (generalization), 115 ? find an alternative query without narrowing or broadening the topic (parallel movement), and ? make a note for future reference (note for later). Table 30 shows how often each type of INR move was observed, broken down by the searcher behavior (relevance judgment and query reformulation) participants were engaged in, followed by the number of participants who used each INR move. The searcher behavior of query reformulation was best captured during the direct observation of search sessions. Thus, observation counts were acquired from the think- aloud during a sample of search sessions in User Workshop 1, as described in Section 3.4.2. Table 30. Observations of information need refinement (INR) moves (See Glossary for the definition of each criterion listed in this table) Number of Observations Think-Aloud INR Moves All (N=58) Relevance Judgment (N=28) Query Reform. (N=30) Number of Participants Clarification, alone 16 (28%) 13 3 3 Specialization Specialization by elimination 25 (43%) 8 7 6 18 3 4 3 Restriction 9 (16%) 3 6 3 Generalization 1 (1.7%) 0 1 1 Parallel movement 0 (0.0%) 0 0 0 Note for later 7 (12%) 5 2 2 Clarification was a frequently used INR move by participants. In many cases, clarification occurred in conjunction with other INR moves, such as specialization, 116 restriction, and note for later. For these cases, observations were counted only once and added to the other associated INR moves with clarification. The observation count for clarification alone in Table 30 (16 of 58 observations, 28%) represents only the cases when clarification occurred without any other INR move. Table 31 shows observations of all clarification (both alone and with others) and other INR moves without clarification. Table 31. Observations of INR moves: with or without clarification Number of Observations Think-Aloud INR Moves All (N=58) Relevance Judgment (N=28) Query Reformulation (N=30) Clarification, all Clarification, alone Clarification, with others specialization spec. by elimination restriction note for later 40 (69%) 16 24 11 8 4 1 21 13 8 1 5 2 0 19 3 16 10 3 2 1 Other moves without clarification specialization spec. by elimination restriction generalization note for later 18 (31%) 5 1 5 1 6 7 0 1 1 0 5 11 5 0 4 1 1 As indicated in Tables 30 and 31, participants refined their information needs not only during query reformulation but also during relevance judgment. In Table 30, participants used INR moves almost evenly during relevance judgment (28 of 58 observations, 48%) and during query reformulation (30 of 58 observations, 52%). 117 Some INR moves were used notably more frequently during relevance judgment (e.g., clarification alone, note for later), and others during query reformulation (e.g., specialization, restriction). Participants were able to clarify their information needs better while viewing testimonies (and making relevance judgments), since they were learning more about their topic while interacting with the VHF collection and its content. Thus, clarification occurred more often during relevance judgment. As a result of clarification, participants were able to develop a clearer idea of what to include and what to eliminate from a search result. This explained the many observations of specialization with clarification. In many cases, specialization occurred during query reformulation. Table 32 expands Table 30 by pairing each INR move with the type of query reformulations that participants used to realize the INR move. For example, participants may use a narrower term to specialize the previous query. Numbers in the table indicate observation counts for each query reformulation type. Numbers in the last column represent the number of participants who used the corresponding query reformulation type listed in each row. Table 33 is a complete think-aloud transcript from P12 and shows the patterns in the INR moves P12 adopted, sources of each adopted INR move, and query reformulation types (see Appendix J for transcripts from all participants). 118 Table 32. Observations of INR moves and subsequent query reformulation types Number of Observations ** Think-Aloud INR Moves Query Reformulation Types All Relevance Judgment Query Reformulation Number of Participants Clarification, alone Adding a condition Narrowing a condition, narrower term New term 1 2 2 1 0 1 0 2 1 1 1 2 Specialization Adding a condition Narrowing a condition, narrower term Broadening a condition, broader term * Modifying a condition, related term * 4 4 1 1 0 0 0 0 4 4 1 1 3 3 1 1 Specialization by elimination Narrowing a condition, removing ORed terms 1 0 1 1 Restriction Adding a condition Adding a language restriction 2 1 0 0 2 1 2 1 Generalization Removing a condition Removing a language restriction 1 1 0 0 1 1 1 1 * Searchers may broaden or modify a condition, while still leaving the query narrower from the query without the condition. ** Not all INR moves in Table 30 resulted in reformulating queries. As a result, the number of observations in Table 32 does not match the number of observations in Table 30. 119 Table 33. Think-aloud transcript from P12 Compromised Information Need (Q3) INR Move & Source Query (Q4) QRT * Result Q3(1) P12: And I know that this camp can be spelled a few different ways so. Can I do German letters? INT: No. This, I have to say I?m surprised it?s not finding your search that way. I know that the preferred term is just spelled f-o-h. P12: F-o-h? INT: F-o-h, without the e. It?s actually with the umlaut. P12: With the umlaut, that?s right, that?s why I put an e in there. That?s why I was asking will this take into account different spellings, like with the Holocaust Museum? I mean I?ve seen it spelled f-e-r-n, like Fernfaltd. You know crazy stuff. INT: Right. Yeah, foehrenwald. I know exactly what you mean. I?m just not sure that we have the umlaut. P12: You do. Well I mean in here. INT: Yeah, I know it?s here. I don?t know that we can make it on the keyboard. That?s all I meant. Clarification (PK) Q4(1) Foehrenwald (limited by language English) Q4(2) F?hrenwald Spelling variation No match 177 matches Q3(2) P12: Okay. Well she mentioned that the Americans took them to Foehrenwald and I hope that she talks more about that because, the Americans to be taking her, I?d like to hear about that process. Why Foehrenwald? Who exactly took her? Where was she? I mean she was coming from Bergen-Belsen, but how did Bergen- Belsen, Bergen-Belsen was in the British, so how did that happen that the Americans got a hold of her and took her to Foehrenwald? Okay so to go backwards. (Refined during relevance judgment) Clarification, alone (VW) 120 Compromised Information Need (Q3) INR Move & Source Query (Q4) QRT * Result Q3(3) P12: Okay for right now I?m more interested in the Jewish experience because I do have a lot from external agencies and I really would like to hear more from survivors. INT (185): Okay. So there are keywords that will help you with that, one of which is living conditions in the refugee camps. And if we look in the refugee life container, you will see other terms there that may be better suited to your search. P12: Okay. Then that?s what I?ll do. I?ll start over at advanced. Clarification (PK) Specialization (INT) Q4(3) F?hrenwald AND Living conditions in refugee camps Q4(4) F?hrenwald AND Refugee experience Adding a condition Broadening a condition No match 69 matches Q3(4) P12: Okay. Why did I say wow? Because the first entry is abortions in the refugee camps, which is something that I?ve been very interested in gleaning more information about this when I?ve conducted my oral histories. And some women have been very forthcoming and others have said no it never happened, whereas I know that it happened quite frequently so it?s interesting that Dafka, of all things that this is the first one. In the Joint, this is interesting, anti-refugee experiences. Barter in the refugee camps, to me that?s black market stuff which is a huge issue in my research or bribery, brutal treatment in the refugee camps. Very crucial to what I?m doing. Childcare, children in the camps, clothing. Now I?m just concerned that refugee camps aren?t exactly the same as displaced person camps. Specialization (TH, PK) Clarification (TH) Q4(5) F?hrenwald AND Abortions in the refuge camps Narrowing a condition (Narrower term) 121 Compromised Information Need (Q3) INR Move & Source Query (Q4) QRT * Result Q3(5) P12: It?s been, it?s very difficult. I mean not every one, but every other one are ones that I would like to add to my list. The social relations, I?m looking at things specifically regarding social life or about experiences that haven?t really been written or talked about in detail, such as abortion or sexual activities or rape or sexual molestation. Some of the more personal experiences. Living conditions. OB: Can you give why you were. P12: These form some of the basic questions that I have. What were the living conditions like? How did they begin to rehabilitate or how was, or the instances where rehabilitation was not possible? Orphanages and children?s homes. I?m very interested in not only the issues of gender, but also the issues of age. Clarification (TH) Specialization (PK) Q3(6) P12: Okay. Things that have to do specifically with gender or age issues I?m going to click on, like menstruation, malnutrition in the camps also address the living conditions. As does living. This is going to be very difficult to. Oh justice and law enforcement. And kids. Clarification (TH) Specification (TH) Q3(7) P12: Right. Interaction with family members would be very important. Injuries would talk about hospital care, living conditions. The three main issues that I?m looking at are gender, age, and health conditions because that had so much to do with who was picked or prioritized for immigration and who wasn?t. Clarification (TH) Specialization (PK, TH) Q4(6) F?hrenwald AND (Rape and sexual molestation OR Means of adaptation & survival OR Killings in the refugee camps OR Relationships between refugees and local populations OR Social relations OR Justice and law enforcement OR Menstruation, malnutrition in the camps OR Housing conditions OR Food in refuge camps OR Disease OR Deaths OR Education in the refugee camps OR Customs and observation) Adding a condition (ORed terms) Missing 122 Compromised Information Need (Q3) INR Move & Source Query (Q4) QRT * Result Q3(8) P12: Of cultural, oh, of customs and observances. Because I assume that will address issues such as faith, religion, the continuity of Judaism as a religion, being of different beliefs. I?m very interested in children in the camps. Bloomsberry House, I don?t know what that is so I?m going to look. Okay, that doesn?t have anything to do with my research. Association of Jewish refugees also isn?t. Anti-refugee experiences would be helpful because there were Polish Pogroms that forced a lot of Jews to flee from the east back into Germany. And aid. And obviously abortions are a gender issue. Okay so now I have to figure out how to minimize this list, which is really difficult. Clarification (TH) Elimination (PK, TH) 123 Compromised Information Need (Q3) INR Move & Source Query (Q4) QRT * Result Q3(9) INT: So just in looking at your list you probably don?t need my point of view on this but it looks like you have really two different kinds of questions. One of which looks like it deals with a lot of social aspects of life in the refugee camps and others that deal with more, what?s the word that I?m looking for. Well, I guess they really all deal with social aspects, some of them deal more with human interaction than others. P12: Right, right. INT: So may be that?s a way to break it down. The terms that deal with human interactions versus the terms that deal with conditions. P12: Okay. Well I?m going to go, both of them are important in that I want to get back, I want to get back to the one that I don?t first concentrate on, but my first interest is in the human relationships and interpersonal relationships. So I?ll try and delete. Now can I, the ones instead of just deleting, can I put them somewhere else? In another file to save or should I just delete them and then I?ll go back and do all the ones that have conditions. INT: I think you would have to delete them form this list right now and then make a new search within project. P12: Okay. Which ones am I deleting? I?m trying to focus it more on interpersonal relationships or the human condition, so I?m trying to take out ones that are a little bit more general. Even though a lot of these could probably be taken out. Justice and law enforcement, I?ll delete and save until later. P12: Housing conditions because that?s more general, a general theme, it?s not as human I guess. Take out food for the same reason. Epidemics, education, diseases, deaths, aid. Now is this too many to form a search? INT: I don?t know. I think it?s worth a shot. I would take out Foehrenwald at this point because you?ve already divined that set, so it?s not going to look for those all over again. P12: Okay. I think we?ll include all. Wow, I actually got three hits. Three very specific ones. Only one female, interesting. Clarification (PK) Specialization (TH) Note for later (TH) Elimination (TH) Clarification (PK, TH) Elimination (TH) Q4(7) Deleted the followings from Q4(6) Justice and law enforcement Housing conditions Food Epidemics Education Diseases Aid Removing ORed terms 3 matches (1 female) 124 Compromised Information Need (Q3) INR Move & Source Query (Q4) QRT * Result Q3(10) P12: This is really amazing because this is something that no one has talked about in my reading of the material, of documents and archives or interviewing people, no one has ever mentioned this suicide. INT: Suicide after liberation? Does there appear to be a keyword in the segment dealing with that. P12: No, not in the segment. It?s not one of the keywords. I think it should be. I mean I really wouldn?t have thought to check for that. This is very, very important for me. Because it?s given me a completely new issue to try and find out more material on. INT: I would be interested to hear what he?s talking about because we do have terms regarding suicide and it may be that it?s not. Well obviously it?s not discussing his own suicide. P12: Well he?s saying when he was in Feldafing, there weren?t enough supervisors. He?s talking about the kids who survived the concentration camps, there weren?t enough people to care for them, watch out for them. He?s saying if somebody couldn?t take care of themselves, couldn?t adapt or fend for themselves, they were in really big trouble and that?s why there were a lot of suicides. And this is a brand new thing I?m hearing. INT: I see. So he?s discussing the causes of suicide as opposed to. P12: Well the causes and also just the event of it. (Refined during relevance judgment) Clarification, alone (VW) Q4(8) ? from Q4(1) Foehrenwald AND Suicide Adding a condition No match Q3(11) P12 (397): Right and then I felt really bad for saying for saying that. Because I originally had wanted to look at her segment on the JDC but then she's talking about immigrating to Canada so and that come much further down the list so I realized this is when she got help after she was already here in North America which does not pertain to my research. I wanted to know about her experiences with the JDC in the DP camps. (Refined during relevance judgment) Elimination (VW) Clarification (VW) 125 Compromised Information Need (Q3) INR Move & Source Query (Q4) QRT * Result Q3(12) P12: Something he just said is very interesting and I haven't really, I mean I've thought about it but I need to think about it some more and the whole issue of identity and for the people that, he's talking about his friend of his who even though Poland had been liberated by the Russians, his friend was not yet able to acknowledge that he was Jewish. He had taken on the identity of being a Christian for so long and so deeply that he just, his Jewishness, his Jewish identity was gone and it's just given me really something to think about, looking at identity, post-war identity and what the process might have been like or how difficult of moving from this Christian identity to Jewish identity. I haven't quite solidified my thoughts on that but just something that he said kind of delineated it for me. OB: It might be relevant to your topic? P12: Yes, very relevant. This just made me think about it in a different way. Listens to tape. I don't understand why he just stopped. (Refined during relevance judgment) Clarification, alone (VW) * QRT: Query reformulation type 126 4.3.1 Clarification Clarification refers to the cognitive process whereby searchers refine their information needs by further elaborating on a search topic and by developing detailed aspects of the topic. During the iterative process of query formulation and reformulation, participants often needed to clarify their needs to move to the next step ? i.e., another search using a reformulated query that better represents their information needs. As discussed above, clarification often prepared participants for another INR move, especially specialization (11 of 24 observations of clarification were associated with other INR moves in Table 31). For example, P12 elaborated an aspect of P12?s topic (i.e., interpersonal relationship or the human condition) by further clarifying the topic, which later led P12 to eliminate a search term (i.e., justice and law enforcement) from an OR combination in the previous query, as remarked in the following quote: ?? I am trying to focus it more on interpersonal relationships or the human condition. So, I am trying to take out ones that are a little bit more general. Even though a lot of these could probably be taken out, justice and law enforcement? I will delete it.? Clarification from time to time occurred alone without any other INR move to follow immediately. Consequently, no immediate action was taken to modify the previous search query (in most cases). For example, P14 was elaborating several different aspects of P14?s topic while watching a testimony and later used them as a basis for reformulating search queries. This explains why participants used clarification alone remarkably more often during relevance judgment (13 of 16 mentions; 81%) than during query reformulation (3 of 16 observations; 19%) in Table 30. 127 Of the 16 observations for clarification alone, only 5 (31%) were immediately followed by a new search, as indicated in Table 32. This reaffirmed that clarification alone occurred, in many cases, to further elaborate different aspects of a topic rather than to take an immediate action. In Table 32, the types of query reformulation that participants used for realizing clarification were adding a condition, narrowing a condition (using a narrower term), or new term. For example, P12 stated: ?? This is really amazing because this is something that no one has talked about in my reading of the material, of documents and archives or interviewing people, no one has ever mentioned this suicide.? As a result of the above clarification, P12 added a condition (F?hrenwald AND suicide) to the previous query P12 used (F?hrenwald). Clarification alone could have been associated with any other types of query reformulation in Table 32. However, no incident for such cases was observed during User Workshop 1. 4.3.2 Specialization Specialization indicates the process whereby searchers narrow their search by specifying their information needs. Searchers may specialize their topic by place, event/experience, organization/group, time, and/or subject/theme. For instance, P13 specialized P13?s topic by an event (November Pogrom, in this case) in the following statement: ?? His flight attempt is very interesting to me, how people made the choice to leave Germany. In particular what sent? because just of the November Pogrom when people decided to leave.? Specialization was the most frequently used INR move (Table 30). Having too many retrieved documents is a common problem in Boolean retrieval systems. Thus, 128 specialization was used to reduce the size of a search result, which comported with the findings of previous studies (Bates, 1990; Chen and Dhar, 1990; Fidel, 1991). As mentioned in the previous section, specialization often occurred with clarification to reflect the refined topic into the query. Of the 25 observations of specialization in Table 30, nineteen (76%, including specialization by elimination) occurred with clarification, while six (24%) were used without clarification. Three of 6 (50%) specializations without clarification resulted from getting too many results and, thus, followed by an immediate new search, as remarked by P11: ?? It?s 59 testimonies to look at. So you can either look at each individual testimony or we could try to put an additional search. I mean an additional term into this search on the cultural and social activities. My hunch is that it?s probably going to minimize it tremendously? This implied that specialization might likely result in an immediate action of query reformulation, when compared with clarification alone. Unlike clarification alone, specialization was used more often during query reformulation (18 of 25 observations; 72%) than during relevance judgment (7 of 25 observations; 28%). Of those 25 observations, ten (43%) were immediately followed by a new search in Table 32. Participants achieved specialization by adding a condition or narrowing a condition (using a narrower term). For example, the initial query P13 used was Saint Louis. Pl3 later added a condition using AND (Saint Louis AND decision regarding flight) to the initial query, in order to specialize it. In order to achieve specialization, searchers might broaden or modify a condition after adding it, while still leaving the query narrower from the query without the condition. In the same example above, P13 got no match, and then modified the added condition using a related term (Saint Louis AND flight preparation). There was 129 a case that P12 first added a condition (F?hrenwald AND living condition in refugee camps), got no result (or too little), and subsequently used a broader term (F?hrenwald AND refugee experience) without any further indication of another INR move other than specialization. Specialization by elimination is a special case of specialization: searchers can narrow a topic by eliminating a concept from an OR combination that represents a certain aspect of a topic. For instance, P12 specialized P12?s topic by eliminating a concept (justice and law enforcement) in the following statement: ??I am trying to focus more on interpersonal relationships or the human condition, so I?m trying to take out ones that are a little bit more general. Even though a lot of these could probably be taken out, justice and law enforcement. I will delete it.? Three participants (P12, P13, and P14) used specialization by elimination, removing ORed terms from the previous query. Participants were learning more about the VHF collection and its content while examining retrieved testimonies or passages. In the beginning of their search, participants were able to tell mainly what they wanted to find. They, then, began to mention that some aspects of the subjects that were discussed in a testimony (or a passage) were not relevant to their topic as they continued their search. 4.3.3 Restriction Restriction is defined as narrowing a search by the characteristics of an interviewee ? not by a subject. Thus, restriction differs from specialization, although both are used with an intention to reduce the size of a search result. Searchers may restrict their search by age, gender, occupation, experience, education, language, place 130 of birth, and other characteristics of interviewees. For example, P13 wanted to restrict a search by the experience of interviewees, as stated in the following quote: ??Another thing that I would be interested in is, I see that you have to experience, your categories by true survivors. But I would also be interested, for example, now I have these 129 testimonies or records, and I would love to know of 129 how many were in concentration camps before they went on the Saint Louis and look specifically on the say 60 person who have been in concentration camps before they departed? Restriction was observed more often during query reformulation (6 of 9 observations, 67%) than during relevance judgment (3 of 9 observations, 33%) in Table 30. Of those 9 cases, five (55%) were used without clarification, which was somewhat different from specialization. In most cases, participants wanted to restrict their search by age, gender, occupation, experience, or language of interviewees, which involved not much clarification. Of the remaining four cases, three were used with clarification, and another with note for later. Three participants (P11, P13, and P14) adopted restriction during their search and realized it by adding a condition or by adding a language restriction. 4.3.4 Generalization Generalization, which is the opposite move to specialization, refers to the process whereby searchers expand their search by broadening a topic; it is used with an intention to increase the size of a search result. There was only one incident in which a participant (P14) tried to generalize the previous query due to the small size of the search result (no match, in this case), as indicated in the following statement: ?? Well, at least not any in English? That is surprising. So then the way I would approach this is to get rid of the living conditions portion, and to look at just the resistance and forced labor battalions.? 131 In this particular incident, P14 realized generalization by removing a condition (living conditions) from the previous query and at the same time by removing a language restriction (English). This explains the observation counts of 2 for generalization in Table 31, although it was observed only once in Table 30. Generalization could have been achieved by using other query reformulation types, like broadening a condition using a broader term or by adding terms with OR. However, no such incident was observed during User Workshop 1. This observation was interesting in that participants received no match from the system in many cases. Logically, the next move for participants to take might be to generalize the query to increase the size of returned documents. Instead, participants adopted other INR moves (clarification, in many cases) rather than generalization. One explanation for this could be unfamiliarity with the VHF thesaurus by participants. Due to their unfamiliarity, participants often failed to come up with a search term that was present in the VHF thesaurus. In most cases, finding the right term was what led participants to a solution, and the system returned a good number of testimonies (or passages) once the term was entered. This affected the cognitive process of participants when reformulating queries a great deal. For this reason, generalization did not occur that much. There could have been cases that generalization would have been useful. This implies that experienced searchers might behave differently. 4.3.5 Parallel Movement Parallel movement refers to a case in which searchers refine their information needs without making it broader or narrower. Participants could have 132 achieved parallel movement mainly by using related terms, but no such case was observed during the User Workshop 2. There were a couple of cases in which participants used a spelling variation and a synonym immediately following the previous query. This implied that such query reformulation types as spelling variation and synonym might be valuable for searchers, despite the small number of observations. A few cases were observed where participants searched for an interviewee using a different name. These cases could have been considered a parallel movement but were excluded from observation counts, since no INR move was involved with such cases. 4.3.6 Note for Later Note for later refers to the case that searchers indicate certain aspects of a topic or the characteristics of an interviewee are potentially useful at a later stage of their research but not at the current stage. P14, who was at the beginning stage of P14?s research, mentioned that gender might become important at the later stage of P14?s research, as remarked in the following statement: ??Initially, gender isn?t important, but as the analysis becomes more sophisticated and complicated, it would be very important to separate the two.? Previous studies that examined successive searches have found that searchers adopt different relevance criteria at different stages of their research (Bateman, 1998; Wang and White, 1999). Two participants (P12 and P14) made seven incidents of note for later, which comprised 12% of all observations of INR moves. This implied that participants might behave differently at the later stage of their research, if successive searches were to be done. Providing a tool for note taking during a search could be useful. 133 4.4 Query Reformulation: Sources of Information for INR Moves and Intermediary Effect Section 4.3 discussed each INR move and its associated query reformulation types that participants used during their search. This section discusses the sources that provided participants with a clue for making an INR move (Section 4.4.1) and the effect of having an intermediary, one of the sources of INR moves, on query reformulation (Section 4.4.2). 4.4.1 Sources of Information for INR Moves During query formulation and reformulation, participants looked for information that could provide clues for refining their information needs. Typically, searchers in document retrieval systems might get clues from such sources as the title, keywords, abstract, and the full text of a document (if available). Participants used several different sources when making each INR move discussed in Section 4.3: ? the previous knowledge of participants on the topic (PK), ? the intermediary (INT), ? pre-interview (pre-testimony) questionnaire (PIQ), ? viewing a testimony or a passage (VW), ? the VHF thesaurus (TH), ? assigned descriptors to a testimony or to a segment (DS),and ? the number of items retrieved (NIR). 134 Table 34 summarizes the type of the INR sources listed above. Numbers in the table represent the frequency of observations of each source both identified by the searcher behavior (RJ for relevance judgment and QR for query formulation and reformulation) and mapped into the corresponding INR move. Participants often used multiple sources of information for an INR move. Observation counts in these cases were made for each source participants used for an INR move. As shown in Table 34, participants most often depended on their previous knowledge (26 of 85 observations, 31%) about their topics when making INR moves. All four participants had a comprehensive understanding of their topics, since they were engaged in serious research on the topic they developed. Thus, participants were able to refine their information needs by referring to what they already knew about their topics. This corresponds with the findings of previous studies that examined the effect of previous knowledge on relevance judgment (Barry, 1994; Mizarro, 1997) and of domain knowledge on query formulation and reformulation (Efthimiadis, 1996; Wildemuth, 2003; Zhang et al., 2005). Considering the fact that many INR moves occurred during relevance judgment in Table 30, the findings of the previous studies on relevance judgment might be directly applicable to help explain the many uses of previous knowledge by participants. 135 Table 34. Sources of information for INR moves for query formulation and reformulation Sources of INR Moves PK INT PIQ TH DS VW NIR INR Moves RJ QR RJ QR RJ QR RJ QR RJ QR RJ QR RJ QR Clarification, alone 3 1 2 2 11 1 Specialization 3 12 1 6 1 10 5 2 Restriction 3 2 3 1 1 3 Generalization 1 0 Note for later 2 2 1 1 1 2 2 8 18 2 11 5 11 10 16 1 3 All (85) 26 (31%) 13 (15%) 5 (5.8%) 11 (13%) 10 (12%) 17 (20%) 3 (3.5%) RJ: Observed during relevance judgment PK: Previous knowledge QR: Observed during query formulation and reformulation INT: Intermediary PIQ: Pre-interview (pre-testimony) questionnaire TH Thesaurs DS: Assigned descriptor VW: Viewing NIR Number ofitems retievd 136 Participants often refined their information needs while viewing a testimony or a passage (17 of 85 observations, 20%). This implied that some of the information that affected the query formulation and reformulation of participants was not available from the VHF system until viewing the tape, as affirmed by the following dialog between P12 and the intermediary: P12: ?? This is really amazing because this is something that no one has talked about in my reading of the material, of documents and archives or interviewing people. No one has ever mentioned this suicide.? Intermediary: ?Suicide after liberation? Does it appear to be a keyword in the segment dealing with that?? P12: ?No, not in the segment. It?s not one of the keywords. I think it should be. I mean I really wouldn?t have thought to check for that. This is very very important for me. Because it?s given me a completely new issue to try and find out more material on.? A similar trend was found for the searcher behavior of relevance judgment ? i.e., participants had little means to infer some of the relevance criteria they adopted, such as richness (Section 4.1.3) and emotion (Section 4.1.4), other than watching the tape. A recent study (Kim et al., 2003) observed that searchers of stories of radio programs often needed to listen to a story when making relevance judgments. These findings suggested that insufficient indexing may be a common problem with current speech retrieval systems. The intermediary who assisted with formulating and reformulating queries was another source of INR moves that were frequently used (13 of 85 observations, 15%) by participants. Due to the lack of familiarity with the VHF thesaurus, participants often depended on the intermediary to find the correct search term presented in the thesaurus. The intermediary often brought in their knowledge of the collection and the thesaurus, which implied indirect use of the thesaurus occurred during the interaction with the 137 intermediary. This effect of the intermediary on query formulation and reformulation will be discussed in the next section in greater detail. In addition to the above mentioned sources, participants used the VHF thesaurus (11 of 85 observations, 13%) and assigned descriptors (10 of 85 observations, 12%), when making INR moves. Participants from time to time found other related subject terms that they did not think of before, while browsing the VHF thesaurus. As a result, participants adopted one or more INR moves that could explain the selection of subject terms found from the thesaurus. A similar behavior was observed when examining the assigned descriptors to a retrieved testimony or a segment. Other sources participants used for making INR moves were pre-interview questionnaires (5 of 85 observations, 5.8%) and the number of items retrieved (3 of 85 observations, 3.5%). Participants often examined the biographical information of interviewees that was available from the pre-interview (pre-testimony) questionnaire (PIQ) and sometimes referred to PIQ to make an INR move, like clarification, restriction, and note for later. There were three incidents that participants made an INR move (restriction, in these cases) immediately after looking at the number of a search result. The findings discussed in this section could be useful for system designers, especially in the context of indexing. For example, the PIQ was used as a source for making INR moves only during relevance judgment (Table 34). Of the 5 observations of PIQ, three were associated with restriction. This implied that it would be useful for searchers to have a feature that could restrict their search using certain biographical 138 information about interviewees. This was an available feature from the current VHF system, but only for gender and language. 4.4.2 Intermediary Effect Due to participants? inexperience with the system, assistance from an intermediary was available during both user workshops. As, discussed in Section 3.3.2, all intermediaries (one intermediary in Workshop 1 and two intermediaries in Workshop 2) were former catalogers who were familiar with the collection, the thesaurus and how descriptors from the thesaurus were applied in indexing. They also worked for several years answering mail requests, but they were not trained in reference interview techniques. Participants found that it was difficult for them to find correct search terms without having any knowledge on how the VHF collection was indexed. Thus, the presence of an intermediary was appreciated by all participants. Although the role of the intermediary was limited to assisting participants with finding the correct search terms from the VHF thesaurus, it was often inevitable for the intermediary to interact with participants during query formulation and reformulation. As a result, the intermediary contributed a great deal to the cognitive process of query formulation and reformulation of participants, as indicated in the following dialog between the intermediary and P12: P12: ?? Okay for right now, I?m more interested in the Jewish experience because I do have a lot from external agencies and I really would like to hear more from survivors.? Intermediary: ?Okay. So, there are keywords that will help you with that, one of which is living conditions in the refugee camps. And if we look in the refugee life container, you will see other terms there that may be better suited to your search.? 139 P12: ?Okay, then that?s what I?ll do.? P12 added a condition (living conditions in refugee camps) to the previous query, immediately following the dialog above. This effect of the intermediary was observed throughout the search, especially during query formulation and reformulation ? the intermediary effect was minimal during relevance judgments, since the intermediary and participants both were strictly instructed not to interact with each other while examining retrieved testimonies or passages. As discussed in Section 4.4.1, clarification was one of the most frequently used INR moves and, in many cases, was not followed by an immediate query reformulation. This might have resulted partly from having the intermediary during the search. Participants often needed to explain their topics in greater detail while interacting with the intermediary. Each participant conducted another search session without the intermediary after the initial search with the intermediary, which was observed by the investigator. During the independent search without the intermediary, participants were mostly focusing on examining and viewing the testimonies they found from the initial search. Moreover, it was observed that the effect of the intermediary still remained while users search independently. Several studies have found that the domain and thesaurus knowledge of searchers (Efthimiadis, 1996; Wildemuth, 2003; Zhang et al., 2005) and intermediaries (Belkin et al., 1983; Ingwersen, 1984; Efthimiadis, 1990) affects the cognitive process of human query formulation and reformulation. It was observed that the thesaurus 140 knowledge of the intermediary was effectively merging with the domain knowledge of participants during the interaction between them, which comported with the findings of previous studies. It would have been interesting to examine the emerging process of the intermediary with thesaurus-knowledge and participants (searchers) with a domain- knowledge. It, however, was well beyond the topic this dissertation intended to cover and would remain as a topic for future studies. 141 Chapter 5: Conclusions and Implications This dissertation examines the cognitive processes underlying human relevance judgment and query reformulation by searchers of oral history interviews. Designing an access system that can effectively search spoken word content is challenging, due to gaps in our understanding of searcher behavior. A few studies have examined this problem (Hirschberg and Whittaker, 1997; Kim et al., 2003), but much of the searcher behavior in speech retrieval systems still remains unknown. The focus of this dissertation is to explore how searchers interact with speech retrieval systems to specify their information needs, select recordings or passages, and reformulate their queries. The dissertation is focused on finding patterns and trends by examining: 1.1 what relevance criteria searchers applied when selecting/discarding a recording or a passage, 1.2 what attributes searchers used when judging each relevance criterion, 2.1 what information need refinement (INR) moves searchers adopted for query reformulation, and 2.2 what query reformulation types searchers used in achieving each INR move. Using qualitative research methods made it possible both to gather rich descriptive data and to provide sufficient evidence to discover patterns and trends in the searcher behavior being examined. The data collection and analysis were grounded in the conceptual framework shown in Figures 6 and 7. This chapter revisits and discusses the major findings presented in Chapter 4 and draws implications for the issues of indexing and metadata assignment, support for 142 search and browsing, and task-oriented retrieval system and interface design from the reported findings and discussions. It then discusses limitations and concludes with ideas for future work. 5.1 Overview of Findings and Discussion This dissertation makes the following three main contributions, as discussed in Chapter 1. The first contribution is related to the collection examined - namely, oral history interviews. The type of relevance criteria and available speech attributes may differ from other medium (e.g., text) and domains (e.g., news, voicemail, lecture). Participants demonstrated these differences during their search, as discussed in Section 4.2.4.2. The findings explored the relevance criteria and the associated attributes searchers used when seeking oral history interviews. The second contribution is a conceptual framework that integrates relevance judgment and query reformulation, as shown in Figures 6 and 7. This framework assumes that the cognitive processes underlying relevance judgment and query reformulation occur interactively during a search. Some of the findings discussed in Chapter 4 support this assumption by elaborating patterns whereby participants often refine their information needs during relevance judgments and reveal their selection criteria during query formulation and reformulation. This dissertation studied the two searcher behaviors together during a search rather than focusing on either one alone. The last contribution deals with system and interface design. It is important to provide proper tools for different tasks to support effective search and browsing. Observations were made throughout the search from the initial query formulation, to examination, selection, and query reformulations. The findings have implications for 143 designing a future retrieval system for oral history interviews. With these contributions in mind, the following sections summarize and discuss major findings and implications. 5.1.1 Relevance Judgment This study adopted the user-oriented view of relevance ? i.e., dynamic and situational (Wilson, 1973; Schamber et al., 1990; Wang and Soergel, 1998). It defines relevance as the degree to which a speech recording (or a passage) meets a searcher?s need. Although topicality was the main criterion for selection, searchers did use other criteria, beyond topicality, that could be specific to their individual preferences and requirements. The findings correspond with the user-oriented view of relevance. Participants used various relevance criteria and often used one or more criteria in conjunction with topicality. Participants were concerned with utility (Saracevic; 1975), pertinence (Foskett, 1972), satisfaction (Bookstein; 1979), and/or situation (Wilson, 1973), as well as topical relevance, when selecting recordings or passages. Some of the findings on relevance criteria were consistent with the findings of previous studies. Participants used topicality most frequently, as expected (Park, 1993; Schamber, 1994; Wang and Soergel, 1998; Tang and Solomon, 2001; Kim et al., 2003). Accessibility, richness, and comprehensibility (Wang, 1994; Bary, 1998; Bateman, 1998) were considered to be important in that they often served as filters after topical relevance. Duration (Kim et al., 2003) affected the decision-making process of searchers, especially for K-12 teachers. Novelty of content was also important (often referred to as newness in previous studies, (Tang and Solomon, 2001; Kim et al., 2003). On the other hand, some relevance criteria found here were seldom mentioned by previous studies, in particular emotion, acquaintance, and access to the interviewee. 144 This reflects some of the unique characteristics of an oral history collection. For example, most testimonies contained emotional expressions while interviewees discussed an event they experienced during the Holocaust. These emotional expressions often added tremendous value to the spoken content. Participants, especially K-12 teachers, considered emotion to be an important criterion when making their relevance judgments. Acquaintance with the interviewee and access to the interviewee were considered to be important regardless of the topical relevance of the testimony being examined. Not surprisingly, one of the major differences between the findings of this study and the findings of previous studies was the type of attributes. Person, place, and event/experience were the three attributes participants used most frequently to judge the topical relevance of a recording or a passage, which is consistent with the general interest of historians (Tibbo, 2002). In comparison, for voicemail, the most important attributes were found to be sender name, sender telephone number, and important date/times (Hirschberg and Whittaker, 1997). Differences in the types of attributes were observed for other criteria. Among these attributes, some originated directly from the content, while others did not (as shown in Table 19). Often these non-content- originated attributes (e.g., amount of information, cache, digitization) were the primary concern of participants. Several distinct usage patterns of attributes were observed. Sometimes participants used one attribute (e.g. still image of the interviewee) as a proxy to infer another attribute that was not available (e.g., age/date of birth). Another example is using religion (Hasidism) as a proxy for social status (poor). It implied that the looked- 145 for attributes (shown in Table 21) were important for participants when making relevance judgments. Another usage pattern was that some attributes (e.g., event/experience) were used mainly to judge the relevance of a passage, and others (e.g., gender) mainly to judge the relevance of a whole interview. Some useful implications for system design can be drawn from these findings, as discussed in Section 5.2. Some attributes (e.g., date of birth, cache, amount of information, digitization) were found to be important for judging relevance regardless of their availability from the system, as shown in Table 24. These attributes were mentioned even when they were not explicitly available from the system. Amount of information was used only for passage-level access (Table 22). By their nature, date of birth, cache, and digitization refer to whole testimonies and were used only for testimony-level access. We found individual differences in the use of relevance criteria and attributes among participants with different needs and preferences. This confirms the notion that relevance is dynamic and situational (Wilson, 1973; Schamber et al., 1990; Wang and Soergel, 1998). Comparison with other medium and domains shows commonalities and differences, as shown in Table 29. Topicality, novelty, and duration are in common to oral history interviews, radio news, and journal articles. Quality, recency, and authority are in common to radio news and journal articles. Richness, emotion, and comprehensibility are, for the purposes of this study, unique to spoken oral history interviews in many languages. Comprehensibility (with the attribute of language of the interview and the clarity of speech) was not observed as a criterion in radio news 146 because the collection studied was only English, and clarity of speech was not an issue. These medium and domain differences were due to the distinct information needs of oral history users (for example, recency is not an issue in oral history) and the type of available attributes (metadata) of oral history interviews. 5.1.2 Information Need Refinement and Query Reformulation The process of query reformulation involves refining information needs and/or reformulating queries. An INR move was defined as the cognitive process searchers adopt to refine their information needs, and a query transformation type as the means to achieve the INR move. Searchers may refine their information needs during a search and reformulate a query by using a query transformation type. INR moves adopted as a basis for query reformulation were found to be similar to those observed in previous studies (Bates, 1990; Chen and Dhar, 1990; Fidel, 1991). Clarification was used either alone or together with other INR moves, such as specialization and restriction. All eight participants had a stable topic related to their own interests and, thus, elaborated different aspects of the topic rather than moving to a new topic. Clarification alone had a somewhat similar meaning to what previous studies referred to as pinpoint (Bates, 1990) and moves to increase both precision and recall (Fidel, 1991). Specialization, the most frequently adopted INR move, was used to reduce the size of a search result and referred to specification, specify, exhaust, reduce, and moves to reduce the size of a set in Table 3 (Bates, 1990; Fidel, 1991; Lau and Horvitz, 1999). Specialization by elimination was observed as a special case of specialization in which participants narrow a search by eliminating a certain aspect of a topic. 147 Restriction is similar to specialization in that it reduces the size of a search result but differs from specialization in that it narrows a search by the characteristics of an interviewee. Although restriction could be observed in other domains (e.g., text in history, documentary film) and genres (e.g., voicemail), searchers of oral history interviews frequently used restriction during their search. For this reason, restriction was separated from specialization. Other INR moves searchers adopted were generalization and note for later. Generalization was observed only once, which is somewhat surprising. Participants were focused more on finding thesaurus terms than broadening the query to increase the size of a search result. Note for later was seldom mentioned by previous studies. Although no immediate action was taken as a result of the INR move, participants often mentioned that certain aspects of the topic would be important at a later stage of their research. This validates the findings of previous studies that searchers use different selection criteria at different stages of their research (Bateman, 1998; Wang and White, 1999). Participants from time to time performed a search using a spelling variation, a synonym, or an alternative name of a person immediately after the previous query. Although these cases involved no INR move (thus, they were not considered as parallel movement), they resulted in an immediate action (i.e., query reformulation). A couple of factors that affected the cognitive process of query formulation and reformulation were discussed. Participants used various sources when refining their information needs, as shown in Table 34 (Section 4.4.1). The intermediary significantly 148 affected the cognitive process of query formulation and reformulation of participants, as discussed in Section 4.4.2. The previous knowledge of participants was an important source for making an INR move, as observed by previous studies (Efthimiadis, 1996; Wildemuth, 2003; Zhang, et al., 2005). Participants often refined their information needs while viewing a testimony or a passage. Other sources participants used for making INR moves were the VHF thesaurus, assigned descriptors, the PIQ data, and the number of search results. Another source that participants heavily used during query formulation and reformulation was the intermediary. This was mainly due to the unfamiliarity of participants with the thesaurus terms. It was observed that the presence of an intermediary during the search greatly affected the cognitive process of participants, especially query formulation and reformulation. This intermediary effect was observed by previous studies (Belkin et al., 1983; Ingwersen, 1984; Efthimiadis, 1990). 5.1.3 Interaction between Relevance Judgment and Query Reformulation A conceptual framework that combines Dervin?s sense-making model (Figure 4) and Taylor?s question formation model (Figure 5) was developed to explore the interaction between the cognitive processes of relevance judgment and query reformulation. Figure 6 presents the framework for examining relevance judgments, and Figure 7 the framework for query reformulation. Figures 16 and 17 are revised versions of Figures 6 and 7, respectively, incorporating the findings of the study. Figure 16 represent the findings in the cognitive process of relevance judgment by searchers of oral history interviews, and Figure 17 for the cognitive process of query reformulation. Relevance criteria (Figure 16) and INR 149 moves (Figure 17) are listed under the process where they occurred. Relevance criteria or INR moves in the intersection of the two circles occurred during both processes. Numbers within parentheses indicate the frequency of occurrence. The numbers next to a criterion or an INR move within the intersection area represent the observation counts during relevance judgment and during query reformulation, respectively. As seen in Figure 16, topicality, comprehensibility, novelty of content, and acquaintance were used during both processes. Comprehensibility has two associated attributes, language and clarity of speech. Language was mentioned significantly more often during query reformulation; language can be used as a search criterion, and this feature is prominent in the search system. Clarity of speech, on the other hand, was mentioned during relevance judgment; it was not a searchable criterion. It was interesting to see the pattern that topicality, the most heavily mentioned criterion, was used evenly during both processes. Tables 8 and 10 summarize the associated attributes of topicality, such as person, place, event/experience, and others. Some of the attributes (e.g., gender, occupation, country of birth, nationality, camp, country, ghetto, city, etc.) were used frequently during query reformulation in Table 10. 150 Search n S 1 Gap n Action n A 1 G 1 Relevance Judgment Query Reformulation Q1 Q4 (n+1) Selected Documents Q4 (1) Q2 (n+1) Q3 (n+1) Relevance Criteria (RJ) Accessibility (28) Richness (14) Duration (9) Emotion (7) Access to the interviewee (3) Miscellaneous Q2 (1), Q3 (1) Relevance Criteria (both) Topicality (219, 234) Comprehensibility (1, 10) Novelty of content (4, 2) Acquaintance (3, 2) Relevance Criteria (QF) (5) Figure 16. A revised conceptual model for relevance judgment Four of five INR moves (clarification for later) were mentioned during both processes, as shown in Figure 17. Clarification alone was observed rem reform during exam occurrence) observed during query reform Table 32 summarized what query transform adopted INR move. 151 alone, specialization, restriction, and note ore often during relevance judgment than query elaborate different aspects of their topic ove (with a low ulation and not during relevance judgment. arkably m ulation, since searchers often needed to ination. Interestingly, generalization was the only INR m ation types searchers used to achieve each Figure 17. A revised conceptual model for query reformulation Search n S 1 Gap n Action n A 1 G 1 Relevance Judgment Query Reformulation Q1 Q4 (n+1) Q4 (1) Q2 (n+1) Q3 (n+1) INR Moves (QF) Generalization (1) Q2 (1), Q3 (1) INR Moves (both) Clarification, alone (13, 3) Specialization (7, 18) Restriction (3, 6) Note for later (5, 2) Information Need Refinement (INR) INR Moves (RJ) Figures 16 and 17 both indicate that the cognitive processes of judgment and query reformulation occur interactively during a search. Searchers reveal their relevance criteria while reform needs while making relevance judgm reformulation cannot be observed with the discrete m the INR moves observed during relevance judgm model in Figure 5. The frameworks presented in Figures 16 and 17 m observe the searcher behavior of relevance judgm realistically then the discrete models. 152 Selected Documents relevance ulating their queries and refine their information ents. The relevance criteria found during query odel shown in Figure 4. Likewise, ent cannot be found using the discrete ake it possible to ent and query reformulation more 5.2 Implications The findings have illuminated several issues that have consequences for the design of future speech retrieval systems for oral history collections and, in some cases, speech retrieval systems or retrieval systems in general. These implications include cataloging and metadata assignment (Section 5.2.1), support for search and browsing (Section 5.2.2), and task-oriented retrieval system and interface design (Section 5.2.3). 5.2.1 Indexing and Metadata Assignment The findings point to requirements regarding the attributes that should be considered in indexing and metadata assignment to best support effective search and browsing. These requirements are independent from the method of indexing, manual or automatic, but they pose more of a challenge for automatic indexing. A list of relevance criteria and associated attributes searchers used when selecting a speech recording or a passage was presented in Tables 7, 8 and 10 (Section 4.1). The associated attributes in the tables 8 and 10 would serve designers as a basis for what information to index. The findings suggest the following attributes to be indexed: ? Spoken-content attributes. These originate from what was spoken, as discussed in Section 4.2.1, and are associated with topicality (Tables 8 and 10) and novelty of content (Table 16 in Section 4.1.7). Examples of spoken-content attributes are: null Person (see Table 10 for sub-attributes of person) null Place (see Table 10 for sub-attributes of place) null Event/experience (see Table 10 for sub-attributes of event/experience) 153 null Organization/group null Time frame null Object null Other topics ? Audio and/or visual attributes. Refer to acoustic and/or visual features presented in a testimony or a passage and are associated with emotion (Table 13 in Section 4.1.4) and miscellaneous (Table 19 in Section 4.1.10). The followings are some examples of audio and/or visual attributes: null Facial expression null Voice (tone) null Gesture null Displayed artifact null Whispering null Singing ? Non-content attributes. Originate neither from the spoken content nor from the audio and/or visual features and are associated with accessibility (Table 11 in Section 4.1.2), richness (Table 12 in Section 4.1.3), comprehensibility (Table 14 in Section 4.1.5), and duration (Table 15 in Section 4.1.6). Example non- content attributes are: null Cache null Digitization null Language null Clarity of speech 154 null Length ? Biographical attributes. Refer to the characteristics of interviewees and are associated with person under topicality (Table 10 in Section 4.1.1), acquaintance (Table 17 in Section 4.1.8), and access to the interviewee (Table 18 in Section 4.1.9). The followings are some examples of biographical attributes: null Name of interviewee null Date of birth null Gender null Occupation of interviewee null Occupation of interviewee?s parents null Country of birth null Religion null Social status of interviewee null Social status of interviewee?s parents null Nationality null Family status null Address null Immigration history null Level of education null Marital status null Address of interviewee 155 In addition to individual relevance criteria, catalogers need to consider the usage patterns of criteria/attributes and external factors that affect the selection behavior of searchers when indexing. The following implications are drawn from the findings: ? Proxy use of attributes (Table 21 in Section 4.2.2). Suggest that cross- referencing the proxy attributes with the looked-for attributes would help searchers find the same (or related) information. Including the looked-for attributes in the thesaurus with a cross-reference for the corresponding proxy attributes that can actually be searched would assist searchers and partially obviate the intermediary. ? Granularity of units judged by attributes (Tables 22 and 23 in Section 4.2.3). Indicates passage-level indexing would increase the browsability of oral history interviews. 5.2.2 Support for Search and Browsing Search and browsing are the two main methods people use when finding information. The findings indicate that different tools are needed to support effective search and browsing and have drawn some important implications as to what tools to provide. ? Content-based search. Supports finding testimonies or passages by topic (Section 4.1.1) and, thus, work well for finding the spoken content. ? Biographical search. Finds information by the characteristics of interviewees (person in Section 4.1.1) and, therefore, effective for searching by biographical attributes. Supports restriction (Section 4.3.3) 156 ? Combining content-based search with biographical search. Supports specialization (Section 4.3.2) and restriction (Section 4.3.3). ? Browsing. An alternative to search that sometimes leads searchers to a serendipitous discovery. The findings suggest the following two aids are useful: null Browsing by topic. Using some of the frequently mentioned attributes for judging topical relevance, shown in Table 10 (Section 4.1.1), such as place, event/experience, and organization/group. null Browsing by audio and/or visual attributes (Sections 4.1.4 and 4.1.10). These attributes may be more suitable for browsing than for searching. ? Within-category search and browsing. Enables clarification (Section 4.3.1), specialization (Section 4.3.2), and restriction (Section 4.3.3). This capability supports the multiple query transformations to achieve an INR move, as discussed in Section 4.3.2. ? Ranked retrieval. Presents retrieved items in order of topical relevance (Section 4.1.1) and thus assists searchers to make relevance judgments efficiently. 5.2.3 Task-Oriented Retrieval System and Interface Design It is important to provide searchers with proper tools that can perform different tasks more effectively. The findings draw some suggestions for task-oriented system and interface design. ? Query formulation and reformulation. null Interactive query formulation and reformulation. Integrates some of the sources of INR moves, such as the previous knowledge of the searcher, 157 intermediary, thesaurus, and assigned descriptors, in Table 34 (Section 4.4.1). null Visualizing index (thesaurus) terms. Provides searchers with an aid that can find search terms (Section 4.4.1) and, thus, enhances their ability to formulate and reformulate the query. Useful especially for novice users (Section 4.3.4). null Search history. Presents the number of search results (Section 4.4.1). null Capability of limiting a search by some of the attributes that are used mainly for testimony-level access (Section 4.2.3), such as: - date of birth (age) - gender - country of birth - cache - digitization - language - occupation - camp - ghetto - time frame ? Relevance judgment null Sufficient metadata for supporting examination and selection. Resolves the common problem of insufficient indexing (Section 4.4.1). 158 In addition to the above suggestions, the findings imply that the following capabilities may be useful: ? Notepad. Supports note for later (Section 4.3.6) and integrates viewing as a source for INR move (Section 4.4.1) ? User modeling. Takes an account the individual differences into system design (Section 4.2.5.1). 5.3 Limitations This study has several limitations that are related to research methods, the study participant, the collection, and the search system. ? It is important to bear in mind the limitations of naturalistic inquiry; the research context cannot be completely controlled. The goals of this research were exploratory rather than comparative, and it was found to be effective to gather a rich data set using the case study approach. Although the findings of the study can be transferred to other cases with a similar context, they cannot be generalized due to the exploratory nature of the study. ? Although we were able to collect a rich set of data on the cognitive processes underlying searcher behavior using the think-aloud protocol (as discussed in Section 3.3.4.2), individual participants demonstrated different capability of verbalizing their thoughts. As a result, the number of mention counts in Chapter 4 is somewhat heavily influenced by one or two participants (e.g., richness in Table 25). ? Topicality (Section 4.1.1) may be over represented, since there are many ways to mention topicality and much fewer ways to mention other criteria. The 159 preponderance of topicality may also be influenced by the sample that consists of mostly scholars working on publications on a topic. ? Another limitation of the study comes from the limited number of study participants. Soergel et al. (2002) identified potential user groups of VHF collections, such as historians, educators, students, scholars, film producers, and others. We intended to recruit participants from as many user groups as possible. As a result, the study was conducted with one or two participants for some user groups. For instance, P11 was the only participant from the film producer group. Inferences that were drawn from such a small number of participants may not correctly represent the searcher behavior of a user group. ? In addition to the limited number of study participants, a relatively small number of INR moves and query reformulations were observed. Coding was done by the author and reviewed by the chair of the dissertation committee who actively participated in both User Workshops (as an observer). However, no dual independent coding was conducted by a second coder. ? In relation to the intermediary effect that was discussed in Section 4.4.2, searcher behavior without an intermediary may differ from searcher behavior with an intermediary. ? The VHF collection was not open for public access, and no participant had previous exposure to the collection or to the search system. Due to this lack of experience with the collection and the system, it was hard for participants to perform certain tasks, like finding descriptors that were present in the VHF 160 thesaurus, during their search. The behavior of inexperienced users may not be representative of experienced users. ? Medium and domain differences were observed, as discussed in Section 4.2.5.2. System designers must bear in their mind these medium and domain differences when designing a speech retrieval system. ? The findings are influenced by the nature and capabilities of this particular search system and interface. For example, the results of this study suggest enrichments to the thesaurus structure and changes to the presentation of the thesaurus to support richer and more intuitive interaction. With a system implementing these suggestions, there may be less reliance on intermediaries. 5.4 Future Work Despite the limitations, this dissertation has made several distinct contributions. Searcher behavior in speech retrieval systems is relatively less understood than searcher behavior in text retrieval systems. This dissertation is an effort to examine the less- known behavior of searchers and has explored the cognitive processes underlying relevance judgment and query reformulation. More studies need to be done in this area, in order to build the foundation for supporting effective speech retrieval. One of the main contributions of this dissertation is observation of the relevance criteria and associated attributes searchers use. Understanding what criteria and attributes searchers use when selecting a recording or a passage would provide catalogers with a basis for indexing and metadata assignment. However, medium and domain differences exist in the types of available speech attributes, as discussed in Section 4.2.5.2. 161 Further research related to the medium of speech could explore whether there are differences between speech and printed text in cognitive processing during relevance judgment and, if so, the nature of such differences. For the genre of interviews, it would be interesting to study the relative contribution of the questions and the answers to the users grasp of a passage for the purpose of making a relevance judgment and further to the listeners understanding of a passage. More studies with different domains, such as lecture, meeting, folklore, news, phone call, sports and entertainment, are needed. In addition to medium and domain differences, individual and group differences affect the searcher behavior, as discussed in Section 4.2.5.1. The behavior of expert (experienced) users may significantly differ from the behavior of novice users. More interestingly, different users groups with distinct needs and preferences may behave differently. Further studies with different user groups, especially teachers, would be useful. One example of domain differences in the type of available attributes from oral history interviews is those emotional attributes that include facial expression, voice (tone), and gesture. However, these attributes are hard to index, due to the expensive nature of manual indexing. Previous studies have found that some prosodic cues, such as pitch (Shriberg et al., 2000), intonation (Hirschberg and Nakatani, 1998), and pose (Arons, 1997), provide useful clues for detecting story boundaries. This idea of prosodic cues might be extended to detecting emotional attributes. For example, facial expression may have a correlation with some acoustic cues, such as pitch. 162 Another contribution is the conceptual framework that examines the cognitive processes of relevance judgment and query reformulation together. Examining the two processes together enables researchers to examine the searcher behavior more realistically and makes it possible to connect a specific behavior with a specific task throughout a search. Applying the conceptual framework to studies with other medium (e.g., text, image, film) and domains (e.g., news, voicemail, lecture) would be useful. Using the conceptual framework, it has been demonstrated that searchers refine their information needs while making relevance judgments and reveal their relevance criteria while refining their information needs. Many studies, including this dissertation, have focused either on characterizing INR moves or on examining the types of query transformation (or both). It would be interesting to examine which attributes (or metadata) are associated with which INR move. Finally, it is important to perform system evaluation that can be used as a basis for system implementation. Developing useful measures is critical for ensuring the validity of a system evaluation. Although the relevance criteria and attributes can be used as measures for system evaluation, more studies need to be done in this important topic. 5.5 Finale Some useful recommendations for indexing and metadata assignment have been made based on the findings of relevance criteria and attributes searchers utilize during their search. Section 5.2.1 discusses different types of attributes that are suitable for testimony-level access and/or passage-level access. 163 The cognitive processes underlying relevance judgments and query reformulation occur interactively during the iterative process of information seeking. As a result, searchers reveal relevance criteria and refine their information needs during both processes. Examining the searcher behavior of relevance judgments and query reformulation together has made it possible to make task-oriented recommendations for supporting effective search and browsing. The findings have provided a basis for designing further speech retrieval system by examining how searchers of oral history interviews select a recording or a passage, refine their information needs, and reformulate their queries. However, more studies need to be done in this area, in order to build the foundation for supporting effective speech retrieval. 164 APPENDIX A ? PRE-QUESTIONNAIRE Participant ID: _________________________ Occupation: __________________________________________________ Organization/Department: _________________________________________________ Job title (Grade, if a student): ______________________________________________ Highest Degree Completed: ___ HS/ ___ Bachelor?s/ ___ Master?s/ ___ Doctorate/ ___ Post Doctorate Describe your project: ____________________________________________________ ______________________________________________________________________ ______________________________________________________________________ ______________________________________________________________________ ______________________________________________________________________ ______________________________________________________________________ Past Experience: Have you ever used an automated system to find recorded speech (e.g., NPR Online, SpeechBot, etc.)? ____ yes/ ____ no If ?yes,? which one(s)? ___________________________________________________________________ ___________________________________________________________________ How frequently do you use such systems? ___ daily/ ___ weekly/ ___ monthly/ ___ rarely/ ___ never Thank you very much! 165 APPENDIX B ? FORMS RELATED TO PERMISSION FROM IRB 166 INFORMED CONSENT FORM FOR DISTRIBUTION OF VIDEOTAPES AND TRANSCRIPTS Project Title: User Interaction in Speech-Based Information Retrieval: Relevance Judgments and Query Reformulation I hereby agree that a copy of all videotapes, audiotapes, and transcripts generated during the study may be given to the Survivors of the Shoah Foundation. I understand the Survivors of the Shoah Foundation will take full responsibility for safeguarding of these materials and will use them only for the purpose of improving the testimony information system and for using documentation or training materials for the system, without referencing my name. Dagobert Soergel, Ph.D. Douglas Oard, Ph.D. Jinmook Kim Print Name of Participant ______________________ College of Information Studies Room 4101, Hornbake University of Maryland College Park, MD 20742-4345 Signature of Participant _________________ Telephone: (301) 405-2037 E-Mail: ds52@umail.umd.edu Date __________________ oard@glue.umd.edu jinmook@glue.umd.edu 167 INFORMED CONSENT FORM FOR TRANSCRIBING TAPES Project Title: User Interaction in Speech-Based Information Retrieval: Relevance Judgments and Query Reformulation I understand all information collected in this study is confidential and hereby agree that I will take full responsibility for safeguarding of audio tapes and transcripts generated during my work. I will use the audiotapes only for the purpose of transcribing them and understand any forms of transcripts I will generate during my work should be destroyed upon the completion of my job as the transcriber for this study. Dagobert Soergel, Ph.D. Douglas Oard, Ph.D. Jinmook Kim Print Name of Transcriber _____________________ College of Information Studies Room 4101, Hornbake University of Maryland College Park, MD 20742-4345 Signature of Transcriber ________________ Telephone: (301) 405-2037 E-Mail: ds52@umail.umd.edu Date __________________ oard@glue.umd.edu jinmook@glue.umd.edu 168 APPENDIX C ? OBSERVATION GUIDELINES DURING USER WORKSHOP 2 Processes people engage in as they search: (1) Relevance judgment (2) Segment selection within testimony (3) Query reformulation (4) Strategy development Four things searchers need to learn about: (1) The capabilities of the interface (2) The contents of the collection (3) The nature of the indexing process (4) What it is that they are really looking for Detailed notes: Observers should record their observations of the participants? actions, focusing particularly on: Nothing each search or search step Noting each testimony examined (use the interview code as ID) Within this framework, the observer should add comments addressing any of the questions listed below, or any other observations that they believe to be salient. The following factors may serve as a useful guide for making observations and/or structuring written comments: Focus on interface: (1) Instances when the user does not understand the interface (2) Mistakes/backtracking events (3) Response time issues (4) Thesaurus navigation Focus on content: (1) Descriptors used (intermediary will focus on this) (2) Difficulty in finding descriptors (3) Topic not represented in thesaurus (4) How does the user select the testimonies to look at from the testimonies found? (5) How does the user interact with the search system on a content level, for example, how does she come up with new search terms during the search? (6) How does she locate specific passages within a testimony? (7) What role do different elements contribute to the user?s relevance judgments (biographical information, keywords, audio, video)? (8) How does what she finds relate to her project? What did it contribute? (9) What would the user have liked to do that she could not do? (10) Any comments on what might be helpful for the user. Comments on improvements: 169 APPENDIX D ? EXAMPLE QUESTIONS FOR THE SEMI- STRUCTURED INTERVIEW DURING USER WORKSHOP 1 1. What relevance criteria were applied to select or discard a recording or a passage? (Suggested question: Why did you select or discard [some specific] recording or segment?) 2. How were the criteria used together to reach a decision? (Suggested question: Were some criteria more important than others?) 3. What information was used as a basis for assessing each relevance criterion? (Suggested questions: What information did you use in determining [some relevance criterion you mentioned]? What features of the system were most helpful for making your decision? Why did you listen to [some specific recording or passage]? Did you get any additional information for selecting or discarding a recording or a passage by listening to the audio?) 4. What differences were there between what the searcher wanted and what he/she found? (Suggested questions: What information did you expect to get when you typed in [some specific query]? Did the information returned by the system meet your expectation? In what way?) 5. What actions did the searcher take, if a search result was not satisfactory? (Suggested questions: What did you do to find information you wanted? Did you refine what you were looking for? To what? Did you try to find alternative search terms? If yes, why?) 6. What information did the searcher use as a clue for formulating an alternative query? (Suggested questions: How did you create [some specific query] you used to perform another search? What features of the system were most helpful for reformulating your query? Did listening to [some specific recording or passage] help in reformulating your query? How?) 7. What capabilities were not present in the system that would have been desirable? (Suggested question: Were there any system features that you had expected to see in a speech search system that were not present in the system you used? Any other comments?) 170 APPENDIX E ? EXAMPLE QUESTIONS FOR THE SEMI- STRUCTURED INTERVIEW 1 DURING USER WORKSHOP 2 Focus: (a) Understand the user?s project, information need, and approach to finding information in order to provide context for future interaction between researcher and participant. (b) Understand learnability issues in the present interface. (c) Understand what kind of support may be needed to help users refine their understanding of their own information needs. Sample questions (* important for the Shoah Foundation, may ask verbatim if it fits): (1) Ask about computer experience. Mac or Windows? (2) Ask about experience in searching databases. (3) What information or materials would have been helpful to receive beforehand so you could prepare your visit better? Prompt with question about thesaurus if necessary. (4) What other information would have been helpful as part of the introductory demo? (5) Ask how the participant how he/she feels about the process. Assure him or her that we are testing the system, not the participants. (6) *Before using the system, what were your expectations? (7) *Once you began using the UI where did it meet or not meet your expectations? (8) *How would you describe your experience with the UI in one word? (9) There are presently three modes of Advanced Search: People Search, Experience Search, and Global Keywords Search. Would you find other modes useful (for example, by date, place, event)? (10) Did you use the onscreen instructions, and if so, please rate their utility on a scale of 1 (lowest)-10 (highest). (11) Would you find a user manual and/or help screens helpful? (12) Imagine that this UI is the first contact you have had with the Shoah Foundation, apart from your general knowledge about the organization before coming here. What words would you use to describe the Shoah Foundation based on this experience with the UI? (13) As you conducted your initial searches, how did your understanding of what you were looking for and what you might find in this collection evolve? (14) Can you think of any capabilities that would be particularly helpful to include in the system? (To be continued in moderated discussion) Would you like any help in crossing the language barrier? 171 APPENDIX F ? EXAMPLE QUESTIONS FOR THE SEMI- STRUCTURED INTERVIEW 2 DURING USER WORKSHOP 2 Focus: (a) Understanding the basis for observed actions that were not clear to the observer at the time. (b) Understanding the evolution of the participants search strategy over time (c) Obtaining an individual assessment of the present process and system based on the most extensive practical search experience during the workshop. (d) Obtaining individual reactions to potential new capabilities. Sample questions: (15) Describe how plan to use the results of your search (and possibly further searches) in your work. Some examples of types of uses: Elements of interview to appear in end product Illustrative quotes Data in testimonies or PIQs to provide primary basis for end product Data for analysis Eyewitness accounts as primary basis for a history Data in testimonies or PIQs to provide secondary support for end product Source of specific supplemental information Checking historical accuracy (16) How did your search strategy evolve over the last few days? What about the way you used the video? (17) What system capabilities did you use? What did you find missing? Did you change search systems? If so, why? What features of the other system did you find particularly helpful? (18) Comment on the usefulness of each of the following elements of user assistance resources (some of these would have to be created): ? Training video ? Written introduction ? Quick reference guide ? Help screens ? Manual ? Printed thesaurus ? Online thesaurus similar to print (19) If access to the Shoah Foundations archive through this tool or a tool like this could be made available in libraries or online / on the Web, how useful would it be to users in your situation under each of the following scenarios. Would they use it? If the only way to provide access were to charge for it (for example, through a license or per testimony), would users or their institutions be willing to pay for access? Consider the following scenarios: ? Scenario 1: The only result of an online search is a list of testimonies (or individual passages within testimonies) in which the user could then request 172 for delivery in about two weeks. What information would need to be given about the testimonies found to make this useful? ? Scenario 2: Audio is available within less than a minute. ? Scenario 3: Audio and low-quality video within a minute. ? Scenario 4: Audio within less than a minute, high-quality video within 10 minutes. (20) Where do you think this system should be made available? What kind of audience do you think would access the system there? (21) What do you think would be the ideal environment to search the collection? What things could we do to change the environment of the Onsite Research and Testing Center to improve the user experience? (22) Would you be willing to share with others some or all of your work products? (List of video passages found relevant for a given topic, transcripts you make to use as quotations, any notes/comments, etc.) (23) If your work were too personal to be shared, would you be willing to share transcripts of video passages with the Foundation so they could be used to improve automatic speech recognition? (Assuming you could type transcripts inside the under interface, and the user interface would insert time codes). 173 APPENDIX G ? EXAMPLE QUESTIONS FOR THE FOCUS GROUP DISCUSSION 1 DURING USER WORKSHOP 2 Focus: (a) Sharing information among participants that can improve their search effectiveness. (b) Eliciting search strategies that might guide further development efforts. (c) Capturing ideas that might be influence process development. Sample questions: (1) What helpful ?tricks? have you learned that were not obvious when you first started searching with this system? (2) What has been the biggest surprise so far? (3) What did you need the intermediary to do? Has the intermediary done these things? How useful was the intermediary in your search? What could you have done without the intermediary? What could you not have done without the intermediary? (4) What strategies do you use for discovering the best thesaurus terms to search with? (5) How do you decide which segments are worth viewing? (6) What have you found to be the most useful ways of getting started when exploring a new direction in your search? (7) How is this UI (or an improved version) useful for people in your field? 174 APPENDIX H ? FOCUS GROUP DISCUSSION 2 DURING USER WORKSHOP 2 Focus: Reacting to potential new capabilities based on their experience during the workshop. Sample questions: In the discussion, please respond to these questions first from your individual point of view, and then try to think more generally from the perspective of users from your field (1) What is this kind of a collection useful for? (i.e, what do you find most useful in such an archive?) (2) What kind of content did you find in this archive that you could have not found by other means (e.g., the Internet, other Holocaust archives, books)? (3) Could you comment on searching for testimonies in all languages, several selected languages, or a single language? What kind of support would you (or other users like you) need to deal with testimonies in multiple languages? (4) Do you have any suggestions on how the system could provide better support for the search task? (5) Would you find it useful to have testimony summaries and/or short summaries of the events or topics discussed in individual passages within a testimony? (6) Once you have selected a testimony or the portion of a testimony (based on information such as keywords), would you prefer to see a transcript first and then highlight the portions you are interested in listening to or viewing? (7) What did you get out of the video that you could not have gotten out of the audio itself? For those collecting their own oral history testimonies, comment on this question for your own use of testimonies you collected yourself. (8) If you found a testimony particularly useful, would you be interested in a transcript of the testimony? Only if the transcript were prepared by a person and is reasonably error-free or also if the transcript is the product of automatic speech recognition system and contains many errors? What about video (still frames or video segments)? (9) How could the interface assist you in using what you found? For example, definition of projects (groups of video clips), direct Web search, being able to type notes in the user interface and link these notes to specific passages. (10) Would you find links to other resources helpful (example: a journal article based on the testimony you are viewing, a book providing background, information about a place or a map of the area)? (11) Would you be interested in having an expanded system that would facilitate sharing and collaboration with others? What would be reasons for people to not want to use such a system? (12) How could this system be useful for people in different audience? Please describe. 175 APPENDIX I ? CODING SCHEME A. Relevance Criteria A1 Topicality A2 Accessibility A3 Richness A4 Emotion A5 Comprehensibility A6 Duration A7 Novelty of content A8 Acquaintance A9 Access to the interviewee A10 Miscellaneous B. Associated Attributes B1 Content B1.1 Time Period B1.1.1 Date B1.2 Event and Experience B1.2.1 Personal Event B1.2.1.1 Hiding B1.2.1.2 Escaping B1.2.1.3 Deportation B1.2.1.4 Life B1.2.1.5 Abandonment B1.2.1.6 Immigration B1.2.1.7 Incarceration B1.2.1.8 Forced Labor B1.2.1.9 Liberation B1.2.1.10 Suicide B1.2.1.11 Abortion B1.2.1.12 Wedding B1.2.1.13 Murder B1.2.1.14 Adaptation B1.2.2 Historic Event B1.2.3 Experience B1.2.3.1 Jewish Survivors B1.2.3.2 Homosexual Survivors B1.2.3.3 Political Prisoners B1.2.3.4 Sinti and Roma Survivors B1.2.3.5 War Crimes Trials Participants B1.2.3.6 Jehovah?s Witness Survivors B1.2.3.7 Liberators and Liberation Witnesses B1.2.3.8 Rescuers and Aid Providers B1.2.3.9 Survivors of Eugenic Policies 176 B1.3 Place B1.3.1 City B1.3.2 Country B1.3.3 Region B1.3.4 Ghetto B1.3.5 Camp B1.4 Person B1.4.1 Name B1.4.2 Date of Birth B1.4.3 Gender B1.4.4 Nationality B1.4.5 Country of Birth B1.4.6 Occupation, Interviewee B1.4.7 Occupation, Parents B1.4.8 Religion B1.4.9 Immigration History B1.4.10 Social Status, Interviewee B1.4.11 Social Status, Parents B1.4.12 Level of Education B1.4.13 Marital Status B1.4.14 Family Status B1.4.15 Address B1.5 Object B1.5.1 Specific Object B1.5.1.1 Ship B1.5.2 Type of Object B1.5.2.1 Weapon B1.5.2.2 Geographical Objects B1.6 Organization/Group B1.6.1 Specific Organization/Group B1.6.1.1 Cultural Organization/Group B1.6.2 Type of Organization/Group B1.6.2.1 Resistance Organization/Group B1.6.2.2 Cultural Organization/Group B1.7 Other Topics B1.8 Non-Textual Attributes B1.8.1 Emotion B1.8.1.1 Crying B1.8.2 Visual Features B1.8.2.1 Facial Expression B1.8.2.1.1 Humiliation B1.8.2.2 Gesture B1.8.2.3 Visual Display B1.8.2.4 Picture B1.8.3 Audio Features B1.8.3.1 Whispering B1.8.3.2 Yelling B1.8.3.3 Singing 177 B1.8.3.4 Voice Tone B1.8.3.5 Accent B2 Format B2.1 Length B2.2 Response Time B2.3 Language B2.4 Clearness of Speech B2.5 Catalogued/Not Catalogued B2.6 Presentation B2.7 Amount of time/percentage B2.8 Not Applicable C. Actions Taken (to Bridge a Gap) C1 Refine Information Need C1.1 Formalized Information Need C1.1.1 Formalized Information Need 1 (Q3 (1) ) C1.1.2 Formalized Information Need 2 (Q3 (2) ) C1.1.3 Formalized Information Need 3 (Q3 (3) ) ? C1.2 Refinement Strategy (INR Move) C1.2.1 Clarification C1.2.2 Specialization C1.2.3 Specialization by Elimination C1.2.4 Restriction C1.2.5 Generalization C1.2.6 Parallel Movement C1.2.7 Note for Later C2.3 Refinement Sources C2.3.1 Previous Knowledge C2.3.2 Intermediary C2.3.3 Pre-Interview (pre-testimony) Questionnaire (PIQ) C2.3.4 Thesaurus C2.3.4 Assigned Descriptors C2.3.4 Viewing C2.3.4 Number of a Search Result C2 Formulate or Reformulate Query C2.1 Compromised Query C2.1.1 Compromised Query 1 (Q4 (1) ) ? Initial Query C2.1.2 Compromised Query 2 (Q4 (2) ) C2.1.3 Compromised Query 3 (Q4 (3) ) ? C2.2 Query Language Component (Reformulation Type) C2.2.1 Adding/Removing a Condition C2.2.1.1 Adding a Condition C2.2.1.1 Removing a Condition C2.2.2 Modifying a Condition C2.2.1 Narrowing a Condition C2.2.1.1 Narrower Term 178 C2.2.1.2 Removing ORed terms C2.2.2 Broadening a Condition C2.2.2.1 Broader Term C2.2.2.2 Adding Terms with OR C2.2.3 Other Modification C2.2.3.1 Replacing a Term with a Spelling Variation C2.2.3.2 Replacing a Term with a Synonym C2.2.3.3 Replacing a Term with a Related Term C2.2.5 New Query D. System/User Interface Issues D1 Overall Reaction D1.1 Easy to Use D1.2 Hard to Use D2 Screen Design D2.1 Presentation D2.2 Sequence of screens D2.3 Terminology D3 Functionality D3.1 Functions/Features Needed D3.1.1 Boolean Search D3.1.2 PIQ Search/Viewer D3.1.3 Related/Alternative Search Terms D3.1.4 Adding Multiple Search Terms at Once D3.1.5 Searching Testimonies by Places under Experience Search D3.1.6 Extensive Editing within My Project D3.1.7 Time Stamped Keywords D3.1.8 Testimony/Segment Summary D3.1.9 Multi-tasking D3.1.10 Rapid Access D3.1.11 Other Functions/Features Needed D3.2 Functions/Features Desired D3.2.1 Introductory Video of System Tutorial D3.2.2 Help D3.2.3 Map Presentation D3.2.4 Integrated User Tools for Note Taking D3.2.5 Temporal saving of selected testimonies D3.2.6 Reference Tools D3.2.7 Remote Access D3.2.8 More Repositories D3.2.9 Support for Cross-Language Retrieval D3.2.10 Video Browsing Tool D3.2.10 Other Suggestions D3.2.10.1 Things to be Improved D4 Thesaurus D4.1 Searching Thesaurus D4.1.1 Typing Search Terms D4.1.2 Browsing Thesaurus D4.2 User Reaction D4.2.1 Comprehensiveness 179 D4.2.2 Clearness of terms D4.2.3 Organization/Navigation D4.2.4 Scope Note D4.2.5 Spelling Variation D4.2.6 Related Term D5 Prerequisite User Knowledge D6 Other MALACH Related Issues D6.1 Audio/Visual Presentation D6.2 Usefulness of Transcripts D6.3 Sharing Search Procedure and Resources D6.4 Photographs D6.5 Paying Fees for Access/View/Use E. System Component E1 Login Screen E2 Registration Screen E3 Initial Search Screen E4 My Project E5 Search History E6 Initial Result Screen E7 Viewing Screen F. Usefulness/Uniqueness of the VHF Archive F1 Education F2 Research F3 Size of Collection F4 Comprehensiveness G. Access Level G1 Recording Level Access G2 Passage Level Access G3 Both Recording Level and Passage Level Access H. Participant H1 Participant 1 (P11) H2 Participant 2 (P12) H3 Participant 3 (P13) H4 Participant 4 (P14) H5 Workshop 2 Participant 1 (P21) H6 Workshop 2 Participant 2 (P22) H7 Workshop 2 Participant 3 (P23) H8 Workshop 2 Participant 4 (P24) I. Workshop Number I1 Workshop 1 I2 Workshop 2 180 APPENDIX J ? MATRICES USED FOR ANALYZING INR MOVE AND QUERY REFORMULATION TYPE (QTT) Participants 1 (P11) Compromised Information Need (Q3) INR Move & Source Query (Q4) QRT Result Q3(1) P11: So now I?m, the things I?d be interested in is to find out if there?s any talk of Germans and Jews who played together in say orchestras or operas or any of those things before 1933 and then when the Jews were pushed out of cultural groups and they started the Kulturbund, how did the Germans feel about losing. And this may not be the database for that, but that?s one thing, how did they feel about losing their colleagues. Clarification (PK) Specialization (PK) Q4(1) Jewish-gentile relations 5106 matches Q3(2) INT: Okay so what we?ll do is save this so we can search within this set. Okay so now we?ll add the Germany time container and we need Germany, it would be 1914-19 or 1900-14, 14-18, and then where?s 1918-1933? There it is. Okay so now it?s going to look for any of those time periods within our set. P11: Right. INT: So the Kulturbund was established in? P11: 33. INT: Wow, only 59. Specialization (INT, TH) Q4(2) Jewish-gentile relations AND Germany 1900-1914 OR Germany 1914-1918 OR Germany 1918-1933 Adding a condition 59 matches 181 Compromised Information Need (Q3) INR Move & Source Query (Q4) QRT Result Q3-3 INT: Still it?s only 59. So now within this set. You know 59 segments really isn?t so many to go through. It?s 59 testimonies to look at. So you can either look at each individual testimony or we could try to put an additional search, I mean an additional term into this search on the cultural and social activities. My hunch is that it?s probably going to minimize it tremendously. I just want to look here and see what the context is of the Jewish Gentile relations. And this was catalogued in the new system so there are fewer keywords. I think the likelihood that we?re going to find the testimonies, we can certainly do the search, but that we?re going to find testimonies for cultural and social activities are discussed. Well actually the next step that we could take as I think more about this is now to look for the occupations within this set. P11: Oh really. Okay. INT: And then people will be discussing Jewish Gentile relations with respect to occupation. And maybe there?s someone who?s a musician whose testimony we?ve catalogued that has something. Specialization (INT) Clarification (INT) Restriction (INT) Q4(3) Jewish-gentile relations AND Occupation, father?s OR Occupation, mother?s OR Occupation, interviewee?s OR Occupation, spouse?s OR Q4(4) Cultural and social activities AND Germany 1933 OR Germany 1934 OR Germany 1935 OR Germany 1936 OR Germany 1937 OR Germany 1938 OR Germany 1939 OR Germany 1940 OR Germany 1941 Adding a condition New terms Adding a condition 74 matches 237 matches Q3-4 P11: Well I can tell you that. No I don?t have a complete list of who was in the orchestra. And they did it by city and no I don?t have that. I mean, I have found 20 people on my own and none of them have mentioned doing that, a testimony with you all. That?s why I was thinking that if we typed in some of the really well known ones, others might speak about them. And I think that would be the way to do it. Specialization (PK) Q4(5) Q4(4) AND Berlin (Prussia, Germany) OR Frankfurt am Main (Grmn) Narrowing a condition, narrower term 182 Compromised Information Need (Q3) INR Move & Source Query (Q4) QRT Result Q3-5 (P11: But besides that, another, I?m trying to think of other ways to get to the Kulturbund, whether they performed or whether they attended. And so I was thinking maybe we could type in some individual names. And I brought the book and I, but you know type in [name, interviewee] and type in [name, interviewee]. They were significant conductors, leaders of the Kulturbund. So it seems like that might be something that people might talk about if they don?t mention the Kulturbund.) P11: Great, that'll be a good thing and we could work on that. Should we just quickly do an individual so you guys? INT: Yeah. P11: Let's do that, maybe we'll feel better. INT: Yeah, it's really becoming a bit of a dead end. P11: Let's type in [name, mentioned], oops. Q4(6) [name], interviewee OR , husband OR , other relationships Q4(7) [name], interviewee , father OR , uncles OR , brothers, biological Q4(8) [name], interviewee Q4(9) [name], interviewee 4 matches 4 matches no match 12 matches 183 Participants 2 (P12) Compromised Information Need (Q3) INR Move & Source Query (Q4) QRT Result Q3(1) P12: And I know that this camp can be spelled a few different ways so. Can I do German letters? INT: No. This, I have to say I?m surprised it?s not finding your search that way. I know that the preferred term is just spelled f-o-h. P12: F-o-h? INT: F-o-h, without the e. It?s actually with the umlaut. P12: With the umlaut, that?s right, that?s why I put an e in there. That?s why I was asking will this take into account different spellings, like with the Holocaust Museum? I mean I?ve seen it spelled f-e-r-n, like Fernfaltd. You know crazy stuff. INT: Right. Yeah, foehrenwald. I know exactly what you mean. I?m just not sure that we have the umlaut. P12: You do. Well I mean in here. INT: Yeah, I know it?s here. I don?t know that we can make it on the keyboard. That?s all I meant. Clarification (PK) Q4(1) Foehrenwald (limited by language English) Q4(2) F?hrenwald Spelling variation No match 177 matches Q3(2) P12: Okay. Well she mentioned that the Americans took them to Foehrenwald and I hope that she talks more about that because, the Americans to be taking her, I?d like to hear about that process. Why Foehrenwald? Who exactly took her? Where was she? I mean she was coming from Bergen-Belsen, but how did Bergen- Belsen, Bergen-Belsen was in the British, so how did that happen that the Americans got a hold of her and took her to Foehrenwald? Okay so to go backwards. (Refined during relevance judgment) Clarification, alone (VW) 184 Compromised Information Need (Q3) INR Move & Source Query (Q4) QRT Result Q3(3) P12: Okay for right now I?m more interested in the Jewish experience because I do have a lot from external agencies and I really would like to hear more from survivors. INT (185): Okay. So there are keywords that will help you with that, one of which is living conditions in the refugee camps. And if we look in the refugee life container, you will see other terms there that may be better suited to your search. P12: Okay. Then that?s what I?ll do. I?ll start over at advanced. Clarification (PK) Specialization (INT) Q4(3) F?hrenwald AND Living conditions in refugee camps Q4(4) F?hrenwald AND Refugee experience Adding a condition Broadening a condition No match 69 matches Q3(4) P12: Okay. Why did I say wow? Because the first entry is abortions in the refugee camps, which is something that I?ve been very interested in gleaning more information about this when I?ve conducted my oral histories. And some women have been very forthcoming and others have said no it never happened, whereas I know that it happened quite frequently so it?s interesting that Dafka, of all things that this is the first one. In the Joint, this is interesting, anti-refugee experiences. Barter in the refugee camps, to me that?s black market stuff which is a huge issue in my research or bribery, brutal treatment in the refugee camps. Very crucial to what I?m doing. Childcare, children in the camps, clothing. Now I?m just concerned that refugee camps aren?t exactly the same as displaced person camps. Specialization (TH, PK) Clarification (TH) Q4(5) F?hrenwald AND Abortions in the refuge camps Narrowing a condition (Narrower term) 185 Compromised Information Need (Q3) INR Move & Source Query (Q4) QRT Result Q3(5) P12: It?s been, it?s very difficult. I mean not every one, but every other one are ones that I would like to add to my list. The social relations, I?m looking at things specifically regarding social life or about experiences that haven?t really been written or talked about in detail, such as abortion or sexual activities or rape or sexual molestation. Some of the more personal experiences. Living conditions. OB: Can you give why you were. P12: These form some of the basic questions that I have. What were the living conditions like? How did they begin to rehabilitate or how was, or the instances where rehabilitation was not possible? Orphanages and children?s homes. I?m very interested in not only the issues of gender, but also the issues of age. Clarification (TH) Specialization (PK) Q3(6) P12: Okay. Things that have to do specifically with gender or age issues I?m going to click on, like menstruation, malnutrition in the camps also address the living conditions. As does living. This is going to be very difficult to. Oh justice and law enforcement. And kids. Clarification (TH) Specification (TH) Q3(7) P12: Right. Interaction with family members would be very important. Injuries would talk about hospital care, living conditions. The three main issues that I?m looking at are gender, age, and health conditions because that had so much to do with who was picked or prioritized for immigration and who wasn?t. Clarification (TH) Specialization (PK, TH) Q4(6) F?hrenwald AND (Rape and sexual molestation OR Means of adaptation & survival OR Killings in the refugee camps OR Relationships between refugees and local populations OR Social relations OR Justice and law enforcement OR Menstruation, malnutrition in the camps OR Housing conditions OR Food in refuge camps OR Disease OR Deaths OR Education in the refugee camps OR Customs and observation) Adding a condition (ORed terms) Missing 186 Compromised Information Need (Q3) INR Move & Source Query (Q4) QRT Result Q3(8) P12: Of cultural, oh, of customs and observances. Because I assume that will address issues such as faith, religion, the continuity of Judaism as a religion, being of different beliefs. I?m very interested in children in the camps. Bloomsberry House, I don?t know what that is so I?m going to look. Okay, that doesn?t have anything to do with my research. Association of Jewish refugees also isn?t. Anti-refugee experiences would be helpful because there were Polish Pogroms that forced a lot of Jews to flee from the east back into Germany. And aid. And obviously abortions are a gender issue. Okay so now I have to figure out how to minimize this list, which is really difficult. Clarification (TH) Elimination (PK, TH) 187 Compromised Information Need (Q3) INR Move & Source Query (Q4) QRT Result Q3(9) INT: So just in looking at your list you probably don?t need my point of view on this but it looks like you have really two different kinds of questions. One of which looks like it deals with a lot of social aspects of life in the refugee camps and others that deal with more, what?s the word that I?m looking for. Well, I guess they really all deal with social aspects, some of them deal more with human interaction than others. P12: Right, right. INT: So may be that?s a way to break it down. The terms that deal with human interactions versus the terms that deal with conditions. P12: Okay. Well I?m going to go, both of them are important in that I want to get back, I want to get back to the one that I don?t first concentrate on, but my first interest is in the human relationships and interpersonal relationships. So I?ll try and delete. Now can I, the ones instead of just deleting, can I put them somewhere else? In another file to save or should I just delete them and then I?ll go back and do all the ones that have conditions. INT: I think you would have to delete them form this list right now and then make a new search within project. P12: Okay. Which ones am I deleting? I?m trying to focus it more on interpersonal relationships or the human condition, so I?m trying to take out ones that are a little bit more general. Even though a lot of these could probably be taken out. Justice and law enforcement, I?ll delete and save until later. P12: Housing conditions because that?s more general, a general theme, it?s not as human I guess. Take out food for the same reason. Epidemics, education, diseases, deaths, aid. Now is this too many to form a search? INT: I don?t know. I think it?s worth a shot. I would take out Foehrenwald at this point because you?ve already divined that set, so it?s not going to look for those all over again. P12: Okay. I think we?ll include all. Wow, I actually got three hits. Three very specific ones. Only one female, interesting. Clarification (PK) Specialization (TH) Note for later (TH) Elimination (TH) Clarification (PK, TH) Elimination (TH) Q4(7) Deleted the followings from Q4(6) Justice and law enforcement Housing conditions Food Epidemics Education Diseases Aid Removing ORed terms 3 matches (1 female) 188 Compromised Information Need (Q3) INR Move & Source Query (Q4) QRT Result Q3(10) P12: This is really amazing because this is something that no one has talked about in my reading of the material, of documents and archives or interviewing people, no one has ever mentioned this suicide. INT: Suicide after liberation? Does there appear to be a keyword in the segment dealing with that. P12: No, not in the segment. It?s not one of the keywords. I think it should be. I mean I really wouldn?t have thought to check for that. This is very, very important for me. Because it?s given me a completely new issue to try and find out more material on. INT: I would be interested to hear what he?s talking about because we do have terms regarding suicide and it may be that it?s not. Well obviously it?s not discussing his own suicide. P12: Well he?s saying when he was in Feldafing, there weren?t enough supervisors. He?s talking about the kids who survived the concentration camps, there weren?t enough people to care for them, watch out for them. He?s saying if somebody couldn?t take care of themselves, couldn?t adapt or fend for themselves, they were in really big trouble and that?s why there were a lot of suicides. And this is a brand new thing I?m hearing. INT: I see. So he?s discussing the causes of suicide as opposed to. P12: Well the causes and also just the event of it. (Refined during relevance judgment) Clarification, alone (VW) Q4(8) ? from Q4(1) Foehrenwald AND Suicide Adding a condition No match Q3(11) P12 (397): Right and then I felt really bad for saying for saying that. Because I originally had wanted to look at her segment on the JDC but then she's talking about immigrating to Canada so and that come much further down the list so I realized this is when she got help after she was already here in North America which does not pertain to my research. I wanted to know about her experiences with the JDC in the DP camps. (Refined during relevance judgment) Elimination (VW) Clarification (VW) 189 Compromised Information Need (Q3) INR Move & Source Query (Q4) QRT Result Q3(12) P12: Something he just said is very interesting and I haven't really, I mean I've thought about it but I need to think about it some more and the whole issue of identity and for the people that, he's talking about his friend of his who even though Poland had been liberated by the Russians, his friend was not yet able to acknowledge that he was Jewish. He had taken on the identity of being a Christian for so long and so deeply that he just, his Jewishness, his Jewish identity was gone and it's just given me really something to think about, looking at identity, post-war identity and what the process might have been like or how difficult of moving from this Christian identity to Jewish identity. I haven't quite solidified my thoughts on that but just something that he said kind of delineated it for me. OB: It might be relevant to your topic? P12: Yes, very relevant. This just made me think about it in a different way. Listens to tape. I don't understand why he just stopped. (Refined during relevance judgment) Clarification, alone (VW) 190 Participant 3 (P13) Compromised Information Need (Q3) INR Move & Source Query (Q4) QRT Result Q3(1) P13: Yes, I am working on my S.S. Saint Louis, German ship of Jewish refugees leaving Hamburg in May ?39, to Cuba, and on toward Cuba where they were rejected and [crinos]? By the neighboring countries like the US and the Dominican Republic? INT: So in terms of looking for that topic, what kind of subjects do you need to find in the archive? P13: I am particularly looking for survivors of the Saint Louis. Clarification, alone (PK) Q4(1) Saint Louis 129 matches Q3(2) P13: Ok, so it means for me it is very arbitrary because I don?t know, I am not interested so much, in particular experiences of the ship, I am more interested in the general experience of the ship. So my focus is not only when they have been in Cuba the departure until they came back. So I am mean, just for interest I click on him. For example, see what segment five contain. And I see that it gives me information about his background. That he came from, segment five I am talking about by the way. His flight attempt is very interesting to me, how people made the choice to leave Germany. In particular what sent[?], one interest because just of the November pogrom when people decided to leave. I will continue with segment six and I will see [mentioned name?] strange in immigration from Belgium to the United Kingdom, Cuba. This I don?t know what to do with this information because if he has been in Cuba from ?33 to ?39, I don?t know why he was on the Saint Louis. (Refined during relevance judgment) Clarification (DS, PK, PIQ) Specification (DS) 191 Compromised Information Need (Q3) INR Move & Query (Q4) QRT Result Source Q3(3) P13: German Refugees. Ok, so he?s talking about German refugees, refugees labor experience [kashun?]. So this is not primary interest for me, it?s more of secondary interest, Because my thesis of course about immigration, immigration policy. This is the bigger framework and I explain it on the case of Saint Louis. So I would be interested also to look at immigration experience, but just on certain instance. I go to look down, just for interest. What his experience was. I stop here when I see when it comes [from omat mau?] so this is just far enough, so I am not looking for my thesis anymore, and I look far up. And so if would I listen to it I may probably start with his family life and segment 3, and then listening to the following segments until 7 or so. And I don?t listen to it right now. So do I have to save it? (Refined during relevance judgment) Elimination (DS) Clarification (DS) Q3(4) P13: So I should pay attention to it. And now we come to new survivors. And, I mean my I just ask him?.. I would like to know for example if I could structure more precisely also the age of the family members, because if she marries her family name would not be [put down? or actual name?] any more, for example. Also the age is coming from bigger interest than gender because after all were there children were she married, mid twenties mid forties when he or she was on the ship. But I see that I apparently cannot do this. (Refined during relevance judgment) Restriction (PIQ) Q3(5) P13: Ok, this is good, another thing that I would be interested in is, I see that you have to experience, your categories by true survivors, but I would also be interested for example, now I have these 129 testimonies or records, and I would love to know of 129 how many were in concentration camps before they went on the Saint Louis and look specifically on the say 60 person [?] who have been in concentration camps before they departed. Restriction (NIR) 192 Compromised Information Need (Q3) INR Move & Query (Q4) QRT Result Source Q3(6) INT: Ok, so the way that we do that is that we will save to project your 129 results, this 129 should included all testimonies catalogued that have the Saint Louis recorded in them. So we will save to project and we will call your project the Saint Louis. Ok, so we have 129. To do this then we will choose this project, search within the project and keyword search. Is there a particular concentration camp that you are looking for or any concentration camps? P13: Yeah, I am particularly interested in people who were put in concentration camp after the November pogrom. There were several names for November pogrom, like Kristallnacht. Do you only use November pogrom?? Specialization (PK) Q4(2) Saint Louis AND November Pogrom Q4(3) Saint Louis AND Kristallnacht Adding a condition Modifying a condition (Synonym) 1 match No match Q3(7) P13:.Ok, this really, yeah, because a lot of refugees died, especially in the Anschluss of Germany, right? Some passengers on the Saint Louis decided to leave Austria because of the Anschluss of the persecution of Jews particularly in Vienna. So for me it would also be interesting to look amongst the 129, how many came from Austria, Vienna in particular. Clarification (PK) Restriction (PK, NIR) Q3(8) INT: So, to back up for a moment. You were interested in people from Vienna or otherwise in Austria who departed on the Saint Louis and then the decision regarding flight, regardless of where they were from. Which of these would you like to pursue P13: Their decision: INT: So now we are going to start working on the decisions about why people left. SO let?s choose your SAINT LOUIS project and we will search within that group we will go back to the keyword search and we will choose the keyword decisions regarding flight, which should?..here we go, so if you want to look at the definition here you can see how to use that. SO it applies to Austria after the Anschluss and Czechoslovakia after the annexation. Clarification (INT) Specialization (INT) Q4(4) ? from Q4(1) Saint Louis AND Decisions regarding flight Q4(5) ? from Q4(1) Saint Louis AND Flight preparation Adding a condition Modifying a condition (Related term) No match 193 Compromised Information Need (Q3) INR Move & Query (Q4) QRT Result Source Q3(9) P13: Ok , now I am making it all done with segment 11. And she talked about her whole immigration/ emigration policies procedures and how she got on the Saint Louis and how the possibility came up. In addition she mentioned that she had a baby that she took with her, so this is just some marginalized information that I pay attention to and make of a note of - interview code 33498. Then another thing was interesting to me, I do not have her name on the passenger list, so it may be that she married, but since she had a baby and she talked about a husband she has to married before she got on the SAINT LOUIS, maybe she got married a second time, but I do not have her on the list. INT: In the biographical profile, it does not give any additional names for her. But there might be a maiden name or other names that she used, but if that is additional information that you then we can get that. SO .. [name, interviewee]. And maybe she married again later in life, if her husband passed away. P13: Yeah, maybe [name, interviewee].. Maybe she was [alternate name, interviewee] or [alternate name, interviewee], [name, interviewee] is her daughter.. 1 match INT: That could very well be her family right there. P13: Yeah, right, because they are three and she talks about a husband and a baby. Searching for an interviewee [name, interviewee] (first name) [name, interviewee] (last name) [alternate name, interviewee] (Additional name for [name, interviewee]) 1 match 1 match Q3(10) P13: And now we come to 20. That is great. Ok think I remember [name, interviewee] and [name, interviewee] which I listened to, and I think also [name, interviewee]. And so I go to [name, interviewee], he has two segments about his experience. Oh he talks about visa, which is very interesting for me so I make a note of it. And so the next keyword I may enter is visas and I also make a note that I found him talking about it and he is [name, interviewee],? and just for later?I go to just for my notes, I write on it the interview code 21385.. and then I go back to look.. I think it was (Refined during relevance judgment) Clarification, alone (VW) Q4(6) ? from Q4(1) Saint Louis AND Immigration and emigration Q4(7) Saint Louis AND Immigration and emigration policies and procedures Adding a condition Narrowing a condition (narrower term) 20 matches 20 matches 194 Compromised Information Need (Q3) INR Move & Query (Q4) QRT Result Source Q3(11) P13: Ok, so we found this one and he talks about the arresting of Jews, which is interesting to me again because of the topic of my thesis about why people chose to leave Germany on the Saint Louis. And then he talks about... yeah he was in a concentration/. And that is very interesting to me and probably he was only released because he could present visa to Gestapo and then he could go on the Saint Louis. And this is how the story went, and he is definitely of primary interest for me and he talks about the suicide attempts on the ship, I don?t know what it means, if he actually attempted suicide. And so if I click on here I listen to the segment... and I will do it because I am just curious to listen about this. And it is segment 13. (Refined during relevance judgment) Clarification, alone (DS) Q3(12) P13: Ok, so present searches size is and I get twenty, I listened to [name, interviewee] who talked about the suicide attempts. And I look further and I get a couple of new survivors and in this they look in their sixties and I would be interested in older people. So I just keep on searching. Ok, see now, I will check how old she is. If I go to biographical profile I may find when she was born hopefully. Date of birth, way old. I can see through all of this... Ok she goes comes from Nuremberg Ok, I screen through her testimony, this data. And I see that she came from Nuremberg and she also talks about the arrest of Jews and her decision regarding flight is of interest to me, and she talks about the November Pogrom too, but it interesting because we didn?t find her among the others. (Refined during relevance judgment) Restriction (VW, PIQ) Clarification (VW) 195 Compromised Information Need (Q3) INR Move & Query (Q4) QRT Result Source Q3(13) P13: Advance? Ok, so here I have her. They all seem to be in their sixties, so that means they were all a child. And I am just curious to see somebody who was older at the time. He still looks very young?.Ok, I cannot go next, I already have 20. Oh, interesting, Ok, so I just may check him. [name, interviewee] [?] and he also talks about suicide attempts. [name, interviewee]... Ok, so maybe I make a note here that maybe later I will also search for [name, interviewee], who was the captain of the ship. And I understand that I cannot listen to him right, now but I look further on what he is talking about besides, I go up. Ok he comes from Fryeburg, Flight attempts he talks about extended family members, what might be of interest too. oh he is all over the place. Ok, I see that in segment 8 he talks about the Saint Louis, the flight preparations both are of interest to me. And [name, interviewee]. So I.., oh I cannot listen to it. (Refined during relevance judgment) Q4(8) Saint Louis AND [name,interviewee] (First name of the captain) Q4(9) Saint Louis AND [name,interviewee] (last name of the captain) Q4(10) [name,interviewee] No match No match 3 matches Q3(14) P13: Up to fifteen, I hope it will take less?..It is interview code 11375. I just want to check when he was born. No he also didn?t give us a date of birth, which is a pain. It is still loading; while it is loading I just want to look further what his experiences were. I see that he talks about living conditions in segment 12, he talks about living conditions, I don?t know yet if this is about the boat, the ship, or about general living conditions, but I will see if I want o listen to it. And it seems that he came to Belgium, where he was separated from his loved one, which is not of primary interest of me. Ok, so he was an orphan, he was a child on the Saint Louis. Yeah, I would be more interested in adult survivors because the probably recall better the political background of the story. Ok, so he also came to the States, ok so he went home to Germany for some time, he went to visit the sites of persecution. Oh, I am [?] ending it[?]. Ok it is still loading, so I continue waiting for it, I go back to biography profile. His name is ?[mumbled] (Refined during relevance judgment) Clarification (DS) Restriction (DS, PIQ) 196 Participant 4 (P14) Compromised Information Need (Q3) INR Move & Source Query (Q4) QRT Result Q3(1) INT: Okay, so, um, at this point it?s probably best for you to tell me a little bit about what you, um, want to try to search for first. P14: Well, (inaudible) what I?m looking, (inaudible), my research is in the preliminary stages, so for now I?m looking at, trying to find information, testimonies, about the resistance movement, people who were part of it or knew of the events taking place at the time. INT: Are there specific geographic limitations? P14: No, probably it would be better if we start from the general, the most general. ? P14: Let?s start with the decisions regarding resistance? INT: You can choose multiple key words at this point and it will search on an ?or? basis: decisions regarding resistance OR camp uprisings OR?but if you just want to keep it at one term at a time, that?s fine. ? P14: Let?s, actually let?s do decisions and discussions?let?s do psychological reactions to resistance, resistance (inaudible)?okay let me try it as one category here, resistance to deportation, that?s very useful, okay how about if I try now, let?s see? ? P14: So I?m looking, I?m looking at those categories because it would enable me to look at the decision, I?m hoping to find information that will say whether or not to resist in the first place, then try to figure out, uh, at what location and what type of resistance decisions were being made depending on the location where they are, so I figure, this is probably the best way of going about this for now. P14: Let?s say English. INT: Yes, I would filter this, because we?re starting to do more non-English testimonies, you will probably come across something in Russian and French. P14: Okay. Gender? It?s not important for now. Probably later on that will be important, but for now that?s not important. Clarification (INT, TH, PK) Specialization (PK) Restriction (PK) Note for later (PK) Q4(1) Decision regarding resistance OR Discussions of resistance OR Resistance during deportation (limited by English) 152 matches 197 Compromised Information Need (Q3) INR Move & Source Query (Q4) QRT Result Q3(2) P14: Okay. Now there isn?t really, this probably would be useful later on, segment number 3, but for now there isn?t?segment number 16, um, won?t be useful, so for now?living conditions most likely will be useful later on?forced labor, construction, that won?t be useful yet?living conditions won?t be useful yet?okay, this I really want to see, this one here, segment number 64. (Refined during relevance judgment) Elimination (DS) Note for later (DS) Q3(3) P14: I?m hoping to find the evolution of the decision-making process, how do they begin to think from, uh, they?re telling us to march out of the city, for example, from that kind of a government order to, how does it shift to thinking, no, we are not going to march out of the city, are we going to stay or are we going to fight? P14: And see if there is in the decision-making process, whether it?s group dynamics of leadership, the role of the leadership, probably the type of personality that becomes the leaders, and if it makes any difference, if for example if you have the same quality of individual leaders. P14: In a different region, probably, would it make any difference? Would the same group of leaders, could, encourage others in another region that were just marching around the city, that leadership would have been able to shift, those as well, if they had been in other places. First, it?s interesting to see what the leadership, the quality of the leadership, and what the leadership were trying to, or how were they trying to reverse that process, from going to death to stopping and say, no, we?re not going to die, we?re going to protect ourselves and if we fail and then you kill us, okay fine, at least we tried to protect ourselves first. That?s, at least for me, that is probably the most critical issue here. P14: How does that reformulation take place, from, we got the orders from the government to walk, to thinking, no, we?re not going to walk. How is that, what happens there in that decision? Obviously this is not just one individual, we?re talking about group of people, how do you, in addition to do leadership circles, the close circle of the leaders, 10, 15 of them, how do you actually expand that, and then include an entire community to say, no we?re not going, we are going to stay here. Uh, to some extent that will be spontaneous, and to some extent that will be very much organized. (Refined during relevance judgment) Clarification, alone (PK, VW) Clarification, alone (VW) Clarification, alone (VW) 198 Compromised Information Need (Q3) INR Move & Source Query (Q4) QRT Result Q3(4) P14: Okay, now we?re talking about serious information, yeah, so now they?re talking about knowledge of the geographic region, the mountains and the forests, uh?okay?huh?see, he?s talking about where and how to organize the meetings for the res-, for the decision to organize the resistance?ah, exactly what we?re looking for. P14: See, he?s also talking about the question of trust and question of loyalty in the organization. When you?re going to bring 5 people with you to the location, how do you actually trust them? That?s going to be kept secret?wow?it?s so different than reading, it?s so different than just reading?wow. Huh, that?s really interesting, wow. (Refined during relevance judgment) Clarification, alone (VW) Clarification, alone (VW) Q3(5) P14: Okay let me just look at this information here for a second. This biographical information is very very useful, very useful. I mean for now this is not, I mean this, at this stage in my research at least there is nothing I need, there is nothing for me to do with this information. But I imagine if, if I expand the research to go beyond just the interviews themselves, to look at, I mean if I see some kind of a pattern, for example, that at this camp the way things are being organized, the resistance is being organized, is really different, then, let?s say the way it?s being organized in other camps, then obviously it?d be important to figure out what were the conditions or the leadership ingredients that made the resistance movement at this camp different than the other camps. Because in the Armenian case the mountains and the forests and the rivers were very very critical, very very critical, and he?s more or less saying the exact same thing here. P14: Well let?s see, I mean, the more, the more research you do, the more you begin to realize how important geography is, well, the other interesting point is, uh, geography is not only important for the purposes of resistance, but geography is also important for the perpetrators of genocide. Clarification, alone (VW, PK) (Refined during relevance judgment) Note for later (PIQ) Clarification (VW, PIQ, PK) 199 Compromised Information Need (Q3) INR Move & Source Query (Q4) QRT Result Q3(6) P14: Oh, I think it involved something with their family background if I remember, oh that?s right, it said something about family relations and so on, and very much like (inaudible) information it?s possible that later on that would be useful. ? P14: But for now, at least for my purposes that doesn?t seem useful for now. ? P14: For now I?m, for me top priority now is to go directly to the discussion of the resistance and decisions and so on, and then move later on you can do more, but for now. Actually, what I?m thinking now is as far as I know there hasn?t been a serious publication just on the resistance during the Holocaust, based on this material. (Refined during relevance judgment) Clarification (VW) Note for later (VW) Elimination (VW) Clarification (VW, PK) Q3(8) P14: He?s talking about, um, the divisions between the resistance groups, one group being supported by the British, one group being supported by De Gaulle, and how much they hated each other! (laughter) So he?s giving information about, um, the resistance movement and their relations with the outside world, and how their connections with the outside world helped them to get more weapons, to organize themselves?but at the same time it?s interesting that while they were getting weapons from the outside, the problem, the dilemma they had was, here they were getting weapons from the outside but that procurement itself was causing conflict within the group. This is very good, this is very good, very useful. yeah. And he just said that, um, he just said that uh, this is so important, he is saying that uh, here you have this outside support coming in, but the problem we have is the organization, the leadership, the organizational structure is not really well prepared to make the decisions, whose, which group is going to have, how many weapons. So first they have this antagonism in the first place, and then in the process of distribution of weapons, that in itself is causing problems precisely because the organization isn?t really geared for that, for that purpose. (Refined during relevance judgment) Clarification, alone (VW) 200 Compromised Information Need (Q3) INR Move & Source Query (Q4) QRT Result Q3(9) INT: So, based on these two testimonies that you listened to, there was clearly some, there were some topics that you were particularly interested in, and we may have keywords about those specific topics, so I was thinking based on what you were talking about in your search, maybe there are some more specific topics, um, more specific than discussions of resistance, for example, that you might want to try to look at. P14: Yeah. Well one of them was, uh, that external connections aspect of it, what type of connections did they have with the outside world that enabled them to procure more weapons. Clarification, alone (VW, INT) [Q4(2) Resistance groups] ? not a real query but a common theme for following queries Q4(3) ? from Q4(2) Resistance groups: Arms procurement Narrowing a condition (narrower term) 117 matches Q3(11) P14: Okay, we?re going back to search. INT: No, uh, back to search results, and it?ll get you back to your list, and then (inaudible) recognize any that are on the cache that will download more quickly. Oh, a woman. P14: Actually that?s another category that would be interesting to do. Initially gender isn?t important, but as the analysis becomes more sophisticated, more complicated, it would be very important to separate the two. ? OB: Since you brought the topic up, so gender, so does it do anything to you? P14: My sense is, just based on what I?ve read, my sense is when women rebelled in the concentration camps, it was much more spontaneous as opposed to organized. Usually, I see, like the way it started was some order comes telling them to do whatever, to go some place. One of them will just say, no I will not, and another one would support that first one, and then more would support, and then once the women realized that, in terms of numbers they?re far greater than the couple of soldiers, the guards standing there. Then they rebelled. Maybe that?s, maybe that is really a pattern, it?s not just my impression of what I?ve read so far, maybe that was a pattern where in the case of the women it was much more spontaneous, even though they knew this could lead to their death, but nevertheless the eruption took place anyway. Yeah, so I don?t really know exactly how they fit into the context of what I?m doing. (Refined during relevance judgment) Clarification (INT) Note for later (PK) Clarification (PK) 201 Compromised Information Need (Q3) INR Move & Source Query (Q4) QRT Result Q3(12) INT: Well, um, something that might be interesting is living conditions and resistance groups for you, which gets to things that maybe aren?t so much about, um, the weapons and things like that, but, you know, how people got food, and where they slept, and things like that. I was thinking based on your interests that you?ve expressed here and before that that might be? P14: Oh sure sure sure, that would be living conditions right there. INT: Yeah, or we could just type it in up here, if you?ll just click in up there, I can just get, um, yeah, whoops. Uh, do, just say okay. Here, I?m sorry. Uh oh. What happened? INT: Something?s?uh?okay, let?s see, I?m sorry, um, I was thinking that we had the keyword living conditions in resistance groups, but we don?t. We have living conditions, which we would have to pair with resistance groups. Clarification, alone (INT) Q4(3) ? from Q4(2) Resistance groups AND Living conditions Adding a condition No keyword Q3(13) P14: Okay, so how about, maybe, to some extent in forced labor battalions. INT: Oh, that?s useful? P14: Because some of the rebellions took place there? ? INT: This is a case where we can do a combined search; you see our results here are very large. Okay so what I would like to do is save this group, and then we can conduct a further search on this group of testimonies. P14: Should I do English? ? INT: There it is, so let?s add resistance and forced labor battalions. Add, and then delete the first one. And now say next; what this will do, it should narrow it down, I hope. We are having some?oh, it didn?t give us any. Look at that. So there aren?t any in which? P14: Well at least not any in English. INT: That is surprising. So then the way that I would approach this is then to get rid of the living conditions portion, and to look at just the resistance and forced labor battalions. Specialization (TH) Clarification (PK) Restriction (INT, NIR) Generalization (INT) Q4(5) Living conditions in forced labor battalions Q4(6) Living conditions in forced labor battalions AND Resistance in forced battalions (limited by English) Q4(7) ? from Q4(4) Q4(3), all languages Q4(6) from Q4(6) Resistance in forced battalions Narrower term Adding a condition Adding a language restriction Removal of the language restriction Removing a condition 320 No match No match 4 matches 202 Compromised Information Need (Q3) INR Move & Source Query (Q4) QRT Result Q3(14) INT: This was the resistance groups container, because you and I had in the past talked about resistance groups and I noticed that the ?Franc Tieurs? were two that you had looked at today, and those were very interesting to you, so I was thinking maybe to look at other testimonies from that group. So you can choose?it is grabbing it? Try clicking next to it. There we go. P14: I?m choosing English because that?s the one I know. Specialization (INT) Q4(7) ? from Q4(2) Resistance groups: Franc-Tieurs et Partisans (limited by English) Narrower term Limited by a language 24 matches Q3(15) P14: Uh, well, I?m starting from, let me just go to the top. Okay, uh, I?m just starting from here, I mean these are just the obvious one: interruption of education by war?it would be interesting to, well, it would be interesting to see, I mean this is sort of later on in the research, but whether and to what extent this in itself was a factor in relying on resistance as opposed to just following the orders, just by itself. Imagine, let?s say, I?m working on my bachelor?s degree and then suddenly we get an order from the government that we have to march out of the city, and you?ve spent two three years going to college, and then suddenly that has to stop because (inaudible); you have to walk, then I imagine that would be a serious problem. I?m sure that would be a very serious problem, yeah. Especially, I mean in this case, awareness of political or military events, and I?m sure this is very much related to here, education, living under forced (mumbled words read from screen) survival, so this would be very important, means of adaptation and survival. It would be interesting to see what he has to say about that kind of adaptation. (Refined during relevance judgment) Clarification (DS) Note for later (DS) 203 Compromised Information Need (Q3) INR Move & Source Query (Q4) QRT Result Q3(16) INT: Let me tell you a little bit about this keyword. Means of adaptation and survival is normally never used on its own; it is usually, it is almost always 99 percent of the time used in conjunction with another activity that one did in order to, to improve one?s circumstance. So in this case, living under false identity may have been the means of adaptation and survival. And then here, black market activities, may have been the means of adaptation and survival. P14: I see. INT: Or stealing may have been. So that?s a keyword that on its own may not be as useful as when, uh, you know the context, the means of adaptation and survival. P14: Plus at the same time though, I mean, if you?re, I mean, the type of, uh, resistance I?m looking at is actually organized armed resistance. But if you look at it from the passive resistance point of view, if you include that in your analysis, which I?m not, then some of this stuff could very much be part of that. Changing your identification, that could be passive resistance. And ultimately it depends how it?s being done; you can?t call all of them resistance, passive resistance? (Refin relevance judgment) ed during Clarification (PK, INT, DS) Elimination (PK, DS) Q3(17) OB: You just mentioned about the concentration camp which is Auschwitz. So, the name of the concentration camp, is that important to you in terms of finding relevant testimony? P14: Oh sure yeah, it?s possible. It?s possible that, in different camps the resistance movements had different characteristics. Uh, so you, eventually I will have to identify what those different characteristics were. And, at least at this stage, uh, I think it would be important to at least keep in mind that there are different camps and need to identify where, which camp she?s coming from. So when she?s describing the events, and her experiences, then to see if somebody else from a different camp would more or less have a similar view, or probably totally different view of what was going on and what they were doing. (Refined during relevance judgment) Clarification, alone (DS, PK) 204 Compromised Information Need (Q3) INR Move & Source Query (Q4) QRT Result Q3(18) P14: Okay now I?m looking at the different segments to see if there is anything about resistance, uprising, but there doesn?t seem to be much of?see very much like with some of the other ones, here too you have living under false identity, that could be seen as a form of passive resistance, but it?s not really what I?m looking for. (Refined during relevance judgment) Clarification (DS) Elimination (DS) Q4(8) Camp uprisings (limited by English) New query 31matches 205 Glossary Relevance criteria ? Topicality: Indicates whether the information object being examined is topically relevant to the information needs searchers have. ? Accessibility: Indicates whether rapid access to the retrieved testimonies/passages for viewing was available. ? Richness: Refers to how much detail of a subject the retrieved testimony/passage covers (amount of information) and/or how well the interviewee presents his/her experience in the testimony/passage (presentation skill). ? Emotion: Refers to the emotional expression presented in a testimony or a passage. ? Comprehensibility: Refers to the degree to which searchers can understand what the speaker says in a testimony/passage. ? Novelty of content: Refers to passage-level novelty rather than testimony-level novelty and covers three different types of cases: ? ? Duration: Refers to the playing time of a testimony or a passage. (1) participants found new facts about a known-event or phenomenon (2) participants found new examples/incidents of a known-event or phenomenon, and (3) participants found an event or phenomenon that was new to them. ? Acquaintance: Refers to the previous relationship of the searcher with the interviewee. ? Access to the interviewee: Indicates the physical distance between the searcher and the interviewees. Miscellaneous: Other miscellaneous criteria that are not defined above. Information need refinement moves ? Clarification: Refers to the cognitive process whereby searchers refine their information needs by further elaborating a search topic and by developing detailed aspects of the topic. ? Specialization: Indicates the process whereby searchers narrow their search by specifying their information needs 206 ? Specialization by elimination: A special case of specialization. Searchers can narrow a topic by eliminating a concept from an OR combination that represents a certain aspect of a topic. ? Restriction: Defined as narrowing a search by the characteristics of an interviewee, not by a subject. ? Generalization: The opposite move to specialization. It refers to the process whereby searchers expand their search by broadening a topic and used with an intention to increase the size of retrieved documents. ? Parallel movement: Refers to the case that searchers refine their information needs without making it broader or narrower. ? Note for later: Refers to the case that searchers indicate certain aspects of a topic or the characteristics of an interviewee are potentially useful at a later stage of their research but not at the current stage. 207 Bibliography Ackerman, M.S. and Malone, T.W. (1990). Answer garden: A tool for growing organizational memory. ACM SIGOIS Bulletin: Proceedings of the Conference on Office Information Systems. 31-39. Allan, J. (Ed). (2002). Topic detection and tracking: Event-based information organization. Boston: Klewer Academic. Allan, J., Lavrenko, V., and Jin, H. (2000). First story detection in TDT is hard. Proceedings of CIKM, 374-381. Ang, J., Dhillon, R., Krupski, A., Shriberg, E., and Stolcke, A. (2002). Prosody-based automatic detection of annoyance and frustration in human-computer dialog. ICSLP-2002, Denver, 2037-2039. Arons, B. (1997). SpeechSkimmer: A system for interactively skimming recorded speech. ACM Transactions on Computer-Human Interaction, 4(1), 3-38. Arons, B. (1994). Pitch-based emphasis detection for segmenting speech recordings. Proceedings of ICSLP, Vol. 4, September 18-22, Yokohama, Japan. Arons B. (1993). SpeechSkimmer: Interactively skimming recorded speech. Proceedings of UIST'93: ACM Symposium on User Interface Software and Technology, Nov. 3-5, Atlanta, 187-196. Bacchiani, M. (1999). Speech recognition system design based on automatically derived units. Ph.D. Dissertation, Boston University. Retrieved April 2 nd , 2006 from http://citeseer.ist.psu.edu/bacchiani99speech.html Baeza-Yates R., and Ribeiro-Neto, B. (1999) Modern information retrieval. New York, NY: ACM Press Baker, J.K. (1975). The DRAGON system - an overview. IEEE Transactions on Acoustics, Speech and Signal Processing, 23, 24-20. Barry, C.L. (1998). Document representation and clues to document relevance. Journal of the American Society for Information Science. 49, 1293-1303. Barry, C.L. (1994). User-defined relevance criteria: An exploratory study. Journal of the American Society for Information Science. 45(3), 149-159. Bateman, J. (1998). Changes in relevance criteria: A longitudinal study. Proceedings of the 61 st Annual Meeting of the American Society for Information Science. 35, 23-32. Bates, M.J. (1989). The design of browsing and berrypicking techniques for the online search interface. Online Review, 13(5), 407-424. 208 Bates, M.J. (1990). Where should the person stop and the information search interface start? Information Processing and Management, 26, 575-591. Belkin, N.J. (1980). Anomalous states of knowledge as a basis for information retrieval. Canadian Journal of Information Science, 5, 133-143. Belkin, N.J., Cool, C., Kelly, D., Lin, S.-J., Park, S.Y., Perez-Carballo, J., and Sikora, C. (2001). Iterative exploration, design and evaluation of support for query reformulation in interactive information retrieval. Information Processing and Management, 37(3), 403-434. Belkin, N.J., Seeger, T., and Wersig, G. (1983) Distributed expert problem treatment as a model for information system analysis and design. Journal of Information Science, 5, 153-167. Bookstein, A. (1979). Relevance. Journal of the American Society for Information Science, 30 (5), 269-273. Bruza, P.D. and Dennis, S. (1997). Query reformulation on the Internet: Empirical data and the Hyperindex search engine. 5 th RIAO Conference. Retrieved April 2 nd , 2006 from http://www.workingweb.com.au/training/RIAO97.pdf Bruce, H.W. (1994). A cognitive view of the situational dynamism of user-centered relevance estimation. Journal of the American Society for Information Science, 45, 142-148. Byrne, W., Doermann, D., Franz, M., Gustman, S., Hajic, J., Oard, D.W., Picheny, M., Psutka, J., Ramabhadran, B., Soergel, D., Ward, T., and Zhu, W. (2004) Automatic recognition of spontaneous speech for access to multilingual oral history archives. IEEE Transactions on Speech and Audio Processing, Special Issue on Spontaneous Speech Processing. Chen, F.R. and Whithgott, M. (1992). The use of emphasis to automatically summarize a spoken discourse. IEEE Int'l Conference on Acoustics, Speech and Signal Processing, 1, 229-232. Chen, H. and Dhar, V. (1990). Online query refinement on information retrieval systems: A process model of searcher/system interactions. Proceedings of the 13 th Annual International ACM SIGIR Conference, 115-132. Christel, M., Wactlar, H., Steven, S., Sirbu, M., Reddy, R., Mauldin, M., and Kanade, T. (1995). Informedia digital video library. Communications of the ACM, 38 (4), 57-58. Cooper, W.S. (1971). A definition of relevance for information retrieval. Information Storage and Retrieval, 7, 19-37. Cuadra, C.A., and Katter, R.V. (1967). Opening the black box of relevance. Journal of Documentation, 23, 291-303. 209 Creswell, J.W. (1994). Research design: Qualitative and quantitative approaches. Thousand Oaks, CA: Sage Publications. Denzin, N.K., and Lincoln Y.S. (Ed.) (2000). Handbook of qualitative research, 2 nd ed. Thousand Oaks, CA: Sage Publications. Dervin, B. (1992). From the mind's eye of the user: The sense-making qualitative- quantitative methodology. In Qualitative Research in Information Management, Glazier, J.D. and Powell, R.R. (Ed.), pp. 61-84. Englewood, Colorado: Libraries Unlimited, Inc. Efthimiadis, E.N. (2000). Interactive query expansion: A user-based evaluation in a relevance feedback environment. Journal of the American Society for Information Science, 51(11), 989-1003. Efthimiadis, E.N. (1996). Query expansion. Annual Review of Information Science and Technology, 31, 121-187. Efthimiadis, E.N. (1990) Online searching aid: A review of front-ends, gateways and other interface. Journal of Documentation, 46(3), 218-262. Ericsson K.A., and Simon, H.A. (1993). Protocol analysis: Verbal reports as data. Cambridge, MA: MIT Press. Fidel, R. (1991). Searchers' selection of search keys: I. The selection routine. Journal of the American Society for Information Science, 42(7), 490-500. Fidel, R. (1991). Searchers' selection of search keys: II. Controlled vocabulary or free- text searching. Journal of the American Society for Information Science, 42(7), 501-514. Fidel, R. (1991). Searchers' selection of search keys: III. Searching styles. Journal of the American Society for Information Science, 42(7), 515-527. Fischer, S., and Effelsberg, W. (1995). Automatic film genre classification. Proceedings of ACM Multimedia'95, San Francisco, CA, pp.295-304. Foskett, D.J. (1972). A note on the concept of 'relevance.' Information Storage & Retrieval, 8, 77-78. Foote, D. (1999). An overview of audio information retrieval. Multimedia Systems, 7 (1), 2-11, ACM Press/Springer-Verlag. French, J.C., Brown, D.E., and Kim, N. (1997). A classification approach to Boolean query reformulation. Journal of the American Society for Information Science, 48(8), 694-706. Gauch, S., and Smith, J.B. (1991). Search improvement via automatic query reformulation. ACM Transactions on Information Systems, 9(3), 249-280. 210 Godfrey, J.J., Holliman, E.C., and McDaniel, J. (1992). SWITCHBOARD: Telephone speech corpus for research and development. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, 1, 517- 520, San Francisco, CA. Graesser, A.C., McMahen, C.L., and Johnson, B.K. (1994). Chapter 15: Question asking and answering. In Handbook of Psycholinguistics, M.A.Gernsbacher (Ed.), pp.517-37. New York, NY: Academic Press. Green, R. (1995). Topical relevance relationships I. Why topic matching fails. JASIS, 46, 646-653. Greenberg, J. (2001). Optimal query expansion (QE) processing methods with semantically encoded structured thesauri terminology. Journal of the American Society for Information Science & Technology, 52(6), 487-498. Guba, E.G., and Lincoln, Y.S. (1998). Competing paradigms in qualitative research. In The Landscape of Qualitative Research, N. Denzin & Y.S. Lincoln (Ed.), pp. 195-220. Thousand Oaks, CA: Sage. Guba, E.G. (1981). Criteria for assessing the trustworthiness of naturalistic inquiries. Educational Communication and Technology Journal, 29(2), 75-79. Guba, E.G., and Lincoln, Y.S. (1982). Epistemological and methodological bases of naturalistic inquiry. Educational Communication and Technology Journal, 30(4), 233-252. Gustman, S., Soergel, D., Oard, D.W., Byrne, W., Picheny, M., Ramabhadran, B., and Greenberg, D. (2002). Supporting access to large digital oral history archives. Proceedings of the Second ACM/IEEE-CS Joint Conference on Digital Libraries, Portland Oregon, 18-27. Harman, D. (2002). Overview of the TREC 2002 novelty track. Proceedings of the 11 th Text Retrieval Conference, National Institute of Standards and Technology. Hauptmann, A.G. (1995). Speech recognition in the ?Informedia" digital video library: Uses and limitations. Proceedings of the Seventh International Conference on Tools with Artificial Intelligence (TAI '95). Hearst, M. (1999). Chapter 10: User interfaces and visualization. In Modern Information Retrieval, Baeza-Yates and Ribeiro-Neto (Ed.). Pp.257-323. New York, NY:ACM Press. Hirschberg, J. and Grosz, B. (1992). Intonational features of local and global discourse structure. Proceedings of the Speech and Natural Language Workshop. pp. 441- 446. New York, NY: Harriman. Hirschberg, J., and Nakatani, C.H. (1998). Acoustic indicators of topic segmentation. Proceedings of International on Speech and Language Processing (ICSLP-98), Sydney. 211 Hirschberg, J., and Whittaker, S. (1997). Studying search and archiving in a real audio database. Working Notes of the AAAI-97 Spring Symposium on Intelligent Integration and Use of Text, Image, Video, and Audio Corpora, March 24-26, Stanford, 70-76. Ingwersen, P. (1984) A cognitive view of three selected online search facilities. Online Review, 8(5), 465-492. Janes, J.W. (1991). Information retrieval interaction. London, England: Taylor Graham. Jansen, B.J., Spink, A., and Saracevic, T. (2000). Real life, real users, and real needs: A study and analysis of user queries on the Web. Information Processing and Management, 36, 207-227. Jelinek, F. (1976). Continuous speech recognition by statistical methods. Proceedings of IEEE, 64, 532-556. Jelinek, F., Bahl, L.R., and Mercer, R.L. (1975). Design of a linguistic statistical decoder for the recognition of continuous speech. IEEE Transaction on Information Theory, 21, 250-256. Jin, R. and Hauptmann, A.G. (2001). Learning to select good title words: A new approach based on reverse information retrieval. Proceedings of the 18 th ICML`01, Williamsport, Maryland. Jin, R. and Hauptmann, A.G. (2000). Title generation for spoken broadcast news using a training corpus. Proceedings of ICSLP 2000, Beijing, China. Jones, K.S., and Willett, P. Ed. (1997). Readings in information retrieval. San Francisco, CA: Morgan Kaufmann Publishers, Inc. Kennedy, P. and Hauptmann, A.G. (2000). Automatic title generation for the Informedia digital library. ACM Digital Libraries, DL-2000, San Antonio, TX. Kim, J., Oard, D.W., and Soergel, D. (2003). Searching large collections of recorded speech: A preliminary study. Technical Report: HCIL-2003-06, CS-TR-4453, UMIACS-TR-2003-23, University of Maryland. Korfhage, R.R. (1997). Information storage and retrieval. New York, NY: John Wiley & Sons, Inc. Kreiman, J. (1982). Perception of sentence and paragraph boundaries in natural conversation. Journal of Phonetics, 10,163-175. Kuhlthau, C.C. (1992). Seeking meaning: A process approach to library and information services. Pp.199. Norwood, NJ: Ablex Publishing Corp. Lang, K.L., and Dumais, S.T. (1992). Chapter 8: Question asking in human-computer interfaces. In Questions and Information Systems, T.W. Lauer, E. Peacock, and 212 A.C. Graesser (Eds.), pp.131-165. Hillsdale, NJ: Lawrence Erlbaum Associates Publishers. Lau, T., and Horvitz, E. (1999). Patterns of search: Analyzing and modeling web query refinement. Proceedings of the 7 th International Conference on User Modeling, 119-128. Lee, C.M., and Narayanan, S.S. (2002). Combining acoustic and language information for emotion recognition. ICSLP-2002, Denver, 873-876. Lee, H., Smeaton, A.F., and Berrut, C. (2000). Implementation and analysis of several keyframe-based browsing interfaces to digital video. Borbinha, J. and Baker T. (Eds.): ECDL 2000, LNCS 1823, Springer-Verlag, Berlin, Heidelberg, pp. 206- 218. Lincoln, Y., and Guba, E. (1985). Naturalistic inquiry. New York: Sage. Liscombe, J., Venditti, J., and Hirchberg, J. (2003). Classifying subject ratings of emotional speech using acoustic features. Eurospeech-2003, Geneva. Losee, Jr., R.M. (1994). Upper bounds for retrieval performance and their use measuring performance and generating optimal Boolean queries: Can it get any better than this? Information Processing & Management, 30(2), 193-203. Mandala, R., Tokunaga, T., and Tanaka, H. (2000). Query expansion using heterogeneous thesauri. Information Processing and Management, 36, 361-378. Marshall, C., and Rossman, G.B. (1989). Designing Qualitative Research. Newbury Park, CA: Sage. Maxwell, J.A. (1996). Qualitative research design: An interactive approach. Thousand Oaks, CA: Sage Publications, Inc. Mayer, R.E. (1991). Thinking, problem solving, cognition, 2 nd ed. New York, NY: Freeman and Company. Mizarro, S. (1997). Relevance: The whole history. Journal of the American Society and Information Science, 48(9), 810-832. Mohri, M., Pereira, F., and Riley, M. (2000). Weighted finite-state transducers in speech recognition. Proceedings of ASR 2000, International Workshop on Automatic Speech Recognition: Challenges for the Next Millennium (ASR 2000), September 18-20, Paris, France. Oard, D.W. (2000). User interface design for speech-based retrieval. Bulletin of the American Society for Information Science, 26(5), 20-22. Oard, D.W. (1997). Speech-based information retrieval for digital libraries. Technical Report: CLIS-TR-97-05, LAMP-TR-015, CS-TR-3778, UMIACS-TR-97-36, University of Maryland. 213 O'Shaughnessy, D. (1992). The prosody of restarts and filled and unfilled pauses in spontaneous speech. Workshop on Prosody in Natural Speech, Philadelphia. Pallett, D.S., Fiscus, J.G., Garofolo, J.S., Martin, A., and Przybocki, M.A. (1999). 1998 Broadcast news benchmark test results. In Proceedings of the 1999 DARPA Broadcast News Workshop, February 28 - March 3, Herndon, VA. Park, T.K. (1993). The nature of relevance in information retrieval: An empirical study. Library Quarterly, 63(3), 318-351. Rabiner, L.R. (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2), 257-286. Rabiner, L.R., and Juang, B.H. (1986). An introduction to hidden Markov models. IEEE ASSP Magazine, January, pp.4-15. L. R. Rabiner, L.R., Juang, B.H., Levinson, S.E., and Sondhi, M.M. (1985). Some properties of continuous hidden Markov model representations. AT&T Technical Journal, 64 (6), 257-286. Reynar, J.C. (1998). Topic segmentation: Algorithms and Applications. Ph.D. Dissertation, Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA. Reynar, J.C. (1994). An automatic method of finding topic boundaries. Proceedings of the 32 nd Annual Meeting of the Association for Computational Linguistics, Student Session, Las Cruces, New Mexico. Rieh, S.Y., and Xie, H. (2001). Patterns and sequences of multiple query reformulations in Web searching: A preliminary study. Proceedings of the ASIST, 38, 246-255. Rocchio, J. (1971). Relevance feedback in information retrieval. The SMART Retrieval System: Experiments in Automatic Document Processing, Salton, G., (Ed.). Pp.313-323. Prentice-Hall, Englewood Cliffs; NJ. Rogers, E.M. (1981). The convergence model of communication and network analysis. In Communication Networks: Toward a New Paradigm, Rogers, E.M., and Kincaid, D.L., Ed., pp.3-53. New York, NY: The Free Press. Salton, G., and Buckley, C. (1990). Improving retrieval performance by relevance feedback. Journal of the American Society for Information Science, 41(4), 288- 297. Saracevic, T. (1976). Relevance: A review of literature and a framework for thinking on the notion in information science. Advances in Librarianship, 6, 79-138. Saracevic, T. (1975). Relevance: A review of and framework for the thinking on the notion in information science. Journal of the American Society for Information Science, 26 (6), 321-343. 214 Schamber, L. (1994). Relevance and information behavior. Annual Review of Information Science and Technology, 29, 1-48. Schamber, L., Eisenberg, M.B, and Nilan, M.S. (1990). A re-examination of relevance: Toward a dynamic, situational definition. Information Processing & Management, 26(6), 755-776. Shriberg, E., Stolcke A., Hakkani-Tur, D., and Tur, G. (2000). Prosody-based automatic segmentation of speech into sentences and topics. Speech Communication: Special Issues on Accessing Information in Spoken Audio, 32(1-2). Slaughter, L., Oard, D.W., Warnick, V.L., Harding, J.L., and Wilkerson, G.J. (1998). A graphical interface for speech-based retrieval. The Third ACM Conference on Digital Libraries, Pittsburgh, PA. Shneiderman, B. (2000). The limits of speech recognition. Communications of the ACM, 43(9), 63-65. Soergel, D. et al. (2002). The many uses of digitized oral history collections: Implications for design. Technical report, College of Information Studies, University of Maryland. Spink, A., Jansen, B.J., and Ozmultu, C. (2001). Use of query reformulation and relevance feedback by Excite users. Internet Research, 10(4), 317-328. Spink, A., and Losee, R.M. (1996). Feedback in information retrieval. Annual Review of Information Science and Technology, 31, 33-78. Spink, A., and Ozmultu, H.C. (2002). Characteristics of question format web queries: an exploratory study. Information Processing and Management, 38, 453-471. Spink, A., and Saracevic, T. (1997). Interaction in information retrieval: selection and effectiveness of search terms. Journal of the American Society for Information Science, 48(8), 741-761. Spink, A., Wolfram, D., Jansen, B.J., and Saracevic, T. (2001). Searching the web: The public and their queries. Journal of the American Society for Information Science, 52(3), 226-234. Stark, L., Whittaker, S., and Hirschberg, J. (2000). ASR Satisficing: The effect of ASR accuracy on speech retrieval. ICSLP-00, Beijing. Stifelman, L.J. (1995). A Discourse Analysis Approach to Structured Speech. AAAI 1995 Spring Symposium Series: Empirical Methods in Discourse Interpretation and Generation, March 27-29, Stanford University. Stifelman, L., Arons, B., and Schmandt, C. (2001). The Audio Notebook: paper and pen interaction with structured speech. Proceedings of CHI 01, pp. 182-189. Seattle, WA: ACM Press. 215 Stolcke, A., Bratt, H., Butzberger, J., France, H., Rao Gadde, V.R., Plauche, M., Richey, C. Shriberg, E., Sonmez, K., Weng, F., and Zheng, J. (2002). The SRI March 2000 Hub-5 Conversational Speech Transcription System. Proceedings of the NIST Speech Transcription Workshop, College Park, MD. Taylor, R. (1962). The process of asking questions. American Documentation. 391-396. Takao, S., Ogata, J., and Ariki, Y. (2000). Topic segmentation of news speech using word similarity. ACM Multimedia 2000, Los Angeles, CA. pp. 442-444. Tang, R., and Solomon, P. (2001). Use of relevance criteria across stages of document evaluation: On the complementarity of experimental and naturalistic studies. Journal of the American Society for Information Science & Technology, 52(8), 666-685. TDT-2000. Topic Detection and Tracking. Proceedings of the 2000 Speech Transcription Workshop, May 16-19, University of Maryland. TDT-1999. Topic Detection and Tracking. Proceedings of the DARPA Broadcast News Workshop, February 28 - March 3, Herndon, VA TDT-1998. Topic Detection and Tracking. Proceedings of the DARPA Broadcast News Transcription and Understanding Workshop, February 8-11, Lansdowne, VA. TDT-1997. Topic Detection and Tracking. Proceedings of the DARPA Speech Recognition Workshop, February 2-5, Chantilly, VA. Thong, J.-M.V., Moreno, P.J., Logan, B., Fidler, B., Maffey, K., and Moores, M. (2001). SpeechBot: An experiment speech-base search engine for multimedia content in the Web. Technical Report: CRL2001/6, Cambridge Research Lab, Compaq. Tibbo, H.R. (2002). Personal communication. TREC-9. (2000). Spoken document retrieval track. Proceedings of the 9 th Text Retrieval Conference, National Institute of Standards and Technology (NIST). TREC-8. (1999). Spoken document retrieval track. Proceedings of the 8 th Text Retrieval Conference, National Institute of Standards and Technology (NIST). TREC-7. (1998). Spoken document retrieval track. Proceedings of the 7 th Text Retrieval Conference, National Institute of Standards and Technology (NIST). TREC-6. (1997). Spoken document retrieval track. Proceedings of the 6 th Text Retrieval Conference, National Institute of Standards and Technology (NIST). Turtle, H.R., and Croft, W.B. (1992). A comparison of text retrieval models. The Computer Journal, 35(3), 279-290. 216 Voorhees, E., and Harman, D. (2000). Overview of the Ninth Text Retrieval Conference. The Ninth Text Retrieval Conference (TREC-9), pp. 1-14. Gaithersburg, MD, NIST. Wang, P. (1994). A cognitive model of document selection of real users of information retrieval systems. Doctoral dissertation, University of Maryland. Wang, P., and Soergel, D. (1998). A cognitive model of document use during a research project. Study I: Document selection. Journal of the American Society for Information Science, 49(2), 115-133. Wang, P., and White, M.D. (1999). A cognitive model of document use during a research project. Study II: Decisions at the reading and citing stages. Journal of American Society for Information Science, 50, 98-114 White, M.D. (1983). The reference encounter model. Drexel Library Quarterly, 19, 38- 55. Whittaker, S., Davis, R. Hirschberg, J., and Muller U. (2000). Jotmail: A voicemail interface that enables you to see what was said. Proceedings of CHI `2000, April 1-6, Hague, Amsterdam, 89-96. Whittaker, S., Hirschberg, J., Amento, B., Stark, L., Bacchiani, M., Isenhour, Pl, Stead, L., Zamchick, G., and Rosenberg, A. (2002). SCANMail: a voicemail interface that makes speech browsable, readable and searchable. Proceedings of CHI `02, pp. 275-282. Minneapolis, MN, ACM Press. Whittaker, S., Hirschberg, J., Choi, J., Hindle, D., Pereira, F., and Singhal, A. (1999). SCAN: Designing and evaluating user interfaces to support retrieval from speech archives. Proceedings of ACM SIGIR `99, Berkley, CA, 26-33. Whittaker, S., Hirschberg, J., and Nakatani C.H. (1998a). All talk and all action: Strategies for managing voicemail message. Proceedings of ACM CHI 98, April, pp.249-50. Whittaker, S., Hirschberg, J., and Nakatani C.H. (1998b). Play it again: a study of the factors underlying speech browsing behavior. Proceedings of ACM CHI '98. Los Angeles, CA, ACM Press. Wightman, C., and Ostendorf, M. (1992). Automatic recognition of intonational features. In IEEE International Conference on Acoustics, Speech, and Signal Processing, March, pp.221-224. Wildemuth, B.M (2003) The effects of domain knowledge on search tactic formulation. Journal of the American Society for Information Science and Technology, 55(3), 246-258. Williams, M.D. (1984). What makes RABBIT run? International Journal of Man- Machine Studies, 21, 333-352. 217 Wilson, P. (1973). Situational relevance. Information Storage and Retrieval, 9, 457- 471. Wilson, T.D. (1994). The proper protocol: Validity and completeness of verbal report. Psychological Science, 5(5), 249-252. Witbrock, M., and Hauptmann, A.G. (1998). Speech recognition for a digital video library. Journal of the American Society for Information Science, 49(7), 619- 632. Witbrock, M., and Mittal, V. (1999). Ultra-summarization: A statistical approach to generating highly condensed non-extractive summaries. Proceedings of SIGIR`99, Berkeley, CA. Xu, J., and Croft, W.B. (1996). Query expansion using local and global document analysis. Proceedings of the 19 th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp.4-11. Yamron, J.P., Carp, I., Gillick, L., Lowe, S., and Mulbregt, P.V. (1998). A hidden Markov model approach to text segmentation and event tracking. ICASSP98, 1, 333-336. Young, S. (2001). Statistical modeling in continuous speech recognition. Proceedings of International Conference on Uncertainty in Artificial Intelligence, Seattle, WA, August. Zhang, X., Anghelescu, H.G.B, and Yuan, X. (2005) Domain knowledge, search behavior, and search effectiveness of engineering and science students: An exploratory study. Information Research, 10(2), p.217. Zechner, K. (2001). Automatic generation of concise summaries of spoken dialogues in unrestricted domains. ACM SIGIR `01, September 9-12, New Orleans, Louisiana, 199-207. 218