Three views of Information Retrieval (IR) are:
- User view - the user wants to fill a gap in their knowledge by searching for meaningful results from queries or browsing data in likely relevant topics
- System view - IT systems for reliable storage and retrieval of data supporting user needs
- Sources View - capturing and presenting data for 3rd parties e.g. collating aircraft data relevant to the aerospace industry by Jane's
To support IR from systems, content is structuring to speed searches using indexing techniques:
- Find data fields and communicate with metadata
- Find words used in the database
- "Stop word" removal reduces the word list by taking out the most commonly used words e.g the, and, to
- Stemming references common terms with suffixes removed e.g. generate from generator, generated, generates. An example of a stemming generation language is Snowball
- Synonym generation helps to create an index which finds more terms related to the original search e.g. table tennis & ping pong
When looking for information on interests, such as guitars, I adopt "indiscriminate driftnetting" searching everything that mentions the esoteric interest and saving links to the ones which closely match the topic e.g. "which replacement pickup best delivers a classic jazz tone". No single search will find the answer and related searches are employed with a painstaking review for relevant listings which become a new search terms to find further information.
No comments:
Post a Comment