View Single Post
Old 01-28-2008, 12:42 AM   #9 (permalink)
glocorpweb
Contributing Member
 
glocorpweb's Avatar
 
Join Date: 01-22-08
Posts: 55
iTrader: 0 / 0%
Latest Blog:
None

glocorpweb is liked by many
Quote:
Originally Posted by coolguy27 View Post
LSI is a methodology involving statistical probability and correlation that helps deducing the semantic distance between words. It’s obviously a complex methodology but can be easily applied to understand the relation between certain words in a paragraph or in a document. This methodology is being used while indexing a page in the search engine’s database.s.



Latent semantic indexing adds an important step to the document indexing process. In addition to recording which keywords a document contains, the method examines the document collection as a whole, to see which other documents contain some of those same words. LSI considers documents that have many words in common to be semantically close, and ones with few words in common to be semantically distant. This simple method correlates surprisingly well with how a human being, looking at content, might classify a document collection. Although the LSI algorithm doesn't understand anything about what the words mean, the patterns it notices can make it seem astonishingly intelligent.

Quoted from source......
Nice digging dude...
glocorpweb is offline   Reply With Quote