Link analysis
In a previous post (Web-based Document Search challenge), I talked about the issue of performing simple similarity based searches for retrieving documents.
One solution as per the book - ""Text Mining: Predictive Methods for Analyzing Unstructured Information" is to perform Link analysis. Google uses a PageRank algorithm. The rank of a document is determined by the rank of the papers that link to it. A document should be ranked highly if it is cited by another highly-ranked document.
Academic documents can be also be ranked based on this citation analysis. If a document is cited by highly-ranked documents, then it should be highly-ranked as well.
References
One solution as per the book - ""Text Mining: Predictive Methods for Analyzing Unstructured Information" is to perform Link analysis. Google uses a PageRank algorithm. The rank of a document is determined by the rank of the papers that link to it. A document should be ranked highly if it is cited by another highly-ranked document.
Academic documents can be also be ranked based on this citation analysis. If a document is cited by highly-ranked documents, then it should be highly-ranked as well.
References
0 Comments:
Post a Comment
<< Home