Thursday, March 16, 2006

Issue: tf-idf calculation issue

The problem with the tf-idf metric is that if I modify a single query - say change a spelling error - I need to regenerate the tf-idf metrics for the entire data set. After which it has to be normalized!
This is a very expensive operation - I wonder if instead of the MySQL database if I used the BerkleyDB, would it be faster?

0 Comments:

Post a Comment

<< Home