Thursday, March 16, 2006

Determining prototypical vectors

Given a cluster, one way of determining the representatives or prototypes of vectors in that cluster could be:
- generate a mean vector with normalized tf-idf features
- the prototypes are the vectors that are most similar (based on the cosine sim measure) to this mean vector

Calculation of a mean vector:
- Take the values of the features for all the vectors in a cluster
- Calculate the mean for a each feature
- Normalize the mean

0 Comments:

Post a Comment

<< Home