Wednesday, March 15, 2006

Issue: Training data set size

So my current focus is on increasing the size of the data set.
How large should this labeled dataset be?

* 1% of 800K is 800
* 5% -> 4000

If I can increase it to about 4000 - I might - (since the task is manual and subjective) be able to improve the classification results.


Post a Comment

<< Home