Hi,
I have troubles to comprehend the computed term frequencies in the VSM example on slide 29 in lecture 4.
It says on the slide: tf("Frodo",d2)=1 and tf("stab",d2)=2. When I try to calculate it, I get tf("Frodo",d2)=1+log10(1)1+log10(2)≠1 and tf("stab",d2)=1+log10(2)1+log10(2)=1, because the "Frodo" occurs once and "stab" occurs twice in d3. Also, "stab" (or "orc") is the most frequent term in d3 and therefore used for normalization, hence the denominator.
It appears as if the absolute term frequency is used in this example rather than the logarithmic and normalized term frequency. Is this correct? And which variant should we use in the exam?
Thanks for your help :)
Best,
Lennart