Hi,
I have troubles to comprehend the computed term frequencies in the VSM example on slide 29 in lecture 4.
It says on the slide: \( tf("Frodo", d_2)=1 \) and \( tf("stab", d_2)=2 \). When I try to calculate it, I get \( tf("Frodo",d_2)=\frac{1+\log_{10}(1)}{1+\log_{10}(2)}\neq 1 \) and \( tf("stab",d_2)=\frac{1+\log_{10}(2)}{1+\log_{10}(2)}=1 \), because the "Frodo" occurs once and "stab" occurs twice in \( d_3 \). Also, "stab" (or "orc") is the most frequent term in \( d_3 \) and therefore used for normalization, hence the denominator.
It appears as if the absolute term frequency is used in this example rather than the logarithmic and normalized term frequency. Is this correct? And which variant should we use in the exam?
Thanks for your help :)
Best,
Lennart