Questions about the exercise sheets

Questions about the exercise sheets

by Daniel Weller -
Number of replies: 10

Hello all,

while working on the exercise sheets, two questions came up that maybe someone can help me with.

In Exercise02 Probabilistic Information Retrieval Task3:

It is not clear to me how the line Σwt is calculated, e.g. the value -0.405.

 

In Exercise03 Semantic Retrieval - Latent Semantic Analysis Task4:

How do I calculate the cos(q, d1) = 0.6325

 

Thanks a lot for your input. :)


In reply to Daniel Weller

Re: Questions about the exercise sheets

by Deleted user -
Hello Daniel,

Exercise02 Probabilistic Information Retrieval Task3:
-0,405 = ln(0,5*(4/3))
The 0,5 is from P(Dt|q,r). The (4/3) is from (N/Nt). -> N = 4 documents. Nt = 3 documents with term. Doc 3 doesn´t include the term, so Nt = 0 and you can´t calculate.
In reply to Deleted user

Re: Questions about the exercise sheets

by Daniel Weller -
Hi Alexander,

ok, I had used "log" instead of "ln", now it is reproducible. Thank you!
In reply to Deleted user

Re: Questions about the exercise sheets

by Philipp Treutlein -
Hello Alexander,

why is the "ln" used here instead of the "log"? Is there any reason for that? Can't find anything in the slides about using the "ln" in the Binary Independence Model.

Thanks in advance!
In reply to Philipp Treutlein

Re: Questions about the exercise sheets

by Deleted user -
Hello Philipp,
I don't know, it's just the formula.
Write in the exam in any case, which log/ln you use, then it can be taken into account.
The base of the logarithm changes the scaling, but not the relative relationship.
In reply to Deleted user

Re: Questions about the exercise sheets

by Philipp Treutlein -
Hi Alexander,

thanks for the answer!
I was just wondering if there is a reason for that since it deviated from the slides. But okay, then i'll follow your advice and denote what i used in the exam.
In reply to Philipp Treutlein

Re: Questions about the exercise sheets

by Benedikt Ebing -

Hi Philipp,

as Alexander stated correctly, the choice of the logarithm just changes the scaling, so it doesn´t really matter which one is used. In this particular case, I assume that the solution was calculate with numpy, where "numpy.log" results in the natural logarithm.

Concerning the exam, either stick to the base given in the task or use the base from the lecture slides. When in doubt, just leave us a note which base you used.

Best,
Benedikt

In reply to Deleted user

Re: Questions about the exercise sheets

by Nicolas Wild -
Hey, alexander, consequently i don't get why for "shears" the (N/Nt) is taken the other way arround? shouldn't it be 4/2 either?
In reply to Nicolas Wild

Re: Questions about the exercise sheets

by Deleted user -
Hi Nicolas,
in the table we write down (Nt/N) so it´s 2/4 but for the weight we use N/Nt.
-> wt = ln(0,5 * 4/2).
In reply to Daniel Weller

Re: Questions about the exercise sheets

by Deleted user -
Hello Daniel,

How do I calculate the cos(q, d1) = 0.6325:

cos(q,d1) = (q * d1) / (||q|| * ||d1||)

q * d1 = 0,9441 * 0,1059 + -0,4355 * 0,0557 + 0,7959 * 0,2485 + -0,5344 * 0,0443
||q|| = sqrt(0,9441²+-0,4355²+0,7959²+-0,5344²)
||d1|| = sqrt(0,1059²+0,0557²+0,2485²+0,0443²)

I hope this is clear.
In reply to Deleted user

Re: Questions about the exercise sheets

by Daniel Weller -
Hi Alexander, Thanks for your solution and the detailed explanation, I have now found my mistake!