Exercise Sheet 11

Exercise Sheet 11

de Raphael Teller -
Número de respuestas: 2

Hi,

I've got a question regarding exercise 2.1 and 2.2 from the newest exercise sheet. 

As mentioned in the task description, every word should be treated as a distinct feature, but how does that effect the calculation of the probabilities? For example with P(Macao | Yes), is the count of "Macao" 4 (because the actual word count is 4) or 3 (because there are 3 features with "Macao")? And what would be are corresponding denominator respectably - 7, because there are 7 words in total for "Yes", or 3, because the actual count of "Yes" is 3?

Thank you for your help.

 

Kind regards,

Raphael Teller

En respuesta a Raphael Teller

Re: Exercise Sheet 11

de Fabian Schmidt -
Hi Raphael,

I now see that the clarification does not entirely resolve the ambiguity.

> or 3 (because there are 3 features with "Macao")

Since each word is an independent feature the numerator is 4 while the denominator is 7. The denominator is 7, since we are modelling the distribution over (7) features (X, word counts) conditional on the label. We observe 7 words conditional on the label being "yes". 4 of those words are Macao. Conclusively, P(Macao|Yes) is 4/7.

Each row is an instance of (X_i, y_i) where X_i refers to the feature vector and and y_i refers to the label. More broadly, If you'd frame the input as vectors, you would have vector X_i that denominates word counts for the instance (the length of the vector is determined by the number of distinct words).

Best,
Fabian