Résumé de section

    • Lecture content: Corpus linguistics: frequency distributions and Zipf’s law, lexical association measures, collocation, and terminology extraction; Lexico-semantic resources (Word-Net); Basics of computational linguistics: morphological normalization (inflectional and derivational morphology), part-of-speech tagging, syntactic parsing; Topic modeling.

      Tutorial content: Extracting collocations (Python libraries: nltk and spacy); Analyzing lexico-semantic knowledge in WordNet (Python library: wordnet interface from nltk); Topic modeling with latent Dirichlet allocation (Python library: gensim).

      Homework: Usage scenario – Topical exploration of the American “Lost Generation” literature (distant reading).