Résumé de section

  •  

     

    Praktikum Vision and Language (taught in English, Winter 25/26)

     

    (latest update: 2024/11/07)

    On this page is all relevant information regarding the Praktikum Vision and Language (Prof. Dr. Radu Timofte,  Prof. Dr. Goran Glavaš ) in the Winter Semester of 2025.

    For WueStudy related questions and other administrative things please check the video tutorials from:

    https://www.uni-wuerzburg.de/en/wuestudy/help/video-tutorials/

     

    Description

    The fields of Natural Language Processing and Computer Vision have both greatly advanced in recent years due to improvements in hardware and the huge amounts of data available on the internet. At the intersection of the two modalities text and image, we have the multimodal vision+language field. Here, we tackle a wide range of problems from the text-based image generation, to image search, caption generation, image understanding and more.
     

    How to tackle those tasks? What are the challenges? What are the solutions? What is state-of-the-art? Can we improve it further or reduce existing limitations?

    In this Praktikum Vision and Language, we work in groups (1-3 participants), on a vision+language project, explore the current state-of-the-art and devise new ways to use the models as tools, propose improvements over the current approach, or uncover and maybe reduce existing limitations.

    Each group is expected to prepare a written review report (10 pages) covering their project and research background and a corresponding oral presentation (around 20-25 minutes, each member has to speak a part).

    Each participant will get hands-on and teamwork skills as well as critical analysis, scientific discourse, and preparation, writing, and presentation on a specific vision+language deep learning task.


    Objectives

    • Each participant will get hands-on and teamwork skills as well as critical analysis, scientific discourse, and preparation, writing, and presentation on a specific vision+language deep learning task.

     

    Prerequisites

    • Basic concepts of mathematical analysis and linear algebra.
    • Basic knowledge of machine learning and deep learning (very) is helpful.
    • Basic programming skills; most of the Praktikum will be Python and use the PyTorch framework.
    • The course language is English.
    • prior attendance of Computer Vision course and Image Processing and Computational Photography course are optional -- good to have but not mandatory.

     

    Read the course slide for available projects and infos for what is to do for the Praktikum!

     

    Locations and Dates

    • Kickoff Meeting (TBA!)
    • Until the next week: group and project assignment finalized
    • Bi-weekly meetings with supervisor to report progress and discuss problems
    • End of lecture time: Presentations
    • Until end of semester (based on group preferences): Deadline for Final Reports and Project Code 
      • 1-2 weeks before Deadline: Presentations held in blocks (depends on course size)

     

    Contact:

    • Gregor Geigle (email: gregor.geigle@uni-wuerzburg.de)
    • Prof. Dr. Radu Timofte (email: radu.timofte@uni-wuerzburg.de)
    • Prof. Dr. Goran Glavaš  (email: goran.glavas@uni-wuerzbug.de)

    For general questions related to the course, please use the Moodle forum General and only use mail for individual problems or questions.