Michael Dorkenwald

I am a PhD student in the QUVA lab at the University of Amsterdam supervised by Yuki Asano and Cees Snoek. I am also part of the ELLIS PhD program in cooperation with Qualcomm.

Before, I received my master's degree in physics from Heidelberg University during which I was part of the research group from Björn Ommer. There, I was working on understanding human and object dynamics within generative frameworks primarily for video synthesis. I had the opportunity for a research visit at Kosta Derpanis's lab in Toronto. Furthermore, I completed an internship in the AWS Rekognition team where I worked on self-supervised video representation learning.

Email / Google Scholar / Twitter / Github / LinkedIn

News

[Oct 2024] Two new preprints proposing a video-language benchmark and LLMs as implicit optimizers for VLMs.
[Jul 2024] One paper accepted to ECCV'24 on masked video modeling.
[Jun 2024] Gave a talk at TNO in den Hague and at the National Institute for Informatics in Tokyo.
[Apr 2024] Workshop organizer of "Self Supervised Learning: What is Next?" at ECCV'24.
[Apr 2024] Teaching Assistant for the Foundation Models (FoMo) course.
[Feb 2024] One paper accepted to CVPR'24 on enabling object localisation abilities in VLMs.
[Aug 2023] Started my PhD at University of Amsterdam.
[Jul 2023] Attended the International Computer Vision Summer School in Sicily.

Research

My research focuses on the intersection of self-supervised learning, video understanding, and vision-language models.

	TVBench: Redesigning Video-Language Evaluation Daniel Cores, Michael Dorkenwald, Manuel Mucientes, Cees G. M. Snoek, Yuki M. Asano ArXiv 2024 ArXiv / Code / Hugging Face
	GLOV: Guided Large Language Models as Implicit Optimizers for Vision Language Models M. Jehanzeb Mirza, Mengjie Zhao, Zhuoyuan Mao, Sivan Doveh, Wei Lin, Paul Gavrikov, Michael Dorkenwald, Shiqi Yang, Saurav Jha, Hiromi Wakaki, Yuki Mitsufuji, Horst Possegger, Rogerio Feris, Leonid Karlinsky, James Glass ArXiv 2024 ArXiv / Project Page / Code
	SIGMA: Sinkhorn-Guided Masked Video Modeling Mohammadreza Salehi, Michael Dorkenwald, Fida Mohammad Thoker, Efstratios Gavves, Cees Snoek, Yuki M. Asano Accepted at ECCV 2024* ArXiv / Project Page / Code
	PIN: Positional Insert Unlocks Object Localisation Abilities in VLMs Michael Dorkenwald, Nimrod Barazani, Cees Snoek, Yuki M. Asano CVPR 2024 ArXiv / Project Page / Code
	SCVRL: Shuffled Contrastive Video Representation Learning Michael Dorkenwald, Fanyi Xiao, Biagio Brattoli, Joseph Tighe, Davide Modolo CVPR 2022 I3D-IVU workshop ArXiv / Project Page
	iPOKE: Poking a Still Image for Controlled Stochastic Video Synthesis Andreas Blattmann, Timo Milbich, Michael Dorkenwald, Björn Ommer ICCV 2021 ArXiv / Project Page / Code
	Stochastic Image-to-Video Synthesis using cINNs Michael Dorkenwald, Timo Milbich, Andreas Blattmann, Robin Rombach, Konstantinos G. Derpanis, Björn Ommer CVPR 2021 ArXiv / Project Page / Code
	Behavior-Driven Synthesis of Human Dynamics Andreas Blattmann, Timo Milbich, Michael Dorkenwald, Björn Ommer CVPR 2021 ArXiv / Project Page / Code
	Understanding Object Dynamics for Interactive Image-to-Video Synthesis Andreas Blattmann, Timo Milbich, Michael Dorkenwald, Björn Ommer CVPR 2021 ArXiv / Project Page / Code
	Unsupervised behaviour analysis and magnification (uBAM) using deep learning Biagio Brattoli, Uta Buechler, Michael Dorkenwald, Philipp Reiser, Lineard Filli, Fritjof Helmchen, Anna-Sophia Wahl, Björn Ommer Nature Machine Intelligence Article / Project Page / Code
	Unsupervised Magnification of Posture Deviations across Subjects Michael Dorkenwald, Uta Büchler, Björn Ommer CVPR 2020 Article / Project Page