Scientific Image and Video Mining

This project finished in 2013, now continued by Video Understanding.

Project Summary

We propose to focus on fundamental computer science research in computer vision and machine learning, and its application to archaeology, cultural heritage preservation, and sociology. We validate our project by collaborations with researchers and practitioners in these fields.

 Goals of the project

We address the following problems:

Mining historical collections of photographs and paintings with applications to archaeology and cultural heritage preservation. This includes for example the quantitative analysis of environmental dammage on wall paintings or mosaics over time, and the cross-indexing of XIXth Century paintings of Pompeii with modern photographs.

Mining TV broadcasts with applications to sociology. This includes automating the analysis and annotation of human actions and interactions in video segments to assist –and provide data for– studies of consumer trends in commercials, political event coverage in newscasts, and class- and gender-related behavior patterns in situation comedies, for example.

For every one of the problems we have in mind, indexing, searching and analyzing photo and video collections is a key issue. Recent advances in image analysis, computer vision, and machine learning promise an opportunity to automate, partly or completely, these tasks (e.g., annotation of photos and videos), as well as to access information whose extraction from images is simply beyond human capabilities (e.g., indexing of very large image archives). To fulfil this promise, we propose to conduct fundamental research in object, scene, and activity modeling, learning, and recognition, and to validate it with the development of computerized image and video mining tools at the service of sciences and humanities.

  • Ivan Laptev
    I am affiliated with Inria Paris – Rocquencourt that is a unit of the French National Institute for Research in Computer Science and ...

Former members:
  • Hélène Dessales Assistant professor in archaeology, Ecole Normale Supérieure
  • Adrien Gaidon Microsoft Research - Inria Joint Centre (PHD)
  • Warith Harchaoui Microsoft Research - Inria Joint Centre ( PHD)
  • Bryan Russel University of Washington
  • Cordelia Schmid Inria Grenoble - Rhône-Alpes
  • Yves Ubelmann Expert Engineer

2016

Communication dans un congrès

titre
Weakly-Supervised Semantic Segmentation using Motion Cues
auteur
Pavel Tokmakov, Karteek Alahari, Cordelia Schmid
article
ECCV – European Conference on Computer Vision, Oct 2016, Amsterdam, Netherlands. pp.388-404, ⟨10.1007/978-3-319-46493-0_24⟩
Accès au texte intégral et bibtex
https://hal.archives-ouvertes.fr/hal-01292794/file/mcnn.pdf BibTex
titre
ContextLocNet: Context-Aware Deep Network Models for Weakly Supervised Localization
auteur
Vadim Kantorov, Maxime Oquab, Minsu Cho, Ivan Laptev
article
ECCV 2016, Oct 2016, Amsterdam, Netherlands. pp.350 – 365, ⟨10.1007/978-3-319-46454-1_22⟩
Accès au texte intégral et bibtex
https://hal.inria.fr/hal-01421772/file/contextlocnet_eccv2016.pdf BibTex
titre
Unsupervised Learning from Narrated Instruction Videos
auteur
Jean-Baptiste Alayrac, Piotr Bojanowski, Nishant Agrawal, Josef Sivic, Ivan Laptev, Simon Lacoste-Julien
article
CVPR2016 – 29th IEEE Conference on Computer Vision and Pattern Recognition, Jun 2016, Las Vegas, United States
Accès au bibtex
https://arxiv.org/pdf/1506.09215 BibTex

2015

Communication dans un congrès

titre
P-CNN: Pose-based CNN Features for Action Recognition
auteur
Guilhem Chéron, Ivan Laptev, Cordelia Schmid
article
ICCV – IEEE International Conference on Computer Vision, Dec 2015, Santiago, Chile. pp.3218-3226, ⟨10.1109/ICCV.2015.368⟩
Accès au texte intégral et bibtex
https://hal.inria.fr/hal-01187690/file/P-CNN_cheronICCV15.pdf BibTex
titre
Online Object Tracking with Proposal Selection
auteur
Yang Hua, Karteek Alahari, Cordelia Schmid
article
ICCV – IEEE International Conference on Computer Vision, Dec 2015, Santiago, Chile. pp.3092-3100, ⟨10.1109/ICCV.2015.354⟩
Accès au texte intégral et bibtex
https://hal.inria.fr/hal-01207196/file/paper.pdf BibTex
titre
On Pairwise Cost for Multi-Object Network Flow Tracking
auteur
Visesh Chari, Simon Lacoste-Julien, Ivan Laptev, Josef Sivic
article
CVPR 2015 – 28th IEEE Conference on Computer Vision and Pattern Recognition, Jun 2015, Boston, United States
Accès au bibtex
https://arxiv.org/pdf/1408.3304 BibTex
titre
Is object localization for free? – Weakly-supervised learning with convolutional neural networks
auteur
Maxime Oquab, Léon Bottou, Ivan Laptev, Josef Sivic
article
IEEE Conference on Computer Vision and Pattern Recognition, Jun 2015, Boston, United States
Accès au texte intégral et bibtex
https://hal.inria.fr/hal-01015140/file/Oquab15.pdf BibTex

2014

Article dans une revue

titre
Activity representation with motion hierarchies
auteur
Adrien Gaidon, Zaid Harchaoui, Cordelia Schmid
article
International Journal of Computer Vision, Springer Verlag, 2014, 107 (3), pp.219-238. ⟨10.1007/s11263-013-0677-1⟩
Accès au texte intégral et bibtex
https://hal.inria.fr/hal-00908581/file/tracklets_journal.pdf BibTex

Communication dans un congrès

titre
Category-specific video summarization
auteur
Danila Potapov, Matthijs Douze, Zaid Harchaoui, Cordelia Schmid
article
ECCV – European Conference on Computer Vision, Sep 2014, Zurich, Switzerland. pp.540-555, ⟨10.1007/978-3-319-10599-4_35⟩
Accès au texte intégral et bibtex
https://hal.inria.fr/hal-01022967/file/video_summarization.pdf BibTex
titre
Learning and Transferring Mid-Level Image Representations using Convolutional Neural Networks
auteur
Maxime Oquab, Léon Bottou, Ivan Laptev, Josef Sivic
article
IEEE Conference on Computer Vision and Pattern Recognition, Jun 2014, Columbus, OH, United States
Accès au texte intégral et bibtex
https://hal.inria.fr/hal-00911179/file/oquab14.pdf BibTex
titre
Occlusion and Motion Reasoning for Long-term Tracking
auteur
Yang Hua, Karteek Alahari, Cordelia Schmid
article
ECCV – European Conference on Computer Vision, Sep 2014, Zurich, Switzerland. pp.172-187, ⟨10.1007/978-3-319-10599-4_12⟩
Accès au texte intégral et bibtex
https://hal.inria.fr/hal-01020149/file/tracking.pdf BibTex
titre
Mixing Body-Part Sequences for Human Pose Estimation
auteur
Anoop Cherian, Julien Mairal, Karteek Alahari, Cordelia Schmid
article
CVPR – IEEE Conference on Computer Vision & Pattern Recognition, Jun 2014, Columbus, OH, United States. pp. 2361-2368, ⟨10.1109/CVPR.2014.302⟩
Accès au texte intégral et bibtex
https://hal.inria.fr/hal-00978643/file/posecvpr2014.pdf BibTex

2013

Article dans une revue

titre
Temporal Localization of Actions with Actoms
auteur
Adrien Gaidon, Zaid Harchaoui, Cordelia Schmid
article
IEEE Transactions on Pattern Analysis and Machine Intelligence, Institute of Electrical and Electronics Engineers, 2013, 35 (11), pp.2782-2795. ⟨10.1109/TPAMI.2013.65⟩
Accès au texte intégral et bibtex
https://hal.inria.fr/hal-00804627/file/ASM_TPAMI_Gaidon.pdf BibTex

2012

Communication dans un congrès

titre
Recognizing activities with cluster-trees of tracklets
auteur
Adrien Gaidon, Zaid Harchaoui, Cordelia Schmid
article
BMVC 2012 – British Machine Vision Conference, Sep 2012, Guildford, United Kingdom. pp.30.1-30.13, ⟨10.5244/C.26.30⟩
Accès au texte intégral et bibtex
https://hal.inria.fr/hal-00722955/file/gaidon_tracklets_bmvc2012.pdf BibTex

Rapport

titre
Temporal Localization of Actions with Actoms
auteur
Adrien Gaidon, Zaid Harchaoui, Cordelia Schmid
article
[Research Report] RR-7930, INRIA. 2012
Accès au texte intégral et bibtex
https://hal.inria.fr/hal-00687312/file/RR-7930.pdf BibTex

2011

Communication dans un congrès

titre
Actom Sequence Models for Efficient Action Detection
auteur
Adrien Gaidon, Zaid Harchaoui, Cordelia Schmid
article
CVPR 2011 – IEEE Conference on Computer Vision & Pattern Recognition, Jun 2011, Colorado Springs, United States. pp.3201-3208, ⟨10.1109/CVPR.2011.5995646⟩
Accès au texte intégral et bibtex
https://hal.inria.fr/inria-00575217/file/1513.pdf BibTex
titre
A time series kernel for action recognition
auteur
Adrien Gaidon, Zaid Harchaoui, Cordelia Schmid
article
BMVC 2011 – British Machine Vision Conference, Aug 2011, Dundee, United Kingdom. pp.63.1-63.11, ⟨10.5244/C.25.63⟩
Accès au texte intégral et bibtex
https://hal.inria.fr/inria-00613089/file/kernel_time_series.pdf BibTex

2009

Communication dans un congrès

titre
Mining visual actions from movies
auteur
Adrien Gaidon, Marcin Marszalek, Cordelia Schmid
article
British Machine Vision Conference, British Machine Vision Association, Sep 2009, Londres, United Kingdom. pp.125.1-125.11, ⟨10.5244/C.23.125⟩
Accès au texte intégral et bibtex
https://hal.inria.fr/inria-00440973/file/gaidon_mining_actions_bmvc2009.pdf BibTex