{ "info": { "author": "Remi Cadene", "author_email": "remi.cadene@icloud.com", "bugtrack_url": null, "classifiers": [ "Development Status :: 3 - Alpha", "Intended Audience :: Developers", "License :: OSI Approved :: MIT License", "Programming Language :: Python :: 3.7", "Topic :: Software Development :: Build Tools" ], "description": "# MUREL: Multimodal Relational Reasoning for Visual Question Answering\n\nThe **MuRel network** is a Machine Learning model learned end-to-end to answer questions about images. First, it extracts a graphical representation of the scene where each node is an object or region. Secondly, it fuses the question representation multiple times with a MuRel cell to progressively refines visual and question interactions. Finally, it answers the question via an implicit attention mechanism and a bilinear model. Interestingly, the MuRel network doesn't include an explicit attention mechanism, usually at the core of state-of-the-art models. Its rich vectorial representation of the scene can even be leveraged to visualize the reasoning process at each step.\n\n

\n \n

\n\nThe **MuRel cell** is a novel reasoning primitive which models interactions between question and image regions. Its pairwise relational module enriches the multimodal representations of each node by taking their neighboorhood into account in the modeling.\n\n

\n \n

\n\nIn this repo, we make our datasets and models available via pip install. Also, we provide pretrained models and all the code needed to reproduce our experiments.\n\n#### Summary\n\n* [Installation](#installation)\n * [Python 3 & Anaconda](#1-python-3--anaconda)\n * [As a standalone project](#2-as-standalone-project)\n * [Download datasets](#3-download-datasets)\n * [As a python library](#2-as-a-python-library)\n* [Quick start](#quick-start)\n * [Train a model](#train-a-model)\n * [Evaluate a model](#evaluate-a-model)\n* [Reproduce results](#reproduce-results)\n * [VQA2](#vqa2-dataset)\n * [VQACP2](#vqacp2-dataset)\n * [TDIUC](#tdiuc-dataset)\n* [Pretrained models](#pretrained-models)\n* [Useful commands](#useful-commands)\n* [Citation](#citation)\n* [Poster](#poster)\n* [Authors](#authors)\n* [Acknowledgment](#acknowledgment)\n\n\n## Installation\n\n### 1. Python 3 & Anaconda\n\nWe don't provide support for python 2. We advise you to install python 3 with [Anaconda](https://www.continuum.io/downloads). Then, you can create an environment.\n\n### 2. As standalone project\n\n```\nconda create --name murel python=3.7\nsource activate murel\ngit clone --recursive https://github.com/Cadene/murel.bootstrap.pytorch.git\ncd murel.bootstrap.pytorch\npip install -r requirements.txt\n```\n\n### 3. Download datasets\n\nDownload annotations, images and features for VQA experiments:\n```\nbash murel/datasets/scripts/download_vqa2.sh\nbash murel/datasets/scripts/download_vgenome.sh\nbash murel/datasets/scripts/download_tdiuc.sh\nbash murel/datasets/scripts/download_vqacp2.sh\n```\n\n**Note:** The features have been extracted from a pretrained Faster-RCNN with caffe. We don't provide the code for pretraining or extracting features for now.\n\n### (2. As a python library)\n\nBy importing the `murel` python module, you can access datasets and models in a simple way:\n```python\nfrom murel.datasets.vqacp2 import VQACP2\nfrom murel.models.networks.murel_net import MurelNet\nfrom murel.models.networks.murel_cell import MurelCell\nfrom murel.models.networks.pairwise import Pairwise\n```\n\nTo be able to do so, you can use pip:\n```\npip install murel.bootstrap.pytorch\n```\n\nOr install from source:\n```\ngit clone https://github.com/Cadene/murel.bootstrap.pytorch.git\npython setup.py install\n```\n\n**Note:** This repo is built on top of [block.bootstrap.pytorch](https://github.com/Cadene/block.bootstrap.pytorch). We import VQA2, TDIUC, VGenome from the latter.\n\n\n## Quick start\n\n### Train a model\n\nThe [boostrap/run.py](https://github.com/Cadene/bootstrap.pytorch/blob/master/bootstrap/run.py) file load the options contained in a yaml file, create the corresponding experiment directory and start the training procedure. For instance, you can train our best model on VQA2 by running:\n```\npython -m bootstrap.run -o murel/options/vqa2/murel.yaml\n```\nThen, several files are going to be created in `logs/vqa2/murel`:\n- [options.yaml](https://github.com/Cadene/block.bootstrap.pytorch/blob/master/assets/logs/vrd/block/options.yaml) (copy of options)\n- [logs.txt](https://github.com/Cadene/block.bootstrap.pytorch/blob/master/assets/logs/vrd/block/logs.txt) (history of print)\n- [logs.json](https://github.com/Cadene/block.bootstrap.pytorch/blob/master/assets/logs/vrd/block/logs.json) (batchs and epochs statistics)\n- [view.html](http://htmlpreview.github.io/?https://raw.githubusercontent.com/Cadene/block.bootstrap.pytorch/master/assets/logs/vrd/block/view.html?token=AEdvLlDSYaSn3Hsr7gO5sDBxeyuKNQhEks5cTF6-wA%3D%3D) (learning curves)\n- ckpt_last_engine.pth.tar (checkpoints of last epoch)\n- ckpt_last_model.pth.tar\n- ckpt_last_optimizer.pth.tar\n- ckpt_best_eval_epoch.predicate.R_50_engine.pth.tar (checkpoints of best epoch)\n- ckpt_best_eval_epoch.predicate.R_50_model.pth.tar\n- ckpt_best_eval_epoch.predicate.R_50_optimizer.pth.tar\n\nMany options are available in the [options directory](https://github.com/Cadene/murel.bootstrap.pytorch/blob/master/murel/options).\n\n### Evaluate a model\n\nAt the end of the training procedure, you can evaluate your model on the testing set. In this example, [boostrap/run.py](https://github.com/Cadene/bootstrap.pytorch/blob/master/bootstrap/run.py) load the options from your experiment directory, resume the best checkpoint on the validation set and start an evaluation on the testing set instead of the validation set while skipping the training set (train_split is empty). Thanks to `--misc.logs_name`, the logs will be written in the new `logs_predicate.txt` and `logs_predicate.json` files, instead of being appended to the `logs.txt` and `logs.json` files.\n```\npython -m bootstrap.run \\\n-o logs/vqa2/murel/options.yaml \\\n--exp.resume best_accuracy_top1 \\\n--dataset.train_split \\\n--dataset.eval_split test \\\n--misc.logs_name predicate\n```\n\n## Reproduce results\n\n### VQA2 dataset\n\n#### Training and evaluation (train/val)\n\nWe use this simple setup to tune our hyperparameters on the valset.\n\n```\npython -m bootstrap.run \\\n-o murel/options/vqa2/murel.yaml \\\n--exp.dir logs/vqa2/murel\n```\n\n#### Training and evaluation (train+val/val/test)\n\nThis heavier setup allows us to train a model on 95% of the concatenation of train and val sets, and to evaluate it on the 5% rest. Then we extract the predictions of our best checkpoint on the testset. Finally, we submit a json file on the EvalAI web site.\n\n```\npython -m bootstrap.run \\\n-o murel/options/vqa2/murel.yaml \\\n--exp.dir logs/vqa2/murel_trainval \\\n--dataset.proc_split trainval\n\npython -m bootstrap.run \\\n-o logs/vqa2/murel_trainval/options.yaml \\\n--exp.resume best_eval_epoch.accuracy_top1 \\\n--dataset.train_split \\\n--dataset.eval_split test \\\n--misc.logs_name test\n```\n\n#### Training and evaluation (train+val+vg/val/test)\n\nSame, but we add pairs from the VisualGenome dataset.\n\n```\npython -m bootstrap.run \\\n-o murel/options/vqa2/murel.yaml \\\n--exp.dir logs/vqa2/murel_trainval_vg \\\n--dataset.proc_split trainval \\\n--dataset.vg True\n\npython -m bootstrap.run \\\n-o logs/vqa2/murel_trainval_vg/options.yaml \\\n--exp.resume best_eval_epoch.accuracy_top1 \\\n--dataset.train_split \\\n--dataset.eval_split test \\\n--misc.logs_name test\n```\n\n#### Compare experiments on valset\n\nYou can compare experiments by displaying their best metrics on the valset.\n\n```\npython -m murel.compare_vqa_val -d logs/vqa2/murel logs/vqa2/attention\n```\n\n#### Submit predictions on EvalAI\n\nIt is not possible to automaticaly compute the accuracies on the testset. You need to submit a json file on the [EvalAI platform](http://evalai.cloudcv.org/web/challenges/challenge-page/80/my-submission). The evaluation step on the testset creates the json file that contains the prediction of your model on the full testset. For instance: `logs/vqa2/murel_trainval_vg/results/test/epoch,19/OpenEnded_mscoco_test2015_model_results.json`. To get the accuracies on testdev or test sets, you must submit this file.\n\n\n### VQACP2 dataset\n\n#### Training and evaluation (train/val)\n\n```\npython -m bootstrap.run \\\n-o murel/options/vqacp2/murel.yaml \\\n--exp.dir logs/vqacp2/murel\n```\n\n#### Compare experiments on valset\n\n```\npython -m murel.compare_vqa_val -d logs/vqacp2/murel logs/vqacp2/attention\n```\n\n### TDIUC dataset\n\n#### Training and evaluation (train/val/test)\n\nThe full training set is split into a trainset and a valset. At the end of the training, we evaluate our best checkpoint on the testset. The TDIUC metrics are computed and displayed at the end of each epoch. They are also stored in `logs.json` and `logs_test.json`.\n\n```\npython -m bootstrap.run \\\n-o murel/options/tdiuc/murel.yaml \\\n--exp.dir logs/tdiuc/murel\n\npython -m bootstrap.run \\\n-o logs/tdiuc/murel/options.yaml \\\n--exp.resume best_eval_epoch.accuracy_top1 \\\n--dataset.train_split \\\n--dataset.eval_split test \\\n--misc.logs_name test\n```\n\n#### Compare experiments\n\nYou can compare experiments by displaying their best metrics on the valset or testset.\n\n```\npython -m murel.compare_tdiuc_val -d logs/tdiuc/murel logs/tdiuc/attention\npython -m murel.compare_tdiuc_test -d logs/tdiuc/murel logs/tdiuc/attention\n```\n\n## Pretrained models\n\n```\nTODO\n```\n\n\n## Useful commands\n\n### Use tensorboard instead of plotly\n\nInstead of creating a `view.html` file, a tensorboard file will be created:\n```\npython -m bootstrap.run -o murel/options/vqa2/murel.yaml \\\n--view.name tensorboard\n```\n\n```\ntensorboard --logdir=logs/vqa2\n```\n\nYou can use plotly and tensorboard at the same time by updating the yaml file like [this one](https://github.com/Cadene/bootstrap.pytorch/blob/master/bootstrap/options/mnist_plotly_tensorboard.yaml#L38).\n\n\n### Use a specific GPU\n\nFor a specific experiment:\n```\nCUDA_VISIBLE_DEVICES=0 python -m boostrap.run -o murel/options/vqa2/murel.yaml\n```\n\nFor the current terminal session:\n```\nexport CUDA_VISIBLE_DEVICES=0\n```\n\n### Overwrite an option\n\nThe boostrap.pytorch framework makes it easy to overwrite a hyperparameter. In this example, we run an experiment with a non-default learning rate. Thus, I also overwrite the experiment directory path:\n```\npython -m bootstrap.run -o murel/options/vqa2/murel.yaml \\\n--optimizer.lr 0.0003 \\\n--exp.dir logs/vqa2/murel_lr,0.0003\n```\n\n### Resume training\n\nIf a problem occurs, it is easy to resume the last epoch by specifying the options file from the experiment directory while overwritting the `exp.resume` option (default is None):\n```\npython -m bootstrap.run -o logs/vqa2/murel/options.yaml \\\n--exp.resume last\n```\n\n### Web API\n\n```\nTODO\n```\n\n### Extract your own image features\n\n```\nTODO\n```\n\n\n## Citation\n\n```\n@InProceedings{Cadene_2019_CVPR,\n author = {Cadene, Remi and Ben-Younes, Hedi and Thome, Nicolas and Cord, Matthieu},\n title = {MUREL: {M}ultimodal {R}elational {R}easoning for {V}isual {Q}uestion {A}nswering},\n booktitle = {{IEEE} Conference on Computer Vision and Pattern Recognition {CVPR}},\n year = {2019},\n url = {http://remicadene.com/pdfs/paper_cvpr2019.pdf}\n}\n```\n\n## Poster\n\n```\nTODO\n```\n\n## Authors\n\nThis code was made available by [Hedi Ben-Younes](https://twitter.com/labegne) (Sorbonne-Heuritech), [Remi Cadene](http://remicadene.com) (Sorbonne), [Matthieu Cord](http://webia.lip6.fr/~cord) (Sorbonne) and [Nicolas Thome](http://cedric.cnam.fr/~thomen/) (CNAM).\n\n## Acknowledgment\n\nSpecial thanks to the authors of [VQA2](TODO), [TDIUC](TODO), [VisualGenome](TODO) and [VQACP2](TODO), the datasets used in this research project.\n\n\n", "description_content_type": "", "docs_url": null, "download_url": "", "downloads": { "last_day": -1, "last_month": -1, "last_week": -1 }, "home_page": "https://github.com/cadene/murel.bootstrap.pytorch", "keywords": "pytorch murel block vqa tdiuc visual question answering visual relationship detection relation bootstrap deep learning aaai cvpr", "license": "", "maintainer": "", "maintainer_email": "", "name": "murel.bootstrap.pytorch", "package_url": "https://pypi.org/project/murel.bootstrap.pytorch/", "platform": "", "project_url": "https://pypi.org/project/murel.bootstrap.pytorch/", "project_urls": { "Homepage": "https://github.com/cadene/murel.bootstrap.pytorch" }, "release_url": "https://pypi.org/project/murel.bootstrap.pytorch/0.0.0/", "requires_dist": [ "bootstrap.pytorch", "skipthoughts", "pretrainedmodels", "opencv-python", "block.bootstrap.pytorch", "pytest; extra == 'test'" ], "requires_python": "", "summary": "MUREL: Multimodal Relational Reasoning for Visual Question Answering", "version": "0.0.0" }, "last_serial": 4868972, "releases": { "0.0.0": [ { "comment_text": "", "digests": { "md5": "9c005fd63b38edf640cf1d4777665139", "sha256": "fcf72221c0c420862af3b1aa9a13f1430632c26321232a4e142ba7c23fe0eceb" }, "downloads": -1, "filename": "murel.bootstrap.pytorch-0.0.0-py3-none-any.whl", "has_sig": false, "md5_digest": "9c005fd63b38edf640cf1d4777665139", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 97108, "upload_time": "2019-02-26T09:45:49", "url": "https://files.pythonhosted.org/packages/f5/af/3b27ded2accbad3b69dff4d46f9cde0ff3e74cde4af14ec29619ce7983df/murel.bootstrap.pytorch-0.0.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "3a685ac327662f02da460e4352d19f84", "sha256": "0a81c58b51b70f4a06da29efb018cce50c4c5ce317c09271a0d49949ec29f677" }, "downloads": -1, "filename": "murel.bootstrap.pytorch-0.0.0.tar.gz", "has_sig": false, "md5_digest": "3a685ac327662f02da460e4352d19f84", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 71446, "upload_time": "2019-02-26T09:45:52", "url": "https://files.pythonhosted.org/packages/3c/19/d11d42d3fb330eff97a189348d3de4c9874dab588ef7556cdd8d1998140e/murel.bootstrap.pytorch-0.0.0.tar.gz" } ] }, "urls": [ { "comment_text": "", "digests": { "md5": "9c005fd63b38edf640cf1d4777665139", "sha256": "fcf72221c0c420862af3b1aa9a13f1430632c26321232a4e142ba7c23fe0eceb" }, "downloads": -1, "filename": "murel.bootstrap.pytorch-0.0.0-py3-none-any.whl", "has_sig": false, "md5_digest": "9c005fd63b38edf640cf1d4777665139", "packagetype": "bdist_wheel", "python_version": "py3", "requires_python": null, "size": 97108, "upload_time": "2019-02-26T09:45:49", "url": "https://files.pythonhosted.org/packages/f5/af/3b27ded2accbad3b69dff4d46f9cde0ff3e74cde4af14ec29619ce7983df/murel.bootstrap.pytorch-0.0.0-py3-none-any.whl" }, { "comment_text": "", "digests": { "md5": "3a685ac327662f02da460e4352d19f84", "sha256": "0a81c58b51b70f4a06da29efb018cce50c4c5ce317c09271a0d49949ec29f677" }, "downloads": -1, "filename": "murel.bootstrap.pytorch-0.0.0.tar.gz", "has_sig": false, "md5_digest": "3a685ac327662f02da460e4352d19f84", "packagetype": "sdist", "python_version": "source", "requires_python": null, "size": 71446, "upload_time": "2019-02-26T09:45:52", "url": "https://files.pythonhosted.org/packages/3c/19/d11d42d3fb330eff97a189348d3de4c9874dab588ef7556cdd8d1998140e/murel.bootstrap.pytorch-0.0.0.tar.gz" } ] }