Unverified Commit f3024690 authored by Jaeyoun Kim's avatar Jaeyoun Kim Committed by GitHub
Browse files

Update README.md (#8337)

Created a table to organize models
Added paper names
Added maintainer information
parent 70a3d96e
# TensorFlow Research Models # TensorFlow Research Models
This folder contains machine learning models implemented by researchers in This folder contains machine learning models implemented by researchers in [TensorFlow](https://tensorflow.org).
[TensorFlow](https://tensorflow.org). The models are maintained by their
respective authors. To propose a model for inclusion, please submit a pull
request.
**Note: some research models are stale and have not updated to the latest The research models are maintained by their respective authors.
TensorFlow yet. If users have trouble with TF 2.x for research models,
please consider TF 1.15.**
## Models **Note: Some research models are stale and have not updated to the latest TensorFlow 2 yet.**
- [adversarial_crypto](adversarial_crypto): protecting communications with ---
adversarial neural cryptography.
- [adversarial_text](adversarial_text): semi-supervised sequence learning with ## Frameworks / APIs with Models
adversarial training. | Folder | Framework | Description | Maintainer(s) |
- [attention_ocr](attention_ocr): a model for real-world image text |--------|-----------|-------------|---------------|
extraction. | [object_detection](object_detection) | TensorFlow Object Detection API | A framework that makes it easy to construct, train and deploy object detection models<br/> | jch1, tombstone, derekjchow, jesu9, dreamdragon, pkulzc |
- [audioset](audioset): Models and supporting code for use with | [slim](slim) | TensorFlow-Slim Image Classification Model Library | A lightweight high-level API of TensorFlow for defining, training and evaluating image classification models <br/>• Inception V1/V2/V3/V4<br/>• Inception-ResNet-v2<br/>• ResNet V1/V2<br/>• VGG 16/19<br/>• MobileNet V1/V2/V3<br/>• NASNet-A_Mobile/Large<br/>• PNASNet-5_Large/Mobile | sguada, nathansilberman |
[AudioSet](http://g.co/audioset).
- [autoencoder](autoencoder): various autoencoders. ---
- [brain_coder](brain_coder): Program synthesis with reinforcement learning.
- [cognitive_mapping_and_planning](cognitive_mapping_and_planning): ## Models / Implementations
implementation of a spatial memory based mapping and planning architecture
for visual navigation. | Folder | Paper(s) | Description | Maintainer(s) |
- [compression](compression): compressing and decompressing images using a |--------|----------|-------------|---------------|
pre-trained Residual GRU network. | [adv_imagenet<br />_models](adv_imagenet_models) | [1] [Adversarial Machine Learning at Scale](https://arxiv.org/abs/1611.01236)<br/>[2] [Ensemble Adversarial Training: Attacks and Defenses](https://arxiv.org/abs/1705.07204) | Adversarially trained ImageNet models | alexeykurakin |
- [cvt_text](cvt_text): semi-supervised sequence learning with cross-view | [adversarial_crypto](adversarial_crypto) | [Learning to Protect Communications with Adversarial Neural Cryptography](https://arxiv.org/abs/1610.06918) | Code to train encoder/decoder/adversary network triplets and evaluate their effectiveness on randomly generated input and key pairs | dave-andersen |
training. | [adversarial<br />_logit_pairing](adversarial_logit_pairing) | [Adversarial Logit Pairing](https://arxiv.org/abs/1803.06373) | Implementation of Adversarial logit pairing paper as well as few models pre-trained on ImageNet and Tiny ImageNet | alexeykurakin |
- [deep_contextual_bandits](deep_contextual_bandits): code for a variety of contextual bandits algorithms using deep neural networks and Thompson sampling. | [adversarial_text](adversarial_text) | [1] [Adversarial Training Methods for Semi-Supervised Text](https://arxiv.org/abs/1605.07725) Classification<br/>[2] [Semi-supervised Sequence Learning](https://arxiv.org/abs/1511.01432) | Adversarial Training Methods for Semi-Supervised Text Classification| rsepassi, a-dai |
- [deep_speech](deep_speech): automatic speech recognition. | [astronet](astronet) | AstroNet<br/>[1] [Identifying Exoplanets with Deep Learning: A Five Planet Resonant Chain around Kepler-80 and an Eighth Planet around Kepler-90](https://arxiv.org/abs/1712.05044) | A neural network for identifying exoplanets in light curves | cshallue|
- [deeplab](deeplab): deep labeling for semantic image segmentation. | [attention_ocr](attention_ocr) | [Attention-based Extraction of Structured Information from Street View Imagery](https://arxiv.org/abs/1704.03549) | | alexgorban |
- [delf](delf): deep local features for image matching and retrieval. | [audioset](audioset) | Models for AudioSet: A Large Scale Dataset of Audio Events | | plakal, dpwe|
- [domain_adaptation](domain_adaptation): domain separation networks. | [autoaugment](autoaugment) | [1] [AutoAugment](https://arxiv.org/abs/1805.09501)<br/>[2] [Wide Residual Networks](https://arxiv.org/abs/1605.07146)<br/>[3] [Shake-Shake regularization](https://arxiv.org/abs/1705.07485)<br/>[4] [ShakeDrop Regularization for Deep Residual Learning](https://arxiv.org/abs/1802.02375) | Train Wide-ResNet, Shake-Shake and ShakeDrop models on CIFAR-10 and CIFAR-100 dataset with AutoAugment | barretzoph|
- [fivo](fivo): filtering variational objectives for training generative | [autoencoder](autoencoder) | Various autoencoders | | snurkabill|
sequence models. | [brain_coder](brain_coder) | [Program synthesis with reinforcement learning](https://arxiv.org/abs/1801.03526) | Program synthesis with reinforcement learning | danabo |
- [im2txt](im2txt): image-to-text neural network for image captioning. | [cognitive_mapping<br />_and_planning](cognitive_mapping_and_planning) | [Cognitive Mapping and Planning for Visual Navigation](https://arxiv.org/abs/1702.03920) | Implementation of a spatial memory based mapping and planning architecture for visual navigation | s-gupta |
- [inception](inception): deep convolutional networks for computer vision. | [compression](compression) | [Full Resolution Image Compression with Recurrent Neural Networks](https://arxiv.org/abs/1608.05148) | | nmjohn |
- [keypointnet](keypointnet): discovery of latent 3D keypoints via end-to-end | [cvt_text](cvt_text) | [Semi-supervised sequence learning with cross-view training](https://arxiv.org/abs/1809.08370) | | clarkkev, lmthang|
geometric eeasoning [[demo](https://keypointnet.github.io/)]. | [deep_contextual<br />_bandits](deep_contextual_bandits) | [Deep Bayesian Bandits Showdown: An Empirical Comparison of Bayesian Deep Networks for Thompson Sampling](https://arxiv.org/abs/1802.09127) | | rikel |
- [learning_to_remember_rare_events](learning_to_remember_rare_events): a | [deep_speech](deep_speech) | [Deep Speech 2](https://arxiv.org/abs/1512.02595) | End-to-End Speech Recognition in English and Mandarin | |
large-scale life-long memory module for use in deep learning. | [deeplab](deeplab) | [1] [DeepLabv1](https://arxiv.org/abs/1412.7062)<br/>[2] [DeepLabv2](https://arxiv.org/abs/1606.00915)<br/>[3] [DeepLabv3](https://arxiv.org/abs/1802.02611)<br/>[4] [DeepLabv3+](https://arxiv.org/abs/1706.05587) | DeepLab models for semantic image segmentation | aquariusjay, yknzhu, gpapan|
- [learning_unsupervised_learning](learning_unsupervised_learning): a | [delf](delf) | [1] [Large-Scale Image Retrieval with Attentive Deep Local Features](https://arxiv.org/abs/1612.06321) <br/>[2] [Detect-to-Retrieve](https://arxiv.org/abs/1812.01584) | DELF: DEep Local Features | andrefaraujo |
meta-learned unsupervised learning update rule. | [domain_adaptation](domain_adaptation) | [1] [Domain Separation Networks](https://arxiv.org/abs/1608.06019) <br/>[2] [Unsupervised Pixel-Level Domain Adaptation with Generative Adversarial Networks](https://arxiv.org/abs/1612.05424) | Code used for two domain adaptation papers| bousmalis, dmrd|
- [lexnet_nc](lexnet_nc): a distributed model for noun compound relationship | [efficient-hrl](efficient-hrl) | [1] [Data-Efficient Hierarchical Reinforcement Learning](https://arxiv.org/abs/1805.08296)<br/>[2] [Near-Optimal Representation Learning for Hierarchical Reinforcement Learning](https://arxiv.org/abs/1810.01257) | Code for performing hierarchical reinforcement learning| ofirnachum |
classification. | [feelvos](feelvos)| [FEELVOS](https://arxiv.org/abs/1902.09513) | Fast End-to-End Embedding Learning for Video Object Segmentation ||
- [lfads](lfads): sequential variational autoencoder for analyzing | [fivo](fivo)| [Filtering variational objectives for training generative sequence models](https://arxiv.org/abs/1705.09279) | | dieterichlawson|
neuroscience data. | [global_objectives](global_objectives) | [Scalable Learning of Non-Decomposable Objectives](https://arxiv.org/abs/1608.04802) | TensorFlow loss functions that optimize directly for a variety of objectives including AUC, recall at precision, and more  | mackeya-google|
- [lm_1b](lm_1b): language modeling on the one billion word benchmark. | [im2txt](im2txt) | [Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge](https://arxiv.org/abs/1609.06647) | Image-to-text neural network for image captioning| cshallue|
- [lm_commonsense](lm_commonsense): commonsense reasoning using language models. | [inception](inception) | [Rethinking the Inception Architecture for Computer Vision](https://arxiv.org/abs/1512.00567) | Deep convolutional networks for computer vision | shlens, vincentvanhoucke|
- [maskgan](maskgan): text generation with GANs. | [keypointnet](keypointnet) | [KeypointNet](https://arxiv.org/abs/1807.03146) | Discovery of Latent 3D Keypoints via End-to-end Geometric Reasoning | mnorouzi|
- [namignizer](namignizer): recognize and generate names. | [learned_optimizer](learned_optimizer) | [Learned Optimizers that Scale and Generalize](https://arxiv.org/abs/1703.04813) | | olganw, nirum |
- [neural_gpu](neural_gpu): highly parallel neural computer. | [learning_to<br />_remember<br />_rare_events](learning_to_remember_rare_events) | [Learning to Remember Rare Events](https://arxiv.org/abs/1703.03129) | A large-scale life-long memory module for use in deep learning | lukaszkaiser, ofirnachum|
- [neural_programmer](neural_programmer): neural network augmented with logic | [learning<br />_unsupervised<br />_learning](learning_unsupervised_learning) | [Meta-Learning Update Rules for Unsupervised Representation Learning](https://arxiv.org/abs/1804.00222) | A meta-learned unsupervised learning update rule| lukemetz, nirum|
and mathematic operations. | [lexnet_nc](lexnet_nc) | LexNET | Noun Compound Relation Classification | vered1986, waterson |
- [next_frame_prediction](next_frame_prediction): probabilistic future frame | [lfads](lfads) | [LFADS - Latent Factor Analysis via Dynamical Systems](https://doi.org/10.1101/152884) | Sequential variational autoencoder for analyzing neuroscience data| jazcollins, susillo |
synthesis via cross convolutional networks. | [lm_1b](lm_1b) | [Exploring the Limits of Language Modeling](https://arxiv.org/abs/1602.02410) | Language modeling on the one billion word benchmark | oriolvinyals, panyx0718 |
- [object_detection](object_detection): localizing and identifying multiple | [lm_commonsense](lm_commonsense) | [A Simple Method for Commonsense Reasoning](https://arxiv.org/abs/1806.02847) | Commonsense reasoning using language models | thtrieu |
objects in a single image. | [lstm_object_detection](lstm_object_detection) | [Mobile Video Object Detection with Temporally-Aware Feature Maps](https://arxiv.org/abs/1711.06368) | | dreamdragon, masonliuw, yinxiaoli, yongzhe2160 |
- [pcl_rl](pcl_rl): code for several reinforcement learning algorithms, | [marco](marco) | [Classification of crystallization outcomes using deep convolutional neural networks](https://arxiv.org/abs/1803.10342) | | vincentvanhoucke |
including Path Consistency Learning. | [maskgan](maskgan)| [MaskGAN: Better Text Generation via Filling in the______](https://arxiv.org/abs/1801.07736) | Text generation with GANs | a-dai|
- [ptn](ptn): perspective transformer nets for 3D object reconstruction. | [namignizer](namignizer)| Namignizer | Recognize and generate names | knathanieltucker |
- [marco](marco): automating the evaluation of crystallization experiments. | [neural_gpu](neural_gpu)| [Neural GPUs Learn Algorithms](https://arxiv.org/abs/1511.08228) | Highly parallel neural computer | lukaszkaiser |
- [qa_kg](qa_kg): module networks for question answering on knowledge graphs. | [neural_programmer](neural_programmer) | [Learning a Natural Language Interface with Neural Programmer](https://arxiv.org/abs/1611.08945) | Neural network augmented with logic and mathematic operations| arvind2505 |
- [real_nvp](real_nvp): density estimation using real-valued non-volume | [next_frame<br />_prediction](next_frame_prediction) | [Visual Dynamics](https://arxiv.org/abs/1607.02586) | Probabilistic Future Frame Synthesis via Cross Convolutional Networks| panyx0718 |
preserving (real NVP) transformations. | [pcl_rl](pcl_rl) | [1] [Improving Policy Gradient by Exploring Under-appreciated Rewards](https://arxiv.org/abs/1611.09321)<br/>[2] [Bridging the Gap Between Value and Policy Based Reinforcement Learning](https://arxiv.org/abs/1702.08892)<br/>[3] [Trust-PCL: An Off-Policy Trust Region Method for Continuous Control](https://arxiv.org/abs/1707.01891) | Code for several reinforcement learning algorithms, including Path Consistency Learning+B13| ofirnachum |
- [rebar](rebar): low-variance, unbiased gradient estimates for discrete | [ptn](ptn) | [Perspective Transformer Nets](https://arxiv.org/abs/1612.00814) | Learning Single-View 3D Object Reconstruction without 3D Supervision | xcyan, arkanath, hellojas, honglaklee |
latent variable models. | [qa_kg](qa_kg) | [Learning to Reason](https://arxiv.org/abs/1704.05526) | End-to-End Module Networks for Visual Question Answering | yuyuz|
- [seq2species](seq2species): deep learning solution for read-level taxonomic | [real_nvp](real_nvp) | [Density estimation using Real NVP](https://arxiv.org/abs/1605.08803) | | laurent-dinh |
classification. | [rebar](rebar) | [REBAR](https://arxiv.org/abs/1703.07370) | Low-variance, unbiased gradient estimates for discrete latent variable models | gjtucker|
- [skip_thoughts](skip_thoughts): recurrent neural network sentence-to-vector | [sentiment<br />_analysis](sentiment_analysis)| [Effective Use of Word Order for Text Categorization with Convolutional Neural Networks](https://arxiv.org/abs/1412.1058) | ||
encoder. | [seq2species](seq2species) | [Seq2Species: A deep learning approach to pattern recognition for short DNA sequences](https://doi.org/10.1101/353474) | Neural Network Models for Species Classification| apbusia, depristo |
- [slim](slim): image classification models in TF-Slim. | [skip_thoughts](skip_thoughts) | [Skip-Thought Vectors](https://arxiv.org/abs/1506.06726) | Recurrent neural network sentence-to-vector encoder | cshallue|
- [street](street): identify the name of a street (in France) from an image | [steve](steve) | [Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion](https://arxiv.org/abs/1807.01675) | A hybrid model-based/model-free reinforcement learning algorithm for sample-efficient continuous control | buckman-google|
using a Deep RNN. | [street](street) | [End-to-End Interpretation of the French Street Name Signs Dataset](https://arxiv.org/abs/1702.03970) | Identify the name of a street (in France) from an image using a Deep RNN| theraysmith|
- [struct2depth](struct2depth): unsupervised learning of depth and ego-motion. | [struct2depth](struct2depth)| [Depth Prediction Without the Sensors: Leveraging Structure for Unsupervised Learning from Monocular Videos](https://arxiv.org/abs/1811.06152) | Unsupervised learning of depth and ego-motion| aneliaangelova|
- [swivel](swivel): the Swivel algorithm for generating word embeddings. | [swivel](swivel) | [Swivel: Improving Embeddings by Noticing What's Missing](https://arxiv.org/abs/1602.02215) | The Swivel algorithm for generating word embeddings | waterson|
- [tcn](tcn): Self-supervised representation learning from multi-view video. | [tcn](tcn) | [Time-Contrastive Networks: Self-Supervised Learning from Video](https://arxiv.org/abs/1704.06888) | Self-supervised representation learning from multi-view video| coreylynch, sermanet |
- [textsum](textsum): sequence-to-sequence with attention model for text | [textsum](textsum)| Sequence-to-sequence with attention model for text summarization | | panyx0718, peterjliu |
summarization. | [transformer](transformer) | [Spatial Transformer Network](https://arxiv.org/abs/1506.02025) | Spatial transformer network that allows the spatial manipulation of data within the network| daviddao|
- [transformer](transformer): spatial transformer network, which allows the | [vid2depth](vid2depth) | [Unsupervised Learning of Depth and Ego-Motion from Monocular Video Using 3D Geometric Constraints](https://arxiv.org/abs/1802.05522) | Learning depth and ego-motion unsupervised from raw monocular video | rezama |
spatial manipulation of data within the network. | [video<br />_prediction](video_prediction) | [Unsupervised Learning for Physical Interaction through Video Prediction](https://arxiv.org/abs/1605.07157) | Predicting future video frames with neural advection| cbfinn |
- [vid2depth](vid2depth): learning depth and ego-motion unsupervised from
raw monocular video. ---
- [video_prediction](video_prediction): predicting future video frames with
neural advection. ## Contributions
If you want to contribute a new model, please submit a pull request.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment