# Gated recurrent unit

**Gated recurrent unit**s (**GRU**s) are a gating mechanism in recurrent neural networks, introduced in 2014 by Kyunghyun Cho et al.^{[1]} Their performance on polyphonic music modeling and speech signal modeling was found to be similar to that of long short-term memory (LSTM). However, GRUs have been shown to exhibit better performance on smaller datasets.^{[2]}

They have fewer parameters than LSTM, as they lack an output gate.^{[3]}

## Architecture[edit]

There are several variations on the full gated unit, with gating done using the previous hidden state and the bias in various combinations, and a simplified form called minimal gated unit.

The operator denotes the Hadamard product in the following.

### Fully gated unit[edit]

Initially, for , the output vector is .

Variables

- : input vector
- : output vector
- : update gate vector
- : reset gate vector
- , and : parameter matrices and vector

- : The original is a sigmoid function.
- : The original is a hyperbolic tangent.

Alternative activation functions are possible, provided that .

Alternate forms can be created by changing and ^{[4]}

- Type 1, each gate depends only on the previous hidden state and the bias.
- Type 2, each gate depends only on the previous hidden state.
- Type 3, each gate is computed using only the bias.

### Minimal gated unit[edit]

The minimal gated unit is similar to the fully gated unit, except the update and reset gate vector is merged into a forget gate. This also implies that the equation for the output vector must be changed ^{[5]}

Variables

- : input vector
- : output vector
- : forget vector
- , and : parameter matrices and vector

## See also[edit]

## References[edit]

**^**Cho, Kyunghyun; van Merrienboer, Bart; Gulcehre, Caglar; Bahdanau, Dzmitry; Bougares, Fethi; Schwenk, Holger; Bengio, Yoshua (2014). "Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation". arXiv:1406.1078 [cs.CL].**^**Chung, Junyoung; Gulcehre, Caglar; Cho, KyungHyun; Bengio, Yoshua (2014). "Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling". arXiv:1412.3555 [cs.NE].**^**"Recurrent Neural Network Tutorial, Part 4 – Implementing a GRU/LSTM RNN with Python and Theano – WildML".*Wildml.com*. Retrieved May 18, 2016.**^**Dey, Rahul; Salem, Fathi M. (2017-01-20). "Gate-Variants of Gated Recurrent Unit (GRU) Neural Networks". arXiv:1701.05923 [cs.NE].**^**Heck, Joel; Salem, Fathi M. (2017-01-12). "Simplified Minimal Gated Unit Variations for Recurrent Neural Networks". arXiv:1701.03452 [cs.NE].