Talk:Long short-term memory

(Rated Start-class, Mid-importance)

Long short-term memory is within the scope of WikiProject Robotics, which aims to build a comprehensive and detailed guide to Robotics on Wikipedia. If you would like to participate, you can choose to edit this article, or visit the project page (Talk), where you can join the project and see a list of open tasks.

Start

This article has been rated as Start-Class on the project's quality scale.

Mid

This article has been rated as Mid-importance on the project's importance scale.

It is requested that an image or photograph be included in this article to improve its quality. Please replace this template with a more specific media request template where possible.
The Free Image Search Tool may be able to locate suitable images on Flickr and other web sites.

Upload

Convolutional LSTM[edit]

In this section new variables $V_{f},V_{i},V_{o}$ are used. These variables should be introduced and described before.

194.39.218.10 (talk) 09:19, 29 November 2016 (UTC)

Checking of equations needed[edit]

In the equations for the peephole LSTM, the last non linearity for $h_{t}$ is applied before multiplying by the gate. In the 2001 paper from Gers and Schmidhuber (LSTM Recurrent Networks Learn Simple Context-Free and Context-Sensitive Languages) it is applied after. I think someone who knows how it is implemented in practice should double check this.

Introduction for non-experts[edit]

The article is not very helpful for the average Wikipedia reader with limited expertise in recurrent ANNs. It should have an introductory section giving examples of typical time-series data and explaining roughly why standard recurrent networks run into difficulty. In particular, in what sort of situation do input vectors at a given time have strong relationships to vectors at much earlier times? And why to conventional recurrent ANNs fail in these circumstances? (it's not sufficient to merely state that error signals vanish). I would improve accessibility for the interested general reader and sacrifice descriptions of more recent developments. Paulhummerman (talk) 14:26, 5 December 2016 (UTC)

Agree. Read this, understood nothing. Search Google for tutorial and understood. Daniel.Cardenas (talk) 18:05, 30 July 2017 (UTC)

External links modified[edit]

Hello fellow Wikipedians,

I have just modified one external link on Long short-term memory. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:

Added archive https://web.archive.org/web/20120522234026/http://etd.uwc.ac.za/usrfiles/modules/etd/docs/etd_init_3937_1174040706.pdf to http://etd.uwc.ac.za/usrfiles/modules/etd/docs/etd_init_3937_1174040706.pdf

When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.

As of February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete the "External links modified" sections if they want, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{sourcecheck}} (last update: 15 July 2018).

If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—InternetArchiveBot (Report bug) 21:35, 5 January 2018 (UTC)

Proposed change[edit]

Hello friends,

After reading this page, I elect that the following lines be removed:

"A problem with using gradient descent for standard RNNs is that error gradients vanish exponentially quickly with the size of the time lag between important events."

"This is due to {\displaystyle \lim _{n\to \infty }W^{n}=0} {\displaystyle \lim _{n\to \infty }W^{n}=0} if the spectral radius of {\displaystyle W} W is smaller than 1."

I make such a proposition for the following reasons:

- Regarding the first line: that certainly is a problem if your gradient descents aren't proportional to the time lag between important events. In other words, sure, that's a problem, but it's not difficult to fix, and for that reason merits no mention.

- Regarding the second line: this is overly-complicated to an extent no less than silly, and is absolutely superfluous in this context. This one has to go if the first one goes anyway, but seriously friends, I'm calling you out on this one. To quote Albert Einstein's reply to Franz Kafka's draft of The Castle, "Life is not this hard."

TheLoneDeranger (talk) 05:38, 26 August 2018 (UTC)

Section "Future" is difficult to read[edit]

The section "Future" should partially be rewritten imo., as it contains lots of repetitive words such as "system", "most" and "more":

"more and more complex and sophisticated, and most of the most advanced neural network frameworks" ... "mixing and matching" ... "Most will be the most advanced system LSTMs into the system, in order to make the system"... — Preceding unsigned comment added by MakeTheWorldALittleBetter (talk • contribs) 16:49, 26 January 2019 (UTC)

Talk:Long short-term memory

Contents

Convolutional LSTM[edit]

Checking of equations needed[edit]

Introduction for non-experts[edit]

External links modified[edit]

Proposed change[edit]

Section "Future" is difficult to read[edit]

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Interaction

Tools

Print/export

Languages