Talk:Long short-term memory
WikiProject Robotics | (Rated Start-class, Mid-importance) | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
It is requested that an image or photograph be included in this article to improve its quality. Please replace this template with a more specific media request template where possible.
The Free Image Search Tool may be able to locate suitable images on Flickr and other web sites. |
Contents
Convolutional LSTM[edit]
In this section new variables are used. These variables should be introduced and described before.
194.39.218.10 (talk) 09:19, 29 November 2016 (UTC)
Checking of equations needed[edit]
In the equations for the peephole LSTM, the last non linearity for is applied before multiplying by the gate. In the 2001 paper from Gers and Schmidhuber (LSTM Recurrent Networks Learn Simple Context-Free and Context-Sensitive Languages) it is applied after. I think someone who knows how it is implemented in practice should double check this.
Introduction for non-experts[edit]
The article is not very helpful for the average Wikipedia reader with limited expertise in recurrent ANNs. It should have an introductory section giving examples of typical time-series data and explaining roughly why standard recurrent networks run into difficulty. In particular, in what sort of situation do input vectors at a given time have strong relationships to vectors at much earlier times? And why to conventional recurrent ANNs fail in these circumstances? (it's not sufficient to merely state that error signals vanish). I would improve accessibility for the interested general reader and sacrifice descriptions of more recent developments. Paulhummerman (talk) 14:26, 5 December 2016 (UTC)
- Agree. Read this, understood nothing. Search Google for tutorial and understood. Daniel.Cardenas (talk) 18:05, 30 July 2017 (UTC)
External links modified[edit]
Hello fellow Wikipedians,
I have just modified one external link on Long short-term memory. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:
- Added archive https://web.archive.org/web/20120522234026/http://etd.uwc.ac.za/usrfiles/modules/etd/docs/etd_init_3937_1174040706.pdf to http://etd.uwc.ac.za/usrfiles/modules/etd/docs/etd_init_3937_1174040706.pdf
When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.
As of February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete the "External links modified" sections if they want, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{sourcecheck}}
(last update: 15 July 2018).
- If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
- If you found an error with any archives or the URLs themselves, you can fix them with this tool.
Cheers.—InternetArchiveBot (Report bug) 21:35, 5 January 2018 (UTC)
Proposed change[edit]
Hello friends,
After reading this page, I elect that the following lines be removed:
"A problem with using gradient descent for standard RNNs is that error gradients vanish exponentially quickly with the size of the time lag between important events."
"This is due to {\displaystyle \lim _{n\to \infty }W^{n}=0} {\displaystyle \lim _{n\to \infty }W^{n}=0} if the spectral radius of {\displaystyle W} W is smaller than 1."
I make such a proposition for the following reasons:
- Regarding the first line: that certainly is a problem if your gradient descents aren't proportional to the time lag between important events. In other words, sure, that's a problem, but it's not difficult to fix, and for that reason merits no mention.
- Regarding the second line: this is overly-complicated to an extent no less than silly, and is absolutely superfluous in this context. This one has to go if the first one goes anyway, but seriously friends, I'm calling you out on this one. To quote Albert Einstein's reply to Franz Kafka's draft of The Castle, "Life is not this hard."
TheLoneDeranger (talk) 05:38, 26 August 2018 (UTC)
Section "Future" is difficult to read[edit]
The section "Future" should partially be rewritten imo., as it contains lots of repetitive words such as "system", "most" and "more":
"more and more complex and sophisticated, and most of the most advanced neural network frameworks" ... "mixing and matching" ... "Most will be the most advanced system LSTMs into the system, in order to make the system"... — Preceding unsigned comment added by MakeTheWorldALittleBetter (talk • contribs) 16:49, 26 January 2019 (UTC)