Summary: Vl Lstm And Recurrent Neural Nets

Study material generic cover image
  • This + 400k other summaries
  • A unique study and practice tool
  • Never study anything twice again
  • Get the grades you hope for
  • 100% sure, 100% understanding
Use this summary
Remember faster, study better. Scientifically proven.
Trustpilot Logo

Read the summary and the most important questions on VL LSTM and Recurrent Neural Nets

  • 1 Simpe recurrent networks

    This is a preview. There are 9 more flashcards available for chapter 1
    Show more cards here

  • 1.5 Other RNN learning algorithmes

    This is a preview. There are 10 more flashcards available for chapter 1.5
    Show more cards here

  • Real-Time recurrent Learning

    Computes all contributions to the gradients during the forward pass , the derivative of each unit with respect to each weigh is tracked
  • 4 Transformers and attention

  • 4.1.1 temporal attention

    This is a preview. There are 9 more flashcards available for chapter 4.1.1
    Show more cards here

  • Temporal attention (for sequences)

    Focuses on relevant elements or intervals of a sequence when processing this sequence
  • Playing "soft attention"

    I is in range (0,1) -> since it can laso choose to let through half of the information
  • 4.2 Attention in sequence-to-sequence models

    This is a preview. There are 7 more flashcards available for chapter 4.2
    Show more cards here

  • Bahdanau attention mechanism

    Additive attention ; allows to focus on different parts of the input sequence at a time
  • Why the name additive attention

    Becuse of the sum inside the tanh function.
  • 11 Script summary

  • 11.1 Simple recurrent networks

    This is a preview. There are 18 more flashcards available for chapter 11.1
    Show more cards here

  • Pred_y = g(x;w)

    Feedforward network = function that maps an input to prediction using network parameters
  • idea is Turing compleet

    That every computer program can be represented by the idea
  • Jordan network (idea)

    • Earliest recurrent neural architectures
    • 1 hidden layer
    • keep the last ouput as a form of context for processing the next input
  • Fully recurrent network

    • Elman network + loosen the choice of reccurent parameters
    • add the R^T * a(t-1)
    • treat the R as trainable
    • a hidden unit now depends on itself + neighboring neurons
    • a(0) = 0 <- clean memory
  • 11.2 Learning algorithmes for RNN's

    This is a preview. There are 11 more flashcards available for chapter 11.2
    Show more cards here

  • Empirical risk minimization (gradient descent)

    • Input sequence with T elements
    • find a parameterizatoin that minimizes the risk ; argmin Remp(g(.;w))
    • gradient descent : w = w_old - eta * Remp
    • iterate this procedure until the only possible choice is eta = 0 and we converged to a local minimum

To read further, please click:

Read the full summary
This summary +380.000 other summaries A unique study tool A rehearsal system for this summary Studycoaching with videos
  • Higher grades + faster learning
  • Never study anything twice
  • 100% sure, 100% understanding
Discover Study Smart

Topics related to Summary: Vl Lstm And Recurrent Neural Nets