Understanding Transformers

 

Understanding Transformers: A Step-by-Step Math Example — Part 1




I understand that the transformer architecture may seem scary, and you might have encountered various explanations on YouTube or in blogs. However, in my blog, I will make an effort to clarify it by providing a comprehensive numerical example. By doing so, I hope to simplify the understanding of the transformer architecture.

Shoutout to HeduAI for providing clear explanations that have helped clarify my own concepts!

In our dataset, there are 3 sentences (dialogues) taken from the Game of Thrones TV show. Although this dataset may seem small, its size actually helps us in finding the results using the upcoming mathematical equations.

Step 2 (Finding the Vocab Size)
To determine the vocabulary size, we need to identify the total number of unique words in our dataset. This is crucial for encoding (i.e., converting the data into numbers).

using a set operation helps remove duplicates, and then we can count the unique words to determine the vocabulary size. Therefore, the vocabulary size is 23, as there are 23 unique words in the given list.

Post a Comment

0 Comments