0%

What is Transformer

Posted on 2022-11-09 Edited on 2023-09-21 In MarkdownNotes , Engineering , Computer Science , AI

1. Intro

A model related to language, model behind GPT, BERT, T5

2. What

A general model

CNN for vision
RNN for language, sequential
hard to train
always forget

Initially, trained for translation

good for training
huge dataset

3. How it work

Positional Encoding

add a index for the word, easier to train

Attention

looking for related word for translation

Self-Attention

How to understand the language? by checking around word to determine the real meaning of a word

4. Use