04 — Transformers

Attention-based architectures that replaced recurrent models as the dominant approach for sequence modelling and have become the foundation of large language models.

Notes

03 — Sequence Models05 — Multimodal Models