Tag Contrib
Wow, an excellent textbook: https://d2l.ai/
From the illustrated transformer:
- Read the Attention Is All You Need paper, the Transformer blog post (Transformer: A Novel Neural Network Architecture for Language Understanding), and the Tensor2Tensor announcement.
- Watch Łukasz Kaiser’s talk walking through the model and its details
- Play with the Jupyter Notebook provided as part of the Tensor2Tensor repo
- Explore the Tensor2Tensor repo.
Follow-up works:
- Depthwise Separable Convolutions for Neural Machine Translation
- One Model To Learn Them All
- Discrete Autoencoders for Sequence Models
- Generating Wikipedia by Summarizing Long Sequences
- Image Transformer
- Training Tips for the Transformer Model
- Self-Attention with Relative Position Representations
- Fast Decoding in Sequence Models using Discrete Latent Variables
- Adafactor: Adaptive Learning Rates with Sublinear Memory Cost
Andrej Karpathy
- https://github.com/karpathy
- https://cs.stanford.edu/people/karpathy/
- https://karpathy.ai/
- Deep Reinforcement Learning: Pong from Pixels: https://karpathy.github.io/2016/05/31/rl/
Huggingface
- https://huggingface.co/docs/transformers/main_classes/model
- https://huggingface.co/docs/diffusers/using-diffusers/loading
Fundamental Tutorials
- https://pytorch.org/tutorials/intermediate/seq2seq_translation_tutorial.html
- Book :: DAVID SILVER :: UCL Course on RL https://www.davidsilver.uk/teaching/
Bert
Pytorch
Self-instruct:
- https://github.com/yizhongw/self-instruct
- Self-Instruct: Aligning Language Models with Self-Generated Instructions https://arxiv.org/abs/2212.10560
- https://github.com/tatsu-lab/stanford_alpaca
- LLaMA: Open and Efficient Foundation Language Models (not actually open?)
- Not really open then: https://crfm.stanford.edu/2023/03/13/alpaca.html
EleutherAI
https://en.wikipedia.org/wiki/EleutherAI
https://www.eleuther.ai/releases
Pythia, A suite of models designed to enable controlled scientific research on transparently trained LLMs https://github.com/EleutherAI/pythia
Code
Datasets
- Free Dolly: Introducing the World's First Truly Open Instruction-Tuned LLM