This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| """ | |
| Transformer Architecture v7 | |
| ================================================================== | |
| """ | |
| import torch | |
| import torch.nn as nn | |
| import torch.nn.functional as F | |
| import math | |
| import os |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| """ | |
| Transformer Architecture v8 | |
| ================================================================== | |
| 1. **Architecture**: GPT-2 style with Learnable Positional Embeddings & aligned naming. | |
| 2. **Pre-training**: Train from scratch on TinyStories (restored from v7). | |
| 3. **Weight Loading**: Load OpenAI GPT-2 pretrained weights. | |
| 4. **Fine-tuning**: Instruction fine-tuning (Alpaca style). | |
| 5. **Persistence**: Save/Load checkpoints (restored from v7). | |
| Dependencies: torch, tiktoken, requests, tqdm, numpy, datasets (for training), tensorflow (for weight loading) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| """ | |
| Transformer Architecture v6 (NLTK Tokenization) | |
| ================================================================== | |
| Improved version of v5 using NLTK for word-level tokenization. | |
| Includes interactive menu for Training, Saving, Loading, and Generation. | |
| """ | |
| import torch | |
| import torch.nn as nn | |
| import torch.nn.functional as F |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| """ | |
| Transformer Architecture v4 | |
| ================================================================== | |
| """ | |
| import torch | |
| import torch.nn as nn | |
| import torch.nn.functional as F | |
| import math | |
| import os |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| """ | |
| Transformer Architecture v3 | |
| ========================================================= | |
| Key addition from v3: | |
| ✓ Added causal masking to prevent attending to future tokens | |
| ✓ Added Pre-LN option for deeper networks | |
| ✓ Proper dropout placement in FFN | |
| ✓ Better temperature scaling | |
| """ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import torch | |
| import torch.nn as nn | |
| import torch.nn.functional as F | |
| import math | |
| import os | |
| # ============================================ | |
| # 1. TOKENIZATION (using simple mapping) | |
| # ============================================ | |
| class SimpleTokenizer: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import torch | |
| import torch.nn as nn | |
| import torch.nn.functional as F | |
| import math | |
| # ============================================ | |
| # 1. TOKENIZATION (using simple mapping) | |
| # ============================================ | |
| class SimpleTokenizer: | |
| def __init__(self, vocab): |