Skip to content

Instantly share code, notes, and snippets.

View amitpuri's full-sized avatar
🌐
Living my dream. What can I do or pivot it to Do what I can? Come what may!

Dr. Amit Puri amitpuri

🌐
Living my dream. What can I do or pivot it to Do what I can? Come what may!
View GitHub Profile
"""
Transformer Architecture v7
==================================================================
"""
import torch
import torch.nn as nn
import torch.nn.functional as F
import math
import os
"""
Transformer Architecture v8
==================================================================
1. **Architecture**: GPT-2 style with Learnable Positional Embeddings & aligned naming.
2. **Pre-training**: Train from scratch on TinyStories (restored from v7).
3. **Weight Loading**: Load OpenAI GPT-2 pretrained weights.
4. **Fine-tuning**: Instruction fine-tuning (Alpaca style).
5. **Persistence**: Save/Load checkpoints (restored from v7).
Dependencies: torch, tiktoken, requests, tqdm, numpy, datasets (for training), tensorflow (for weight loading)
"""
Transformer Architecture v6 (NLTK Tokenization)
==================================================================
Improved version of v5 using NLTK for word-level tokenization.
Includes interactive menu for Training, Saving, Loading, and Generation.
"""
import torch
import torch.nn as nn
import torch.nn.functional as F
"""
Transformer Architecture v4
==================================================================
"""
import torch
import torch.nn as nn
import torch.nn.functional as F
import math
import os
"""
Transformer Architecture v3
=========================================================
Key addition from v3:
✓ Added causal masking to prevent attending to future tokens
✓ Added Pre-LN option for deeper networks
✓ Proper dropout placement in FFN
✓ Better temperature scaling
"""
import torch
import torch.nn as nn
import torch.nn.functional as F
import math
import os
# ============================================
# 1. TOKENIZATION (using simple mapping)
# ============================================
class SimpleTokenizer:
import torch
import torch.nn as nn
import torch.nn.functional as F
import math
# ============================================
# 1. TOKENIZATION (using simple mapping)
# ============================================
class SimpleTokenizer:
def __init__(self, vocab):