Projects
Vision Transformer (ViT) from Scratch in PyTorch
GitHub
100+
Built a Vision Transformer from scratch in PyTorch.
Key Features:
- Multi-head self-attention mechanism.
- Detailed implementation with step by step outputs.
PyTorch
Vision Transformers
Scratch
2D Positional Encodings for Vision Transformer
Implemented various positional encodings and adapted them to 2D for Vision Transformer.
Implemented Positional Encodings:
- Learnable
- Sinusoidal (Absolute)
- Relative
- Rotary Position Embedding (RoPe)
- No Position
PyTorch
2D Positional Encoding
Vision Transformer
Large Language Model (LLM) from Scratch in PyTorch
Developed LLMs from scratch in PyTorch with detailed implementation steps and advanced functionalities.
Key Features:
- Byte-Pair Encoding (BPE) tokenizer
- Rotational Positional Embeddings (RoPE)
- SwishGLU activation
- RMSNorm
- Mixture of Experts (MoE)
- Key-Value Cache
- Temperature, Top-p and Top-k sampling
PyTorch
Large Language Model
LLM
Transformers
Scratch
Various Generative Adversarial Networks (GANs)
Implemented several GAN variants from scratch for image generation and image translation with easy-to-understand code.
Implemented Models:
- Vanilla GAN Paper
- Deep Convolutional GAN (DCGAN) Paper
- Least Squares GAN (LSGAN) Paper
- Conditional GAN (cGAN) Paper
- CycleGAN Paper
- Wasserstein GAN (WGAN) Paper
- Wasserstein GAN with Gradient Penalty (WGAN-GP) Paper
- StarGAN Paper
PyTorch
Generative Adversarial Networks
GANs
Image Generation
Image Translation
Duplicate Photos and Video Finder
Created a high-speed Python tool to detect and delete duplicate photos/videos in directories recursively.
Features:
- Fast Pixel-wise comparison for photos
- Option to keep largest/smallest file among duplicates
- Useful for cleaning shared media libraries and backups
Python
OpenCV
NumPy