Barrier Problem and Tiled Matrix Multiplication in CUDA

Barrier Problem

Simple Matrix Multiplication

Tiled Matrix Multiplication