about « all posts

Posts In #pytorch

One example of memory optimization for KL-loss calculation in pytorch

Mar 30 2023 · 4 min read
#pytorch #memory #fairseq #nlp #machine-learning
Memory usage is a common issue for large ML models. Especially in academia, we have to use resources wisely and make the most out of resources available. While working on my mixture model’s KL-objective, I have to make some less common optimization to reduce memory usage. Setup Decoder outputs a large matrix \(O\) with dimensionality \((M \times B \times L \times D)\) where \(M\) is the number of clusters, \(B\) is a batch size, \(L\) is a sequence lengths and \(D\) is model output dimension.
Read More…