Aspiring Data Science
373 subscribers
425 photos
11 videos
10 files
1.87K links
Заметки экономиста о программировании, прогнозировании и принятии решений, научном методе познания.
Контакт: @fingoldo

I call myself a data scientist because I know just enough math, economics & programming to be dangerous.
Download Telegram
#pytorch #lightning #raschka

Знакомство с Pytorch Lightning от Себастиана Рашки (Units 1-10). Классный парень, автор известных книг по ML, автор либы mlxtend.

https://www.youtube.com/@PyTorchLightning/playlists
#pytorch #lightning

Понравился пример из доки Monte Carlo Dropout for predictions

class LitMCdropoutModel(L.LightningModule):
def __init__(self, model, mc_iteration):
super().__init__()
self.model = model
self.dropout = nn.Dropout()
self.mc_iteration = mc_iteration

def predict_step(self, batch, batch_idx):
# enable Monte Carlo Dropout
self.dropout.train()

# take average of `self.mc_iteration` iterations
pred = [self.dropout(self.model(x)).unsqueeze(0) for _ in range(self.mc_iteration)]
pred = torch.vstack(pred).mean(dim=0)
return pred


Статья
Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning.
👍1
#pytorch #lightning

В Лайтнинге обнаружился параметр тренера Trainer(benchmark=True)

Дока туманно намекает, что это некая оптимизация алгоритмов Куды. Удалось найти, каких именно:

static const algo_t algos[] = {
CUDNN_CONVOLUTION_FWD_ALGO_GEMM,
CUDNN_CONVOLUTION_FWD_ALGO_FFT,
CUDNN_CONVOLUTION_FWD_ALGO_FFT_TILING,
CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_GEMM,
CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_PRECOMP_GEMM,
CUDNN_CONVOLUTION_FWD_ALGO_DIRECT,
CUDNN_CONVOLUTION_FWD_ALGO_WINOGRAD,
CUDNN_CONVOLUTION_FWD_ALGO_WINOGRAD_NONFUSED,
}

Говорят, иногда хорошо ускоряет:


Oli
Olof Harrysson
Apr 2019

I find that torch.backends.cudnn.benchmark increases the speed for my YOLOv3 model by a lot, like 30-40%. Furthermore, it lowers the memory footprint after it completes the benchmark.

It even works when my input images vary in size between each batch, neat! I was thinking about having the network optimize on a few smaller torch.randn(...) to benchmark on, and then start the training. I hope that this could allow me to increase the batch size since the memory footprint is lower after the bechmark. What do you guys thing?


https://github.com/pytorch/pytorch/blob/1848cad10802db9fa0aa066d9de195958120d863/aten/src/ATen/native/cudnn/Conv.cpp#L486-L494
👍1
#pytorch #lightning #swa

А кто пробовал SWA?

Stochastic Weight Averaging (SWA) can make your models generalize better at virtually no additional cost. This can be used with both non-trained and trained models. The SWA procedure smooths the loss landscape thus making it harder to end up in a local minimum during optimization.

For a more detailed explanation of SWA and how it works, read this post by the PyTorch team.