The words you are searching are inside this book. To get more targeted content, please make full-text search by clicking here.
Discover the best professional documents and content resources in AnyFlip Document Base.
Search
Published by thanatthuchkamseng, 2020-05-24 05:24:44

StatisticsMachineLearningPythonDraft

StatisticsMachineLearningPythonDraft

Statistics and Machine Learning in Python, Release 0.3 beta

(continued from previous page)
### DEBUG: Shape of last convnet= torch.Size([16, 6, 6]) . FC size= 576

Set 576 neurons to the first FC layer

SGD with momentum lr=0.001, momentum=0.5

model = LeNet5((3, 6, 16, 576, 120, 84, D_out)).to(device)
optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.5)
criterion = nn.NLLLoss()

# Explore the model
for parameter in model.parameters():

print(parameter.shape)

print("Total number of parameters =", np.sum([np.prod(parameter.shape) for parameter in␣
˓→model.parameters()]))

model, losses, accuracies = train_val_model(model, criterion, optimizer, dataloaders,
num_epochs=25, log_interval=5)

_ = plt.plot(losses[ train ], -b , losses[ val ], --r )

torch.Size([6, 3, 5, 5])
torch.Size([6])
torch.Size([16, 6, 5, 5])
torch.Size([16])
torch.Size([120, 576])
torch.Size([120])
torch.Size([84, 120])
torch.Size([84])
torch.Size([10, 84])
torch.Size([10])
Total number of parameters = 83126
Epoch 0/24
----------
train Loss: 2.3047 Acc: 10.00%
val Loss: 2.3043 Acc: 10.00%

Epoch 5/24
----------
train Loss: 2.3019 Acc: 10.01%
val Loss: 2.3015 Acc: 10.36%

Epoch 10/24
----------
train Loss: 2.2989 Acc: 12.97%
val Loss: 2.2979 Acc: 11.93%

Epoch 15/24
----------
train Loss: 2.2854 Acc: 10.34%
val Loss: 2.2808 Acc: 10.26%

Epoch 20/24
----------
train Loss: 2.1966 Acc: 15.87%

(continues on next page)

6.3. Convolutional neural network 297

Statistics and Machine Learning in Python, Release 0.3 beta (continued from previous page)

val Loss: 2.1761 Acc: 17.29%
Training complete in 4m 40s
Best val Acc: 22.81%

Increase learning rate and momentum lr=0.01, momentum=0.9

model = LeNet5((3, 6, 16, 576, 120, 84, D_out)).to(device)
optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.9)
criterion = nn.NLLLoss()

model, losses, accuracies = train_val_model(model, criterion, optimizer, dataloaders,
num_epochs=25, log_interval=5)

_ = plt.plot(losses[ train ], -b , losses[ val ], --r )

Epoch 0/24
----------
train Loss: 2.1439 Acc: 19.81%
val Loss: 1.9415 Acc: 29.59%

Epoch 5/24
----------
train Loss: 1.3457 Acc: 51.62%
val Loss: 1.2294 Acc: 55.74%

Epoch 10/24
----------
train Loss: 1.1607 Acc: 58.39%
val Loss: 1.1031 Acc: 60.68%

Epoch 15/24
----------
train Loss: 1.0710 Acc: 62.08%

(continues on next page)

298 Chapter 6. Deep Learning

Statistics and Machine Learning in Python, Release 0.3 beta

(continued from previous page)

val Loss: 1.0167 Acc: 64.26%

Epoch 20/24
----------
train Loss: 1.0078 Acc: 64.25%
val Loss: 0.9505 Acc: 66.62%

Training complete in 4m 58s
Best val Acc: 67.30%

Adaptative learning rate: Adam

model = LeNet5((3, 6, 16, 576, 120, 84, D_out)).to(device)
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
criterion = nn.NLLLoss()

model, losses, accuracies = train_val_model(model, criterion, optimizer, dataloaders,
num_epochs=25, log_interval=5)

_ = plt.plot(losses[ train ], -b , losses[ val ], --r )

Epoch 0/24
----------
train Loss: 1.8857 Acc: 29.70%
val Loss: 1.6223 Acc: 40.22%

Epoch 5/24
----------
train Loss: 1.3564 Acc: 50.88%
val Loss: 1.2271 Acc: 55.97%

Epoch 10/24
----------
train Loss: 1.2169 Acc: 56.35%

(continues on next page)

6.3. Convolutional neural network 299

Statistics and Machine Learning in Python, Release 0.3 beta (continued from previous page)

val Loss: 1.1393 Acc: 59.72%

Epoch 15/24
----------
train Loss: 1.1296 Acc: 59.67%
val Loss: 1.0458 Acc: 63.05%

Epoch 20/24
----------
train Loss: 1.0830 Acc: 61.16%
val Loss: 1.0047 Acc: 64.49%

Training complete in 4m 34s
Best val Acc: 65.76%

MiniVGGNet

model = MiniVGGNet(layers=(3, 16, 32, 1, 120, 84, D_out), debug=True)
print(model)
_ = model(data_example)

MiniVGGNet(
(conv11): Conv2d(3, 16, kernel_size=(3, 3), stride=(1, 1))
(conv12): Conv2d(16, 16, kernel_size=(3, 3), stride=(1, 1))
(conv21): Conv2d(16, 32, kernel_size=(3, 3), stride=(1, 1))
(conv22): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(fc1): Linear(in_features=1, out_features=120, bias=True)
(fc2): Linear(in_features=120, out_features=84, bias=True)
(fc3): Linear(in_features=84, out_features=10, bias=True)

)
### DEBUG: Shape of last convnet= torch.Size([32, 6, 6]) . FC size= 1152

300 Chapter 6. Deep Learning

Statistics and Machine Learning in Python, Release 0.3 beta

Set 1152 neurons to the first FC layer

SGD with large momentum and learning rate

model = MiniVGGNet((3, 16, 32, 1152, 120, 84, D_out)).to(device)
optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.9)
criterion = nn.NLLLoss()

model, losses, accuracies = train_val_model(model, criterion, optimizer, dataloaders,
num_epochs=25, log_interval=5)

_ = plt.plot(losses[ train ], -b , losses[ val ], --r )

Epoch 0/24
----------
train Loss: 2.3027 Acc: 10.12%
val Loss: 2.3004 Acc: 10.31%

Epoch 5/24
----------
train Loss: 1.4790 Acc: 45.88%
val Loss: 1.3726 Acc: 50.32%

Epoch 10/24
----------
train Loss: 1.1115 Acc: 60.74%
val Loss: 1.0193 Acc: 64.00%

Epoch 15/24
----------
train Loss: 0.8937 Acc: 68.41%
val Loss: 0.8297 Acc: 71.18%

Epoch 20/24
----------
train Loss: 0.7848 Acc: 72.14%
val Loss: 0.7136 Acc: 75.42%

Training complete in 4m 27s
Best val Acc: 76.73%

6.3. Convolutional neural network 301

Statistics and Machine Learning in Python, Release 0.3 beta

Adam

model = MiniVGGNet((3, 16, 32, 1152, 120, 84, D_out)).to(device)
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
criterion = nn.NLLLoss()

model, losses, accuracies = train_val_model(model, criterion, optimizer, dataloaders,
num_epochs=25, log_interval=5)

_ = plt.plot(losses[ train ], -b , losses[ val ], --r )

Epoch 0/24
----------
train Loss: 1.8366 Acc: 31.34%
val Loss: 1.5805 Acc: 41.62%

Epoch 5/24
----------
train Loss: 1.1755 Acc: 57.79%
val Loss: 1.1027 Acc: 60.83%

Epoch 10/24
----------
train Loss: 0.9741 Acc: 65.53%
val Loss: 0.8994 Acc: 68.29%

Epoch 15/24
----------
train Loss: 0.8611 Acc: 69.74%
val Loss: 0.8465 Acc: 70.90%

Epoch 20/24
----------
train Loss: 0.7916 Acc: 71.90%
val Loss: 0.7513 Acc: 74.03%

(continues on next page)

302 Chapter 6. Deep Learning

Statistics and Machine Learning in Python, Release 0.3 beta

(continued from previous page)

Training complete in 4m 23s
Best val Acc: 74.87%

ResNet

model = ResNet(ResidualBlock, [2, 2, 2], num_classes=D_out).to(device) # 195738 parameters
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
criterion = nn.NLLLoss()

model, losses, accuracies = train_val_model(model, criterion, optimizer, dataloaders,
num_epochs=25, log_interval=5)

_ = plt.plot(losses[ train ], -b , losses[ val ], --r )

Epoch 0/24
----------
train Loss: 1.4402 Acc: 46.84%
val Loss: 1.7289 Acc: 38.18%

Epoch 5/24
----------
train Loss: 0.6337 Acc: 77.88%
val Loss: 0.8672 Acc: 71.34%

Epoch 10/24
----------
train Loss: 0.4851 Acc: 83.11%
val Loss: 0.5754 Acc: 80.47%

Epoch 15/24
----------

(continues on next page)

6.3. Convolutional neural network 303

Statistics and Machine Learning in Python, Release 0.3 beta (continued from previous page)

train Loss: 0.3998 Acc: 86.22%
val Loss: 0.6208 Acc: 80.16%

Epoch 20/24
----------
train Loss: 0.3470 Acc: 87.99%
val Loss: 0.4696 Acc: 84.20%

Training complete in 7m 5s
Best val Acc: 85.60%

6.4 Transfer Learning Tutorial

Sources:
• cs231n @ Stanford
• Sasank Chilamkurthy

Quote cs231n @ Stanford:
In practice, very few people train an entire Convolutional Network from scratch (with random
initialization), because it is relatively rare to have a dataset of sufficient size. Instead, it is
common to pretrain a ConvNet on a very large dataset (e.g. ImageNet, which contains 1.2
million images with 1000 categories), and then use the ConvNet either as an initialization or a
fixed feature extractor for the task of interest.
These two major transfer learning scenarios look as follows:

• ConvNet as fixed feature extractor:
– Take a ConvNet pretrained on ImageNet,

304 Chapter 6. Deep Learning

Statistics and Machine Learning in Python, Release 0.3 beta

– Remove the last fully-connected layer (this layer’s outputs are the 1000 class scores
for a different task like ImageNet)

– Treat the rest of the ConvNet as a fixed feature extractor for the new dataset.

In practice:

– Freeze the weights for all of the network except that of the final fully connected layer.
This last fully connected layer is replaced with a new one with random weights and
only this layer is trained.

• Finetuning the convnet:

fine-tune the weights of the pretrained network by continuing the backpropagation. It is possi-
ble to fine-tune all the layers of the ConvNet

Instead of random initializaion, we initialize the network with a pretrained network, like the
one that is trained on imagenet 1000 dataset. Rest of the training looks as usual.

%matplotlib inline

import os
import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim
from torch.optim import lr_scheduler
import torchvision
import torchvision.transforms as transforms
from torchvision import models
#
from pathlib import Path
import matplotlib.pyplot as plt

# Device configuration
device = torch.device( cuda if torch.cuda.is_available() else cpu )

6.4.1 Training function

Combine train and test/validation into a single function.

Now, let’s write a general function to train a model. Here, we will illustrate:

• Scheduling the learning rate

• Saving the best model

In the following, parameter scheduler is an LR scheduler object from torch.optim.
lr_scheduler.

# %load train_val_model.py
import numpy as np
import torch
import time
import copy

def train_val_model(model, criterion, optimizer, dataloaders, num_epochs=25,
(continues on next page)

6.4. Transfer Learning Tutorial 305

Statistics and Machine Learning in Python, Release 0.3 beta

(continued from previous page)

scheduler=None, log_interval=None):

since = time.time()

best_model_wts = copy.deepcopy(model.state_dict())
best_acc = 0.0

# Store losses and accuracies accross epochs
losses, accuracies = dict(train=[], val=[]), dict(train=[], val=[])

for epoch in range(num_epochs):
if log_interval is not None and epoch % log_interval == 0:
print( Epoch {}/{} .format(epoch, num_epochs - 1))
print( - * 10)

# Each epoch has a training and validation phase
for phase in [ train , val ]:

if phase == train :
model.train() # Set model to training mode

else:
model.eval() # Set model to evaluate mode

running_loss = 0.0
running_corrects = 0

# Iterate over data.
nsamples = 0
for inputs, labels in dataloaders[phase]:

inputs = inputs.to(device)
labels = labels.to(device)
nsamples += inputs.shape[0]

# zero the parameter gradients
optimizer.zero_grad()

# forward
# track history if only in train
with torch.set_grad_enabled(phase == train ):

outputs = model(inputs)
_, preds = torch.max(outputs, 1)
loss = criterion(outputs, labels)

# backward + optimize only if in training phase
if phase == train :

loss.backward()
optimizer.step()

# statistics
running_loss += loss.item() * inputs.size(0)
running_corrects += torch.sum(preds == labels.data)

if scheduler is not None and phase == train :
scheduler.step()

#nsamples = dataloaders[phase].dataset.data.shape[0]
epoch_loss = running_loss / nsamples

(continues on next page)

306 Chapter 6. Deep Learning

Statistics and Machine Learning in Python, Release 0.3 beta

(continued from previous page)

epoch_acc = running_corrects.double() / nsamples

losses[phase].append(epoch_loss)
accuracies[phase].append(epoch_acc)
if log_interval is not None and epoch % log_interval == 0:

print( {} Loss: {:.4f} Acc: {:.4f} .format(
phase, epoch_loss, epoch_acc))

# deep copy the model
if phase == val and epoch_acc > best_acc:

best_acc = epoch_acc
best_model_wts = copy.deepcopy(model.state_dict())
if log_interval is not None and epoch % log_interval == 0:
print()

time_elapsed = time.time() - since
print( Training complete in {:.0f}m {:.0f}s .format(

time_elapsed // 60, time_elapsed % 60))
print( Best val Acc: {:4f} .format(best_acc))

# load best model weights
model.load_state_dict(best_model_wts)

return model, losses, accuracies

6.4.2 CIFAR-10 dataset

Source Yunjey Choi

WD = os.path.join(Path.home(), "data", "pystatml", "dl_cifar10_pytorch")
os.makedirs(WD, exist_ok=True)
os.chdir(WD)
print("Working dir is:", os.getcwd())
os.makedirs("data", exist_ok=True)
os.makedirs("models", exist_ok=True)

# Image preprocessing modules
transform = transforms.Compose([

transforms.Pad(4),
transforms.RandomHorizontalFlip(),
transforms.RandomCrop(32),
transforms.ToTensor()])

# CIFAR-10 dataset
train_dataset = torchvision.datasets.CIFAR10(root= data/ ,

train=True,
transform=transform,
download=True)

test_dataset = torchvision.datasets.CIFAR10(root= data/ ,
train=False,
transform=transforms.ToTensor())

# Data loader

(continues on next page)

6.4. Transfer Learning Tutorial 307

Statistics and Machine Learning in Python, Release 0.3 beta

(continued from previous page)

train_loader = torch.utils.data.DataLoader(dataset=train_dataset,
batch_size=100,
shuffle=True)

val_loader = torch.utils.data.DataLoader(dataset=test_dataset,
batch_size=100,
shuffle=False)

# Put together train and val
dataloaders = dict(train=train_loader, val=val_loader)

# Info about the dataset val ]})
data_shape = dataloaders["train"].dataset.data.shape[1:]
D_in = np.prod(data_shape)
D_out = len(set(dataloaders["train"].dataset.targets))
print("Datasets shape", {x: dataloaders[x].dataset.data.shape for x in [ train ,
print("N input features", D_in, "N output", D_out)

Working dir is: /home/edouard/data/pystatml/dl_cifar10_pytorch
Files already downloaded and verified
Datasets shape { train : (50000, 32, 32, 3), val : (10000, 32, 32, 3)}
N input features 3072 N output 10

Finetuning the convnet

• Load a pretrained model and reset final fully connected layer.

• SGD optimizer.

model_ft = models.resnet18(pretrained=True)
num_ftrs = model_ft.fc.in_features
# Here the size of each output sample is set to 10.
model_ft.fc = nn.Linear(num_ftrs, D_out)

model_ft = model_ft.to(device)

criterion = nn.CrossEntropyLoss()

# Observe that all parameters are being optimized
optimizer_ft = optim.SGD(model_ft.parameters(), lr=0.001, momentum=0.9)

# Decay LR by a factor of 0.1 every 7 epochs
exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=7, gamma=0.1)

model, losses, accuracies = train_val_model(model_ft, criterion, optimizer_ft,
dataloaders, scheduler=exp_lr_scheduler, num_epochs=25, log_interval=5)

epochs = np.arange(len(losses[ train ]))
_ = plt.plot(epochs, losses[ train ], -b , epochs, losses[ val ], --r )

Epoch 0/24
----------
train Loss: 1.2478 Acc: 0.5603
val Loss: 0.9084 Acc: 0.6866

(continues on next page)

308 Chapter 6. Deep Learning

Statistics and Machine Learning in Python, Release 0.3 beta

(continued from previous page)

Epoch 5/24
----------
train Loss: 0.5801 Acc: 0.7974
val Loss: 0.5918 Acc: 0.7951

Epoch 10/24
----------
train Loss: 0.4765 Acc: 0.8325
val Loss: 0.5257 Acc: 0.8178

Epoch 15/24
----------
train Loss: 0.4555 Acc: 0.8390
val Loss: 0.5205 Acc: 0.8201

Epoch 20/24
----------
train Loss: 0.4557 Acc: 0.8395
val Loss: 0.5183 Acc: 0.8212

Training complete in 277m 16s
Best val Acc: 0.822800

Adam optimizer (continues on next page)
model_ft = models.resnet18(pretrained=True)
num_ftrs = model_ft.fc.in_features 309
# Here the size of each output sample is set to 10.
model_ft.fc = nn.Linear(num_ftrs, D_out)
model_ft = model_ft.to(device)
criterion = nn.CrossEntropyLoss()

6.4. Transfer Learning Tutorial

Statistics and Machine Learning in Python, Release 0.3 beta

(continued from previous page)

# Observe that all parameters are being optimized
optimizer_ft = torch.optim.Adam(model_ft.parameters(), lr=0.001)
# Decay LR by a factor of 0.1 every 7 epochs
exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=7, gamma=0.1)
model, losses, accuracies = train_val_model(model_ft, criterion, optimizer_ft,

dataloaders, scheduler=exp_lr_scheduler, num_epochs=25, log_interval=5)
epochs = np.arange(len(losses[ train ]))
_ = plt.plot(epochs, losses[ train ], -b , epochs, losses[ val ], --r )

Epoch 0/24
----------
train Loss: 1.0491 Acc: 0.6407
val Loss: 0.8981 Acc: 0.6881
Epoch 5/24
----------
train Loss: 0.5495 Acc: 0.8135
val Loss: 0.9076 Acc: 0.7147
Epoch 10/24
----------
train Loss: 0.3352 Acc: 0.8834
val Loss: 0.4148 Acc: 0.8613
Epoch 15/24
----------
train Loss: 0.2819 Acc: 0.9017
val Loss: 0.4019 Acc: 0.8646
Epoch 20/24
----------
train Loss: 0.2719 Acc: 0.9050
val Loss: 0.4025 Acc: 0.8675
Training complete in 293m 37s
Best val Acc: 0.868800

310 Chapter 6. Deep Learning

Statistics and Machine Learning in Python, Release 0.3 beta

ResNet as a feature extractor

Freeze all the network except the final layer: requires_grad == False to freeze the parameters
so that the gradients are not computed in backward().

model_conv = torchvision.models.resnet18(pretrained=True)
for param in model_conv.parameters():

param.requires_grad = False

# Parameters of newly constructed modules have requires_grad=True by default
num_ftrs = model_conv.fc.in_features
model_conv.fc = nn.Linear(num_ftrs, D_out)

model_conv = model_conv.to(device)

criterion = nn.CrossEntropyLoss()

# Observe that only parameters of final layer are being optimized as
# opposed to before.
optimizer_conv = optim.SGD(model_conv.fc.parameters(), lr=0.001, momentum=0.9)

# Decay LR by a factor of 0.1 every 7 epochs
exp_lr_scheduler = lr_scheduler.StepLR(optimizer_conv, step_size=7, gamma=0.1)
model, losses, accuracies = train_val_model(model_conv, criterion, optimizer_conv,

dataloaders, scheduler=exp_lr_scheduler, num_epochs=25, log_interval=5)

epochs = np.arange(len(losses[ train ]))
_ = plt.plot(epochs, losses[ train ], -b , epochs, losses[ val ], --r )

Epoch 0/24
----------
train Loss: 1.9107 Acc: 0.3265
val Loss: 1.7982 Acc: 0.3798

(continues on next page)

6.4. Transfer Learning Tutorial 311

Statistics and Machine Learning in Python, Release 0.3 beta (continued from previous page)

Epoch 5/24
----------
train Loss: 1.6666 Acc: 0.4165
val Loss: 1.7067 Acc: 0.4097

Epoch 10/24
----------
train Loss: 1.6411 Acc: 0.4278
val Loss: 1.6737 Acc: 0.4269

Epoch 15/24
----------
train Loss: 1.6315 Acc: 0.4299
val Loss: 1.6724 Acc: 0.4218

Epoch 20/24
----------
train Loss: 1.6400 Acc: 0.4274
val Loss: 1.6755 Acc: 0.4250

Training complete in 61m 46s
Best val Acc: 0.430200

Adam optimizer

model_conv = torchvision.models.resnet18(pretrained=True)
for param in model_conv.parameters():

param.requires_grad = False

# Parameters of newly constructed modules have requires_grad=True by default
num_ftrs = model_conv.fc.in_features
model_conv.fc = nn.Linear(num_ftrs, D_out)

model_conv = model_conv.to(device)

(continues on next page)

312 Chapter 6. Deep Learning

Statistics and Machine Learning in Python, Release 0.3 beta

(continued from previous page)

criterion = nn.CrossEntropyLoss()

# Observe that only parameters of final layer are being optimized as
# opposed to before.
optimizer_conv = optim.Adam(model_conv.fc.parameters(), lr=0.001)

# Decay LR by a factor of 0.1 every 7 epochs
exp_lr_scheduler = lr_scheduler.StepLR(optimizer_conv, step_size=7, gamma=0.1)

model, losses, accuracies = train_val_model(model_conv, criterion, optimizer_conv,
exp_lr_scheduler, dataloaders, num_epochs=25, log_interval=5)

epochs = np.arange(len(losses[ train ]))
_ = plt.plot(epochs, losses[ train ], -b , epochs, losses[ val ], --r )

---------------------------------------------------------------------------

TypeError Traceback (most recent call last)

<ipython-input-16-dde92868b554> in <module>
19
20 model, losses, accuracies = train_val_model(model_conv, criterion, optimizer_conv,

---> 21 exp_lr_scheduler, dataloaders, num_epochs=25, log_interval=5)
22
23 epochs = np.arange(len(losses[ train ]))

TypeError: train_val_model() got multiple values for argument num_epochs

6.4. Transfer Learning Tutorial 313

Statistics and Machine Learning in Python, Release 0.3 beta
314 Chapter 6. Deep Learning

CHAPTER

SEVEN

INDICES AND TABLES

• genindex
• modindex
• search

315


Click to View FlipBook Version