【27】grad-cam的簡單邏輯實現以及效果展示

語言: CN / TW / HK

本文已參與「新人創作禮」活動,一起開啟掘金創作之路。


如有錯誤,懇請指出。


1. grad-cam的簡單實現


grad-cam通過對類別c最後的預測值yc進行方向傳播,得到回傳的特徵層A的梯度資訊A‘,此時A’其實就是yc對A所求得的偏導。我們認為,梯度資訊比較大的部分其對於當前的這個類別資訊是比較重要的,所以在grad-cam中,對於梯度資訊A‘會在通道上進行一個平均處理,這樣就可以得到每個channel上的一個重要程度,然後再與每個channels進行相乘,也就是進行一個簡單的加權求和,然後通過ReLU函式來剔除負梯度資訊,最後所得到的就是grad-cam的熱力圖。不過,這當然會涉及一些後處理,插值等方法來達到最後的視覺化效果。 在這裡插入圖片描述

其中的整個流程可以通過下面的這個公式來表示: 在這裡插入圖片描述

大概的實現流程主要有兩個問題: - 1)獲取中間過程的梯度資訊 - 2)選擇輸出某一層的特徵圖

對於問題1,可以選擇特定的卷積層來捕獲通過其的資訊流,核心程式碼如下:

```c input_grad = [] output_grad = []

def save_gradient(module, grad_input, grad_output): input_grad.append(grad_input) # print(f"{module.class.name} input grad:\n{grad_input}\n") output_grad.append(grad_output) # print(f"{module.class.name} output grad:\n{grad_output}\n")

last_layer = model.layer4[-1] last_layer.conv3.register_full_backward_hook(save_gradient) output[0][0].backward() ``` 對於問題2,由於這裡是保留了模型原有的預訓練引數的,也就是隻是一個推理過程,不需要訓練,所以我使用了以下方法實現:

```c import torch from torchvision.models import resnet50

def get_feature_map(model, input_tensor):

x = model.conv1(input_tensor)
x = model.bn1(x)
x = model.relu(x)
x = model.maxpool(x)

x = model.layer1(x)
x = model.layer2(x)
x = model.layer3(x)
x = model.layer4(x)
return x

get output and feature

model = resnet50(pretrained=True) input = torch.rand([8, 3, 224, 224], dtype=torch.float, requires_grad=True) feature = get_feature_map(model, input) ```

完整程式碼如下所示: ```c import cv2 import numpy as np import einops import torch import torch.nn as nn import torch.nn.functional as F from torchvision.models import resnet50 from torchvision.transforms import Compose, Normalize, ToTensor

input_grad = [] output_grad = []

def save_gradient(module, grad_input, grad_output): input_grad.append(grad_input) # print(f"{module.class.name} input grad:\n{grad_input}\n") output_grad.append(grad_output) # print(f"{module.class.name} output grad:\n{grad_output}\n")

def preprocess_image(img: np.ndarray, mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]) -> torch.Tensor: preprocessing = Compose([ ToTensor(), Normalize(mean=mean, std=std) ]) return preprocessing(img.copy()).unsqueeze(0)

def show_cam_on_image(img: np.ndarray, mask: np.ndarray, use_rgb: bool = False, colormap: int = cv2.COLORMAP_JET) -> np.ndarray: """ This function overlays the cam mask on the image as an heatmap. By default the heatmap is in BGR format. :param img: The base image in RGB or BGR format. :param mask: The cam mask. :param use_rgb: Whether to use an RGB or BGR heatmap, this should be set to True if 'img' is in RGB format. :param colormap: The OpenCV colormap to be used. :returns: The default image with the cam overlay. """ heatmap = cv2.applyColorMap(np.uint8(255 * mask), colormap) if use_rgb: heatmap = cv2.cvtColor(heatmap, cv2.COLOR_BGR2RGB) heatmap = np.float32(heatmap) / 255

if np.max(img) > 1:
    raise Exception(
        "The input image should np.float32 in the range [0, 1]")

cam = heatmap + img
cam = cam / np.max(cam)
return np.uint8(255 * cam)

def get_feature_map(model, input_tensor):

x = model.conv1(input_tensor)
x = model.bn1(x)
x = model.relu(x)
x = model.maxpool(x)

x = model.layer1(x)
x = model.layer2(x)
x = model.layer3(x)
x = model.layer4(x)
return x

prepare image

image_path = './photo/cow.jpg' rgb_img = cv2.imread(image_path, 1)[:, :, ::-1] rgb_img = cv2.resize(rgb_img, (224, 224)) rgb_img = np.float32(rgb_img) / 255 input = preprocess_image(rgb_img, mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])

input = torch.rand([8, 3, 224, 224], dtype=torch.float, requires_grad=True)

model = resnet50(pretrained=True) last_layer = model.layer4[-1] last_layer.conv3.register_full_backward_hook(save_gradient)

print(model)

get output and feature

feature = get_feature_map(model, input) output = model(input)

print("feature.shape:", feature.shape) # torch.Size([8, 2048, 7, 7])

print("output.shape:", output.shape) # torch.Size([8, 1000])

cal grad

output[0][0].backward() gard_info = input.grad

print("gard_info.shape: ", gard_info.shape) # torch.Size([8, 3, 224, 224])

print("input_grad.shape:", input_grad[0][0].shape) # torch.Size([8, 512, 7, 7])

print("output_grad.shape:", output_grad[0][0].shape) # torch.Size([8, 2048, 7, 7])

feature_grad = output_grad[0][0] feature_weight = einops.reduce(feature_grad, 'b c h w -> b c', 'mean') grad_cam = feature * feature_weight.unsqueeze(-1).unsqueeze(-1) # (b c h w) * (b c 1 1) -> (b c h w) grad_cam = F.relu(torch.sum(grad_cam, dim=1)).unsqueeze(dim=1) # (b c h w) -> (b h w) -> (b 1 h w) grad_cam = F.interpolate(grad_cam, size=(224, 224), mode='bilinear') # (b 1 h w) -> (b 1 224 224) -> (224 224) grad_cam = grad_cam[0, 0, :]

print(grad_cam.shape) # torch.Size([224, 224])

cam_image = show_cam_on_image(rgb_img, grad_cam.detach().numpy()) cv2.imwrite('./result/test.jpg', cam_image) ```

2. grad-cam的效果展示


這裡使用了github的一個cam開源庫,獲取圖片的熱力圖,參考程式碼如下,更多的介紹與使用方法可以見參考資料1.

```c from pytorch_grad_cam import GradCAM, ScoreCAM, GradCAMPlusPlus, AblationCAM, XGradCAM, EigenCAM, FullGrad from pytorch_grad_cam.utils.model_targets import ClassifierOutputTarget from pytorch_grad_cam.utils.image import show_cam_on_image, preprocess_image from torchvision.models import resnet50 import torch import argparse import cv2 import numpy as np

def get_args(): parser = argparse.ArgumentParser() parser.add_argument('--use-cuda', action='store_true', default=False, help='Use NVIDIA GPU acceleration') parser.add_argument('--image-path',type=str,default='./photo/cow.jpg', help='Input image path') parser.add_argument('--method', type=str, default='gradcam', help='Can be gradcam/gradcam++/scorecam/xgradcam/ablationcam') parser.add_argument('--eigen_smooth',action='store_true', help='Reduce noise by taking the first principle componenet' 'of cam_weights*activations') parser.add_argument('--aug_smooth', action='store_true', help='Apply test time augmentation to smooth the CAM')

args = parser.parse_args()
args.use_cuda = args.use_cuda and torch.cuda.is_available()
if args.use_cuda:
    print('Using GPU for acceleration')
else:
    print('Using CPU for computation')

return args

if name == 'main':

args = get_args()
model = resnet50(pretrained=True)

target_layers = [model.layer4[-1]]

rgb_img = cv2.imread(args.image_path, 1)[:, :, ::-1]
rgb_img = cv2.resize(rgb_img, (224, 224))
rgb_img = np.float32(rgb_img) / 255
input_tensor = preprocess_image(rgb_img, mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])

# Construct the CAM object once, and then re-use it on many images:
cam = GradCAM(model=model, target_layers=target_layers, use_cuda=args.use_cuda)

# targets = [e.g ClassifierOutputTarget(281)]
targets = None

# You can also pass aug_smooth=True and eigen_smooth=True, to apply smoothing.
grayscale_cam = cam(input_tensor=input_tensor,
                    targets=targets,
                    eigen_smooth=args.eigen_smooth,
                    aug_smooth=args.aug_smooth)

# In this example grayscale_cam has only one image in the batch:
grayscale_cam = grayscale_cam[0, :]
# visualization = show_cam_on_image(rgb_img, grayscale_cam, use_rgb=True)

cam_image = show_cam_on_image(rgb_img, grayscale_cam)
cv2.imwrite('result.jpg', cam_image)

``` 效果如下所示:

在這裡插入圖片描述 在這裡插入圖片描述

3. Debug


補充,在程式碼編寫過程中出現過一些錯誤,這裡順便記錄下來:

  • 問題1:RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

這個問題的出現是因為輸入變數沒有設定梯度,增加requires_grad=True即可

```c

x = torch.tensor([2], dtype=torch.float)

x = torch.tensor([2], dtype=torch.float, requires_grad=True) ```

  • 問題2:The .grad attribute of a Tensor that is not a leaf Tensor is being accessed.

/tmp/ipykernel_76011/1698254930.py:27: UserWarning: The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its .grad attribute won't be populated during autograd.backward(). If you indeed want the gradient for a non-leaf Tensor, use .retain_grad() on the non-leaf Tensor. If you access the non-leaf Tensor by mistake, make sure you access the leaf Tensor instead. See github.com/pytorch/pytorch/pull/30531 for more information

這裡我需要計算輸出關於輸入x的反向梯度資訊,由於x需要是可求導的,於是我只是簡單的設定了requires_grad=True,此時在執行到x.grad,想要輸出其反向的梯度資訊時出現了問題,如上所示。

這裡顯示,正在訪問不是葉張量的張量的 .grad 屬性。 在 autograd.backward() 期間不會填充其 .grad 屬性。 如果您確實想要非葉張量的梯度,請在非葉張量上使用 .retain_grad(),為此,還需要增加以下一行程式碼:

c x = torch.tensor([1, 2, 3, 1, 1, 2, 2, 1, 2], dtype=torch.float32, requires_grad=True).reshape(1,1,3,3) x.retain_grad()

然後,後來看見其他的博主的程式碼,以下程式碼也可以執行:

```c x = torch.tensor([1, 2, 3, 1, 1, 2, 2, 1, 2], dtype=torch.float32, requires_grad=True).reshape(1,1,3,3) x = torch.autograd.Variable(x, requires_grad=True)

x.retain_grad()

```

  • 問題3:Trying to backward through the graph a second time (or directly access saved variables after they have already been freed).

RuntimeError: Trying to backward through the graph a second time (or directly access saved variables after they have already been freed). Saved intermediate values of the graph are freed when you call .backward() or autograd.grad(). Specify retain_graph=True if you need to backward through the graph a second time or if you need to access saved variables after calling backward.

這個錯誤是我想要同時檢視y[0]與y[1]的方向梯度資訊,但是這裡是第二次遍歷的時候變數已經釋放掉了,所以需要儲存中間引數,請指定 retain_graph=True,對此我的解決方法如下所示:

```c fc_out[0][0].backward() x.grad

fc_out[0][1].backward() x.grad ```

由以上程式碼改為;

```c torch.autograd.backward(fc_out[0][0], retain_graph=True) print("fc_out[0][0].backward:\n",x.grad)

torch.autograd.backward(fc_out[0][1], retain_graph=True) print("fc_out[0][1].backward:\n",x.grad) ```

參考資料:

  1. http://github.com/jacobgil/pytorch-grad-cam
  2. http://blog.csdn.net/qq_37541097/article/details/123089851