R_DROP
July 7, 2021, 7:37 p.m.
read: 2398
本篇博客主要是更新基于PyTorch的等价实现,
以方便日后调用
主要的核心是交叉熵和KL散度,
网络中需要有dropout结构
0x00 CE LOSS
code
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
import tensorflow as tf
input = np.random.randn(10, 512)
input_torch = torch.from_numpy(input).float()
input_tf = tf.convert_to_tensor(value=input, dtype=np.float32)
label = np.array([99, 11, 22, 33, 44, 88, 47, 478, 500, 501]).astype(np.int32)
label_torch = torch.from_numpy(label).long()
label_tf = tf.convert_to_tensor(value=label, dtype=np.int32)
label_tf = tf.one_hot(label_tf, 512)
loss_tf = tf.keras.losses.categorical_crossentropy(label_tf, tf.keras.activations.softmax(input_tf))
print(loss_tf.numpy().mean())
ce_loss = nn.CrossEntropyLoss()
print(ce_loss(input_torch, label_torch))
out:
6.741584
tensor(6.7416)
0x01 KL散度
待我看完论文更新
0x02 Ref
又是Dropout两次!这次它做到了有监督任务的SOTA
R-Drop: Regularized Dropout for Neural Networks
https://github.com/dropreg/R-Drop