Tensorflow 与 PyTorch 的 LSTM的权值排布差异

July 6, 2018, 8 p.m.

read: 1143

Tensorflow:

# i = input_gate, j = new_input, f = forget_gate, o = output_gate
i, j, f, o = array_ops.split(
value=lstm_matrix, num_or_size_splits=4, axis=1)

参考:
https://github.com/tensorflow/tensorflow/blob/r1.8/tensorflow/python/ops/rnn_cell_impl.py#L836

PyTorch


    Attributes:
        weight_ih_l[k] : the learnable input-hidden weights of the :math:`\text{k}^{th}` layer
            `(W_ii|W_if|W_ig|W_io)`, of shape `(4*hidden_size x input_size)`
        weight_hh_l[k] : the learnable hidden-hidden weights of the :math:`\text{k}^{th}` layer
            `(W_hi|W_hf|W_hg|W_ho)`, of shape `(4*hidden_size x hidden_size)`
        bias_ih_l[k] : the learnable input-hidden bias of the :math:`\text{k}^{th}` layer
            `(b_ii|b_if|b_ig|b_io)`, of shape `(4*hidden_size)`
        bias_hh_l[k] : the learnable hidden-hidden bias of the :math:`\text{k}^{th}` layer
            `(b_hi|b_hf|b_hg|b_ho)`, of shape `(4*hidden_size)`

参考:
https://pytorch.org/docs/stable/_modules/torch/nn/modules/rnn.html#LSTMCell

可以看到forget gate 和 new memory gate 排布相反




深度神经网络有感

自从微软搞出了残差网络,就出现了真正意义上的深度神经网络,上百层的网络配上BN也不会丢失信息,然后微软的画风就跑偏了,各种大力出奇迹的操作,几百层的网络说搭就搭。。。

Tensorflow 直接对于图片进行3通道卷积

文章标题:Tensorflow 直接对于图片进行3通道卷积文章内容:每张图的信息如下大小 218 * 820x00需要预测的图片如下: 实际上是3个通道(RGB,有些图片会存在4个通道 RGBA, …

此站点由 ASP.NETIIS 驱动 | © 2018-2023 hupeng.me. All Rights Reserved.