--- title: RES --- flowchart LR D(Input) --> E[Layer] E --> F(($$+$$)) D --> F F --> G(Output)
Residual Neural Network
Outline: Creating a residual neural network
Skip Connections
Theoretically, you cannot have too many layers in a network. For example, if a perfect network has 3 layers, then any excessive layer should just not do anything. So a deeper network should always have atleast the same performance as a shallower one…
This is not the case. Primarily it is theorized that layers do not like to become the identity function, so they negatively impact performance. To address this, we can effectivly add an identity function to our network that bypasses the network layers known as a skip connection.
Residual Networks
Let’s build a basic linear residual network, in essence it is just a mlp with a skip connection
class ResidualModel(nn.Module):
def __init__(self,
int,
in_neurons:int,
hidden_neurons:int,
out_neurons:float = 0.,
dropout:->None:
)
super().__init__()
= [
layers
nn.Linear(in_neurons,hidden_neurons),
nn.Relu(),
nn.Dropout(dropout),
nn.Linear(hidden_neurons,out_neurons),
]
self.dense = nn.Sequential(*layers)
self.skip = nn.Linear(in_neurons,out_neurons)
def forward(self,x) -> torch.Tensor:
= self.dense(x) + self.skip(x)
x return x
This is a completely viable network, but it isn’t really how these are used. The idea of a residual block is for a bunch of them to be commbined in sequence to form a “deep” network. To make this, we can just copy-paste our layers multiple times, or we can create a block.
class ResidualBlock(nn.Module):
def __init__(self,
int,
in_neurons:int,
hidden_neurons:int,
out_neurons:float = 0.,
dropout:bool = False,
use_layer_norm:->None:
)
super().__init__()
= [
layers
nn.Linear(in_neurons,hidden_neurons),
nn.Relu(),
nn.Linear(hidden_neurons,out_neurons),
nn.Dropout(dropout),
]
self.dense = nn.Sequential(*layers)
self.skip = nn.Linear(in_neurons,out_neurons)
self.layer_norm = nn.LayerNorm(out_neurons) if use_layer_norm else None
def forward(self,x) -> torch.Tensor:
= self.dense(x) + self.skip(x)
x
if self.layer_norm is not None:
= self.layer_norm(x)
x return x
Notice that we moved the drop out to the end of block, and added the ability to apply a layer norm to the entire block
We can now generate a model using our residual blocks, basically the same way we made our original MLP
class ResNet(nn.Module):
def __init__(self,
int,
in_neurons:int,
hidden_neurons:int,
out_neurons:int,
no_blocks: float = 0.,
dropout:bool = False,
use_layer_norm:->None:
)
super().__init__()
= [ResidualBlock(in_neurons,hidden_neurons,hidden_neurons,dropout,use_layer_norm)]
layers
for i in range(no_blocks):
layers.append(ResidualBlock(hidden_neurons,hidden_neurons,hidden_neurons,dropout,use_layer_norm))
layers.append(ResidualBlock(hidden_neurons,hidden_neurons,out_neurons,dropout,use_layer_norm))self.model = nn.Sequential(*layers)
def forward(self,x) -> torch.Tensor:
= self.model(x)
x return x