Implement Layer Normalization over the last 3 dimensions (F, D1, D2) of a 4D tensor.
The formula for Layer Normalization is:
y=Var[x]+ϵx−E[x]∗γ+β
where the mean E[x] and variance Var[x] are computed over the normalization dimensions (F, D1, D2) for each element in the first dimension (B). γ and β are learnable affine parameters (elementwise scale and shift), and ϵ is a small value added to the variance for numerical stability.
Input:
Tensor X of shape (B,F,D1,D2) (input data)
Vector gamma of shape (F,D1,D2) (scale parameters)
Vector beta of shape (F,D1,D2) (shift parameters)
Epsilon ϵ (a small float, typically 1e-5)
Output:
Tensor Y of shape (B,F,D1,D2) (normalized data)
Notes:
Compute the mean and variance across the last three dimensions (F,D1,D2) independently for each sample in the batch B.
Apply the normalization using the computed mean/variance and the provided γ and β.