Perform 2D average pooling on an input tensor:
output[i,j]=k21m=0∑k−1n=0∑k−1input[S⋅i+m−P,S⋅j+n−P]
The average pooling operation slides a window of size k×k over the input tensor with stride S and padding P, computing the average value within each window position.
Input:
- Matrix
input of size H×W (input tensor)
kernel_size (k): Size of the pooling window
stride (S): Step size between window positions
padding (P): Number of zero-padding elements added on all sides
Output:
- Matrix
output of size Hout×Wout where:
Hout=⌊SH+2P−k+1⌋
Wout=⌊SW+2P−k+1⌋
Notes:
- All matrices are stored in row-major order
- Zero padding is applied when specified by the padding parameter
- For values outside the input boundaries (after padding), use zero values in the average computation
- The denominator (k2) should always be the full kernel size, even when some elements are outside the input boundaries
- This problem is adapted from KernelBench