An issue in GELU results

tensara

Problems

Sandbox

Leaderboards

Blog

Contests

Back

An issue in GELU results

hdnhan

Jan 31, 2026

Fomula: $\text{GELU}(x) = x *\Phi(x)$

From PyTorch's official docs (https://docs.pytorch.org/docs/2.8/generated/torch.nn.GELU.html):

When the approximate argument is ‘tanh’, GELU is estimated with: $\text{GELU}(x) = 0.5x\left( 1 + \tanh\left( \sqrt{\frac{2}{\pi}}\left( x + 0.044715x^{3}\right) \right) \right)$

But reference solution:

def reference_solution(self, input_matrix: torch.Tensor) -> torch.Tensor:
    """
    PyTorch implementation of GELU.
        
    Args:
        input_matrix: Input matrix of shape (M, N)
            
    Returns:
        Result of GELU activation
    """
    with torch.no_grad(), torch.autocast("cuda", enabled=False, dtype=input_matrix.dtype):
        return torch.nn.functional.gelu(input_matrix)

which means $\text{approximate}=\text{none}$ by default, not $\tanh$

For example, when $x=-5$ , the expected result should be $0$ , or more precisely $\sim 2 \times 10^{-7}$ . Simple check:

Go to this site (https://cpp.sh), and paste the script:

#include <iostream>
int main() {
    float x = -5.0f;
    float y = 0.5f * x * (1.0f + tanhf(0.797884f * (x + 0.044715f * x * x * x)));
    printf("%.6f\n", y);
}

Or go this website (https://www.wolframalpha.com/input?i=0.5*x*%281+%2B+tanh%28sqrt%282%2Fpi%29+*+%28x+%2B+0.044715+*+x%5E3%29%29%29+where+x+%3D+-5), the result should be shown as $-2.2918 \times 10^{-7}$

An issue in GELU results

Comments