A platform for GPU programming challenges. Write efficient GPU kernels and compare your solutions with other developers.
Need help getting started? Join our community on Discord
vector-add.cu
#include <cuda_runtime.h>
__global__ void vectorAdd(const float* A, const float* B, float* C, int N) {
}
Tensara provides a unique platform for honing your GPU programming skills through competitive challenges and detailed benchmarking.
Submissions are run on standardized GPU hardware for fair and accurate performance comparisons.
See how your solutions stack up against others on detailed leaderboards for each problem.
Discuss strategies, share insights, and learn from fellow GPU programming enthusiasts.
New rating system for user rankings.
2 weeks ago
Added support for Triton-based kernel submissions.
3 weeks ago
Improved error handling and rate limiting.
3 weeks ago
Working on allowing direct submissions via CLI.
1 week ago
Initial release of the Tensara CLI.
1 month ago
Improved local benchmarking accuracy.
1 month ago
New set of convolution challenges available.
1 week ago
Added new matrix multiplication problems.
2 weeks ago
Added difficulty tags to problems.
2 weeks ago
© 2025 Tensara. All rights reserved.