GEFORCE

NVIDIA L40

Name: NVIDIA L40
Brand: NVIDIA

NVIDIA graphics card specifications and benchmark scores

48 GB

VRAM

2490

MHz Boost

300W

TDP

384

Bus Width

✨Ray Tracing 🤖Tensor Cores

NVIDIA L40 Specifications

⚙️

L40 GPU Core

Shader units and compute resources

The NVIDIA L40 GPU core specifications define its raw processing power for graphics and compute workloads. Shading units (also called CUDA cores, stream processors, or execution units depending on manufacturer) handle the parallel calculations required for rendering. TMUs (Texture Mapping Units) process texture data, while ROPs (Render Output Units) handle final pixel output. Higher shader counts generally translate to better GPU benchmark performance, especially in demanding games and 3D applications.

Shading Units

18,176

Shaders

18,176

TMUs

568

ROPs

192

SM Count

142

⏱️

L40 Clock Speeds

GPU and memory frequencies

Clock speeds directly impact the L40's performance in GPU benchmarks and real-world gaming. The base clock represents the minimum guaranteed frequency, while the boost clock indicates peak performance under optimal thermal conditions. Memory clock speed affects texture loading and frame buffer operations. The L40 by NVIDIA dynamically adjusts frequencies based on workload, temperature, and power limits to maximize performance while maintaining stability.

Base Clock

735 MHz

Base Clock

735 MHz

Boost Clock

2490 MHz

Boost Clock

2,490 MHz

Memory Clock

2250 MHz 18 Gbps effective

NVIDIA's L40 Memory

VRAM capacity and bandwidth

VRAM (Video RAM) is dedicated memory for storing textures, frame buffers, and shader data. The L40's memory capacity determines how well it handles high-resolution textures and multiple displays. Memory bandwidth, measured in GB/s, affects how quickly data moves between the GPU and VRAM. Higher bandwidth improves performance in memory-intensive scenarios like 4K gaming. The memory bus width and type (GDDR6, GDDR6X, HBM) significantly influence overall GPU benchmark scores.

Memory Size

48 GB

VRAM

49,152 MB

Memory Type

GDDR6

VRAM Type

GDDR6

Memory Bus

384 bit

Bus Width

384-bit

Bandwidth

864.0 GB/s

💾

L40 by NVIDIA Cache

On-chip cache hierarchy

On-chip cache provides ultra-fast data access for the L40, reducing the need to fetch data from slower VRAM. L1 and L2 caches store frequently accessed data close to the compute units. AMD's Infinity Cache (L3) dramatically increases effective bandwidth, improving GPU benchmark performance without requiring wider memory buses. Larger cache sizes help maintain high frame rates in memory-bound scenarios and reduce power consumption by minimizing VRAM accesses.

L1 Cache

128 KB (per SM)

L2 Cache

96 MB

📈

L40 Theoretical Performance

Compute and fill rates

Theoretical performance metrics provide a baseline for comparing the NVIDIA L40 against other graphics cards. FP32 (single-precision) performance, measured in TFLOPS, indicates compute capability for gaming and general GPU workloads. FP64 (double-precision) matters for scientific computing. Pixel and texture fill rates determine how quickly the GPU can render complex scenes. While real-world GPU benchmark results depend on many factors, these specifications help predict relative performance levels.

FP32 (Float)

90.52 TFLOPS

FP64 (Double)

1,414.3 GFLOPS (1:64)

FP16 (Half)

90.52 TFLOPS (1:1)

Pixel Rate

478.1 GPixel/s

Texture Rate

1,414.3 GTexel/s

✨

L40 Ray Tracing & AI

Hardware acceleration features

The NVIDIA L40 includes dedicated hardware for ray tracing and AI acceleration. RT cores handle real-time ray tracing calculations for realistic lighting, reflections, and shadows in supported games. Tensor cores (NVIDIA) or XMX cores (Intel) accelerate AI workloads including DLSS, FSR, and XeSS upscaling technologies. These features enable higher visual quality without proportional performance costs, making the L40 capable of delivering both stunning graphics and smooth frame rates in modern titles.

RT Cores

142

Tensor Cores

568

🏗️

Ada Lovelace Architecture & Process

Manufacturing and design details

The NVIDIA L40 is built on NVIDIA's Ada Lovelace architecture, which defines how the GPU processes graphics and compute workloads. The manufacturing process node affects power efficiency, thermal characteristics, and maximum clock speeds. Smaller process nodes pack more transistors into the same die area, enabling higher performance per watt. Understanding the architecture helps predict how the L40 will perform in GPU benchmarks compared to previous generations.

Architecture

Ada Lovelace

GPU Name

AD102

Process Node

5 nm

Foundry

TSMC

Transistors

76,300 million

Die Size

609 mm²

Density

125.3M / mm²

🔌

NVIDIA's L40 Power & Thermal

TDP and power requirements

Power specifications for the NVIDIA L40 determine PSU requirements and thermal management needs. TDP (Thermal Design Power) indicates the heat output under typical loads, guiding cooler selection. Power connector requirements ensure adequate power delivery for stable operation during demanding GPU benchmarks. The suggested PSU wattage accounts for the entire system, not just the graphics card. Efficient power delivery enables the L40 to maintain boost clocks without throttling.

TDP

300 W

TDP

300W

Power Connectors

1x 16-pin

Suggested PSU

700 W

📐

L40 by NVIDIA Physical & Connectivity

Dimensions and outputs

Physical dimensions of the NVIDIA L40 are critical for case compatibility. Card length, height, and slot width determine whether it fits in your chassis. The PCIe interface version affects bandwidth for communication with the CPU. Display outputs define monitor connectivity options, with modern cards supporting multiple high-resolution displays simultaneously. Verify these specifications against your case and motherboard before purchasing to ensure a proper fit.

Slot Width

Dual-slot

Length

267 mm 10.5 inches

Height

111 mm 4.4 inches

Bus Interface

PCIe 4.0 x16

Display Outputs

4x DisplayPort 1.4a

Display Outputs

4x DisplayPort 1.4a

🎮

NVIDIA API Support

Graphics and compute APIs

API support determines which games and applications can fully utilize the NVIDIA L40. DirectX 12 Ultimate enables advanced features like ray tracing and variable rate shading. Vulkan provides cross-platform graphics capabilities with low-level hardware access. OpenGL remains important for professional applications and older games. CUDA (NVIDIA) and OpenCL enable GPU compute for video editing, 3D rendering, and scientific applications. Higher API versions unlock newer graphical features in GPU benchmarks and games.

DirectX

12 Ultimate (12_2)

DirectX

12 Ultimate (12_2)

OpenGL

4.6

OpenGL

4.6

Vulkan

1.4

Vulkan

1.4

OpenCL

3.0

CUDA

8.9

Shader Model

6.8

📦

L40 Product Information

Release and pricing details

The NVIDIA L40 is manufactured by NVIDIA as part of their graphics card lineup. Release date and launch pricing provide context for comparing GPU benchmark results with competing products from the same era. Understanding the product lifecycle helps evaluate whether the L40 by NVIDIA represents good value at current market prices. Predecessor and successor information aids in tracking generational improvements and planning future upgrades.

Manufacturer

NVIDIA

Release Date

Oct 2022

Production

End-of-life

Predecessor

Server Ampere

Successor

Server Hopper

L40 Benchmark Scores

geekbench_openclSource

Geekbench OpenCL tests GPU compute performance using the cross-platform OpenCL API. This shows how NVIDIA L40 handles parallel computing tasks like video encoding and scientific simulations. OpenCL is widely supported across different GPU vendors and platforms.

geekbench_opencl #4 of 582

330,683

87%

Max: 380,114

Compare with other GPUs

🏆 Top 5 Performers

#1 NVIDIA GeForce RTX 5090

380,114

#2 NVIDIA GeForce RTX 5090 D

375,966

#3 NVIDIA L40S

334,437

#4 NVIDIA L40

330,683

#5 NVIDIA RTX 5880 Ada Generation

327,829

📍 Nearby Performers

#1 NVIDIA GeForce RTX 5090

380,114

#2 NVIDIA GeForce RTX 5090 D

375,966

#3 NVIDIA L40S

334,437

#4 NVIDIA L40 This GPU

330,683

#5 NVIDIA RTX 5880 Ada Generation

327,829

#6 NVIDIA GeForce RTX 4090

317,684

#7 NVIDIA RTX 6000 Ada Generation

311,629

#8 NVIDIA H200 NVL

305,608

#9 NVIDIA GeForce RTX 4090 D

271,315

geekbench_vulkanSource

Geekbench Vulkan tests GPU compute using the modern low-overhead Vulkan API. This shows how NVIDIA L40 performs with next-generation graphics and compute workloads. Vulkan offers better CPU efficiency than older APIs like OpenGL. Modern games and applications increasingly use Vulkan for cross-platform GPU acceleration.

geekbench_vulkan #8 of 386

232,627

61%

Max: 379,571

Compare with other GPUs

🏆 Top 5 Performers

#1 NVIDIA GeForce RTX 5090

379,571

#2 NVIDIA GeForce RTX 5090 D

376,915

#3 NVIDIA GeForce RTX 4090

270,615

#4 NVIDIA GeForce RTX 5080

257,942

#5 NVIDIA RTX 6000 Ada Generation

252,235

📍 Nearby Performers

#3 NVIDIA GeForce RTX 4090

270,615

#4 NVIDIA GeForce RTX 5080

257,942

#5 NVIDIA RTX 6000 Ada Generation

252,235

#6 NVIDIA L40S

250,769

#7 NVIDIA GeForce RTX 4090 D

244,421

#8 NVIDIA L40 This GPU

232,627

#9 NVIDIA GeForce RTX 5070 Ti

227,554

#10 NVIDIA GeForce RTX 4080 SUPER

226,163

#11 NVIDIA GeForce RTX 4080

216,045

#12 NVIDIA GeForce RTX 4070 Ti SUPER

206,035

#13 NVIDIA GeForce RTX 5090 Mobile

198,405

About NVIDIA L40

The NVIDIA L40 from NVIDIA stands out with its massive 48 GB GDDR6 VRAM, making it a beast for demanding workloads like AI training and 3D rendering. Built on the cutting-edge Ada Lovelace architecture using a 5 nm process, it delivers a base clock of 735 MHz that boosts up to 2490 MHz under load. With a 300 W TDP and PCIe 4.0 x16 interface, NVIDIA's NVIDIA L40 ensures seamless integration into high-end systems without choking on power delivery. Benchmark scores crush the competition: 330,683 points in Geekbench OpenCL and 232,627 in Geekbench Vulkan, proving its raw compute muscle. Released on October 13, 2022, this card remains a top pick for professionals pushing hardware limits. Its specs scream value for anyone scaling up compute-intensive tasks.

When it comes to price-to-performance, NVIDIA's NVIDIA L40 punches way above its weight, offering workstation-grade power at a fraction of flagship consumer card costs. That 48 GB VRAM pool handles massive datasets without swapping to system RAM, keeping workflows buttery smooth. The Ada Lovelace efficiency means lower electricity bills over long render farms compared to older Ampere cards. Boost clocks hit 2490 MHz reliably, translating to real-world gains in ray tracing and ML inference. For the cash, you're getting future-ready tech that won't obsolete overnight. NVIDIA's NVIDIA L40 redefines value in the datacenter GPU space.

Market positioning slots NVIDIA's NVIDIA L40 perfectly between consumer RTX cards and enterprise H100s, ideal for SMBs and creators needing pro features without enterprise pricing. Future-proofing comes via PCIe 4.0 and 5 nm node, ready for next-gen software stacks. System requirements are straightforward: a beefy PSU for 300 W and x16 slot, no exotic cooling needed.