Volta

NVIDIA V100

Proven Volta-generation GPU for inference and legacy AI workloads

VRAM

32 GB

Bandwidth

900 GB/s

FP16

125 TFLOPS

TDP

300W (SXM2) / 250W (PCIe)

NVIDIA V100

Technical Specifications

VRAM 32 GB HBM2
Memory Bandwidth 900 GB/s
FP16 Performance 125 TFLOPS
FP32 Performance 15.7 TFLOPS
INT8 Performance 62 TOPS
TDP 300W (SXM2) / 250W (PCIe)
Form Factor SXM2 / PCIe Gen3
Interconnect NVLink 2.0 (300 GB/s, SXM2 only)
PCIe Interface PCIe Gen3 x16
Max GPUs per Server 8 (DGX-1V / HGX V100) / 4 (PCIe)

Prices vary with supply and import costs. Contact for current India pricing.

Best For

Cost-effective inference for models up to 13B parameters
Running legacy CUDA workloads and frameworks
Academic and research labs on a budget
Computer vision and NLP inference

Not Ideal For

  • Training large LLMs (newer GPUs offer 5-8x more throughput)
  • Workloads needing BF16 support (V100 only supports FP16)
  • New deployments where power efficiency matters (older architecture)

Overview

The NVIDIA V100 was the first Tensor Core GPU and defined the modern era of deep learning acceleration. Built on the Volta architecture with 32 GB of HBM2 memory, it remains a capable GPU for inference and smaller-scale training workloads at a fraction of the cost of newer-generation hardware.

For inference, the V100 handles most production models including BERT, ResNet, Whisper, and LLaMA 7B with room to spare. Its 32 GB VRAM is sufficient for the majority of inference use cases. For organizations running proven models in production, the V100 offers excellent cost efficiency.

The V100 is widely available in the refurbished market, making it an attractive option for academic institutions, startups, and research labs that need GPU compute without the premium of current-generation hardware. We supply certified refurbished V100 SXM2 and PCIe units with warranty.

Note: The V100 does not support BF16 data type. If your framework requires BF16, consider the A100 or newer. For FP16 and FP32 workloads, the V100 remains fully supported by all modern deep learning frameworks.

Get NVIDIA V100 pricing for your setup

Tell us your workload and cluster size. We'll quote the complete solution including servers, networking, and colocation.