Volta

NVIDIA V100

Name: NVIDIA V100
Brand: NVIDIA
SKU: v100
Availability: InStock

Proven Volta-generation GPU for inference and legacy AI workloads

VRAM

32 GB

Bandwidth

900 GB/s

FP16

125 TFLOPS

TDP

300W (SXM2) / 250W (PCIe)

Get V100 Pricing

Technical Specifications

VRAM	32 GB HBM2
Memory Bandwidth	900 GB/s
FP16 Performance	125 TFLOPS
FP32 Performance	15.7 TFLOPS
INT8 Performance	62 TOPS
TDP	300W (SXM2) / 250W (PCIe)
Form Factor	SXM2 / PCIe Gen3
Interconnect	NVLink 2.0 (300 GB/s, SXM2 only)
PCIe Interface	PCIe Gen3 x16
Max GPUs per Server	8 (DGX-1V / HGX V100) / 4 (PCIe)

Prices vary with supply and import costs. Contact for current India pricing.

Best For

Cost-effective inference for models up to 13B parameters

Running legacy CUDA workloads and frameworks

Academic and research labs on a budget

Computer vision and NLP inference

Not Ideal For

Training large LLMs (newer GPUs offer 5-8x more throughput)
Workloads needing BF16 support (V100 only supports FP16)
New deployments where power efficiency matters (older architecture)

Overview

The NVIDIA V100 was the first Tensor Core GPU and defined the modern era of deep learning acceleration. Built on the Volta architecture with 32 GB of HBM2 memory, it remains a capable GPU for inference and smaller-scale training workloads at a fraction of the cost of newer-generation hardware.

For inference, the V100 handles most production models including BERT, ResNet, Whisper, and LLaMA 7B with room to spare. Its 32 GB VRAM is sufficient for the majority of inference use cases. For organizations running proven models in production, the V100 offers excellent cost efficiency.

The V100 is widely available in the refurbished market, making it an attractive option for academic institutions, startups, and research labs that need GPU compute without the premium of current-generation hardware. We supply certified refurbished V100 SXM2 and PCIe units with warranty.

Note: The V100 does not support BF16 data type. If your framework requires BF16, consider the A100 or newer. For FP16 and FP32 workloads, the V100 remains fully supported by all modern deep learning frameworks.

Use Case Guides

Best GPU for LLM Inference in India

Large Language Model (LLM) Inference / Serving

Read guide

Get NVIDIA V100 pricing for your setup

Tell us your workload and cluster size. We'll quote the complete solution including servers, networking, and colocation.

WhatsApp Us