Category: File Formats
Definition
GGUF (GPT-Generated Unified Format) is an efficient file format designed for running large language models on consumer hardware with limited resources.
How It Works
GGUF stores quantized model weights and metadata in a single file. It supports various quantization levels, from 2-bit to 16-bit precision, trading accuracy for smaller file sizes.
The format includes model architecture information, making it self-contained and portable across different inference engines.
Why It Matters
GGUF democratizes AI by enabling powerful models to run on personal computers without expensive GPUs. A 70-billion parameter model can run on a laptop with just 8GB of RAM.
This format powers local AI applications and privacy-focused deployments where cloud services aren't viable.
← Back to File Formats | All Terms