Category: Technical Terms
Definition
Inference is when a trained AI model makes predictions or generates outputs on new data.
How It Works
After training, the AI model's parameters are fixed. During inference, you feed it new input and it produces output based on what it learned.
Training happens once and takes lots of computing power. Inference happens every time someone uses the AI and needs to be fast.
Why It Matters
Inference is what you experience when using AI. Every ChatGPT response, image recognition, or recommendation is inference in action.
Making inference faster and cheaper lets more people use AI and enables real-time applications.
← Back to Technical Terms | All Terms