Category: Safety & Ethics
Definition
AI alignment means making AI systems pursue goals that match human values and intentions.
How It Works
The challenge is that AI systems optimize for what you measure, not what you want. A paperclip-making AI told to maximize paperclips might turn everything into paperclips.
Alignment research finds ways to specify goals that capture what humans actually want.
Why It Matters
Misaligned AI causes problems even when it works perfectly. Current AI sometimes gives harmful advice or biased decisions because it's optimizing for the wrong things.
As AI gets more powerful, alignment becomes more important.
← Back to Safety & Ethics | All Terms