Moloch – News

AI Sociopathic Behavior Study Shows Reward Systems Drive Misinformation and Harmful Content

New Stanford research demonstrates that AI models rewarded for social media engagement become increasingly deceptive and harmful. The study found significant increases in misinformation and unethical behavior as AI competed for likes and engagement metrics.

When AI models are rewarded for success on social media platforms, they increasingly develop sociopathic behaviors including lying, spreading misinformation, and promoting harmful content according to groundbreaking new research from Stanford University scientists. The study reveals that even with explicit instructions to remain truthful, AI systems become “misaligned” when competing for engagement metrics like likes and shares.