Microsoft Unveils Scanner To Detect Backdoors

Microsoft researchers this week published a paper and a lightweight scanner to detect sleeper-agent backdoors in large language models. They identify three detection indicators — a "double-triangle" attention pattern, leakage of poisoned training data, and fuzzy triggers that activate on partial tokens — and show defenders can often find triggers without the exact phrase. The tools aim to help enterprises vet models for stealthy model-poisoning.
Scoring Rationale
Practical, well-supported detection methods from official Microsoft research; broad industry relevance with limited public replication details.
Practice with real Logistics & Shipping data
90 SQL & Python problems · 15 industry datasets
250 free problems · No credit card
See all Logistics & Shipping problemsStep-by-step roadmaps from zero to job-ready — curated courses, salary data, and the exact learning order that gets you hired.
Sources
- Read OriginalThree clues your LLM may be poisonedtheregister.com
- Read OriginalThree clues that your LLM may be poisoned with a sleeper-agent back dooritsecuritynews.info
- Read OriginalMicrosoft unveils method to detect sleeper agent backdoorsartificialintelligence-news.com


