Skip to content
TopicTracker
From HackerNewsView original
TranslationTranslation

Snyk VulnBench JavaScript 1.0: Can LLMs Find the Same Bugs Twice?

The paper introduces Snyk VulnBench JavaScript 1.0, a benchmark evaluating whether large language models can consistently identify the same software vulnerabilities across repeated attempts. It tests LLMs on JavaScript vulnerability detection, focusing on reproducibility of bug finding.

Related stories