Show HN: CLI tool for detecting non-exact code duplication with embedding models
A new CLI tool called Slopo uses embedding models to detect non-exact code duplication, helping developers find similar code blocks that are not identical copies.
Background
- Slopo is a CLI (command-line) tool that uses "embedding models" — AI neural networks that convert code snippets into mathematical vectors — to find code that is semantically similar but not textually identical (non-exact duplication). This goes beyond traditional tools like `diff` or `grep`, which only catch exact copy-pastes.
- The approach is similar to how "code similarity" detectors work in plagiarism checkers or large-scale refactoring: embeddings capture the *meaning* of code (e.g., two different implementations of a sorting algorithm) rather than just its literal characters.
- The tool is authored by rafal-qa and posted as a "Show HN" on Hacker News, meaning it's a personal project being shared for feedback rather than a commercial product.
- Why it matters: in large codebases, non-exact duplication (e.g., slightly modified copied blocks) leads to maintenance burden, bugs, and inconsistent fixes. AI-powered detection helps developers find and consolidate such code.