Translation

Show HN: We replaced 5ML models with 1 shared encoder on an $11/month VPS

A developer consolidated five separate MiniLM models into one shared encoder with five lightweight heads, reducing memory usage from 455MB to 25MB while improving matching scores. The multi-task approach required adding a contrastive objective to prevent embedding quality collapse. The system now runs on an $11/month VPS with zero API costs and faster processing times.