TAG · #MACHINE-LEARNING

#machine-learning

30 items

HOTNESS

Drawings in untrained neural networks survive the training process
3.0
Researchers found that drawings embedded in randomly initialized neural networks can persist through the training process, suggesting that certain patterns in untrained networks are robust enough to survive optimization.
hnJul 13, 2026#Tech
Find genre of a track with CNN trained on spectrograms
1.0
The website FindGenre.com uses a CNN trained on spectrograms to identify the genre of a music track. Users can upload a song, and the model predicts its genre based on visual representations of its audio frequencies.
hnJul 9, 2026#Tech
Fable is not a useful model
3.0
The author argues that Fable, a language model, is not a useful model for bioinformatics or transcriptomics tasks, citing issues with its architecture, training data, and output quality. They conclude that researchers should avoid using Fable and instead rely on existing, well-established tools and approaches.
hnJul 8, 2026#Tech
Grok 4.5
8.0
xAI has released Grok 4.5, a new version of its AI model, which the company claims is its most advanced yet.
hnJul 8, 2026#Tech
Imagine what it will be like if 5 years from now models have improved on Fable as much as Fable has improved on GPT3.
3.0
Paul Graham speculates about future AI progress, imagining how advanced models might become in five years if they improve upon the current model Fable as much as Fable has improved upon GPT-3. The post quotes Jared Friedman praising Fable as "insanely good" and deserving of its hype.
x-paulgJul 6, 2026#Tech
From bigrams to GPT-2, one component at a time (in Jax)
1.0
The article walks through building and training a GPT-2 Small-scale model from scratch using JAX, progressing from simple bigram models to a full transformer architecture component by component.
hnJul 8, 2026#Tech
Grok 4.5
5.0
Cursor has launched Grok 4.5, a new version of its coding assistant model, offering improved performance and capabilities for developers using the Cursor IDE.
hnJul 8, 2026#Tech
Mathematics in Machine Learning (Vector Embeddings)
2.0
This video explains the concept of vector embeddings in machine learning, detailing how mathematical vectors are used to represent data like words, images, or items in a continuous space. It covers how these embeddings capture semantic relationships and similarities, enabling algorithms to process and compare complex information efficiently.
hnJul 3, 2026#Tech
The hard part was never the model
3.0
The article argues that the real challenge in AI development is not building the model itself, but rather managing the surrounding infrastructure, data pipelines, deployment, and organizational complexity that make the model useful in practice.
hnJul 3, 2026#Tech
Contextual Information Retrieval
1.0
The article discusses the concept of Contextual Information Retrieval (CIR), which enhances traditional search by incorporating user context—such as location, intent, and history—to deliver more relevant results. It explores the limitations of keyword-based retrieval and how embedding-based approaches combined with contextual data can improve search accuracy and user experience.
hnJul 3, 2026#Tech
How AI Learned to Speak
3.0
The video explores the evolution of AI language models, from early rule-based systems to modern neural networks like GPT. It explains how training on vast text data enables AI to generate human-like speech, highlighting key breakthroughs in natural language processing.
hnJul 3, 2026#Tech
Learning to Replicate Expert Judgment in Financial Tasks
3.0
Thinking Machines Data Science team developed a machine learning model to automate expert judgment in finance by fine-tuning Llama 3 8B, achieving 80% agreement with senior analysts. The model uses probability-based reasoning instead of direct predictions, and passes critical industry disqualifiers, making it ready for initial deployment alongside a human-in-the-loop system.
hnJul 3, 2026#Tech
A First Course in Causal Inference
2.0
This paper presents a comprehensive textbook introducing causal inference, covering key concepts such as causal diagrams, potential outcomes, identification, and estimation methods. It is designed for graduate students and researchers with a background in statistics or machine learning, providing both theoretical foundations and practical applications.
hnJul 3, 2026#Science
Jamesob's guide to running SOTA LLMs locally
3.0
The guide provides step-by-step instructions for running state-of-the-art large language models on local hardware, covering setup, model selection, and optimization techniques for users who want to avoid cloud-based AI services.
hnJul 3, 2026#Tech
Show HN: AI latent space with overlapping manifolds
2.0
A developer shares a Python script called "integrated_egregore_core_test_v6_4.py" on GitHub, which explores AI latent space by modeling overlapping manifolds. The project appears to be an experimental implementation for testing how different conceptual regions can intersect within a neural network's internal representation space.
hnJul 3, 2026#Tech
Leanstral 1.5: Proof Abundance for All
6.0
Mistral AI has released Leanstral 1.5, a 7-billion parameter model specialized in generating proofs for the Lean theorem prover. Building on the previous version, it significantly boosts proof completion rates and offers a more accessible way for developers and mathematicians to use automated reasoning.
hnJul 3, 2026#Tech
RL Beyond the Verifiable
6.5
The article discusses extending reinforcement learning (RL) to domains where success is not easily verifiable, exploring methods like reward modeling, human feedback, and learned reward functions to train AI systems on tasks that lack clear, objective criteria for correctness.
hnJul 3, 2026#Tech
Learning Multi-Agent Coordination via Sheaf-ADMM
3.0
Researchers propose Sheaf-ADMM, a framework that combines cellular sheaf theory with the Alternating Direction Method of Multipliers (ADMM) to improve multi-agent coordination. By modeling constraints with sheaf structures, the method enables agents to learn local behaviors while maintaining global coherence, outperforming existing approaches on tasks like rendezvous and formation control.
hnJul 3, 2026#Tech
Build You Own Model
1.0
RunInfra.ai offers a platform for building and deploying custom machine learning models, providing infrastructure and tools to streamline the AI development process from training to production.
hnJul 3, 2026#Tech
Explaining Attention with Program Synthesis
3.0
Researchers propose using program synthesis to generate interpretable explanations for attention mechanisms in transformer models, addressing the challenge of understanding how these models make decisions by producing human-readable programs that replicate attention patterns.
hnJul 3, 2026#Science
The AI Superforecasters Are Here
8.0
A new generation of AI systems is achieving superhuman accuracy in geopolitical and economic forecasting, outperforming human expert panels. These models combine large language models with structured reasoning techniques and access to real-time data, suggesting AI may soon dominate prediction markets and decision-making support.
hnJul 3, 2026#Tech
TS Foundation Models
5.0
TiRex-2 is a large-scale time series foundation model developed by NX-AI, designed to generalize across diverse time series tasks without task-specific training. It supports zero-shot forecasting, classification, and anomaly detection using a transformer-based architecture trained on a broad corpus of time series data.
hnJul 3, 2026#Tech
The Data Recipe for Teaching AI New Skills [video]
2.0
This video explores the essential data ingredients required to train AI models to acquire new skills, discussing how structured datasets, labeled examples, and diverse training inputs are used to teach AI everything from language understanding to robotics tasks.
hnJul 3, 2026#Tech
First Principles of Model Routing
0.0
The article explores model routing from first principles, breaking down how AI systems can intelligently direct queries to the most suitable model based on task requirements. It covers core concepts like cost-performance trade-offs, latency considerations, and the architectural decisions behind building effective routing systems for large language models.
hnJul 3, 2026#Tech
Don't Train the Model, Evolve the Harness
2.0
A new approach called "harness optimization" proposes evolving the training harness or data pipeline rather than retraining the model itself, aiming to improve performance more efficiently by focusing on the surrounding infrastructure.
hnJul 3, 2026#Tech
Ask HN: Best Local LLM Setup for a 128GB M4 Max Mac Studio?
1.0
A user asks the Hacker News community for recommendations on the best local LLM setup for a 128GB M4 Max Mac Studio, sparking discussion on model size, inference tools, and quantization methods suitable for high-end Apple Silicon hardware.
hnJul 3, 2026#Tech
Data Science Weekly – Issue 658
1.0
Issue 658 of Data Science Weekly curates top resources, including tutorials on LLM fine-tuning, Python libraries for data visualization, and practical guides on MLOps, along with industry news on AI regulation and open-source tool updates.
hnJul 2, 2026#Tech
Learning to Replicate Expert Judgment in Financial Tasks
3.0
Thinking Machines Data Science has developed a method to replicate expert judgment in financial tasks using machine learning, enabling automated credit risk assessment and transaction monitoring while maintaining accuracy and consistency.
hnJul 2, 2026#Tech
Show HN: Gist Discover – TikTok for ArXiv Summaries
3.0
Gist Discover is a free feed that summarizes arXiv AI papers into slide decks with four layers: gist, logic, counter-argument, and steelman. The creator trained a custom model using a $130K multi-teacher pipeline, claiming it beats frontier models on quality. It also offers IDE extensions for Cursor, Windsurf, and VS Code.
hnJul 2, 2026#Tech
Show HN: A 155K-param transformer builds a map of a world it's never shown
5.0
A transformer model with only 155,000 parameters, trained on sequences of visited locations, can internally learn and build a geometric map of a 2D grid world it was never explicitly shown, demonstrating spontaneous topological learning.
hnJul 2, 2026#Tech

Load next 30Updated —