Pure-Python symbolic regression that rediscovered Kepler's law from 8 data point
GP-Elite is a pure-Python symbolic regression tool that rediscovered Kepler's third law using only eight data points. The project demonstrates how genetic programming can discover physical laws from minimal data without external dependencies beyond NumPy.
Background
- Symbolic regression is a machine-learning technique that searches for a clean mathematical formula to fit data (e.g., y = a·x² + b), unlike neural networks which produce black-box models.
- Kepler's third law (P² ∝ a³): the square of a planet's orbital period equals the cube of its distance from the Sun. Discovered in the early 1600s from observational data, it paved the way for Newton's gravity.
- GP-Elite is a pure-Python symbolic regression library using genetic programming — evolving formulas by randomly combining operators (+, −, ×, ÷, powers), selecting the best fits, and mutating/recombining them over generations.
- Its demo: given just 8 data points (planetary distances and orbital periods), it rediscovered the exact Kepler law formula without any hints about the relationship's form.
- "Pure-Python" means no heavy dependencies (no NumPy, no PyTorch), making it easy to run and study.