High-Performance Symbolic Regression in Python and Julia
greater
operator in sympy mapping by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/590
Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.18.1...v0.18.2
Merged pull requests:
Closed issues:
Full Changelog: https://github.com/MilesCranmer/SymbolicRegression.jl/compare/v0.24.1...v0.24.2
Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.18.0...v0.18.1
Filtered to only include relevant ones for Python frontend. Also note that not all backend features, like graph-based expressions/program synthesis, are supported yet, so I don't mention those changes yet.
(BREAKING) The swap_operands
mutation contributed by @foxtran now has a default weight of 0.1 rather than 0.0.
(BREAKING) The Dataset struct has had many of its field declared immutable, as a safety precaution.
LoopVectorization.jl has been moved to a package extension. PySR will install automatically at first use of turbo=True
rather than by default, which means faster install time and startup time.
turbo=True
will have no effect on that version (due to internal changes in Julia), which is why I have instead done the following:Bumper.jl support added. Passing bumper=true
to PySRRegressor()
will result in faster performance.
Various fixes to distributed compute; confirmed Slurm support again!
Now prefer to use new keyword-based constructors for nodes:
Node{T}(feature=...) # leaf referencing a particular feature column
Node{T}(val=...) # constant value leaf
Node{T}(op=1, l=x1) # operator unary node, using the 1st unary operator
Node{T}(op=1, l=x1, r=1.5) # binary unary node, using the 1st binary operator
rather than the previous constructors Node(op, l, r) and Node(T; val=...) (though those will still work; just with a depwarn). If you did any construction of nodes manually, note the new syntax. (Old syntax will still work though)
Formatting overhaul of backend (https://github.com/MilesCranmer/SymbolicRegression.jl/pull/278)
Upgraded Optim to 1.9
Upgraded DynamicQuantities to 0.13
Upgraded DynamicExpressions to 0.16
The main search loop in the backend has been greatly refactored for readability and improved type inference. It now looks like this (down from a monolithic ~1000 line function)
function _equation_search(
datasets::Vector{D}, ropt::RuntimeOptions, options::Options, saved_state
) where {D<:Dataset}
_validate_options(datasets, ropt, options)
state = _create_workers(datasets, ropt, options)
_initialize_search!(state, datasets, ropt, options, saved_state)
_warmup_search!(state, datasets, ropt, options)
_main_search_loop!(state, datasets, ropt, options)
_tear_down!(state, ropt, options)
return _format_output(state, ropt)
end
Backend changes: https://github.com/MilesCranmer/SymbolicRegression.jl/compare/v0.23.1...v0.24.1
Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.17.4...v0.18.0
Small patch to Julia version to avoid buggy libgomp in 1.10.1 and 1.10.2.
Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.17.3...v0.17.4
seval
to support multiple expressionsFull Changelog: https://github.com/MilesCranmer/PySR/compare/v0.17.2...v0.17.3
Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.17.1...v0.17.2
Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.17.0...v0.17.1
eval
-> seval
Vector
when calling SymbolicRegression.jl functions (otherwise would get passed as PyList{Any}
; see https://github.com/JuliaPy/PythonCall.jl/issues/441)equation_search
code with jl.PythonCall.GC.disable()
to avoid multithreading-related segfaults (https://github.com/JuliaPy/PythonCall.jl/issues/298)np.str_
to str
before passing to variable_names
, otherwise it becomes a PyArray
and not a String
(might be worth adding a workaround, it seems like PyJulia does this automatically)pysr
(via JuliaCall)python -m pysr install
. The install process is done by JuliaCall at import time.
pysr.install()
and python -m pysr install
because JuliaCall now handles this.python -m pysr install
will not give a warning and do nothing.julia_project
argument (ignored; no effect). The user now needs to set this up by customizing juliapkg.json
. See updated documentation for instructions.python -m pysr.test [test]
to python -m pysr test [test]
.pyproject.toml
for building rather than setup.py
. However, setup.py install
should still work.PySRRegressor
, I am now storing a serialized version of them. This means you can now pickle the search state and warm-start the search from a file, in another Python process!
self.raw_julia_state_
will deserialize it automatically for youfrom pysr import SymbolicRegression as SR
x1 = SR.Node(feature=1) # Create expressions manually
<model>.julia_options_
(generated from a serialized format for pickle safety) so that the user can call a variety of functions in SymbolicRegression.jl
directly.ncyclesperiteration => ncycles_per_iteration
loss => elementwise_loss
full_objective => loss_function
juliacall.ipython
extension at import timeFull Changelog: https://github.com/MilesCranmer/PySR/compare/v0.16.9...v0.17.0
Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.16.8...v0.16.9
typing_extensions
for compatibility with Python 3.7 by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/497
Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.16.7...v0.16.8