pypomp.core.pomp.Pomp.train

Pomp.train(J: int, M: int, eta: dict[str, float], key: Array | None = None, theta: Mapping[str, int | float | number | Array] | Sequence[Mapping[str, int | float | number | Array]] | pypomp.core.parameters.PompParameters | None = None, optimizer: str = 'Adam', alpha: float = 0.97, thresh: int = 0, scale: bool = False, ls: bool = False, c: float = 0.1, max_ls_itn: int = 10, eta_cooling: float = 1.0, alpha_cooling: float = 1.0, n_monitors: int = 1, track_time: bool = True, clip_norm: float | None = None) None[source]

Optimizes model parameters using a differentiable particle filter and gradient-based methods.

This method performs Maximum Likelihood Estimation (MLE) by treating the particle filter as a differentiable computational graph. It computes gradients of the log-likelihood with respect to the parameters via reverse-mode automatic differentiation (using JAX), and updates the parameters using optimizers (e.g., Adam, SGD).

This implementation leverages JAX to efficiently vectorize the algorithm across multiple initial parameter sets simultaneously. Results are automatically stored in the model’s history and can be accessed using self.results().

Parameters:
  • J (int) – The number of particles in the MOP objective for obtaining the gradient and/or Hessian.

  • M (int) – Maximum iteration for the gradient descent optimization.

  • eta (dict[str, float]) – Learning rates per parameter as a dictionary.

  • key (jax.Array, optional) – The random key for reproducibility. Defaults to self.fresh_key.

  • theta (ThetaInput, optional) – Parameters involved in the POMP model. Defaults to self.theta. Accepts: - A single dictionary: dict[str, Numeric] - A list of dictionaries: list[dict[str, Numeric]] - An existing PompParameters object Providing a list or PompParameters object enables faster, vectorized execution across all parameter sets.

  • optimizer (str, optional) – The gradient-based iterative optimization method to use. Options include “Adam”, “SGD”, “Newton”, “WeightedNewton”, “BFGS”, and “FullMatrixAdam”. Note: options other than “Adam” and “SGD” might be quite slow. The “Adam” option itself can take ~3x longer per iteration than mif does.

  • alpha (float, optional) – Discount factor for MOP.

  • thresh (int, optional) – Threshold value to determine whether to resample particles.

  • scale (bool, optional) – Boolean flag controlling whether to normalize the search direction.

  • ls (bool, optional) – Boolean flag controlling whether to use the line search algorithm. Note: the line search algorithm can be quite slow.

  • Parameters (Line Search) –

    c (float, optional): The Armijo condition constant for line search which controls how much the negative log-likelihood needs to decrease before the line search algorithm continues.

    max_ls_itn (int, optional): Maximum number of iterations for the line search algorithm.

  • eta_cooling (float, optional) – Cooling factor for the learning rate (eta) using cosine decay. This represents the factor by which the original learning rate is multiplied by the end of training. Defaults to 1.0 (no cooling).

  • alpha_cooling (float, optional) – Cooling factor for the MOP discount factor (alpha) using cosine decay. This factor represents the multiplier for the distance of alpha from 1.0 by the end of training (i.e., alpha approaches 1.0). Defaults to 1.0 (no cooling).

  • n_monitors (int, optional) – Number of particle filter runs to average for log-likelihood estimation.

  • track_time (bool, optional) – Boolean flag controlling whether to track the execution time.

  • clip_norm (float, optional) – Clips gradient to [-clip_norm, clip_norm]. If None, no clipping is applied.

Returns:

None. Updates self.results_history with a PompTrainResult containing the log-likelihoods, parameter traces, and optimizer details from the training run.