{ "info": { "author": "Philippe Remy", "author_email": "", "bugtrack_url": null, "classifiers": [], "description": "# ARMA w. Scipy\nEstimating coefficients of ARMA models with the Scipy package.\n\nInstallation\n```bash\npip install arma-scipy\n```\n\nPython import\n```python\nfrom arma_scipy import fit\n```\n\n## Motivation\n\nARMA models in general can be, after choosing p and q, fitted by least\nsquares regression to find the values of the parameters which minimize\nthe error term. It is generally considered good practice to find the\nsmallest values of p and q which provide an acceptable fit to the data.\nFor a pure AR model, the Yule-Walker equations may be used to provide a\nfit. \"Least squares\" means that the overall solution minimizes the sum of the\nsquares of the residuals made in the results of every single equation.\n\nThe reasons behind this Scipy fit implementation are twofold:\n- provide an alternative when the score function is not the MSE - **more important**\n- provide a way to compare both fit methods (stat theory vs optimization) - less important\n\nYou can fit the coefficients of an `ARMA(4,4)` that way:\n```bash\npython generate_arma_process.py\npython scipy_fit_data.py\n```\n\nHere is an example of such a fit:\n```\n################################################################################\nOptimization terminated successfully.\n Current function value: 1.432208\n Iterations: 508\n Function evaluations: 788\nEstimation of the coefficients with the scipy package:\n[ 0.2235 -0.5872 0.3143 -0.1805 0.167 -0.0464 0.6528 0.224 ]\nEstimation of the coefficients with the statsmodels.tsa (least squares) package:\n[ 0.237 -0.4998 0.3467 -0.128 0.1542 -0.1467 0.6244 0.2245]\nTrue ARMA coefficients:\n[ 0.25 -0.5 0.35 -0.15 0.5 -0.4 0.78 0.32]\n```\n\n## Comparison\n\n- It is not a surprise that the score function is minimized by the fit of the `statsmodels` package. Indeed, the maximum likelihood estimation is guaranteed to yield the lowest mean squared error score on the train set.\n- The Scipy minimize function is doing a relatively good job at being close to this minimum. However, due to the randomness nature of this optimization and the crucial choice of x0 (initial values of the coefficients to optimize), several runs are necessary to guarantee to be close enough to this global minimum. It's clear enough that there is a strong variability across the runs. A significant proportion of runs do not get even closer to this minimum. On 200 runs, the average hovers ~10% above the minimum expected value, then starts to overfit. The best run, however, reaches 1.41807, a score extremely close to the target minimum score of 1.4179.\n\n
\n
\n
\n
\n