==================================================================== Numerical differentiation (``qcip_tools.numerical_differentiation``) ==================================================================== Mathematical background ----------------------- Numerical derivation ++++++++++++++++++++ Theory ______ This an application of the `Neville's algorithm `_. Let :math:`M(x)` be a `Maclaurin series expansion `_ of a function :math:`f(x)`, .. math:: :label: expansion M(x) = \sum_ {n=0}^{n_{max}} \frac{1}{n!}\frac{\partial^n f(0)}{\partial x^n} x^n = \sum_ {n=0}^{n_{max}} \frac{A_n}{n!} x^n, where :math:`A_n=f^{(n)}(0)`, the value of the :math:`n`-order derivative of :math:`f(x)` at :math:`x=0`. If the expansion is not truncated, :math:`n_{max}=\infty`. Given a small value :math:`h>0` and :math:`I_{a,q} \in \mathbb{R}`, .. math:: M(h\,I_{a,q}) = \sum_{n=0}^{n_{max}} I_{a,q}^n\,\frac{A_n}{n!}\,h^n where .. math:: \forall q \in \mathbb{N}: I_{a,q} = \left\{\begin{matrix}-a^{|q|-1} &\text{if }q< 0,\\0 &\text{if } q= 0, \\ a^{q-1} & \text{if }q> 0.\end{matrix}\right. Therefore, :math:`\forall h \in \mathbb{R}^+:\forall k\in\mathbb{N}^0:h\,I_{a, k+1}=a^k\,h` defines a geometric progression where :math:`h` is the smallest value and :math:`a` is the common ratio (or scale factor). An estimate of an :math:`A_d` factor [value of :math:`f^{(d)}(0)`] determined by using :math:`h`, is given by .. math:: A_d(h) = \frac{d!}{h^d}\,M^{(d)}(h) + \mathcal{O}(x^{d+1}), where :math:`M^{(d)}(h)` is the value at :math:`h` of the :math:`d`-order derivative of the Maclaurin series. Let's assume that one can determine :math:`A_d(h)` with a given error :math:`\mathcal{O}(h^{d+p}), p\in\mathbb{N}^+` by computing .. math:: \begin{align} A_d(h) &= \frac{d!}{h^d}\,\sum_{q=q_{min}}^{q_{max}} C_q\,M(h\,I_{a,q})+ \mathcal{O}(h^{d+p}), \\ &= \frac{d!}{h^d}\,\sum_{q=q_{min}}^{q_{max}} C_q\, \sum_{n=0}^{n_{max}} I_{a,q}^n\,\frac{A_n}{n!}\,h^n+ \mathcal{O}(h^{d+p}), \\ &= \frac{d!}{h^d}\,\sum_{n=0}^{n_ {max}} \left[\sum_{q=q_{min}}^{q_{max}} I_{a,q}^n\,C_q\right]\,\frac{A_n}{n!}\,h^n + \mathcal{O}(h^{d+p}). \end{align} In order for the equality to be satisfied, it is necessary that: .. math:: :label: sys \sum_{q=q_{min}}^{q_{max}} I_{a,q}^n\,C_q = \left\{ \begin{array}{ll} 1 & \text{if }n=d,\\ 0 & \text{otherwise.} \end{array}\right. This defines a set of :math:`n_{max}` linear equations with :math:`q_{max}-q_{min}+1` unknowns (:math:`C_q`). In order to get a unique solution, :math:`q_{max}-q_{min}+1\geqslant d` and therefore, one can set :math:`n_{max}=q_{max}-q_{min}+1= d+p`. :math:`q_{min}` and :math:`q_{max}` take different values, depending of the approximation: +------------------+----------------------------------------------------+--------------------------------------------------+ | Type | :math:`i_{min}` | :math:`i_{max}` | +==================+====================================================+==================================================+ | Forward (``F``) | 0 | :math:`d+p-1` | +------------------+----------------------------------------------------+--------------------------------------------------+ | Backward (``B``) | :math:`-d+p-1` | 0 | +------------------+----------------------------------------------------+--------------------------------------------------+ | Centered (``C``) | :math:`-\left\lfloor\frac{d+p-1}{2}\right\rfloor` | :math:`\left\lfloor\frac{d+p-1}{2}\right\rfloor` | +------------------+----------------------------------------------------+--------------------------------------------------+ Note that :math:`p+d-1` should therefore always be even in the case of centered derivatives. The system of linear equation given by Eq. :eq:`sys` is solved. Once the :math:`C_q`'s are determined, :math:`A_d(h)` is obtained with a precision :math:`\mathcal{O}(h^{d+p})` by computing .. math:: :label: ad A_d(h) = \frac{d!}{h^d}\,\sum_{q=q_{min}}^{q_{max}} C_q\,M(h\,I_{a,q}). It can be simply showed that the coefficients for a multivariate function is a tensor product of the coefficients for the univariates approximations. In this case :math:`A_d(\mathbf{h})` is a symmetric tensor, while :math:`\mathbf{h}` is a vector. .. note:: In pratice, it wouldn't change anything if a Taylor series (not centered at :math:`x=0`) is used instead. It also holds for power series if one discard the :math:`d!` in :eq:`ad`. .. warning:: If values of :math:`M(h)` are known with a certain precision :math:`\delta y`, the absolute error on :math:`A_d(h)` cannot be lower than :math:`\delta y \times h^{-d}`, no matter the order of magnitude of :math:`A_d(h)`. If there is no physical (or chemical) limitation to the precision, :math:`\delta y` is then determined by the precision of the representation, see the `python documentation on floating point arithmetic `_. Implementation ______________ The :math:`I_{a,q}` function is named ``ak_shifted()`` in the python code. In the code, to works with the romberg triangles defined below, :math:`h=a^{k}h_0`, where :math:`k` is the minimal amplitude, while :math:`h_0` is the minimal value. The ``Coefficients`` class takes care of the computation of the different coefficients, storing them in ``Coefficients.mat_coefs``. The :math:`\frac{d!}{h^d}` part can be computed via the ``Coefficients.prefactor()`` function. The ``compute_derivative_of_function(c, scalar_function, k, h0, input_space_dimension, **kwargs)`` function does the whole computation (for uni and multivariate functions) and gives :math:`A_d(h)` (which is the component of a tensor in the case of multivariate functions). A ``scalar_function(fields, h_0, **kwargs)`` function must be defined, which must return the value of the Maclaurin expansion :math:`\mathbf{M}_i(\mathbf{x})`, with :math:`\mathbf{M}:\mathbb{R}^n\rightarrow\mathbb{R}^m` (where `n` is the input space dimension), given :math:`n` ``fields`` (the :math:`\mathbf{x}` vector), corresponding to the different :math:`q`. Since the function expects real values, :math:`i` (the :math:`i^\text{th}` coordinate of :math:`\mathbf{M}(\mathbf{x})\in\mathbb{R}^m`) must be passed thought ``**kwargs`` if any. ``c`` is a list of tuple ``(Coefficient, int)``, which define the derivatives to compute and with respect to which coordinates (starting by 0) the differentiation must be performed. For example, given :math:`E:\mathbb{R}^3\rightarrow\mathbb{R}`, computation of :math:`\frac{\partial^3E(\mathbf{x})}{\partial \mathbf{x}_0\partial\mathbf{x}_1^2}` is requested via .. sourcecode:: python first_order = numerical_differentiation.Coefficients(1, 2, ratio=a, method='C') second_order = numerical_differentiation.Coefficients(2, 2, ratio=a, method='C') derivative_values = [] # compute d^3 E(x) / dx0 dx1^2 derivative_values.append(numerical_differentiation.compute_derivative_of_function( [(first_order, 0), (second_order, 1)], E, # assume defined 0, # k = 0 0.001, # h0 3 # input space is R³. )) .. warning:: In the code, :math:`k` is added (or substracted) to :math:`q` so that ``scalar_function()`` receives the :math:`q` values with respect to :math:`h_0` and is expected to returns the value of :math:`M(I_{a,q}h_0)`. Romberg's scheme (Richardson extrapolation) +++++++++++++++++++++++++++++++++++++++++++ Theory ______ In quantum chemistry, the `Richardson extrapolation `_ is also known as the Romberg differentiation scheme. Like its integration counterpart, it allows to refine the value of a numerical derivatives by computing multiple estimate of the derivates. Given Eq. :eq:`expansion` and a small :math:`h \in \mathbb{R}^+`, :math:`r, m \in \mathbb{N}^+`, :math:`k \in \mathbb{N}^0`, .. math :: \begin{align} a^{rm}\,M(a^k h) - M(a^{k+1} h) &= a^{rm}\,\sum_ {n=0}^{n_{max}} a^{nk}\,\frac{A_n}{n!} h^n - \sum_ {n=0}^{n_{max}} a^{nk+n}\,\frac{A_n}{n!} h^n \\ &= \sum_ {n=0}^{n_{max}} [a^{nk+rm}-a^{nk+n}]\frac{A_n}{n!} h^n. \end{align} It is therefore possible to remove the :math:`s`-power term from the expansion by choosing :math:`s=rm`. Let :math:`H_{k,0}\equiv A_d(h=a^k h_0)` with :math:`h_0` the minimal value, the following recurrence relation can be defined: .. math:: :label: romberg H_{k,m} = \frac{a^{rm}\,H_{k,m-1}-H_{k+1,m-1}}{a^{rm}-1} + \mathcal{O}(h^{r(m+1)}), where :math:`m` is the number of *refinement* steps or iterations. To achieve such a :math:`\mathcal{O}(h^{rm})` precision, it is required to know :math:`m+1` values of :math:`H_{k,0}` with :math:`k\in [0;m+1]`. In general, :math:`r` should be equal to 1 to remove the :math:`m`-power contamination at iteration :math:`m`, but in the case of centered derivatives, every odd-power term vanishes, so one can use :math:`r=2` to achieve a faster convergence by removing the :math:`2m`-power contamination at iteration :math:`m`. A romberg triangle is obtained, with the following shape: .. code-block:: text k= | h= | m=0 m=1 m=2 ... ----+-------+----------------------------- 0 | a⁰*h0 | val01 val11 val20 ... 1 | a¹*h0 | val02 val12 ... 2 | a²*h0 | val03 ... ... | ... | ... If the minimal value :math:`h_0` is chosen well, the value of the derivative should be the rightmost value. In practice, there is a value window with lower bound (for which there are too large round-off errors) and larger bound (when higher order terms makes the number of :math:`a^kh_0` required large, along with quantum chemistry related reasons which lower that value). Without knowledge of this ideal :math:`h_0` window, it is necessary to carry out an analysis of the triangle to select the "best" value. Two quantities are usefull: .. math:: \begin{align} &\varepsilon_k(m) = H_{k+1,m} - H_{k,m}, \\ &\varepsilon_{m}(k) = H_{k,m+1} - H_{k,m}, \end{align} where :math:`\varepsilon_k(m)` is the amplitude error at a given iteration :math:`m` and :math:`\varepsilon_m(k)` is the iteration error for a given amplitude :math:`k`. This "best" value is chosen according to the following flowchart: .. figure:: ./images/flowchart_romberg_selection.png :align: center Flowchart to select the "best" value in a Romberg triangle, adapted from the text in M. de Wergifosse *et al*. *Int. J. Quant. Chem.* **114**, 900 (2014). Note that the original paper does not give advices on what the threshold value should be, but it clearly depends on the order of magnitude of :math:`A_d(h)` and the available (and/or required) precision. .. note:: In the case where :math:`A_d(\mathbf{h}=a^k\mathbf{h}_0)` is a tensor (because the corresponding function is multivariate), a Romberg triangle must be carried out for each component of this tensor. But since the vector is symmetric, it can reduce the number of components to carry out. Implementation ______________ The ``RombergTriangle`` class implement the Romberg's triangle algorithm. The constructors expect the value corresponding to increasing ``k`` values to be passed as argument. The :math:`H_{k,m}` are available as ``RombergTriangle.romberg_triangle[k, m]``. Note that the values for which :math:`k \geqslant k_{max} - m` are set to zero. The ``find_best_value()`` implement the flowchart showed above. To help the (eventual) debugging and understanding, a ``verbose`` option is available. For example, for an univariate function :math:`F(x)`, compute :math:`\frac{dF(x)}{dx}`: .. code-block:: python a = 2.0 # Forward first order derivative: c = numerical_differentiation.Coefficients(1, 1, ratio=a, method='F') derivative_values = [] # compute dF(x) / dx for k in range(5): derivative_values.append(numerical_differentiation.compute_derivative_of_function( [(c, 0)], # derivative with respect to the first (and only) variable univariate_function, # assume defined k, 0.001, # f0 1 # R → R )) t = numerical_differentiation.RombergTriangle(derivative_values, ratio=a) position, value, iteration_error = t.find_best_value(threshold=1e-5, verbose=True) For a bivariate function :math:`F(x, y)`, compute :math:`\frac{\partial^3 F(x, y)}{\partial x \partial y^2}`: .. code-block:: python a = 2.0 # centered first and second order derivative: c1 = numerical_differentiation.Coefficients(1, 2, ratio=a, method='C') c2 = numerical_differentiation.Coefficients(2, 2, ratio=a, method='C') derivative_values = [] # compute d³F(x,y) / dxdy² for k in range(5): derivative_values.append(numerical_differentiation.compute_derivative_of_function( [(c1, 0), (c2, 1)], # first and second order derivative with respect to the first and second variable bivariate_function, # assume defined k, 0.001, # f0 2 # R² → R )) t = numerical_differentiation.RombergTriangle(derivative_values, ratio=a, r=2) # r=2 because centered position, value, iteration_error = t.find_best_value() Sources ------- + D\. Eberly, « Derivative Approximation by Finite Difference ». From the `Geometric Tools documentation `_ (last consultation: March 4, 2017). + J.N. Lyness *et al*. *Numer. Math.* **8**, 458 (1966). + L.F. Richardson *et al*. *Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathematical or Physical Character* **226**, 299 (1927). + M\. de Wergifosse *et al*. *Int. J. Quant. Chem.* **114**, 900 (2014). + A.A.K. Mohammed *et al*. *J. Comp. Chem.* **34**, 1497 (2013). API documentation ----------------- .. automodule:: qcip_tools.numerical_differentiation :members: