pym - A math and numerical methods library in pythonÂ¶

Last Updated on 5/28/17

Documentation at alexhagen.github.io/pym/

pym (pronounced pim) is a pretty simple numerical methods library for python. It can be used to do interpolation, extrapolation, integration, normalization, etc. In general, it is a replication of Brian Bradie's book A Friendly Introduction to Numerical Methods in code.

Pym Demonstrations and ScreenshotsÂ¶

InstallationÂ¶

To install pym, all we have to do is install numpy, scipy, and matplotlib, then download pym to our code directory (or wherever, really). To do this, we can use

$ pip install numpy
$ pip install scipy
$ pip install matplotlib
$ pip install colours
$ cd ~/code
$ git clone https://github.com/alexhagen/pym.git

and then, we can use the library within any script by using

In [1]:

from pym import func as pym

Curve creation and graphingÂ¶

The basis of pym is the curve class, which holds x and y data, as well as its associated error. We can create a function with a sinusoid in it by using the following code

In [2]:

from pym import func as pym
import numpy as np

# use numpy to create some trigonometric functions across two periods
x_data = np.linspace(0., 4. * np.pi, 1000)
sin_data = np.sin(x_data)
cos_data = np.cos(x_data)

# define these data as ahm.curves to expose the interface to the numerical
# methods
sin = pym.curve(x_data, sin_data, name='$\sin \left( x \right)$')
cos = pym.curve(x_data, cos_data, name='$\cos \left( x \right)$')

# Plot these using the function pym.plot which returns a pyg.ah2d object
plot = sin.plot(linecolor='#285668', linestyle='-')
plot = cos.plot(linecolor='#FC8D82', linestyle='-', addto=plot)

# make it pretty with some shading, lines, changing limits, and labels
plot.fill_between(sin.x, np.zeros_like(sin.y), sin.y, fc='#ccccff')
plot.fill_between(cos.x, np.zeros_like(cos.y), cos.y, fc='#ffcccc')
plot.lines_on()
plot.markers_off()
plot.ylim(-1.1, 1.1)
plot.xlim(0., 4. * np.pi)
plot.xlabel(r'x-coordinate ($x$) [$cm$]')
plot.ylabel(r'y-coordinate ($y$) [$cm$]')

# export it to a websvg (which doesnt convert text to paths)
plot.export('_static/curve_plotting', ratio='silver')
plot.show('A pretty chart from data made for a pym curve')

Figure 1: A pretty chart from data made for a pym curve

Integration and normalizationÂ¶

One of the useful options of pym is the ability to normalize a function, either according to its maximum, or according to its integral. The following is and example of this integration, showing that after integration, we attain an integral of 1.0.

In [3]:

# use numpy to create a monotonic function to play with
x_data = np.linspace(0., 2., 1000)
y_data = np.power(x_data, 2)

# define these data as ahm.curves to expose the interface to the numerical
# methods
y = pym.curve(x_data, y_data, name='$x^{2}$')

# Plot the unmodified function, shade the integral, and add a pointer with the
# integral value
plot = y.plot(linecolor='#285668', linestyle='-')
plot.fill_between(x_data, np.zeros_like(y_data), y_data, fc='#ccccff')
plot.add_data_pointer(1.5, point=1.5,
                      string=r'$\int f \left( x \right) dx = %.2f$' %
                      (y.integrate(0, 2)), place=(0.5, 3.))
plot.lines_on()
plot.markers_off()

# now normalize the curve with respect to the integral
y.normalize('int')
# Plot the modified function, shade the integral, and add a pointer with the
# integral value
plot = y.plot(addto=plot, linecolor='#FC8D82', linestyle='-')
plot.fill_between(x_data, np.zeros_like(y.x), y.y, fc='#ffdddd')
plot.add_data_pointer(1.25, point=0.125,
                      string=r'$\int f_{norm} \left( x \right) dx = %.2f$' %
                      (y.integrate(0, 2)), place=(0.25, 1.5))
plot.lines_on()
plot.markers_off()
plot.xlabel(r'x-coordinate ($x$) [$cm$]')
plot.ylabel(r'y-coordinate ($y$) [$cm$]')
plot.ylim(0.0, 4.0)
plot.xlim(0.0, 2.0)

# export it to a websvg (which doesnt convert text to paths)
plot.export('_static/int_norm', ratio='silver')
plot.show('Normalized curves have a total integral of 1.0')

Figure 2: Normalized curves have a total integral of 1.0

Curve arithmeticÂ¶

pym makes it easy to do simple artimetic operations on curves. The arithmetic all happens after copying the curve, so you don't lose anything in place. The example below illustrates the common identity $$\sin^{2}\left( \theta \right) + \cos^{2}\left( \theta \right) = 1$$

In [4]:

one = sin * sin + cos * cos
one.name = r'$\sin^{2}\left( \theta \right) + \cos^{2}\left( \theta \right) = 1$'
sin2 = sin * sin
cos2 = cos * cos

plot = one.plot(linestyle='-', linecolor='#999999')
plot = sin2.plot(linestyle='--', linecolor='#FC8D82', addto=plot)

plot.fill_between(sin2.x, np.zeros_like(sin2.y), sin2.y, fc='#ffcccc', name=r'$\sin^{2} \left( \theta \right)')
plot.fill_between(cos2.x, sin2.y, sin2.y + cos2.y, fc='#ccccff', name=r'$\sin^{2} \left( \theta \right)')

plot.markers_off()
plot.lines_on()

plot.xlim(0, 12)
plot.ylim(0, 1.1)
#plot.legend(loc=1)

plot.export('_static/identity', ratio='silver')
plot.show('Trigonometric identity and its contributions from $\cos^{2}$ and $\sin^{2}$')

Figure 3: Trigonometric identity and its contributions from $\cos^{2}$ and $\sin^{2}$

SubclassingÂ¶

pym is easily subclassable. I personally like to define classes for data I download from instruments and write a load script and add some properties to the object. The following example shows elevation plotting for the Pacific Crest Trail using the wonderful postholer.com.

In [5]:

import urllib
from bs4 import BeautifulSoup

class trail(pym.curve):
    def __init__(self, trail_name):
        # first we have to download the trail off of postholer.com
        trail_string = trail_name.replace(' ', '-')
        url = 'http://www.postholer.com/databook/{trail}'.format(trail=trail_string)
        page = urllib.urlopen(url)
        pagestr = page.read()
        self.soup = BeautifulSoup(pagestr, 'lxml')
        # then we have to find the table with the mileage and elevation data
        for table in self.soup.find_all("table"):
            for row in table.find_all("tr"):
                for cell in row.find_all("td"):
                    if cell.string == "Elev":
                        self.table = table
                        break
        # then we read the mileage and elevation data into lists
        mile = []
        elev = []
        for row in self.table.find_all("tr")[1:]:
            mile.extend([float(row.find_all("td")[1].string)])
            elev.extend([float(row.find_all("td")[5].string)])
        # finally, we initalize the parent class ``curve`` of this object with the data downloaded
        # and the name
        super(trail, self).__init__(mile, elev, name=trail_name)
        
# lets download three long distance western trails
pct = trail('pacific crest trail')
cdt = trail('continental divide trail')
ct = trail('colorado trail')
# now that we've initialized the ``trail``s, we can treat them as curves
plot = pct.plot(linestyle='-', linecolor='#7299c6')
plot = cdt.plot(linestyle='-', linecolor='#baa892', addto=plot)
plot = ct.plot(linestyle="-", linecolor='#3f4b00', addto=plot)
plot.xlabel("Miles since trail start ($s$) [$\unit{mi}$]")
plot.ylabel("Elevation ($E$) [$\unit{ft}$]")
plot.lines_on()
plot.markers_off()
plot.ylim(0, 12000)
plot.legend(loc=2)
plot.export('_static/trail_elevations')
plot.show('First section elevation of several long distance hiking trails')

Figure 4: First section elevation of several long distance hiking trails

FittingÂ¶

pym has a quick interface for fitting functions to its curves, and then plotting these.

A cool example of this uses the Google Trends database of search instances. If you search for "Trail Running", or anything outdoorsy, you get back a graph that looks periodic. Below I have an example using the downloaded U.S. results for "Trail Running" since 2004. My hypothesis is that the interest peaks in the nice weather (summer), and is at its nadir during the winter.

In [6]:

import time
import matplotlib.dates as mdates
from matplotlib.dates import DayLocator, HourLocator, DateFormatter, drange

# Download everything and process into two columns
arr = np.loadtxt('_static/trail_running_search.csv', dtype=str, skiprows=3, delimiter=',')
dates = []
scores = []
for row in arr:
    dates.extend([float(time.mktime(time.strptime(row[0], '%Y-%m')))])
    scores.extend([float(row[1])])
# Now create a curve object with this data
search = pym.curve(dates, scores, name='trail running search')
# The data looks sinusoidal but also with a (sort of) linear increase, so lets make a function that would fit that
def sin_fun(x, a, b, c, T):
    return a * np.sin((x) / (T / (2.0 * np.pi))) + b*x + c
# In general we don't need to guess much, but the period of the function is pretty important
# my hypothesis is that the period is a year.  The rest of the values are just eye-balling off of the chart
T = 365.0 * 24.0 * 60.0 * 60.0
search.fit_gen(sin_fun, guess=(10.0, 1.0E-8, 60.0, T))
# Now that we have the fit, lets plot it
plot = search.plot(linestyle='-', linecolor='#dddddd')
plot = search.plot_fit(linestyle='-', linecolor='#999999', addto=plot)
plot.lines_on()
plot.markers_off()
# And add descriptive statistics to the chart
period = search.coeffs[3]
minx = search.find_min()[0, 0]
miny = search.min()
plot.add_data_pointer(minx + 1.5 * period, point=search.fit_at(minx + 1.5 * period), place=(1.07E9, 90.0),
                      string=time.strftime('Max interest occurs in %B each year', time.gmtime(minx + 1.5 * period)))
plot.add_data_pointer(minx + 2.0 * period, point=search.fit_at(minx + 2.0 * period), place=(1.2E9, 40.0),
                      string=time.strftime('Min interest occurs in %B each year', time.gmtime(minx + 2.0 * period)))
plot.add_hmeasure(minx + 4.0 * period, minx + 8.0 * period, 0.90 * search.fit_at(minx + 4.0 * period),
                  string=r'$4\cdot T \sim %.2f \unit{yr}$' % (4.0*period/60.0/60.0/24.0/365.0))
# label the chart and convert the epoch seconds to years
plot.xlabel('Date ($t$) [$\unit{s}$]')
plot.ylabel('Interest ($\mathbb{I}$) [ ]')
times = [float(time.mktime(time.strptime(str(_y), '%Y'))) for _y in np.arange(2004, 2018, 2)]
times_f = [time.strftime('%Y', time.gmtime(_x)) for _x in times]
plot.ax.xaxis.set_ticks(times)
plot.ax.xaxis.set_ticklabels(times_f)
# finally, lets export
plot.export('_static/trail_running', ratio='silver')
plot.show('Fitting Trail Running Trends with a sinusoid to show its periodic nature')

Figure 5: Fitting Trail Running Trends with a sinusoid to show its periodic nature

Interpolation and error propagationÂ¶

pym uses a linear interpolation backend to make its curve objects continuous, and it also propagates the error throughout when operations are performed on it.

In [7]:

# coming soon

Class Documentation for func.curve¶

class func.curve(x, y, name=”, u_x=None, u_y=None, data=’smooth’)[source]¶

Bases: object

An object to expose some numerical methods and plotting tools.

A curve object takes any two dimensional dataset and its uncertainty (both in the $x$ and $y$ direction). Each data set includes $x$ and $y$ data and uncertainty associated with that, as well as a name and a data shape designation (whether this is smooth data or binned).

There exist three ways to add uncertainty to the measurements. The first is to define an array or list of values that define the absolute uncertainty at each x. The second is to define a list of tuples that define the lower and upper absolute uncertainty at each x, respectively. The final way is to define a two dimensional array, where the first row is the lower absolute uncertainty at each x, and the second row is the upper absolute uncertainty at each x.

Parameters:	x (list-like) – The ordinate data of the curve u_x (list-like) – The uncertainty in the ordinate data of the curve y (list-like) – The abscissa data of the curve u_y (list-like) – The uncertainty in the abscissa data of the curve name (str) – The name of the data set, used for plotting, etc. data (str) – The type of data, whether ‘smooth’ or ‘binned’. This parameter affects the interpolation (and in turn, many other functions) by determining what the value is between data points. For smooth data, linear interpolation is enacted to find values between points, for binned data, constant interpolation is used.
Returns:	the `curve` object.
Return type:	curve

add(right, name=None)[source]¶

add(value) adds a value to the curve.

The add function will add the provided value to the curve in place.

Parameters:	right (number) – the number or curve to be added to the curve
Returns:	`curve` with added $y$ values

add_data(x, y, u_x=None, u_y=None)[source]¶

Add data to the already populated x and y.

Parameters:	x (list-like) – The ordinate data to add to the already populated curve object. y (list-like) – The abscissa data to add to the already populated curve object. u_x (list-like) – The uncertainty in the ordinate data to be added. u_y (list-like) – The uncertainty in the abscissa data to be added.
Returns:	A curve object with the added data, fully sorted.
Return type:	curve

at(x, extrapolation=True)[source]¶

at(x) finds a value at x.

at(x) uses interpolation or extrapolation to determine the value of the curve at a given point, $x$. The function first checks if $x$ is in the range of the curve. If it is in the range, the function calls interpolate() to determine the value. If it is not in the range, the function calls extrapolate() to determine the value.

Parameters:	x (float) – The coordinate of which the value is desired.
Returns:	the value of the curve at point $x$
Return type:	float

average(xmin=None, xmax=None)[source]¶

average() will find the average y-value across the entire range.

Parameters:	xmin (float) – The lower bound of `x`-value to include in the average. Default: `x.min()` xmax (float) – The upper bound of `x`-value to include in the average. Default: `x.max()`
Returns:	A float value equal to

\[\bar{y} = \frac{\int_{x_{min}}^{x_{max}} y dx} {\int_{x_{min}}^{x_{max}} dx}\]

Return type:	float

bin_int(x_min=None, x_max=None)[source]¶

bin_int integrates a bar chart.

bin_int is a convenience function used through the class when calling integrate. It integrates for curves that have the .data property set to 'binned'. It does this simply by summing the bin width and bin heights, such that

\[\int_{x_{min}}^{x_{max}} \approx \sum_{i=1,\dots}^{N} \Delta x \cdot y\]

Note that this function assumes that the last bin has the same bin width as the penultimate bin width. This could be remedied in certain ways, but I’m not sure which to choose yet.

Parameters:	x_min (float) – Optional the bottom of the range to be integrated. x_max (float) – Optional the top of the range to be integrated.
Returns:	the result of the integration.

copy(name=None)[source]¶

Perform a deep copy of the curve and passes it out to another curve object so that it can be manipulated out-of-place.

Returns:	a copy of the `curve` object calling the function
Return type:	curve

crop(y_min=None, y_max=None, x_min=None, x_max=None, replace=None)[source]¶

Crop the data within the specified rectange.

crop(y_min, y_max, x_min, x_max, replace) will find any data points that fall outside of the rectangle with corners at (x_min, y_min) to (x_max, y_max) and replace it with the value specified as return.

Parameters:

x_min (float) – A value for which any values with $x<x_{min}$ will be replaced with the value replace.
x_max (float) – A value for which any values with $x>x_{max}$ will be replaced with the value replace.
y_min (float) – A value for which any values with $y<y_{min}$ will be replaced with the value replace.
y_max (float) – A value for which any values with $y>y_{max}$ will be replaced with the value replace.
replace (float) – The value to replace any value outside of the rectangle with. Default None.

Returns:

the cropped curve object

curve_div(right)[source]¶

curve_div(curve) divides one curve by another.

This is a helper class, usually only called through curve.divide, or using the / operator. The class first takes a unique set of x points that are within the range of both curves. Then, it divides the y values by the other.

Parameters:	right (number) – the curve to divide by.
Returns:	the left `curve` object, with the values divided in place.

curve_mult(mult)[source]¶

curve_mult(curve) multiplies two curves together.

This is a helper class, usually only called through curve.multiply, or using the * operator. The class first takes a unique set of x points that are within the range of both curves. Then, it multiplies those two together.

Parameters:	mult (number) – the curve to multiply by
Returns:	the left `curve` object, with the values multipled in place.

decimate(R=None, length=None)[source]¶

Remove all but every R th point in the curve.

Parameters:	R (int) – An integer value telling how often to save a point. length (int) – Alternate, an integer telling how big you want the final array.
Returns:	the decimated `curve` object

derivative(x, epsilon=None)[source]¶

derivative(x) takes the derivative at point $x$.

derivative(x) takes the derivative at point provided x, using a surrounding increment of $\varepsilon$, provided by epsilon. epsilon has a default value of $\min \frac{\Delta x}{100}$, but you can specify this smaller if your points are closer. Because we’re currently only using linear integration, this won’t change a thing as long as its smaller than the change in your ordinate variable.

Parameters:	x (float) – The ordinate to take the derivative at. epsilon (float) – The $\Delta x$ around the point at $x$ used to calculate the derivative.
Returns:	the derivative at point `x`

divide(denominator)[source]¶

divide(denominator) divides a curve by a value.

The divide function will divide the curve by the value provided in numerator. Note that this will only change the value (y) of the function, not the abscissa (x).

Parameters:	denominator (number) – the number to divide the curve by.
Returns:	none

divide_by(numerator)[source]¶

divide_by(numerator) divides a value by the curve.

The divide function will divide the value provided in numerator by the values in the curve. Note that this will only change the value (y) of the function, not the abscissa (x).

Parameters:	numerator (number) – the number to be divided by the curve.
Returns:	none

extrapolate(x)[source]¶

extrapolate(x) finds value of a point out of the curve range.

The function uses linear extrapolation to find the value of a point without the range of the already existing curve. First, it determines whether the requested point is above or below the existing data. Then, it uses find_nearest_down() or find_nearest_up() to find the nearest point. Then it uses find_nearest_down() or find_nearest_up() to find the second nearest point. Finally, it solves the following equation to determine the value

\[y=\frac{\left(y_{\downarrow}-y_{\downarrow \downarrow} \right)}{\left(x_{\downarrow}-x_{\downarrow \downarrow}\right)} \left(x-x_{\downarrow}\right)+y_{\downarrow}\]

Parameters:	x (float) – the ordinate of the value requested
Returns:	the value of the curve at point $x$
Return type:	float

fft(pos=True, return_curve=True, real=True)[source]¶

fft finds the fft of the curve

fft assumes that the values contained in curve.x are time values and are evenly distributed, and returns the fft of the values in curve.y versus curve.x.

Parameters:	pos (bool) – if `True`, returns only the positive frequency components curve (bool) – if `True`, returns the data as a curve
Returns:	`f` the array of frequencies and `a` the amplitude of of components present at that frequency

find(y)[source]¶

find(y) finds values of $x$ that have value $y$

This function takes a parameter $y$ and finds all of the ordinate coordinates that have that value. Basically, this is a root-finding problem, but since we have a linear interpolation, the actual root-finding is trivial. The function first finds all intervals in the dataset that include the value $y$, and then solves the interpolation to find those $x$ values according to

\[x=\left(y-y_{\downarrow}\right)\frac{\left(x_{\uparrow} -x_{\downarrow}\right)}{\left(y_{\uparrow}-y_{\downarrow}\right)} +x_{\downarrow}\]

Parameters:	y (float) – the value which ordinate values are desired
Returns:	a list of $x$ that have value $y$
Return type:	list

find_first_above(y_min)[source]¶

Find the first point with y value above the given value y.

Parameters:	y_min (float) – the comparitor value
Returns:	the tuple (x, y) which is the first in `x` space where `y` is above the given y_min

find_nearest_down(x, error=False)[source]¶

find_nearest_down(x) will find the actual data point that is closest in negative x-distance to the data point x passed to the function.

Parameters:	x (float) – The data point `x` which to find the closest value below. error (bool) – If true, the u_x and u_y will be returned at that point, even if they are `None`.
Returns:	a tuple containing the `x` and `y` value of the data point immediately below in `x` value to the value passed to the function, optionally containing the `u_x` and `u_y` value.

find_nearest_up(x, error=False)[source]¶

find_nearest_up(x, error=False) will find the actual data point that is closest in positive x-distance to the data point x passed to the function.

Parameters:	x (float) – The data point `x` which to find the closest value above. error (bool) – If true, the u_x and u_y will be returned at that point, even if they are `None`.
Returns:	a tuple containing the `x` and `y` value of the data point immediately above in `x` value to the value passed to the function, optionally containing the `u_x` and `u_y` value.
Return type:	tuple

find_peaks(thres=0.3, min_dist=1)[source]¶: find_peaks finds the peaks in the curve

fit_at(x)[source]¶

fit_at returns the point at coordinate $x$ from a previously fitted curve.

Parameters:	x (float) – the ordinate variable for which the fit value is needed.

fit_cube()[source]¶

fit_cube fits a function of order 3 to the curve.

fit_cube fits a cubic function of form $y=a x^{3} + b x^{2} + c x + d$ to the curve, returning the parameters $\left(a, b, c, d\right)$ as a tuple.

Returns:	the tuple $\left(a, b, c, d\right)$

fit_exp()[source]¶

fit_exp fits an exponential to the function.

fit_exp fits an exponential of form $y=B\cdot \exp \left( \alpha\cdot x\right)$ to the curve, returning the parameters $\left(\alpha, B\right)$ as a tuple.

Returns:	the tuple $\left(\alpha, B\right)$

fit_gauss(guess=None)[source]¶

fit_gauss fits a gaussian function to the curve.

fit_gauss fits a gaussian function of form $y=\alpha \exp \left[ -\frac{\left(x - \mu\right)^{2}}{2 \sigma^{2}}\right]$ to the curve, returning the parameters $\left(\alpha, \mu, \sigma\right)$ as a tuple.

Returns:	the tuple $\left(\alpha, \mu, \sigma\right)$

fit_gen(fun, guess=None, u_y=None)[source]¶

fit_gen fits a general function to the curve.

fit_gen fits a general function to the curve. The general function is a python function that takes a parameters and an ordinate variable, x and returns the value of the function at that point, y. The function must have the prototype def func(x, alpha, beta, ...):. Then, the coefficients are returned as a tuple.

Returns:	the coefficients to the general function

fit_lin()[source]¶

fit_lin fits a linear function to the curve.

fit_lin fits a linear function of form $y=m\cdot x + b$ to the curve, returning the parameters $\left(m, b\right)$ as a tuple.

Returns:	the tuple $\left(m, b\right)$

fit_pow(guess=None)[source]¶

fit_gauss fits a gaussian function to the curve.

Returns:	the tuple $\left(\alpha, \mu, \sigma\right)$

fit_square()[source]¶

fit_square fits a function of order 2 to the curve.

fit_square fits a quadratic function of form $y=a x^{2} + b x + c$ to the curve, returning the parameters $\left(a, b, c\right)$ as a tuple.

Returns:	the tuple $\left(a, b, c\right)$

inrange(x)[source]¶

Check if a point is within the range of data.

Parameters:	x (float) – The data point to check if it is in the range of the existing curve data.
Returns:	Whether or not the data is in the range of the curve data.
Return type:	bool

integrate(x_min=None, x_max=None, quad=’lin’)[source]¶

integrate integrates under the curve.

integrate will integrate under the given curve, providing the result to $\int_{x_{min}}^{x_{max}}$. x_min and x_max can be provided to change the range of integration. quad can also be provided to change the quadrature, but the only quadrature currently supported is 'lin' which uses trapezoidal rule to integrate the curve.

Parameters:	x_min (float) – Optional the bottom of the range to be integrated. x_max (float) – Optional the top of the range to be integrated. quad (str) – Optional the “quadrature” to be used for numerical integration.
Returns:	the result of the integration.

interpolate(x)[source]¶

interpolate(x) finds the value of a point in the curve range.

The function uses linear interpolation to find the value of a point in the range of the curve data. First, it uses find_nearest_down() and find_nearest_up() to find the two points comprising the interval which $x$ exists in. Then, it casts the linear interpolation as a line in point slope form and solves

\[y=\frac{\left(y_{1}-y_{0}\right)}{\left(x_{1}-x_{0}\right)} \left(x-x_{0}\right)+y_{0}\]

Parameters:	x (float) – The coordinate of the desired value.
Returns:	the value of the curve at $x$
Return type:	float

multiply(mult)[source]¶

multiply(mult) multiplies the curve by a value.

The multiply function will multiply the curve by the value passed to it in mult. This value can be an array with the same size or a scalar of type integer or float. Note that this will only change the value (y) of the function, not the abscissa (x).

Parameters:	mult (number) – the number to multiply the curve by
Returns:	the curve after multiplication

normalize(xmin=None, xmax=None, norm=’int’)[source]¶

normalize() normalizes the entire curve to be normalized.

Caution! This will change all of the y values in the entire curve!

Normalize will take the data of the curve (optionally just the data between xmin and xmax) and normalize it based on the option given by norm. The options for norm are max and int. For a max normalization, first the function finds the maximum value of the curve in the range of the $x$ data and adjusts all $y$ values according to

\[y = \frac{y}{y_{max}}\]

For an int normalization, the function adjusts all $y$ values according to

\[y=\frac{y}{\int_{x_{min}}^{x_{max}}y \left( x \right) dx}\]

Parameters:

xmin (float) – optional argument giving the lower bound of the integral in an integral normalization or the lower bound in which to find the max in a max normalization
xmax (float) – optional argument giving the upper bound of the integral in an integral normalization or the upper bound in which to find the max in a max normalization
norm (str) – a string of ‘max’ or ‘int’ (default ‘max’) which defines which of the two types of normalization to perform

Returns:

None

rebin(x=None)[source]¶

Redistribute the curve along a new set of x values.

rebin(x) takes a list-like input of new points on the ordinate and redistributes the abscissa so that the x values are only on those points. For continuous/smooth data, this simply interpolates the previous curve to the new points. For binned data, this integrates between left bin points and redistributes the fraction of data between those points.

Parameters:	x (list) – the new x values to redistribute the curve. If binned, this indicates the left edge
Returns:	the curve object with redistributed values

rename(name)[source]¶: Rename the current curve.

return_fit()[source]¶: Return the fit as a curve.

rolling_avg(bin_width=0.1)[source]¶

rolling_avg(bin_width) redistributes the data on a certain bin width, propogating the error needed.

If we have data in an array such as

\[\begin{split}\left[\begin{array}{c} \vec{x}\\ \vec{y} \end{array}\right]=\left[\begin{array}{cccc} 0.1 & 0.75 & 1.75 & 1.9\\ 1.0 & 2.0 & 3.0 & 4.0 \end{array}\right]\end{split}\]

and we want to see the data only on integer bins, we will return

\[\begin{split}\left[\begin{array}{c} \vec{x}\\ \vec{y} \end{array}\right]=\left[\begin{array}{cc} 0.0 & 2.0\\ 1.5 & 3.5 \end{array}\right]\end{split}\]

This function will also return the uncertainty in each bin, taking into account both the uncertainty of each value in the bin, as well as the uncertainty caused by standard deviation within the bin itself. This can be expressed by

\[\begin{split}\left[\begin{array}{c} \vec{x}\\ \vec{y}\\ \vec{u}_{x}\\ \vec{u}_{y} \end{array}\right]=\left[\begin{array}{c} \frac{\sum_{x\text{ in bin}}x}{N_{x}}\\ \frac{\sum_{x\text{ in bin}}y}{N_{y}}\\ \frac{\sum_{x\text{ in bin}}\sqrt{ \left(\frac{\text{bin width}}{2}\right)^{2} +\text{mean}\left(\sigma_{x}\right)^{2}}}{N_{x}}\\ \frac{\sum_{x\text{ in bin}}\sqrt{\sigma_{y}^{2} +stdev_{y}^{2}}}{N_{x}} \end{array}\right]\end{split}\]

Parameters:	bin_width (float) – The width in which the redistribution will happen.
Return type:	The redistributed curve.

static round_to_amt(num, amt)[source]¶

round_to_amt is a static method that round a number to an: arbitrary interval

Given a number num such as $1.2$ and an amount amt such as $0.25$, round_to_amt would return $1.20$ because that is the closest value downward on a $0.25$ wide grid.

Parameters:	num (float) – the number to be rounded. amt (float) – the amount to round the number to.
Returns:	the number after it has been rounded.

sort()[source]¶

Sort the list depending on the $x$ coordinate.

sort() sorts all of the data input to the curve so that it is ordered from decreasing $x$ to increasing $x$.

Returns:	the `curve` object, but it has been sorted in-place.
Return type:	curve

trapezoidal(x_min, x_max, quad=’lin’)[source]¶

trapezoidal() uses the trapezoidal rule to integrate the curve.

trapezoidal(x_min, x_max) integrates the curve using the trapezoidal rule, i.e.

\[\int_{x_{min}}^{x_{max}}y dx \approx \sum_{i=1,\dots}^{N} \left(x_{\uparrow} - x_{\downarrow}\right) \cdot \left( \frac{y_{\downarrow} + y_{uparrow}}{2}\right)\]

Right now, it uses $10 \times N_{x}$ points to integrate between values, but that is completely arbitrary and I’ll be looking into changing this. There is also the ability to pass quad to the function as 'log' CURRENTLY FAILING and it will calculate the trapezoids in logarithmic space, giving exact integrals for exponential functions.

Parameters:	x_min (float) – the left bound of integration. x_max (float) – the right bound of integration. quad (str) – the type of quadrature to use, currently only `'lin'` or `'log'`
Returns:	the integral of the curve from trapezoidal rule.

u_y_at(x, dx=0.0)[source]¶

u_y_at(x) finds a the uncertainty of a value at x.

u_y_at(x) uses interpolation or extrapolation to determine the uncertainty of the value of the curve at a given point, $x$. The function first checks if $x$ is in the range of the curve. If it is in the range, the function calls interpolate() and propogate_error() to find the uncertainty of the point. If it is not in the range, the function calls extrapolate() and propogate_error() to determine the value.

We use the following equation to perform the interpolation:

\[y\left(x\right) = \left(x-x_{\downarrow}\right) \frac{\left(y_{\uparrow}-y_{\downarrow}\right)} {\left(x_{\uparrow}-x_{\downarrow}\right)}\]

And using the error propagation formula from (Knoll, 1999), which is

\[\sigma_{\zeta}^{2} = \left(\frac{\partial\zeta}{\partial x}\right)^{2}\sigma_{x}^{2} + \left(\frac{\partial\zeta}{\partial y}\right)^{2}\sigma_{y}^{2}\]

for a derived value $\zeta$, we can apply this to interpolation and get:

\[\sigma_{y}^{2} = \left(\frac{\partial y}{\partial x}\right)^{2}\sigma_{x}^{2} + \left(\frac{\partial y}{\partial x_{\downarrow}}\right)^{2} \sigma_{x_{\downarrow}}^{2} + \left(\frac{\partial y}{\partial x_{\uparrow}}\right)^{2} \sigma_{x_{\uparrow}}^{2} + \left(\frac{\partial y}{\partial y_{\downarrow}}\right)^{2} \sigma_{y_{\downarrow}}^{2} + \left(\frac{\partial y}{\partial y_{\uparrow}}\right)^{2} \sigma_{y_{\uparrow}}^{2}\]

and, performing the derivatives, we can get:

\[\begin{split}\sigma_{y}^{2}=\left(\frac{\left(y_{\uparrow}-y_{\downarrow}\right)} {\left(x_{\uparrow}-x_{\downarrow}\right)}\right)^{2} \sigma_{x}^{2}+\left(-\left(x-x_{\uparrow}\right) \frac{\left(y_{\uparrow}-y_{\downarrow}\right)} {\left(x_{\uparrow}-x_{\downarrow}\right)^{2}}\right)^{2} \sigma_{x_{\downarrow}}^{2}+\left(\left(x-x_{\downarrow}\right) \frac{\left(y_{\uparrow}-y_{\downarrow}\right)}{ \left(x_{\uparrow}-x_{\downarrow}\right)^{2}}\right)^{2} \sigma_{x_{\uparrow}}^{2}\\+\left(-\frac{\left(x-x_{\downarrow} \right)}{\left(x_{\uparrow}-x_{\downarrow}\right)}\right)^{2} \sigma_{y_{\downarrow}}^{2}+\left(\frac{ \left(x-x_{\downarrow}\right)}{\left(x_{\uparrow}-x_{\downarrow} \right)}\right)^{2}\sigma_{y_{\uparrow}}^{2}\end{split}\]

Finally, if we take $m=\frac{\left(y_{\uparrow}-y_{\downarrow} \right)}{\left(x_{\uparrow}-x_{\downarrow}\right)}$, and $\Delta\xi=\frac{\left(x-x_{\downarrow}\right)}{\left(x_{ \uparrow}-x_{\downarrow}\right)}$, we can get:

\[\sigma_{y}^{2}=m^{2}\left[\sigma_{x}^{2}+ \sigma_{y_{\downarrow}}^{2}+\sigma_{y_{\uparrow}}^{2}+ \Delta\xi^{2}\left(\sigma_{x_{\downarrow}}^{2}+ \sigma_{x_{\uparrow}}^{2}\right)\right]\]

and the square root of that is the uncertainty.

\[\sigma_{y}=m\sqrt{\sigma_{x}^{2}+\sigma_{y_{\downarrow}}^{2}+ \sigma_{y_{\uparrow}}^{2}+\Delta\xi^{2}\left( \sigma_{x_{\downarrow}}^{2}+\sigma_{x_{\uparrow}}^{2}\right)}\]

Note that if an uncertainty in x is not supplied, that the first term will go to zero, giving

\[\require{cancel} \sigma_{y}=m\sqrt{\cancel{\sigma_{x}^{2}} +\sigma_{y_{\downarrow}}^{2}+ \sigma_{y_{\uparrow}}^{2}+\Delta\xi^{2}\left( \sigma_{x_{\downarrow}}^{2}+\sigma_{x_{\uparrow}}^{2}\right)}\]

Parameters:	x (float) – The coordinate of which the value is desired. dx (float) – Optional The uncertainty in the x coordinate requested, given in the above equations as $\sigma_{x}$.
Returns:	$\sigma_{y}$, the uncertainty of the value of the curve at point $x$
Return type:	float