diff_classifier.features

diff_classifier.features

Functions to calculate trajectory features from input trajectory data

This module provides functions to calculate trajectory features based off the ImageJ plugin TrajClassifer by Thorsten Wagner. See details at https://imagej.net/TraJClassifier.

diff_classifier.features.alpha_calc(track)[source]

Calculates alpha, the exponential fit parameter for MSD data

Parameters:
track : pandas.core.frame.DataFrame

At a minimum, must contain a Frames and a MSDs column. The function msd_calc can be used to generate the correctly formatted pd dataframe.

Returns:
alph : numpy.float64

The anomalous exponent derived by fitting MSD values to the function, <rad**2(n)> = 4*dcoef*(n*delt)**alph

dcoef : numpy.float64

The fitted diffusion coefficient derived by fitting MSD values to the function above.

Examples

>>> frames = 5
>>> data1 = {'Frame': np.linspace(1, frames, frames),
...          'X': np.linspace(1, frames, frames)+5,
...          'Y': np.linspace(1, frames, frames)+3}
>>> dframe = pd.DataFrame(data=data1)
>>> dframe['MSDs'], dframe['Gauss'] = msd_calc(dframe)
>>> alpha_calc(dframe)
(2.0000000000000004, 0.4999999999999999)
>>> frames = 10
>>> data1 = {'Frame': np.linspace(1, frames, frames),
...          'X': np.sin(np.linspace(1, frames, frames)+3),
...          'Y': np.cos(np.linspace(1, frames, frames)+3)}
>>> dframe = pd.DataFrame(data=data1)
>>> dframe['MSDs'], dframe['Gauss'] = msd_calc(dframe)
>>> alpha_calc(dframe)
(0.023690002018364065, 0.5144436515510022)
diff_classifier.features.aspectratio(track)[source]

Calculates the aspect ratio of the rectangle containing the input track.

Parameters:
track : pandas.core.frame.DataFrame

At a minimum, must contain an X and Y column. The function msd_calc can be used to generate the correctly formatted pd dataframe.

Returns:
aspratio : numpy.float64

aspect ratio of the trajectory. Always >= 1.

elong : numpy.float64

elongation of the trajectory. A transformation of the aspect ratio given by 1 - aspratio**-1.

Examples

>>> frames = 10
>>> data1 = {'Frame': np.linspace(1, frames, frames),
...          'X': np.linspace(1, frames, frames)+5,
...          'Y': np.linspace(1, frames, frames)+3}
>>> dframe = pd.DataFrame(data=data1)
>>> dframe['MSDs'], dframe['Gauss'] = msd_calc(dframe)
>>> aspectratio(dframe)
(5732146505273195.0, 0.99999999999999978)
>>> frames = 10
>>> data1 = {'Frame': np.linspace(1, frames, frames),
...          'X': np.sin(np.linspace(1, frames, frames))+3,
...          'Y': np.cos(np.linspace(1, frames, frames))+3}
>>> dframe = pd.DataFrame(data=data1)
>>> dframe['MSDs'], dframe['Gauss'] = msd_calc(dframe)
>>> aspectratio(dframe)
(1.0997501702946164, 0.090702573174318291)
diff_classifier.features.asymmetry(track)[source]

Calculates the asymmetry of the trajectory.

Parameters:
track : pandas DataFrame

At a minimum, must contain an X and Y column. The function msd_calc can be used to generate the correctly formatted pd dataframe.

Returns:
eig1 : numpy.float64

Dominant eigenvalue of the gyration tensor.

eig2 : numpy.float64

Secondary eigenvalue of the gyration tensor.

asym1 : numpy.float64

asymmetry of the input track. Equal to 0 for circularly symmetric tracks, and 1 for linear tracks.

asym2 : numpy.float64

alternate definition of asymmetry. Equal to 1 for circularly symmetric tracks, and 0 for linear tracks.

asym3 : numpy.float64

alternate definition of asymmetry.

Examples

>>> frames = 10
>>> data1 = {'Frame': np.linspace(1, frames, frames),
...          'X': np.linspace(1, frames, frames)+5,
...          'Y': np.linspace(1, frames, frames)+3}
>>> dframe = pd.DataFrame(data=data1)
>>> dframe['MSDs'], dframe['Gauss'] = msd_calc(dframe)
>>> asymmetry(dframe)
(16.5, 0.0, 1.0, 0.0, 0.69314718055994529)
>>> frames = 10
>>> data1 = {'Frame': np.linspace(1, frames, frames),
...          'X': np.sin(np.linspace(1, frames, frames)+3),
...          'Y': np.cos(np.linspace(1, frames, frames)+3)}
>>> dframe = pd.DataFrame(data=data1)
>>> dframe['MSDs'], dframe['Gauss'] = msd_calc(dframe)
>>> asymmetry(dframe)
(0.53232560128104522,
0.42766829138901619,
0.046430119259539708,
0.80339606128247354,
0.0059602683290953052)
diff_classifier.features.boundedness(track, framerate=1)[source]

Calculates the boundedness, fractal dimension, and trappedness of the input track.

Parameters:
track : pandas.core.frame.DataFrame

At a minimum, must contain a Frames and a MSDs column. The function msd_calc can be used to generate the correctly formatted pd dataframe.

framerate : framrate of the video being analyzed. Actually cancels out. So

why did I include this. Default is 1.

Returns:
bound : float

Boundedness of the input track. Quantifies how much a particle with diffusion coefficient dcoef is restricted by a circular confinement of radius rad when it diffuses for a time duration N*delt. Defined as bound = dcoef*N*delt/rad**2. For this case, dcoef is the short time diffusion coefficient (after 2 frames), and rad is half the maximum distance between any two positions.

fractd : float

The fractal path dimension defined as fractd = log(N)/log(N*data1*l**-1) where netdisp is the total length (sum over all steplengths), N is the number of steps, and data1 is the largest distance between any two positions.

probf : float

The probability that a particle with diffusion coefficient dcoef and traced for a period of time N*delt is trapped in region r0. Given by pt = 1 - exp(0.2048 - 0.25117*(dcoef*N*delt/r0**2)). For this case, dcoef is the short time diffusion coefficient, and r0 is half the maximum distance between any two positions.

Examples

>>> frames = 10
>>> data1 = {'Frame': np.linspace(1, frames, frames),
...          'X': np.linspace(1, frames, frames)+5,
...          'Y': np.linspace(1, frames, frames)+3}
>>> dframe = pd.DataFrame(data=data1)
>>> dframe['MSDs'], dframe['Gauss'] = msd_calc(dframe)
>>> boundedness(dframe)
(1.0, 1.0000000000000002, 0.045311337970735499)
>>> frames = 10
>>> data1 = {'Frame': np.linspace(1, frames, frames),
...          'X': np.sin(np.linspace(1, frames, frames)+3),
...          'Y': np.cos(np.linspace(1, frames, frames)+3)}
>>> dframe = pd.DataFrame(data=data1)
>>> dframe['MSDs'], dframe['Gauss'] = msd_calc(dframe)
>>> boundedness(dframe)
(0.96037058689895005, 2.9989749477908401, 0.03576118370932313)
diff_classifier.features.calculate_features(dframe, framerate=1, frame=(10, 100))[source]

Calculates multiple features from input MSD dataset and stores in pandas dataframe.

Parameters:
dframe : pandas.core.frame.DataFrame

Output from msd.all_msds2. Must have at a minimum the following columns: Track_ID, Frame, X, Y, and MSDs.

framerate : int or float

Framerate of the input videos from which trajectories were calculated. Required for accurate calculation of some features. Default is 1. Possibly not required. Ignore if performing all calcuations without units.

frame : int

Frame at which to calculate Deff

Returns:
datai: pandas.core.frame.DataFrame

Contains a row for each trajectory in dframe. Holds the following features of each trajetory: Track_ID, alpha, D_fit, kurtosis, asymmetry1, asymmetry2, asymmetry3, aspect ratio (AR), elongation, boundedness, fractal dimension (fractal_dim), trappedness, efficiency, straightness, MSD ratio, frames, X, and Y.

Examples

See example outputs from individual feature functions.

diff_classifier.features.efficiency(track)[source]

Calculates the efficiency and straitness of the input track

Parameters:
track : pandas.core.frame.DataFrame

At a minimum, must contain a Frames and a MSDs column. The function msd_calc can be used to generate the correctly formatted pd dataframe.

Returns:
eff : float

Efficiency of the input track. Relates the sum of squared step lengths. Based on Helmuth et al. (2007) and defined as: E = |xpos(N-1)-xpos(0)|**2/SUM(|xpos(i) - xpos(i-1)|**2

strait : float

Relates the net displacement netdisp to the sum of step lengths and is defined as: S = |xpos(N-1)-xpos(0)|/SUM(|xpos(i) - xpos(i-1)|

Examples

>>> frames = 10
>>> data1 = {'Frame': np.linspace(1, frames, frames),
...          'X': np.linspace(1, frames, frames)+5,
...          'Y': np.linspace(1, frames, frames)+3}
>>> dframe = pd.DataFrame(data=data1)
>>> dframe['MSDs'], dframe['Gauss'] = msd_calc(dframe)
>>> ft.efficiency(dframe)
(9.0, 0.9999999999999999)
>>> frames = 10
>>> data1 = {'Frame': np.linspace(1, frames, frames),
...          'X': np.sin(np.linspace(1, frames, frames))+3,
...          'Y': np.cos(np.linspace(1, frames, frames))+3}
>>> dframe = pd.DataFrame(data=data1)
>>> dframe['MSDs'], dframe['Gauss'] = msd_calc(dframe)
>>> ft.efficiency(dframe)
(0.46192924086141945, 0.22655125514290225)
diff_classifier.features.feature_violin(tgroups, feature='boundedness', labels=['sample 1', 'sample 2', 'sample 3'], points=40, ylim=[0, 1], nticks=11)[source]

Plots violin plots of features in comparison groups

Parameters:
tgroups : dict of pandas.core.frames.DataFrame

Dictionary containing pandas dataframes containing trajectory features of subgroups to be plotted

feature : string

Feature to be compared

labels : list of strings

Labels of subgroups to be plotted.

points : int

Determines resolution of violin plot

ylim : list of int

Y range of output plot

diff_classifier.features.gyration_tensor(track)[source]

Calculates the eigenvalues and eigenvectors of the gyration tensor of the input trajectory.

Parameters:
track : pandas DataFrame

At a minimum, must contain an X and Y column. The function msd_calc can be used to generate the correctly formatted pd dataframe.

Returns:
eig1 : numpy.float64

Dominant eigenvalue of the gyration tensor.

eig2 : numpy.float64

Secondary eigenvalue of the gyration tensor.

eigv1 : numpy.ndarray

Dominant eigenvector of the gyration tensor.

eigv2 : numpy.ndarray

Secondary eigenvector of the gyration tensor.

Examples

>>> frames = 5
>>> data1 = {'Frame': np.linspace(1, frames, frames),
...          'X': np.linspace(1, frames, frames)+5,
...          'Y': np.linspace(1, frames, frames)+3}
>>> dframe = pd.DataFrame(data=data1)
>>> dframe['MSDs'], dframe['Gauss'] = msd_calc(dframe)
>>> gyration_tensor(dframe)
(4.0,
4.4408920985006262e-16,
array([ 0.70710678, -0.70710678]),
array([ 0.70710678,  0.70710678]))
>>> frames = 10
>>> data1 = {'Frame': np.linspace(1, frames, frames),
...          'X': np.sin(np.linspace(1, frames, frames)+3),
...          'Y': np.cos(np.linspace(1, frames, frames)+3)}
>>> dframe = pd.DataFrame(data=data1)
>>> dframe['MSDs'], dframe['Gauss'] = msd_calc(dframe)
>>> gyration_tensor(dframe)
(0.53232560128104522,
0.42766829138901619,
array([ 0.6020119 , -0.79848711]),
array([-0.79848711, -0.6020119 ]))
diff_classifier.features.kurtosis(track)[source]

Calculates the kurtosis of input track.

Parameters:
track : pandas.core.frame.DataFrame

At a minimum, must contain an X and Y column. The function msd_calc can be used to generate the correctly formatted pd dataframe.

Returns:
kurt : numpy.float64

Kurtosis of the input track. Calculation based on projected 2D positions on the dominant eigenvector of the radius of gyration tensor.

Examples

>>> frames = 5
>>> data1 = {'Frame': np.linspace(1, frames, frames),
...          'X': np.linspace(1, frames, frames)+5,
...          'Y': np.linspace(1, frames, frames)+3}
>>> dframe = pd.DataFrame(data=data1)
>>> dframe['MSDs'], dframe['Gauss'] = msd_calc(dframe)
>>> kurtosis(dframe)
2.5147928994082829
>>> frames = 10
>>> data1 = {'Frame': np.linspace(1, frames, frames),
...          'X': np.sin(np.linspace(1, frames, frames)+3),
...          'Y': np.cos(np.linspace(1, frames, frames)+3)}
>>> dframe = pd.DataFrame(data=data1)
>>> dframe['MSDs'], dframe['Gauss'] = msd_calc(dframe)
>>> kurtosis(dframe)
1.8515139698652476
diff_classifier.features.minboundrect(track)[source]

Calculates the minimum bounding rectangle of an input trajectory.

Parameters:
dframe : pandas.core.frame.DataFrame

At a minimum, must contain an X and Y column. The function msd_calc can be used to generate the correctly formatted pd dataframe.

Returns:
rot_angle : numpy.float64

Angle of rotation of the bounding box.

area : numpy.float64

Area of the bounding box.

width : numpy.float64

Width of the bounding box.

height : numpy.float64

Height of the bounding box.

center_point : numpy.ndarray

Center point of the bounding box.

corner_pts : numpy.ndarray

Corner points of the bounding box.

Notes

Based off of code from the following repo: https://github.com/dbworth/minimum-area-bounding-rectangle/blob/master/ python/min_bounding_rect.py

Examples

>>> frames = 10
>>> data1 = {'Frame': np.linspace(1, frames, frames),
...          'X': np.linspace(1, frames, frames)+5,
...          'Y': np.linspace(1, frames, frames)+3}
>>> dframe = pd.DataFrame(data=data1)
>>> dframe['MSDs'], dframe['Gauss'] = msd_calc(dframe)
>>> minboundrect(dframe)
(-2.3561944901923448,
2.8261664256307952e-14,
12.727922061357855,
2.2204460492503131e-15,
array([ 10.5,   8.5]),
array([[  6.,   4.],
       [ 15.,  13.],
       [ 15.,  13.],
       [  6.,   4.]]))
>>> frames = 10
>>> data1 = {'Frame': np.linspace(1, frames, frames),
...          'X': np.sin(np.linspace(1, frames, frames))+3,
...          'Y': np.cos(np.linspace(1, frames, frames))+3}
>>> dframe = pd.DataFrame(data=data1)
>>> dframe['MSDs'], dframe['Gauss'] = msd_calc(dframe)
>>> minboundrect(dframe)
(0.78318530717958657,
3.6189901131223992,
1.9949899732081091,
1.8140392491811692,
array([ 3.02076903,  2.97913884]),
array([[ 4.3676025 ,  3.04013439],
       [ 2.95381341,  1.63258851],
       [ 1.67393557,  2.9181433 ],
       [ 3.08772466,  4.32568917]]))
diff_classifier.features.msd_ratio(track, fram1=3, fram2=100)[source]

Calculates the MSD ratio of the input track at the specified frames.

Parameters:
track : pandas.core.frame.DataFrame

At a minimum, must contain a Frames and a MSDs column. The function msd_calc can be used to generate the correctly formatted pd dataframe.

fram1 : int

First frame at which to calculate the MSD ratio.

fram2 : int

Last frame at which to calculate the MSD ratio.

Returns:
ratio: numpy.float64

MSD ratio as defined by [MSD(fram1)/MSD(fram2)] - [fram1/fram2] where fram1 < fram2. For Brownian motion, it is 0; for restricted motion it is < 0. For directed motion it is > 0.

Examples

>>> frames = 10
>>> data1 = {'Frame': np.linspace(1, frames, frames),
...          'X': np.linspace(1, frames, frames)+5,
...          'Y': np.linspace(1, frames, frames)+3}
>>> dframe = pd.DataFrame(data=data1)
>>> dframe['MSDs'], dframe['Gauss'] = msd_calc(dframe)
>>> ft.msd_ratio(dframe, 1, 9)
-0.18765432098765433
>>> frames = 10
>>> data1 = {'Frame': np.linspace(1, frames, frames),
...          'X': np.sin(np.linspace(1, frames, frames))+3,
...          'Y': np.cos(np.linspace(1, frames, frames))+3}
>>> dframe = pd.DataFrame(data=data1)
>>> dframe['MSDs'], dframe['Gauss'] = msd_calc(dframe)
>>> ft.msd_ratio(dframe, 1, 9)
0.04053708075268797
diff_classifier.features.unmask_track(track)[source]

Removes empty frames from inpute trajectory datset.

Parameters:
track : pandas.core.frame.DataFrame

At a minimum, must contain a Frame, Track_ID, X, Y, MSDs, and Gauss column.

Returns:
comp_track : pandas.core.frame.DataFrame

Similar to track, but has all masked components removed.