diff_classifier.msd

diff_classifier.msd

Functions to calculate mean squared displacements from trajectory data

This module includes functions to calculate mean squared displacements and additional measures from input trajectory datasets as calculated by the Trackmate ImageJ plugin.

class diff_classifier.msd.Bunch(**kwds)[source]

Bases: object

diff_classifier.msd.all_msds(data)[source]

Calculates mean squared displacements of a trajectory dataset

Returns numpy array containing MSD data of all tracks in a trajectory pandas dataframe.

Parameters:
data : pandas.core.frame.DataFrame

Contains, at a minimum a ‘Frame’, ‘Track_ID’, ‘X’, and ‘Y’ column. Note: it is assumed that frames begins at 1, not 0 with this function. Adjust before feeding into function.

Returns:
new_data : pandas.core.frame.DataFrame

Similar to input data. All missing frames of individual trajectories are filled in with NaNs, and two new columns, MSDs and Gauss are added: MSDs, calculated mean squared displacements using the formula MSD = <(xpos-x0)**2> Gauss, calculated Gaussianity

Examples

>>> data1 = {'Frame': [1, 2, 3, 4, 5, 1, 2, 3, 4, 5],
...          'Track_ID': [1, 1, 1, 1, 1, 2, 2, 2, 2, 2],
...          'X': [5, 6, 7, 8, 9, 1, 2, 3, 4, 5],
...          'Y': [6, 7, 8, 9, 10, 2, 3, 4, 5, 6]}
>>> df = pd.DataFrame(data=data1)
>>> all_msds(df)
diff_classifier.msd.all_msds2(data, frames=651)[source]

Calculates mean squared displacements of input trajectory dataset

Returns numpy array containing MSD data of all tracks in a trajectory pandas dataframe.

Parameters:
data : pandas.core.frame.DataFrame

Contains, at a minimum a ‘Frame’, ‘Track_ID’, ‘X’, and ‘Y’ column. Note: it is assumed that frames begins at 0.

Returns:
new_data : pandas.core.frame.DataFrame

Similar to input data. All missing frames of individual trajectories are filled in with NaNs, and two new columns, MSDs and Gauss are added: MSDs, calculated mean squared displacements using the formula MSD = <(xpos-x0)**2> Gauss, calculated Gaussianity

Examples

>>> data1 = {'Frame': [0, 1, 2, 3, 4, 0, 1, 2, 3, 4],
...          'Track_ID': [1, 1, 1, 1, 1, 2, 2, 2, 2, 2],
...          'X': [5, 6, 7, 8, 9, 1, 2, 3, 4, 5],
...          'Y': [6, 7, 8, 9, 10, 2, 3, 4, 5, 6]}
>>> df = pd.DataFrame(data=data1)
>>> cols = ['Frame', 'Track_ID', 'X', 'Y', 'MSDs', 'Gauss']
>>> om flength = max(df['Frame']) + 1
>>> msd.all_msds2(df, frames=length)[cols]
diff_classifier.msd.binning(experiments, wells=4, prefix='test')[source]

Split set of input experiments into groups.

Parameters:
experiments : list of str

List of experiment names.

wells : int

Number of groups to divide experiments into.

Returns:
slices : int

Number of experiments per group.

bins : dict of list of str

Dictionary, keys corresponding to group names, and elements containing lists of experiments in each group.

bin_names : list of str

List of group names

diff_classifier.msd.geomean_msdisp(prefix, umppx=0.16, fps=100.02, upload=True, remote_folder='01_18_Experiment', bucket='ccurtis.data', backup_frames=651)[source]

Comptes geometric averages of mean squared displacement datasets

Calculates geometric averages and stadard errors for MSD datasets. Might error out if not formatted as output from all_msds2.

Parameters:
prefix : string

Prefix of file name to be plotted e.g. features_P1.csv prefix is P1.

umppx : float

Microns per pixel of original images.

fps : float

Frames per second of video.

upload : bool

True if you want to upload to s3.

remote_folder : string

Folder in S3 bucket to upload to.

bucket : string

Name of S3 bucket to upload to.

Returns:
geo_mean : numpy.ndarray

Geometric mean of trajectory MSDs at all time points.

geo_stder : numpy.ndarray

Geometric standard errot of trajectory MSDs at all time points.

diff_classifier.msd.make_xyarray(data, length=651)[source]

Rearranges xy position data into 2d arrays

Rearranges xy data from input pandas dataframe into 2D numpy array.

Parameters:
data : pd.core.frame.DataFrame

Contains, at a minimum a ‘Frame’, ‘Track_ID’, ‘X’, and ‘Y’ column.

length : int

Desired length or number of frames to which to extend trajectories. Any trajectories shorter than the input length will have the extra space filled in with NaNs.

Returns:
xyft : dict of np.ndarray

Dictionary containing xy position data, frame data, and trajectory ID data. Contains the following keys: farray, frames data (length x particles) tarray, trajectory ID data (length x particles) xarray, x position data (length x particles) yarray, y position data (length x particles)

Examples

>>> data1 = {'Frame': [0, 1, 2, 3, 4, 2, 3, 4, 5, 6],
...          'Track_ID': [1, 1, 1, 1, 1, 2, 2, 2, 2, 2],
...          'X': [5, 6, 7, 8, 9, 1, 2, 3, 4, 5],
...          'Y': [6, 7, 8, 9, 10, 2, 3, 4, 5, 6]}
>>> df = pd.DataFrame(data=data1)
>>> length = max(df['Frame']) + 1
>>> xyft = msd.make_xyarray(df, length=length)
{'farray': array([[0., 0.],
           [1., 1.],
           [2., 2.],
           [3., 3.],
           [4., 4.],
           [5., 5.],
           [6., 6.]]),
 'tarray': array([[1., 2.],
           [1., 2.],
           [1., 2.],
           [1., 2.],
           [1., 2.],
           [1., 2.],
           [1., 2.]]),
 'xarray': array([[ 5., nan],
           [ 6., nan],
           [ 7.,  1.],
           [ 8.,  2.],
           [ 9.,  3.],
           [nan,  4.],
 'yarray': [nan,  5.]]),
           array([[ 6., nan],
           [ 7., nan],
           [ 8.,  2.],
           [ 9.,  3.],
           [10.,  4.],
           [nan,  5.],
           [nan,  6.]])}
diff_classifier.msd.msd_calc(track, length=10)[source]

Calculates mean squared displacement of input track.

Returns numpy array containing MSD data calculated from an individual track.

Parameters:
track : pandas.core.frame.DataFrame

Contains, at a minimum a ‘Frame’, ‘X’, and ‘Y’ column

Returns:
new_track : pandas.core.frame.DataFrame

Similar to input track. All missing frames of individual trajectories are filled in with NaNs, and two new columns, MSDs and Gauss are added: MSDs, calculated mean squared displacements using the formula MSD = <(xpos-x0)**2> Gauss, calculated Gaussianity

Examples

>>> data1 = {'Frame': [1, 2, 3, 4, 5],
...          'X': [5, 6, 7, 8, 9],
...          'Y': [6, 7, 8, 9, 10]}
>>> df = pd.DataFrame(data=data1)
>>> new_track = msd.msd_calc(df, 5)
>>> data1 = {'Frame': [1, 2, 3, 4, 5],
...          'X': [5, 6, 7, 8, 9],
...          'Y': [6, 7, 8, 9, 10]}
>>> df = pd.DataFrame(data=data1)
>>> new_track = msd.msd_calc(df)
diff_classifier.msd.nth_diff(dataframe, n=1, axis=0)[source]

Calculates the nth difference between vector elements

Returns a new vector of size N - n containing the nth difference between vector elements.

Parameters:
dataframe : pandas.core.series.Series of int or float

Input data on which differences are to be calculated.

n : int

Function calculated xpos(i) - xpos(i - n) for all values in pandas series.

axis : {0, 1}

Axis along which differences are to be calculated. Default is 0. If 0, input must be a pandas series. If 1, input must be a numpy array.

Returns:
diff : pandas.core.series.Series of int or float

Pandas series of size N - n, where N is the original size of dataframe.

Examples

>>> df = np.ones((5, 10))
>>> nth_diff(df)
array([[0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0.]])
>>> df = np.ones((5, 10))
>>> nth_diff (df)
array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])
diff_classifier.msd.plot_all_experiments(experiments, bucket='ccurtis.data', folder='test', yrange=(0.1, 10), fps=100.02, xrange=(0.01, 1), upload=True, outfile='test.png', exponential=True)[source]

Plots precision-weighted averages of MSD datasets.

Plots pre-calculated precision-weighted averages of MSD datasets calculated from precision_averaging and stored in an AWS S3 bucket.

Parameters:
group : list of str

List of experiment names to plot. Each experiment must have an MSD and SEM file associated with it in s3.

bucket : str

S3 bucket from which to download data.

folder : str

Folder in s3 bucket from which to download data.

yrange : list of float

Y range of plot

xrange: list of float

X range of plot

upload : bool

True to upload to S3

outfile : str

Filename of output image

diff_classifier.msd.precision_averaging(group, geomean, geo_stder, weights, save=True, bucket='ccurtis.data', folder='test', experiment='test')[source]

Calculates precision-weighted averages of MSD datasets.

Parameters:
group : list of str

List of experiment names to average. Each element corresponds to a key in geo_stder and geomean.

geomean : dict of numpy.ndarray

Each entry in dictionary corresponds to an MSD profiles, they key corresponding to an experiment name.

geo_stder : dict of numpy.ndarray

Each entry in dictionary corresponds to the standard errors of an MSD profile, the key corresponding to an experiment name.

weights : numpy.ndarray

Precision weights to be used in precision averaging.

Returns:
geo : numpy.ndarray

Precision-weighted averaged MSDs from experiments specified in group

geo_stder : numpy.ndarray

Precision-weighted averaged SEMs from experiments specified in group

diff_classifier.msd.precision_weight(group, geo_stder)[source]

Calculates precision-based weights from input standard error data

Calculates precision weights to be used in precision-averaged MSD calculations.

Parameters:
group : list of str

List of experiment names to average. Each element corresponds to a key in geo_stder and geomean.

geo_stder : dict of numpy.ndarray

Each entry in dictionary corresponds to the standard errors of an MSD profile, the key corresponding to an experiment name.

Returns:
weights: numpy.ndarray

Precision weights to be used in precision averaging.

w_holder : numpy.ndarray

Precision values of each video at each time point.

diff_classifier.msd.random_traj_dataset(nframes=100, nparts=30, seed=1, fsize=(0, 512), ndist=(1, 2))[source]

Creates a random population of random walks.

Parameters:
nframes : int

Number of frames for each random trajectory.

nparts : int

Number of particles in trajectory dataset.

seed : int

Seed for pseudo-random number generator for reproducability.

fsize : tuple of int or float

Scope of points over which particles may start at.

ndist : tuple of int or float

Parameters to generate normal distribution, mu and sigma.

Returns:
dataf : pandas.core.frame.DataFrame

Trajectory data containing a ‘Frame’, ‘Track_ID’, ‘X’, and ‘Y’ column.

diff_classifier.msd.random_walk(nsteps=100, seed=1, start=(0, 0))[source]

Creates 2d random walk trajectory.

Parameters:
nsteps : int

Number of steps for trajectory to move.

seed : int

Seed for pseudo-random number generator for reproducability.

start : tuple of int or float

Starting xy coordinates at which the random walk begins.

Returns:
x : numpy.ndarray

Array of x coordinates of random walk.

y : numpy.ndarray

Array of y coordinates of random walk.