Optimal number of fitting points for msd measurements
In the equation above,
The uncertainty with which
Taking into account the Sources of error in measuring diffusion coefficient through nanoparticle tracking analysis, one can assume that averaging many steps yields more accurate results. Although this is theoretically understood, experimentally may be harder to realize[@ernst2013Measuring a diffusion coefficient by single-particle tracking: statistical analysis of experimental mean squared displacement curves].
Therefore the question is how many
The experiment in [@ernst2013Measuring a diffusion coefficient by single-particle tracking: statistical analysis of experimental mean squared displacement curves] is based on a very long track (
Hollow and filled squares represent different number of total frames, while the horizontal axis shows how many points were included in the MSD fit (like in the inset). Surprisingly, the optimum (lowest variance) happens at around 4 or 5 points. Taking into account more data points only worsens the results. The difference between
In the figure above, each distribution was generated at the optimal number of time-steps for the MSD fit, but with a different number of total frames. The most important message is:
For trajectories with a length in the order of 100 data points, the actual outcome of an experiment for the diffusion coefficient can vary by more than a factor of 2.
The standard deviation for the distributions can be described by:
To summarize, if we use track lengths of around 100 points, the accuracy to calculate the diffusion coefficient is of around
This definitely have an impact on the quality of the data generated by nanoparticle tracking analysis, and is probable one of the limitations of nanoparticle tracking analysis. Moreover, a factor 10 increase in the track length only yields a factor 2 improvement on the accuracy of the diffusion coefficient. The question I have is what would be the Optimal track length for MSD measurements, considering not only the accuracy, but the data volume and time it takes to generate the data.
Also, what is the impact of this uncertainty on the Stokes-Einstein relationship?
These are the other notes that link to this one.