Differences between the Regular version and the Professional version of Chaos Data Analyzer in calculating the Correlation Dimension.
I ran both versions of the program on the sample data file CHAOS.DAT that comes with the Professional version and calculated the correlation dimension using a an embedding of 5 and n=1. The answers I get are 1.996 1 0.086 for the Regular version and 2.001 1 0.103 for the Professional version. Thus the difference (5 parts in 2000) is roughly 5% of the stated error (about 100 parts in 2000) and is thus well within the errors inherent in this method.
It's still fair to ask why there is any difference at all. Major parts of the code were rewritten to improve the calculation speed, most particularly an integer log routine for binning the correlation integral data, and recompiling with a different floating point math library. These differences should not account for an error of that size, however.
In the Regular version of the program, correlations are ignored between the 1% of the points that are closest together in time, whereas such correlations are ignored between the 5% closest together in the Professional version. If the data have been over sampled (too small an interval between data points), the nearby-in-time points produce a spurious correlation and spuriously low dimension. Thus the 5% criterion was used in the Professional version to be slightly more conservative. Indeed, this raises the dimension of the sample data to a value closer to the accepted value of about 2.03 for the Lorenz attractor.
This over sampling problem is also the reason as additional parameter n was added to the Professional version. If the data is sampled too rapidly compared to the dominant frequencies in the dynamics, this stretched the attractor out into a skinny tube along the diagonal of the embedding space. You can see if this is happening by looking at views 2 and 3 of the graph of data. If the data is collapsed onto a diagonal line, try increasing the value of n until it begins to fill out the region. Don't get too carried away, however, because if you go too far, the attractor will begin to fold over onto itself. In the former case, your dimension is likely to be too low, and in the latter case it is likely to be too high. Of course, if you had an infinite amount of noise-free data, none of this would matter.
Another way to choose n is to make it on the order of the correlation time divided by the embedding dimension. For the CHAOS.DAT sample data, the correlation time is about 6, and the optimal embedding is about 4 or 5, and thus n=1 should be about right, although n=2 is probably just as good. A similar thing can be done with the Regular version of the program by manipulating the data to take alternate data points before calculating the correlation dimension. Hoover, discarding alternate data points is not quite as accurate as keeping every data point but constructing the embedding space using delays of n times the sample time as is done in the Professional version.
I'm sorry if this isn't explained very clearly in the documentation, but there is a lot of art in doing these calculations, and you have to be prepared for imprecise and sometimes completely spurious answers. I highly recommend that you use the surrogate data tests included with the CDA:PRO before you draw any important conclusions about small differences in correlation dimension or other measures of hidden chaos. If you have data sets for which the two versions of the program report differences large than the quoted precision, I would be interested in exploring this project further.
J. C. Sprott
Professor of Physics
Author of Chaos Data Analyzer, Chaos Data Analyzer -
the Professional Version and Chaos Demonstrations.