Summary
A script for automatically sorting graph curves, e.g. for gnuplot.
Problem
When you have a bunch of curves, and you plot them in an arbitrary order, you might get the following:

Typically, you want to sort the graphs in what appears to be visually descending order, as follows:

Sorting the curves is usually done manually, by eyeballing the curves. However, manual sorting of graph curves can become tedious. And when some curves don’t go out as far on the x-axis, it can be even trickier to place these short curves. (Some curves might be short if this experimental run trains more slowly.)
Heuristic approach
An automatic heuristic sorting approach is as follows:
- We maintain a sorted list of curves, from highest to lowest. The sorted list is initialized to empty.
- At each iteration, we find the curve that goes the furthest out on the x-axis, but is not yet in the sorted list. We then will choose where to insert it into the sorted list.
- For this curve and all curves in the sorted list, we want an estimate of the curve value at the current curve’s furthest x-value. We compute this estimate using a moving average. (For this reason, all curves should have aligned x-axis steps, and should have equidistant x-axis steps.)
- We place this curve into the sorted list, to minimize the number of rank errors of curve estimates at this x-value.
And that’s it!
Example output
Here is the sorted output of a larger, more difficult example, sorted using the above heuristic. Click on this image to get a larger version you can inspect:

A few of the decisions aren’t good. For example, why is curve 15 placed about curve 6? But most of the decisions are reasonable. For example, curve 13 is placed at the bottom, because it is very low compared to the other curves for the short duration that curve 13 is present.
Code
I have written a script implementing the heuristic above.
Here is the latest version of sort-curves.py.
You will also need movingaverage.py from my Python common library.
USAGE:
./sort-curves.py *.dat
where every *.dat is in standard (gnuplot) two-column-per-line format:
xvalue yvalue
Overall, I find this script a useful timesaver.
![Reblog this post [with Zemanta]](http://img.zemanta.com/reblog_e.png?x-id=036db446-2e02-4881-94e1-41d7d839bf8d)