I have a buffer receiving data, which means the data are like 'stream' and have latency in 'IO'. The way I am doing now is when the buffer is full, using qsort to sort the buffer and write the result to disk. but there is obvious latency when doing qsort, so I am looking for some other sorting algorithms that may start sorting while the data is being added to the buffer, in order to reduce time consumed overall.
don't know if I have made myself clear and leave any comments if needed, thanks
preguntado el 09 de marzo de 12 a las 13:03
Heap sort keeps the data permanently in a partially sorted condition and so is comparable to Insertion sort. But it is substantially quicker and has a worst case of O(n log n) compared with O(n2) for Insertion Sort.
How is this going to work? Presumably at some point you have to stop reading from the stream, store what you have sorted, and start reading a new set of data?
I think merge-sort or tree sort can be of great help . Look why on wikipedia.
- When you can cut the huge input in reasonable large blocks, merge-sort is more appropriate.
- When you insert small pieces at a time, tree-sort is more appropriate.
You want to implement an online sorting algorithm, ie an algorithm which runs while receiving the data in a streamlined fashion. Search for algoritmos en línea over the web and you may find other nice algorithms.
In your case I would use tree sort. It doesn't have a better complexity than quicksort (both are
O(nlog n) most of the time and
O(n²) in few bad cases). But it amortizes the cost over each input. Which means the delay you have to wait after the last data is added is not of order
O(nlog n), pero
You can try to use my Link Array structure. It should be ok for sequential adding of random data while keeping it sorted (look at the numbers in the table). This is a variation of Lista de omisión approach but with easier implementation and logic (although the performance of Skip list should be better)