Title: Stupid Columnsort Tricks
1Stupid Columnsort Tricks
- Geeta Chaudhry
- Tom Cormen
- Dartmouth College
- Department of Computer Science
2What Do We Know About Columnsort?
- Sorts N values on an r s mesh
- Uses 8 steps
- Each step either sorts each column or performs a
fixed permutation
- Divisibility restriction s divides r
- Height restriction r 2s2
- 4s3/2
- Exponent of s goes from 2 to 3/2
- Mesh need not be quite so tall and skinny
- Cost 2 additional steps
- Can simultaneously remove the divisibility
restriction and relax the height restriction tor
6s3/2
3Why Relax the Conditions?
- Columnsort applies in more circumstances
- Our motivation out-of-core sorting
- Column height r is limited by amount of memory
- Either per processor or in entire system
- N rs, r 2s2 N r3/2/21/2
- N rs, r 4s3/2 N r5/3/42/3
- Reducing the exponent of s in the bound for r
allows us to sort more values with a given amount
of memory - A similar technique works for applying columnsort
to in-core sorting
4This Talk
- Slabpose columnsort
- r 4s3/2
- Requires divisibility restriction
- Also in the paper
- Subblock columnsort
- r 4s3/2 with divisibility restriction
- r 6s3/2 without divisibility restriction
- Proof that the divisibility restriction is
unnecessary in the basic columnsort algorithm
5Columnsort Steps
- Sort each column
- Transpose entire mesh
- Sort each column
- Untranspose entire mesh
- Sort each column
- Shift down by half a column
- Sort each column
- Shift up by half a column
6Slabpose Columnsort Steps
- Sort each column
- Slabpose transpose within vertical slabs
- Sort each column
- Shuffle columns
- Slabpose
- Sort each column
- Untranspose entire mesh
- Sort each column
- Shift down by half a column
- Sort each column
- Shift up by half a column
- Sort each column
- Slabpose transpose within vertical slabs
- Sort each column
- Shuffle columns
- Slabpose
- Sort each column
- Untranspose entire mesh
- Sort each column
- Shift down by half a column
- Sort each column
- Shift up by half a column
- Sort each column
- Slabpose transpose within vertical slabs
- Sort each column
- Shuffle columns
- Slabpose
- Sort each column
- Untranspose entire mesh
- Sort each column
- Shift down by half a column
- Sort each column
- Shift up by half a column
Oblivious!
7Slabpose Columnsort Steps
- Sort each column
- Slabpose transpose within vertical slabs
- Sort each column
- Shuffle columns slabpose
- Sort each column
- Untranspose entire mesh
- Sort each column
- Shift down by half a column
- Sort each column
- Shift up by half a column
Oblivious!
8Why Work With Vertical Slabs?
- In regular columnsort, the matrix needs to be
tall and skinny - Working with vertical slabs allows us to change
the aspect ratio to use tall and skinny slabs - Well use slabs that are s columns wide
- The mesh will have s slabs
90-1 Principle
- If an oblivious algorithm sorts all input sets
consisting solely of 0s and 1s, then it sorts all
input sets with arbitrary values - Use the 0-1 Principle by looking at portions of
the r s mesh - Clean all 0s or all 1s
- Dirty may be mixed 0s and 1s
10Step 1 Sort Each Column
0
dirty
r
1
s
11Step 2 Slabpose
s-slab
column
s
s slabs
12Step 3 Sort Each Column
s rows
13Step 4 Shuffle
s-slab
s-slab
s rows
s slabs
s slabs
14Step 5 Slabpose
s-slab
s-slab
r/ s rows
2 rows
s slabs
s slabs
s sets of dirty rows
15Step 6 Sort Each Column
2 s rows 2s3/2 elements
16Step 7 Untranspose Entire Mesh
2s3/2 elements
r 4s3/2 2s3/2 r/2 dirty area half
a column
Once the size of the dirty area is at most half a
column, the last four steps will finish up
17Step 8 Sort Each Column
dirty area resides in one column done
18Step 8 Sort Each Column
dirty area resides in two columns no change
19Step 9 Shift Down by Half a Column
dirty area resides in one column
20Step 10 Sort Each Column
dirty area resides in one column
21Step 11 Shift Up by Half a Column
sorted
22Subblock Columnsort
- Adds two steps to columnsort
- Sort each column
- A fixed permutation
- The permutation is any one that distributes all
elements of each s s subblock to alls
columns - Like slabpose columnsort, the size of the dirty
area is 2s3/2 entering the last four steps - As long as 2s3/2 r/2 (half a column), the last
four steps complete the sorting
23Removing the Divisibility Restrictionfrom
Columnsort
- With the divisibility restriction, the dirty rows
after the transpose step have only 0-1
transitions - Without the divisibility restriction, there may
also be 1-0 transitions - The proof shows that even with the 1-0
transitions, the size of the dirty area entering
the last four steps does not increase - Thus r 2s2 suffices, even without the
divisibility restriction
24Conclusion
- We can get around the restrictions of columnsort
- Reduce the exponent in the height restriction
from 2 to 3/2 - The mesh need not be quite so tall and skinny
- Cost Two extra steps
- In out-of-core implementation, slabpose
columnsort requires no additional I/O - The divisibility restriction is unnecessary
- Open question Can we reduce the exponent further
within the columnsort framework?