From: Two-way partitioning of a recursive Gaussian filter in CUDA
Lines per | Process time | Data request | Execution and | |||
---|---|---|---|---|---|---|
block | (ms) | percentage | sync (ms) | |||
 | Local | Global | Local | Global | Local | Global |
4 | 15.2 | 15.1 | 1.2% | 12.3% | 15.0 | 13.5 |
8 | 8.4 | 8.9 | 2.9% | 25.2% | 8.2 | 6.8 |
16 | 5.1 | 5.9 | 6.1% | 41.9% | 4.8 | 3.5 |
32 | 4.9 | 8.4 | 10.2% | 48.9% | 4.4 | 4.3 |
64 | 5.1 | 8.5 | 10.2% | 48.9% | 4.4 | 4.3 |