So I reinstated the bit vector which was a little tricky to do while maintaining performance, but it works very well. So in summary with the attached 3 patch series, the CPU usage of the common cut path is nearly halved, while the max memory that will be allocated for the bit vector is 64KiB. I'll apply this series in the morning. thanks, Pádraig. p.s. I doubt adding a sentinel to the range pair structure would out performance the bit vector approach, given the significant benefit shown in the benchmark in the commit message.