GNU bug report logs -
#41535
[PATCH] performance optimization for aarch64
Previous Next
Full log
View this message in rfc822 format
On Sat, May 30, 2020 at 11:19 AM Li Qiang <liqiang64 <at> huawei.com> wrote:
> 在 2020/5/26 10:39, l00374334 写道:
> > From: liqiang <liqiang64 <at> huawei.com>
> >
> > By analyzing the compression and decompression process of gzip, I found
> >
> > that the hot spots of CRC32 and longest_match function are very high.
> >
> >
> >
> > On the aarch64 architecture, we can optimize the efficiency of crc32
> >
> > through the interface provided by the neon instruction set (12x faster
> >
> > in aarch64), and optimize the performance of random access code through
> >
> > prefetch instructions (about 5%~8% improvement). In some compression
> >
> > scenarios, loop expansion can also get a certain performance improvement
> >
> > (about 10%).
> >
> >
> >
> > Modify by Li Qiang.
> >
> > ---
> > configure | 14 ++++++++++++++
> > deflate.c | 30 +++++++++++++++++++++++++++++-
> > util.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
Thank you for that work and sorry for the delay in responding.
However, for now I prefer not to apply it.
I'd prefer to see arch-specific optimizations added to libz in the
hope (perhaps naive) that someone will find time to make gzip use
libz.
This bug report was last modified 3 years and 74 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.