GNU bug report logs - #41615
[feature/native-comp] Dump prettier C code.

Previous Next

Package: emacs;

Reported by: Nicolas Bértolo <nicolasbertolo <at> gmail.com>

Date: Sat, 30 May 2020 15:09:01 UTC

Severity: normal

Done: Andrea Corallo <akrl <at> sdf.org>

Bug is archived. No further changes may be made.

Full log


Message #29 received at 41615 <at> debbugs.gnu.org (full text, mbox):

From: Nicolas Bértolo <nicolasbertolo <at> gmail.com>
To: Andrea Corallo <akrl <at> sdf.org>
Cc: 41615 <at> debbugs.gnu.org
Subject: Re: bug#41615: [feature/native-comp] Dump prettier C code.
Date: Sun, 31 May 2020 14:26:46 -0300
> I believe bzero is unnecessary given these are static allocated.

Ok with me.

> For memcpy we can just use the standard library implementation given
>  elns are linked to it.  The other advantage is that doing this way (here
> at least) memcpy is not inlined also at speed 3, so we don't trap in the
> optimizer issue!

This is good!

> All summed is even a little faster than the stock patch and closer to
> the one with the specific GCC blob support.

Good.

> Let me know if you like the attached and if does the job for you too.

I like it. I see calls to memcpy even with -O3, which is great.

Nico

El dom., 31 may. 2020 a las 13:57, Andrea Corallo (<akrl <at> sdf.org>) escribió:
>
> Nicolas Bértolo <nicolasbertolo <at> gmail.com> writes:
>
> >> I like this considerably less :)
> >
> > Ok, let's say goodbye to this patch.
> >
> >> It introduces quite some complexity and the same advantage in
> >> debuggability can be achieved with something like the attached 8 line
> >> patch (untested).
> >
> > Sounds good, I haven't tested it either.
> >
> >> Generally speaking I want to try to keep our back-end as simple as we
> >> manage to.
> >
> > I initially wrote this patch chasing the reason for slow compile times. I think
> > that a 10k line C file should be compiled much faster than what gccjit achieves.
> > I thought that "uncommon" (for C) ways of doing thing were causing gccjit to get
> > stuck trying to optimize them hard, until it gave up. I thought that filling the
> > static data using memcpy() and constant strings would help GCC recognize this as
> > a constant initialization and hopefully just store a completely initialized copy
> > in memory.
> >
> > I found that GCC would inline memcpy() and the static initialization would turn
> > into a very long unrolled loop with SSE instructions. I tested this with -O3
> > only in gccjit to force maximum optimization. I found this super strange
> > considering that -ftree-loop-distribute-patterns is enabled at -O3 and it should
> > recognize the naive_memcpy() function as an implementation of memcpy() and issue
> > calls to libc's implementation. Instead, it was inlining and unrolling it.
>
> Ok you confirm the suspects I wrote in the other mail!
>
> I've used your patch as a base, apart for minors here and there I've
> stripped out the definitions of bzero and memcpy.
>
> I believe bzero is unnecessary given these are static allocated.
>
> For memcpy we can just use the standard library implementation given
> elns are linked to it.  The other advantage is that doing this way (here
> at least) memcpy is not inlined also at speed 3, so we don't trap in the
> optimizer issue!
>
> All summed is even a little faster than the stock patch and closer to
> the one with the specific GCC blob support.
>
> Let me know if you like the attached and if does the job for you too.
>
> Thanks
>
>   Andrea
>
> --
> akrl <at> sdf.org




This bug report was last modified 5 years and 39 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.