GNU bug report logs -
#41615
[feature/native-comp] Dump prettier C code.
Previous Next
Full log
View this message in rfc822 format
> I believe bzero is unnecessary given these are static allocated.
Ok with me.
> For memcpy we can just use the standard library implementation given
> elns are linked to it. The other advantage is that doing this way (here
> at least) memcpy is not inlined also at speed 3, so we don't trap in the
> optimizer issue!
This is good!
> All summed is even a little faster than the stock patch and closer to
> the one with the specific GCC blob support.
Good.
> Let me know if you like the attached and if does the job for you too.
I like it. I see calls to memcpy even with -O3, which is great.
Nico
El dom., 31 may. 2020 a las 13:57, Andrea Corallo (<akrl <at> sdf.org>) escribió:
>
> Nicolas Bértolo <nicolasbertolo <at> gmail.com> writes:
>
> >> I like this considerably less :)
> >
> > Ok, let's say goodbye to this patch.
> >
> >> It introduces quite some complexity and the same advantage in
> >> debuggability can be achieved with something like the attached 8 line
> >> patch (untested).
> >
> > Sounds good, I haven't tested it either.
> >
> >> Generally speaking I want to try to keep our back-end as simple as we
> >> manage to.
> >
> > I initially wrote this patch chasing the reason for slow compile times. I think
> > that a 10k line C file should be compiled much faster than what gccjit achieves.
> > I thought that "uncommon" (for C) ways of doing thing were causing gccjit to get
> > stuck trying to optimize them hard, until it gave up. I thought that filling the
> > static data using memcpy() and constant strings would help GCC recognize this as
> > a constant initialization and hopefully just store a completely initialized copy
> > in memory.
> >
> > I found that GCC would inline memcpy() and the static initialization would turn
> > into a very long unrolled loop with SSE instructions. I tested this with -O3
> > only in gccjit to force maximum optimization. I found this super strange
> > considering that -ftree-loop-distribute-patterns is enabled at -O3 and it should
> > recognize the naive_memcpy() function as an implementation of memcpy() and issue
> > calls to libc's implementation. Instead, it was inlining and unrolling it.
>
> Ok you confirm the suspects I wrote in the other mail!
>
> I've used your patch as a base, apart for minors here and there I've
> stripped out the definitions of bzero and memcpy.
>
> I believe bzero is unnecessary given these are static allocated.
>
> For memcpy we can just use the standard library implementation given
> elns are linked to it. The other advantage is that doing this way (here
> at least) memcpy is not inlined also at speed 3, so we don't trap in the
> optimizer issue!
>
> All summed is even a little faster than the stock patch and closer to
> the one with the specific GCC blob support.
>
> Let me know if you like the attached and if does the job for you too.
>
> Thanks
>
> Andrea
>
> --
> akrl <at> sdf.org
This bug report was last modified 5 years and 39 days ago.
Previous Next
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.