The 3rd surrogate-pair test fails under Cygwin: > # Also test whether a surrogate-pair in the search string works. Fails at grep-3.7 or latest commit. Reproduces easily enough from the command line: > printf '%s\n' "$(printf '\360\220\220\205')" >in > LANG=en_US.utf8 > locale > src/grep --file=in in Reports a match under Linux but not under Cygwin. Tested Cygwin64 on Windows 7 Home and Windows 10. Comparing gdb sessions between the platforms, I noticed: > linux: sbclen = '\001' , '\377' , '\376' , "\377\377" > cygwin: sbclen = '\001' , '\377' , '\376' , '\377' in `dfa` (i.e. dfa.localeinfo.sbclen). Also this: > linux: enlistnew (cpp=0x, new=0x "\360\220\220\205") at dfa.c:3928 > cygwin: enlistnew (cpp=0x, new=0x "\360\355\260\205") at dfa.c:3928 Locale data is different for the same locale on the 2 systems. I investigated this further by breakpointing the code as it starts to compute sbclen[250] which is \376 ubder Linux but \377 under Cygwin. I captured the gdb sessions using `script` and have attached them in the hope they are some help. If your system rejects the tar.gz attachment I'll send them plaintext in separate emails. They compare best in a side-by-side diff highlighting changed characters. I find `tkdiff` good for this: from View choose "Show inline comparison (recursive)". Uninteresting changes between the sessions are removed: Automatic - strip hex numbers (addresses usually) to plain 0x - remove escape sequences (colouring &c.) - probably other stuff Specifics - force matching locale names - insert blank lines at linux:72 to line up return stmt - split linux:100 to more easily see later args Cheers ... Duncan.