From debbugs-submit-bounces@debbugs.gnu.org Thu Sep 28 22:29:12 2023 Received: (at submit) by debbugs.gnu.org; 29 Sep 2023 02:29:12 +0000 Received: from localhost ([127.0.0.1]:54762 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qm3Ff-00014D-S1 for submit@debbugs.gnu.org; Thu, 28 Sep 2023 22:29:12 -0400 Received: from lists.gnu.org ([2001:470:142::17]:52308) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qm3Fb-00013q-JT for submit@debbugs.gnu.org; Thu, 28 Sep 2023 22:29:10 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qm3FH-0004Ai-2T for bug-gnu-emacs@gnu.org; Thu, 28 Sep 2023 22:28:47 -0400 Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qm3F5-0004xH-8I for bug-gnu-emacs@gnu.org; Thu, 28 Sep 2023 22:28:46 -0400 Received: from pmg2.iro.umontreal.ca (localhost.localdomain [127.0.0.1]) by pmg2.iro.umontreal.ca (Proxmox) with ESMTP id D5EB9803EB for ; Thu, 28 Sep 2023 22:28:33 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca; s=mail; t=1695954511; bh=8LnCyXezxUkwa32IYkKkcL5zkbzORF0yygQUyqsLQK0=; h=From:To:Subject:Date:From; b=M4eYOYvVotPEnsAvqbEBY4MMTdDS7FH3W/TxcFtkmSINc75hGqxHcSAv+h51HQZ0X dvJfRVniATvqv0JKPgnEPQTjV0lLR6PjDS4G5A/ZxY6VTW8JrV3Nsgnv+mHbKz4N6B SG+dk///WVKowp5A1RT0w0FrJVW5pFiGVH9kJ5yqxeEeabYiPrEa+Yewao/3YltkI9 mKzjk17tAma+xWDOmrLn7y6bCjARTJSkjxVGaPVrW6+qt4ZHh8LKIbuyoC4olTQq2Y VW/g8fHd8HQ73ItJtXxUifMqW4YExV0fcT6AB3/+707HO2zzVVZZo/mI0hGBqi/W0E KUMaCSlF0bXNg== Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1]) by pmg2.iro.umontreal.ca (Proxmox) with ESMTP id 0B8818037F for ; Thu, 28 Sep 2023 22:28:31 -0400 (EDT) Received: from pastel (unknown [216.154.33.233]) by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id E16351202C2 for ; Thu, 28 Sep 2023 22:28:30 -0400 (EDT) From: Stefan Monnier To: bug-gnu-emacs@gnu.org Subject: Disassembling a regexp's bytecode Date: Thu, 28 Sep 2023 22:28:16 -0400 Message-ID: MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-SPAM-INFO: Spam detection results: 0 ALL_TRUSTED -1 Passed through trusted hosts only via SMTP BAYES_00 -1.9 Bayes spam probability is 0 to 1% DKIM_SIGNED 0.1 Message has a DKIM or DK signature, not necessarily valid DKIM_VALID -0.1 Message has at least one valid DKIM or DK signature DKIM_VALID_AU -0.1 Message has a valid DKIM or DK signature from author's domain DKIM_VALID_EF -0.1 Message has a valid DKIM or DK signature from envelope-from domain X-SPAM-LEVEL: Received-SPF: pass client-ip=132.204.25.50; envelope-from=monnier@iro.umontreal.ca; helo=mailscanner.iro.umontreal.ca X-Spam_score_int: -42 X-Spam_score: -4.3 X-Spam_bar: ---- X-Spam_report: (-4.3 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Score: 0.0 (/) X-Debbugs-Envelope-To: submit X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -1.0 (-) --=-=-= Content-Type: text/plain Tags: patch I'd like to add a function that lets us see a regexp's bytecode directly from within Emacs (recompiling with REGEX_EMACS_DEBUG can be quite useful in many cases, but it's much more invasive and it's often overkill). The patch below is what I use currently, but clearly it's not ready for `master`. Before I try and clean it, I'd like to discuss some issues to figure out how best to solve them: - First, in order to easily use the same code between REGEX_EMACS_DEBUG and my new `re--describe-compiled`, I need to print sometimes to `stderr` and sometimes to a string, which I do using `open_memstream`. AFAIK `open_memstream` is not directly available in Windows (and maybe under some other Unixes either, tho it's in POSIX-2008, IIUC). Could someone help me get an `opem_memstream` emulation working (maybe via gnulib)? - I'm thinking of always providing this function. Another option would be to do it under the control of a compilation flag, tho it doesn't seem worth adding a new flag just for that. I guess we could reuse REGEX_EMACS_DEBUG (tho it's too invasive IMO), or ENABLE_CHECKING, but I'd rather just always offer the function. After all, it might encourage users to look more carefully at their regexps and maybe even to help us improve our regexp engine, who knows. Stefan In GNU Emacs 30.0.50 (build 1, x86_64-pc-linux-gnu, X toolkit, cairo version 1.16.0, Xaw3d scroll bars) of 2023-09-16 built on pastel Repository revision: 0954f127b8840bf843a2acfb18d2e18e526166e1 Repository branch: work Windowing system distributor 'The X.Org Foundation', version 11.0.12101007 System Description: Debian GNU/Linux 12 (bookworm) Configured using: 'configure -C --enable-checking --enable-check-lisp-object-type --with-modules --with-cairo --with-tiff=ifavailable 'CFLAGS=-Wall -g3 -Og -Wno-pointer-sign' PKG_CONFIG_PATH=/home/monnier/lib/pkgconfig' --=-=-= Content-Type: text/patch Content-Disposition: attachment; filename=regexp.patch diff --git a/src/regex-emacs.c b/src/regex-emacs.c index e42c045bb86..bc26bb02dce 100644 --- a/src/regex-emacs.c +++ b/src/regex-emacs.c @@ -447,7 +447,7 @@ #define CHARSET_RANGE_TABLE_END(range_table, count) \ # include "sysstdio.h" static void -debug_putchar (int c) +debug_putchar (FILE *stderr, int c) { if (c >= 32 && c <= 126) putc (c, stderr); @@ -461,7 +461,7 @@ debug_putchar (int c) /* Print the fastmap in human-readable form. */ static void -print_fastmap (char *fastmap) +print_fastmap (FILE *stderr, char *fastmap) { bool was_a_range = false; int i = 0; @@ -471,7 +471,7 @@ print_fastmap (char *fastmap) if (fastmap[i++]) { was_a_range = false; - debug_putchar (i - 1); + debug_putchar (stderr, i - 1); while (i < (1 << BYTEWIDTH) && fastmap[i]) { was_a_range = true; @@ -479,8 +479,8 @@ print_fastmap (char *fastmap) } if (was_a_range) { - debug_putchar ('-'); - debug_putchar (i - 1); + debug_putchar (stderr, '-'); + debug_putchar (stderr, i - 1); } } } @@ -492,7 +492,7 @@ print_fastmap (char *fastmap) the START pointer into it and ending just before the pointer END. */ static void -print_partial_compiled_pattern (re_char *start, re_char *end) +print_partial_compiled_pattern (FILE *stderr, re_char *start, re_char *end) { int mcnt, mcnt2; re_char *p = start; @@ -524,8 +524,8 @@ print_partial_compiled_pattern (re_char *start, re_char *end) fprintf (stderr, "/exactn/%d", mcnt); do { - debug_putchar ('/'); - debug_putchar (*p++); + debug_putchar (stderr, '/'); + debug_putchar (stderr, *p++); } while (--mcnt); break; @@ -567,26 +567,26 @@ print_partial_compiled_pattern (re_char *start, re_char *end) /* Are we starting a range? */ if (last + 1 == c && ! in_range) { - debug_putchar ('-'); + debug_putchar (stderr, '-'); in_range = true; } /* Have we broken a range? */ else if (last + 1 != c && in_range) { - debug_putchar (last); + debug_putchar (stderr, last); in_range = false; } if (! in_range) - debug_putchar (c); + debug_putchar (stderr, c); last = c; } if (in_range) - debug_putchar (last); + debug_putchar (stderr, last); - debug_putchar (']'); + debug_putchar (stderr, ']'); p += 1 + length; @@ -737,28 +737,30 @@ print_partial_compiled_pattern (re_char *start, re_char *end) } -static void -print_compiled_pattern (struct re_pattern_buffer *bufp) +void +print_compiled_pattern (FILE *dest, struct re_pattern_buffer *bufp) { re_char *buffer = bufp->buffer; - print_partial_compiled_pattern (buffer, buffer + bufp->used); - fprintf (stderr, "%td bytes used/%td bytes allocated.\n", + print_partial_compiled_pattern (dest, buffer, buffer + bufp->used); + fprintf (dest, "%td bytes used/%td bytes allocated.\n", bufp->used, bufp->allocated); if (bufp->fastmap_accurate && bufp->fastmap) { - fputs ("fastmap: ", stderr); - print_fastmap (bufp->fastmap); + fputs ("fastmap: ", dest); + print_fastmap (dest, bufp->fastmap); } - fprintf (stderr, "re_nsub: %td\t", bufp->re_nsub); - fprintf (stderr, "regs_alloc: %d\t", bufp->regs_allocated); - fprintf (stderr, "can_be_null: %d\n", bufp->can_be_null); + fprintf (dest, "re_nsub: %td\t", bufp->re_nsub); + fprintf (dest, "regs_alloc: %d\t", bufp->regs_allocated); + fprintf (dest, "can_be_null: %d\n", bufp->can_be_null); /* Perhaps we should print the translate table? */ } +#ifdef REGEX_EMACS_DEBUG + static void print_double_string (re_char *where, re_char *string1, ptrdiff_t size1, re_char *string2, ptrdiff_t size2) @@ -771,17 +773,15 @@ print_double_string (re_char *where, re_char *string1, ptrdiff_t size1, if (FIRST_STRING_P (where)) { for (i = 0; i < string1 + size1 - where; i++) - debug_putchar (where[i]); + debug_putchar (stderr, where[i]); where = string2; } for (i = 0; i < string2 + size2 - where; i++) - debug_putchar (where[i]); + debug_putchar (stderr, where[i]); } } -#ifdef REGEX_EMACS_DEBUG - static int regex_emacs_debug = -10000; # define DEBUG_STATEMENT(e) e @@ -789,7 +789,7 @@ print_double_string (re_char *where, re_char *string1, ptrdiff_t size1, if (regex_emacs_debug > 0) fprintf (stderr, __VA_ARGS__) # define DEBUG_COMPILES_ARGUMENTS # define DEBUG_PRINT_COMPILED_PATTERN(p, s, e) \ - if (regex_emacs_debug > 0) print_partial_compiled_pattern (s, e) + if (regex_emacs_debug > 0) print_partial_compiled_pattern (stderr, s, e) # define DEBUG_PRINT_DOUBLE_STRING(w, s1, sz1, s2, sz2) \ if (regex_emacs_debug > 0) print_double_string (w, s1, sz1, s2, sz2) @@ -1769,7 +1769,7 @@ regex_compile (re_char *pattern, ptrdiff_t size, if (regex_emacs_debug > 0) { for (ptrdiff_t debug_count = 0; debug_count < size; debug_count++) - debug_putchar (pattern[debug_count]); + debug_putchar (stderr, pattern[debug_count]); putc ('\n', stderr); } #endif @@ -2700,7 +2700,7 @@ regex_compile (re_char *pattern, ptrdiff_t size, { re_compile_fastmap (bufp); DEBUG_PRINT ("\nCompiled pattern:\n"); - print_compiled_pattern (bufp); + print_compiled_pattern (stderr, bufp); } regex_emacs_debug--; #endif diff --git a/src/regex-emacs.h b/src/regex-emacs.h index bc357633135..e355cd30eb0 100644 --- a/src/regex-emacs.h +++ b/src/regex-emacs.h @@ -195,4 +195,6 @@ #define EMACS_REGEX_H 1 extern re_wctype_t re_wctype_parse (const unsigned char **strp, ptrdiff_t limit); +extern void print_compiled_pattern (FILE *dest, struct re_pattern_buffer *bufp); + #endif /* EMACS_REGEX_H */ diff --git a/src/search.c b/src/search.c index 3d86b24c2b5..ed8115d0c54 100644 --- a/src/search.c +++ b/src/search.c @@ -115,8 +115,8 @@ compile_pattern_1 (struct regexp_cache *cp, Lisp_Object pattern, else cp->f_whitespace_regexp = Qnil; - whitespace_regexp = STRINGP (Vsearch_spaces_regexp) ? - SSDATA (Vsearch_spaces_regexp) : NULL; + whitespace_regexp = STRINGP (Vsearch_spaces_regexp) + ? SSDATA (Vsearch_spaces_regexp) : NULL; val = (char *) re_compile_pattern (SSDATA (pattern), SBYTES (pattern), posix, whitespace_regexp, &cp->buf); @@ -3385,6 +3385,30 @@ DEFUN ("newline-cache-check", Fnewline_cache_check, Snewline_cache_check, set_buffer_internal_1 (old); return val; } + +DEFUN ("re--describe-compiled", Fre__describe_compiled, Sre__describe_compiled, + 1, 1, 0, + doc: /* Return a string describing the compiled form of REGEXP. */) + (Lisp_Object regexp) +{ + struct regexp_cache *cache_entry + = compile_pattern (regexp, NULL, + (!NILP (BVAR (current_buffer, case_fold_search)) + ? BVAR (current_buffer, case_canon_table) : Qnil), + false, + !NILP (BVAR (current_buffer, + enable_multibyte_characters))); + char *buffer = NULL; + size_t size = 0; + FILE* f = open_memstream (&buffer, &size); + if (!f) + report_file_error ("open_memstream failed", regexp); + print_compiled_pattern (f, &cache_entry->buf); + fclose (f); + if (!buffer) + return Qnil; + return make_unibyte_string (buffer, size); +} static void syms_of_search_for_pdumper (void); @@ -3464,6 +3488,7 @@ syms_of_search (void) defsubr (&Smatch_data__translate); defsubr (&Sregexp_quote); defsubr (&Snewline_cache_check); + defsubr (&Sre__describe_compiled); pdumper_do_now_and_after_load (syms_of_search_for_pdumper); } --=-=-=-- From debbugs-submit-bounces@debbugs.gnu.org Fri Sep 29 11:07:13 2023 Received: (at 66261) by debbugs.gnu.org; 29 Sep 2023 15:07:13 +0000 Received: from localhost ([127.0.0.1]:56927 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qmF5E-0003of-0x for submit@debbugs.gnu.org; Fri, 29 Sep 2023 11:07:13 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:41046) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qmF54-0003o0-Av for 66261@debbugs.gnu.org; Fri, 29 Sep 2023 11:07:11 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qmF4d-00086D-Jj; Fri, 29 Sep 2023 11:06:35 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=References:Subject:In-Reply-To:To:From:Date: mime-version; bh=1niKS7KcpfFYTH0T91q0b0rbebzSEszpZp8Sj5OuCp0=; b=b/5DeFN5W9FN ktJkBk3JiwppsTcwac9VS4a6J+DR7R3gu51FfTTn6Xji3ll/elMxwLuVo/ZM7GiHycdiF5VO8081D mLdZP5H59LURuggZ8mRbKEqegDAOMhodGvBMqE0tHMh13JPTPm+XrXhayLzEnZ6yMSCm54pEGFOPF OCVCePPVg5EwPwc1GTnPJVvmPof/enMHPC7fUgyrzCy66/c9zePMN17340eNLxKhYzajiEjww6sel Qwief989s13v8EAvO1Qvs37j++biNAtcZn8mX3srY8qKvJDPQLi86dpJ73MUw0AwBM1CydX0epgsa 0/uu6kPY7Vh7JvyIrmvOwg==; Date: Fri, 29 Sep 2023 18:06:13 +0300 Message-Id: <83bkdlyq6y.fsf@gnu.org> From: Eli Zaretskii To: Stefan Monnier In-Reply-To: (bug-gnu-emacs@gnu.org) Subject: Re: bug#66261: Disassembling a regexp's bytecode References: X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 66261 Cc: 66261@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > Date: Thu, 28 Sep 2023 22:28:16 -0400 > From: Stefan Monnier via "Bug reports for GNU Emacs, > the Swiss army knife of text editors" > > - First, in order to easily use the same code between REGEX_EMACS_DEBUG > and my new `re--describe-compiled`, I need to print sometimes to > `stderr` and sometimes to a string, which I do using `open_memstream`. > AFAIK `open_memstream` is not directly available in Windows (and > maybe under some other Unixes either, tho it's in POSIX-2008, IIUC). > Could someone help me get an `opem_memstream` emulation working > (maybe via gnulib)? Gnulib doesn't have such an emulation, AFAICT. Why cannot you fall back to temporary files when open_memstream is not available? > - I'm thinking of always providing this function. Another option would > be to do it under the control of a compilation flag, tho it doesn't > seem worth adding a new flag just for that. I guess we could > reuse REGEX_EMACS_DEBUG (tho it's too invasive IMO), or > ENABLE_CHECKING, but I'd rather just always offer the function. > After all, it might encourage users to look more carefully at their > regexps and maybe even to help us improve our regexp engine, who knows. I would suggest to have it under ENABLE_CHECKING first, and only remove the condition if there's a demand. (I assume that most people who debug regexps build Emacs with --enable-checking.) From debbugs-submit-bounces@debbugs.gnu.org Fri Sep 29 11:48:33 2023 Received: (at 66261) by debbugs.gnu.org; 29 Sep 2023 15:48:34 +0000 Received: from localhost ([127.0.0.1]:56984 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qmFjB-0005AX-Tz for submit@debbugs.gnu.org; Fri, 29 Sep 2023 11:48:33 -0400 Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]:4855) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qmFiz-00059H-Od for 66261@debbugs.gnu.org; Fri, 29 Sep 2023 11:48:28 -0400 Received: from pmg1.iro.umontreal.ca (localhost.localdomain [127.0.0.1]) by pmg1.iro.umontreal.ca (Proxmox) with ESMTP id 2FE7E100046; Fri, 29 Sep 2023 11:47:57 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca; s=mail; t=1696002471; bh=zkmeehEfpLvCsl1cNbvRxcjzVum1CEIY5woOLXvMgjk=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=pUEqUaFNEahiqCUccz8Bj3LT+QTOmhhj6AGpNdBRyrgdLA0hq6MOr79oqyM+qalqi fdA1Bpb1+WeMaLpcGEWntHVb1464LnKpTBDB6Tr4YDrTbZOIruBgYpFvWHAslZF4Fm lOXfEkePrwNdfmFGxOfRpCk5rf1CvmLV6y1XFyC2tzVMScoCDVfg4S9Uh8ulw5pHCP nBfX/o0+UlJPAh7PmtDoxqItC9yhPDnv8XKemNl6vgKuCSjxH8vmzTMolEVqfwg8xh ZFkYKg/7FqkPXLxCVH5Q2ZMWn8tPAB48gre2VRJSZ0Y9heMeaBtCCmGpgJLMFE1xZD NnLIrnrY8DKkQ== Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1]) by pmg1.iro.umontreal.ca (Proxmox) with ESMTP id B85DD100084; Fri, 29 Sep 2023 11:47:51 -0400 (EDT) Received: from alfajor (unknown [23.233.149.155]) by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id 99A7012020A; Fri, 29 Sep 2023 11:47:51 -0400 (EDT) From: Stefan Monnier To: Eli Zaretskii Subject: Re: bug#66261: Disassembling a regexp's bytecode In-Reply-To: <83bkdlyq6y.fsf@gnu.org> (Eli Zaretskii's message of "Fri, 29 Sep 2023 18:06:13 +0300") Message-ID: References: <83bkdlyq6y.fsf@gnu.org> Date: Fri, 29 Sep 2023 11:47:50 -0400 User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-SPAM-INFO: Spam detection results: 0 ALL_TRUSTED -1 Passed through trusted hosts only via SMTP AWL -0.188 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DKIM_SIGNED 0.1 Message has a DKIM or DK signature, not necessarily valid DKIM_VALID -0.1 Message has at least one valid DKIM or DK signature DKIM_VALID_AU -0.1 Message has a valid DKIM or DK signature from author's domain DKIM_VALID_EF -0.1 Message has a valid DKIM or DK signature from envelope-from domain X-SPAM-LEVEL: X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 66261 Cc: 66261@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) >> - First, in order to easily use the same code between REGEX_EMACS_DEBUG >> and my new `re--describe-compiled`, I need to print sometimes to >> `stderr` and sometimes to a string, which I do using `open_memstream`. >> AFAIK `open_memstream` is not directly available in Windows (and >> maybe under some other Unixes either, tho it's in POSIX-2008, IIUC). >> Could someone help me get an `opem_memstream` emulation working >> (maybe via gnulib)? > Gnulib doesn't have such an emulation, AFAICT. > Why cannot you fall back to temporary files when open_memstream is not > available? Doesn't seem worth the trouble (and I must admit that the idea of using a temp file hurts my sense of aesthetics, on top of it =F0=9F=98=80. Tho, = it'd be OK if it were done for me by gnulib). >> - I'm thinking of always providing this function. Another option would >> be to do it under the control of a compilation flag, tho it doesn't >> seem worth adding a new flag just for that. I guess we could >> reuse REGEX_EMACS_DEBUG (tho it's too invasive IMO), or >> ENABLE_CHECKING, but I'd rather just always offer the function. >> After all, it might encourage users to look more carefully at their >> regexps and maybe even to help us improve our regexp engine, who knows. > > I would suggest to have it under ENABLE_CHECKING first, and only > remove the condition if there's a demand. (I assume that most people > who debug regexps build Emacs with --enable-checking.) OK, I'll make it conditional on ENABLE_CHECKING as well as on the presence of `open_memstream`. Stefan From debbugs-submit-bounces@debbugs.gnu.org Fri Sep 29 12:25:26 2023 Received: (at 66261) by debbugs.gnu.org; 29 Sep 2023 16:25:26 +0000 Received: from localhost ([127.0.0.1]:57032 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qmGIv-0006OQ-4y for submit@debbugs.gnu.org; Fri, 29 Sep 2023 12:25:26 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:39266) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qmGIp-0006O3-7M for 66261@debbugs.gnu.org; Fri, 29 Sep 2023 12:25:23 -0400 Received: from fencepost.gnu.org ([2001:470:142:3::e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qmGIV-00084B-2O; Fri, 29 Sep 2023 12:24:59 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=gnu.org; s=fencepost-gnu-org; h=MIME-version:References:Subject:In-Reply-To:To:From: Date; bh=wITcubRBv8fIpJMC5w9UbL36S5bKWRohtERsNf+IQgc=; b=O/AMHeDDrNb1mzmUxeSG E7zqV2CkmzK95OhNBTEbgBkazvg2e2YlLasrQBbtFNpYVoK1ED2zZevQv8FE0sOIowB/AK7jSuSvq 5rswEGE0wtoFgT4hcNl8fYtzfuV14WnbQJbUeSYLNt9yha3OPj9Qv5YhQ5oN38wwzEY3pwmn1vCl1 FE5JHWouupiHykV1/J2fm91gkDErV5xNupYJdTBsbqiemZGFp0QJFT44/3y1p/eqsCpx6SjU8kQxV 9DPubt+Ac54RujdgRwDUCtR7AdsE4dSWoWl2bVWWEV/BNeiKPFdXd2QiYUVR6hTIVv1seq33Jq7V6 +I5GvcIJQOjdJg==; Date: Fri, 29 Sep 2023 19:24:39 +0300 Message-Id: <83msx5x7zs.fsf@gnu.org> From: Eli Zaretskii To: Stefan Monnier In-Reply-To: (message from Stefan Monnier on Fri, 29 Sep 2023 11:47:50 -0400) Subject: Re: bug#66261: Disassembling a regexp's bytecode References: <83bkdlyq6y.fsf@gnu.org> MIME-version: 1.0 Content-type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 66261 Cc: 66261@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) > From: Stefan Monnier > Cc: 66261@debbugs.gnu.org > Date: Fri, 29 Sep 2023 11:47:50 -0400 > > >> - First, in order to easily use the same code between REGEX_EMACS_DEBUG > >> and my new `re--describe-compiled`, I need to print sometimes to > >> `stderr` and sometimes to a string, which I do using `open_memstream`. > >> AFAIK `open_memstream` is not directly available in Windows (and > >> maybe under some other Unixes either, tho it's in POSIX-2008, IIUC). > >> Could someone help me get an `opem_memstream` emulation working > >> (maybe via gnulib)? > > Gnulib doesn't have such an emulation, AFAICT. > > Why cannot you fall back to temporary files when open_memstream is not > > available? > > Doesn't seem worth the trouble (and I must admit that the idea of using > a temp file hurts my sense of aesthetics, on top of it 😀. Tho, it'd be > OK if it were done for me by gnulib). Then just let it write to stderr, it's okay to do that in ENABLE_CHECKING code. From debbugs-submit-bounces@debbugs.gnu.org Fri Sep 29 12:34:02 2023 Received: (at 66261) by debbugs.gnu.org; 29 Sep 2023 16:34:02 +0000 Received: from localhost ([127.0.0.1]:57051 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qmGRG-0006f8-7s for submit@debbugs.gnu.org; Fri, 29 Sep 2023 12:34:02 -0400 Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]:45419) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qmGRD-0006eT-4b for 66261@debbugs.gnu.org; Fri, 29 Sep 2023 12:34:00 -0400 Received: from pmg2.iro.umontreal.ca (localhost.localdomain [127.0.0.1]) by pmg2.iro.umontreal.ca (Proxmox) with ESMTP id D461C8037F; Fri, 29 Sep 2023 12:33:38 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca; s=mail; t=1696005217; bh=BZa8oDR+IA216H+9q3qhludtv4G/HEaPsDjFn/xZ+es=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=Ha8ztWPJb2JQ1zd0Bbn+msyXjtMFAIBnDISB4iJ4BMDFwjJZ+R5Qg5FU7SBVHNKUa pIaO5jqhHE4YFuTNuPu1KZXOs+3fLm7jAr2DUkwEQAbkdFLtl8AM1RDeVnGTkicdLm wSRZlA14dhKpfua7xDkvfEg+4XStk3URcc1LdnmY73X9ZtRGjdvoCvi1b0CFy/NHoK AcUDQAuiDLBtC4IPqKZFd05y5Zk7kshSv//leAtCvjq/GZRvxCBYZJQluTSOwojQ/3 HQXj20cSPErNce3xwJk3amWMVhNHLeSgFtJjHN0bCrxciqKTI1NrDFg/8Mk+jOcLdd p56Cl5ibKk7Vg== Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1]) by pmg2.iro.umontreal.ca (Proxmox) with ESMTP id 92633803EB; Fri, 29 Sep 2023 12:33:37 -0400 (EDT) Received: from alfajor (unknown [23.233.149.155]) by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id 7944F1203C3; Fri, 29 Sep 2023 12:33:37 -0400 (EDT) From: Stefan Monnier To: Eli Zaretskii Subject: Re: bug#66261: Disassembling a regexp's bytecode In-Reply-To: <83msx5x7zs.fsf@gnu.org> (Eli Zaretskii's message of "Fri, 29 Sep 2023 19:24:39 +0300") Message-ID: References: <83bkdlyq6y.fsf@gnu.org> <83msx5x7zs.fsf@gnu.org> Date: Fri, 29 Sep 2023 12:33:36 -0400 User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-SPAM-INFO: Spam detection results: 0 ALL_TRUSTED -1 Passed through trusted hosts only via SMTP AWL -0.006 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DKIM_SIGNED 0.1 Message has a DKIM or DK signature, not necessarily valid DKIM_VALID -0.1 Message has at least one valid DKIM or DK signature DKIM_VALID_AU -0.1 Message has a valid DKIM or DK signature from author's domain DKIM_VALID_EF -0.1 Message has a valid DKIM or DK signature from envelope-from domain X-SPAM-LEVEL: X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 66261 Cc: 66261@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) >> Doesn't seem worth the trouble (and I must admit that the idea of using >> a temp file hurts my sense of aesthetics, on top of it =F0=9F=98=80. Th= o, it'd be >> OK if it were done for me by gnulib). > > Then just let it write to stderr, it's okay to do that in > ENABLE_CHECKING code. Good idea. It's best if I can get the string to ELisp (it's common for Emacs's stderr to be discarded or hard to reach when started by the desktop environment), but stderr is better than nothing, when `open_memstream` is not available. Thanks, Stefan From debbugs-submit-bounces@debbugs.gnu.org Fri Sep 29 14:57:04 2023 Received: (at 66261-done) by debbugs.gnu.org; 29 Sep 2023 18:57:04 +0000 Received: from localhost ([127.0.0.1]:57199 helo=debbugs.gnu.org) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qmIff-0005MF-Sm for submit@debbugs.gnu.org; Fri, 29 Sep 2023 14:57:04 -0400 Received: from mailscanner.iro.umontreal.ca ([132.204.25.50]:52332) by debbugs.gnu.org with esmtp (Exim 4.84_2) (envelope-from ) id 1qmIfd-0005Lk-Ga for 66261-done@debbugs.gnu.org; Fri, 29 Sep 2023 14:57:02 -0400 Received: from pmg3.iro.umontreal.ca (localhost [127.0.0.1]) by pmg3.iro.umontreal.ca (Proxmox) with ESMTP id BA072442DAE; Fri, 29 Sep 2023 14:56:40 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=iro.umontreal.ca; s=mail; t=1696013795; bh=CvcU3iQoyArkEc2HqS+DhmyRF7xXDiDYcEgViIWqV1k=; h=From:To:Cc:Subject:In-Reply-To:References:Date:From; b=WT/+BztXrGgYkZwhoLnjvhqJOa2e6DqPqroIaq35avRLB12k0Xh0ReVpw4KcqcJi1 G44l1NPuKfAGxGvy+sCaWoTlr1KOBhgExfP+vsXMgkCuPJNiJDGc83QOzBTcasjBoQ kd39A738Mxi8f1m5/yGV2zRAO2sm+mF45xPSLYGiANx4qMIMihFRAFUKL1dNPEMGNd 1JgaEkFbMApl+h4hRpE+ZyPqzYwITxCKCthug6/h8xCDj6EkD8vHPzX9q0MaXs7yo7 0tph+PH/PjQmMjkj075xeELosZGI2ctryBf6VhBS1ZdB6HG4FH2/JGotWMwLANTNg5 Oeit/Nnw+vSEw== Received: from mail01.iro.umontreal.ca (unknown [172.31.2.1]) by pmg3.iro.umontreal.ca (Proxmox) with ESMTP id 2635C442B7D; Fri, 29 Sep 2023 14:56:35 -0400 (EDT) Received: from pastel (unknown [216.154.33.233]) by mail01.iro.umontreal.ca (Postfix) with ESMTPSA id 06A881202E4; Fri, 29 Sep 2023 14:56:35 -0400 (EDT) From: Stefan Monnier To: Eli Zaretskii Subject: Re: bug#66261: Disassembling a regexp's bytecode In-Reply-To: (Stefan Monnier's message of "Fri, 29 Sep 2023 12:33:36 -0400") Message-ID: References: <83bkdlyq6y.fsf@gnu.org> <83msx5x7zs.fsf@gnu.org> Date: Fri, 29 Sep 2023 14:56:34 -0400 User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain X-SPAM-INFO: Spam detection results: 0 ALL_TRUSTED -1 Passed through trusted hosts only via SMTP AWL 0.005 Adjusted score from AWL reputation of From: address BAYES_00 -1.9 Bayes spam probability is 0 to 1% DKIM_SIGNED 0.1 Message has a DKIM or DK signature, not necessarily valid DKIM_VALID -0.1 Message has at least one valid DKIM or DK signature DKIM_VALID_AU -0.1 Message has a valid DKIM or DK signature from author's domain DKIM_VALID_EF -0.1 Message has a valid DKIM or DK signature from envelope-from domain X-SPAM-LEVEL: X-Spam-Score: -2.3 (--) X-Debbugs-Envelope-To: 66261-done Cc: 66261-done@debbugs.gnu.org X-BeenThere: debbugs-submit@debbugs.gnu.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: debbugs-submit-bounces@debbugs.gnu.org Sender: "Debbugs-submit" X-Spam-Score: -3.3 (---) Pushed to `master` with a short mention in search.texi. Stefan From unknown Sun Aug 10 07:32:41 2025 Received: (at fakecontrol) by fakecontrolmessage; To: internal_control@debbugs.gnu.org From: Debbugs Internal Request Subject: Internal Control Message-Id: bug archived. Date: Sat, 28 Oct 2023 11:24:11 +0000 User-Agent: Fakemail v42.6.9 # This is a fake control message. # # The action: # bug archived. thanks # This fakemail brought to you by your local debbugs # administrator