GNU bug report logs - #47885
[PATCH] org-table-import: Make it more smarter for interactive use

Previous Next

Package: org-mode;

Reported by: Utkarsh Singh <utkarsh190601 <at> gmail.com>

Date: Mon, 19 Apr 2021 04:44:02 UTC

Severity: normal

Tags: patch

Full log


View this message in rfc822 format

From: Utkarsh Singh <utkarsh190601 <at> gmail.com>
To: Nicolas Goaziou <mail <at> nicolasgoaziou.fr>
Cc: 47885 <at> debbugs.gnu.org
Subject: bug#47885: [PATCH] org-table-import: Make it more smarter for interactive use
Date: Tue, 20 Apr 2021 22:45:22 +0530
Hi,

On 2021-04-20, 15:40 +0200, Nicolas Goaziou <mail <at> nicolasgoaziou.fr> wrote:

> For the problem we're trying to solve, this sounds like over-engineering
> to me. Do we want so badly to guess a separator?

Earlier I took is as an assignment to learn Elisp but now I don't think
we should increase complexity this much.

> Thinking again about it, this needs extra care, as end0 might end up on
> an empty line. You tried to avoid this in your first function, but
> I think this was not sufficient either. Actually, beg0 could also start
> on an empty line.
>
> This needs to be tested extensively, but as a first approximation,
> I think `beg' needs to be defined as:
>
>   (save-excursion
>     (goto-char (min beg0 end0))
>     (skip-chars-forward " \t\n")
>     (if (eobp) (point) (line-beginning-position)))
>
> and `end' as
>
>   (save-excursion
>     (goto-char (max beg end0))
>     (skip-chars-backward " \t\n" beg)
>     (if (= beg (point)) (point) (line-end-position)))
>
> Then you need to bail out if beg = end.
>
>> 	 (sep-rexp '((","  "^[^\n,]+$")
>
> sep-rexp -> sep-regexp
>
>> 		     ("\t" "^[^\n\t]+$")
>> 		     (";"  "^[^\n;]+$")
>> 		     (":"  "^[^\n:]+$")
>> 		     (" "  "^\\([^'\"][^\n\s][^'\"]\\)+$")))
>
> At this point, I suggest to use `rx' macro instead.
>
> I suggest this (yes, I like pattern-matching, `car' and `cdr' are so
> 80's) instead:
>
>   (save-excursion
>     (goto-char beg)
>     (catch :found
>       (pcase-dolist (`(,sep ,regexp) sep-regexp)
>         (save-excursion
>           (unless (re-search-forward regexp end t)
>           (throw :found sep))))
>     nil))
>

Thanks! I was not aware of pcase-dolist function.

Function after doing the necessary changes:

(defun org-table-guess-separator (beg0 end0)
  "Guess separator for `org-table-convert-region' for region BEG0 to END0.

List of preferred separator:
comma, TAB, semicolon, colon or SPACE.

If region contains a line which doesn't contain the required
separator then discard the separator and search again using next
separator."
  (let* ((beg (save-excursion
                (goto-char (min beg0 end0))
                (skip-chars-forward " \t\n")
                (if (eobp) (point) (line-beginning-position))))
	 (end (save-excursion
                (goto-char (max beg end0))
                (skip-chars-backward " \t\n" beg)
                (if (= beg (point)) (point) (line-end-position))))
         (sep-regexp '((","  (rx bol (1+ (not (or ?\n ?,))) eol))
		       ("\t" (rx bol (1+ (not (or ?\n ?\t))) eol))
		       (";"  (rx bol (1+ (not (or ?\n ?\;))) eol))
		       (":"  (rx bol (1+ (not (or ?\n ?:))) eol))
		       (" "  (rx bol (1+ (not (or ?' ?\" ))
                                         (not (or ?\s ?\;))
                                         (not (or ?' ?\"))) eol))))
         sep)
    (unless (= beg end)
      (save-excursion
        (goto-char beg)
        (catch :found
          (pcase-dolist (`(,sep ,regexp) sep-regexp)
            (save-excursion
              (unless (re-search-forward (eval regexp) end t)
                (throw :found sep))))
          nil)))))

> Again all this needs to extensively tested, as there are a lot of
> dangers lurking around.

Summary of things that still requires a review:

+ Setting boundary right

+ When using SPACE as separator is it sufficient to check for all for
all non quoted SPACE's?

-- 
Utkarsh Singh
http://utkarshsingh.xyz




This bug report was last modified 4 years and 50 days ago.

Previous Next


GNU bug tracking system
Copyright (C) 1999 Darren O. Benham, 1997,2003 nCipher Corporation Ltd, 1994-97 Ian Jackson.