Issue224

classification
Title byte compiler problem: miscompiling emacs-w3m
Type defect Module core code 21.5
Severity some work obstructed Platform N/A
Keywords
explanation
process
These optional controls are only of interest to committers and tracker administrators.
Status committed   Reason fixed
Superseder   Submitted 2008-01-08.18:07:50
Priority normal   Assigned To
Nosy List acs  

Created on 2008-01-19.06:43:17 by mfabian, last changed 2008-05-13.13:15:59 by aidan.

Messages
msg749 [hidden] ([hidden]) Date: 2008-05-13.13:15:59
  Message-ID: <1210684559.08.0.439478861114.issue224@xemacs.org>
As discussed,
http://hg.debian.org/hg/xemacs/xemacs?cs=e8f448f997ac fixes this.
msg388 [hidden] ([hidden]) Date: 2008-01-19.06:43:17
  Message-ID: <18317.10908.57631.419920@parhasard.net>
Ar an cúigiú lá déag de mí Eanair, scríobh Stephen J. Turnbull: 

 > Stephen J. Turnbull writes:
 >  > Aidan Kehoe writes:
 >  > 
 >  >  > Katsumi, Mike and Dieter: this change
 >  >  > http://hg.debian.org/hg/xemacs/xemacs?cs=e8f448f997ac 
 >  >  > fixes the problem for me. I'd appreciate confirmation of this from your end,
 >  >  > if you have the time to check. 
 >  > 
 >  > Aidan, would you please take the time to explain why this DTRTs?  A
 >  > good place would be as a comment in the code, and you should feel free
 >  > to point that out to a reviewer (probably me ;-) who asks for more
 >  > explanation.
 > 
 > Oh, I see; I gather this is a follow-on to "bind print-gensym to a cons".

Aye. Relatedly, I’ve just committed a fix to the bug I described in 
http://mid.gmane.org/18315.32323.858144.704976@parhasard.net , with a test
case; see http://hg.debian.org//hg/xemacs/xemacs/?cs=cacc942c0d0f . 

 > But
 >
 >  > The reason in this case is that byte compiler expertise is now a rare
 >  > commodity.  Even a small thing like explaining why that
 >  > print-gensym-alist interferes with byte compilation would help raise
 >  > consciousness (speaking for myself, of course, and possibly others).
 > 
 > still applies.  A note that it is conceptually part of that earlier
 > patch might be appropriate?

It’s hard to judge how much detail I need to go into, though. If I’m just
writing for you, then I can be a bit more terse and put a bit less effort
into it than if I’m writing for the general case, and writing for the
general case is hard, because I’m still learning this stuff myself. But I
*should* be writing for the general case, if other people are going to be
able to dive in.

Anyway, when it comes to that question in particular:

   #'gensym returns an uninterned symbol. This symbol is not in obarray [=
the global name->value map for elisp], and as such does not have a canonical
name. If some existing variable does not refer to it, it will be garbage
collected, like most Lisp objects but unlike most symbols. See footnote
[1] for example code. 

   When serialising [printing] an interned symbol [= normally one in obarray,
with a canonical name], deciding on a representation for it is trivial; use
its canonical name. This means that the first time you encounter this symbol
on deserialising [reading], you intern the symbol [= create an entry for it
in obarray], and each subsequent time you encounter it, you look it up in
obarray and re-use it.

   Deciding on a representation for serialising an uninterned symbol is
harder. For one, it is entirely possible to create two distinct uninterned
symbols with the same name:

(let ((a (make-symbol "hi there"))
      (b (make-symbol "hi there")))
  (eq a b))
=> nil

So we can’t use only the name, if we’re to preserve the property that the
two are distinct. For another, this property (= that uninterned symbols with
the same name are distinct) is constantly used in Lisp programs--the gensym
counter, which tries to generate distinct names for distinct uninterned
symbols, is dumped, so when byte-compiling one file from a -vanilla start it
will have the same initial value as when byte-compiling another file also
from a -vanilla start. We want the #'gensym calls in Gnus to refer to
different objects than the #'gensym calls in AuCTeX, otherwise using both at
the same time will lead to subtle bugs.

   The CL macros make heavy use of #'gensym calls, and there’s nothing wrong
with that; #'gensym is the best way to create temporary variable names at runtime
without polluting the Lisp namespace, or indeed stepping on your own toes if
you’re a heavy macro user.

   The really old-school emacs Lisp way to serialise [print] an uninterned
symbol was just to print its name. Then, at deserialisation [read] time it
is interned [= inserted into obarray]. This sucks, and leads to subtle bugs
with large code bases.

   The slightly less old-school emacs Lisp way is to bind print-gensym to
t. What this does is, within a single #'print [serialise] call, the first
time an uninterned lisp symbol has to be printed, it’s printed as
#N=#:SYMBOL-NAME, where N is a counter, and SYMBOL-NAME is the name of the
symbol ("hi there" for my two examples above). The symbol and its
corresponding counter value is stored in an table for later retrieval. The
next time that same symbol is to be printed within that #'print call, the
code looks through the table to see if the symbol itself is in that alist;
it finds it, and instead of printing #N=#:SYMBOL-NAME, it just prints #N#.

The Lisp reader interpretes #N=#:SYMBOL-NAME as a directive ‘create an
uninterned symbol with the name SYMBOL-NAME, and store N as its index in a
table of uninterned symbols specific to this top-level form’. It interprets
#N# as a directive ‘look up the Nth entry in the uninterned symbol table for
this form, and use that.’

So, for example, expanding a call to one of the CL macros gives this:

(let ((print-readably t)
      (print-gensym t))
  (cl-prettyexpand '(loop for i in '(a b c d e f g h)
                     do (message "hi there %S" i)))
  nil)

=> (block nil
     (let* ((#1=#:G32994 '(a b c d e f g h))
            (i nil))
       (while (consp #1#)
         (setq i (car #1#))
         (message "hi there %S" i)
         (setq #1# (cdr #1#)))
       nil))

   The table used by the printer is called print-gensym-alist. If
print-gensym is bound to t, this is reset on entry to and exit from all
printing functions--pretty much #'prin1-to-string, #'prin1, #'princ and
#'print. With print-gensym bound to t, you can’t byte-compile two different
functions and expect state stored in uninterned symbols to be preserved
across the function boundaries, because the byte compiler uses two separate
calls to #'prin1 when writing the functions to the byte compile output
buffer, and the index decided on when printing the first function is no
longer available when printing the second. 

   If print-gensym is bound to a cons, then print-gensym-alist is preserved
across calls to the printer functions. It’s not reset. So if you bound
print-gensym-alist to a cons during the entire byte compilation process, you
could re-use state stored in a single uninterned symbol across function
boundaries, except that:

   The table used by the reader [deserialiser] is called Vread_objects (it’s
not visible to Lisp) and is reset with each top-level form encountered. So
with two functions and a single uninterned symbol, the use of the uninterned
symbol in the second function will provoke an error when the reader looks
through Vread_objects and doesn’t see any entry with the corresponding
index. 

   What’s actually done (now) by the byte compiler is it binds print-gensym
to a cons and print-gensym-alist to nil for each top-level form it
outputs. This preserves the identity of uninterned symbols within top level
forms, and does not preserve it across them. This suits the Lisp reader
quite well.

   GNU have gone and extended this syntax a little; they now serialise and
deserialise circular objects with it, as well as gensyms. So this:

(read
 (let ((thing '(1 2 3 4 5 5))
       (print-circle t))
   (setf (cddr thing) thing)
   (prin1-to-string thing)))

doesn’t error for them, and gives back a circular list. This is something we
should merge, though I hope people will not make active use of it.

Hope that helped a little--I’m not if, at all, it should be committed to the
documentation somewhere. 

[1] Code to demonstrate that an uninterned symbol will be garbage collected:

(setq box (make-weak-box (gensym)))
=> #<weak_box>

;; At this point, the only reference to the symbol that #'gensym returned is
;; by means of box, and since box is a weak box, references by means of it
;; do not count for the sake of garbage collection.

(weak-box-ref box)
=> G32999

(garbage-collect)
=> [value omitted]

(weak-box-ref box)
=> nil
msg387 [hidden] ([hidden]) Date: 2008-01-19.06:43:17
  Message-ID: <b4msl10vvzt.fsf@jpl.org>
>>>>> Aidan Kehoe <kehoea@parhasard.net> wrote:

> Katsumi, Mike and Dieter: this change
> http://hg.debian.org/hg/xemacs/xemacs?cs=e8f448f997ac
> fixes the problem for me. I’d appreciate confirmation of this from your end,
> if you have the time to check.

The patch fixed the problem perfectly.  Thank you.

_______________________________________________
XEmacs-Beta mailing list
XEmacs-Beta@xemacs.org
http://calypso.tux.org/cgi-bin/mailman/listinfo/xemacs-beta
msg386 [hidden] ([hidden]) Date: 2008-01-19.06:43:17
  Message-ID: <s3t3at0uvpx.fsf@magellan.suse.de>
Aidan Kehoe <kehoea@parhasard.net> さんは書きました:

>  Ar an t-aonú lá déag de mí Eanair, scríobh Katsumi Yamaoka: 
>
>  > >>>>> Mike FABIAN wrote:
>  > > Aidan Kehoe <kehoea@parhasard.net> wrote:
>  > 
>  > >> It seems that the stable version of emacs-w3m doesn't actually
>  > >> compile with XEmacs 21.5 anyway, independent of this problem; is
>  > >> SuSE using another version?
>  > 
>  > > We are using the latest CVS HEAD of emacs-w3m.
>  > 
>  > > The problem was reproducible with CVS HEAD from 20070717 and also with a
>  > > CVS HEAD from yesterday. CVS of emacs-w3m is here:
>  > 
>  > >     cvs -d :pserver:anonymous@cvs.namazu.org:/storage/cvsroot login
>  > >     (CVS password empty)
>  > >     cvs -d :pserver:anonymous@cvs.namazu.org:/storage/cvsroot co emacs-w3m
>  > 
>  > An extraction of emacs-w3m that causes exactly the same problem
>  > with the most recent XEmacs 21.5 is attached below.  There are
>  > a defsubst and two defcustom's.  After the byte compilation, the
>  > first defcustom has "#1=#FOO" and "#1#"s, but the second one
>  > has only "#1#"s.  Because of this, the elc file cannot be loaded.
>
> Katsumi, Mike and Dieter: this change
> http://hg.debian.org/hg/xemacs/xemacs?cs=e8f448f997ac 
> fixes the problem for me. I’d appreciate confirmation of this from your end,
> if you have the time to check. 

Yes, this fix works. Thank you very much!

Updated XEmacs rpm packages with this fix will be at

    http://download.opensuse.org/repositories/M17N/

in a few hours.
msg385 [hidden] ([hidden]) Date: 2008-01-19.06:43:17
  Message-ID: <18315.29135.523193.605739@parhasard.net>
Ar an t-aonú lá déag de mí Eanair, scríobh Katsumi Yamaoka: 

 > >>>>> Mike FABIAN wrote:
 > > Aidan Kehoe <kehoea@parhasard.net> wrote:
 > 
 > >> It seems that the stable version of emacs-w3m doesn't actually
 > >> compile with XEmacs 21.5 anyway, independent of this problem; is
 > >> SuSE using another version?
 > 
 > > We are using the latest CVS HEAD of emacs-w3m.
 > 
 > > The problem was reproducible with CVS HEAD from 20070717 and also with a
 > > CVS HEAD from yesterday. CVS of emacs-w3m is here:
 > 
 > >     cvs -d :pserver:anonymous@cvs.namazu.org:/storage/cvsroot login
 > >     (CVS password empty)
 > >     cvs -d :pserver:anonymous@cvs.namazu.org:/storage/cvsroot co emacs-w3m
 > 
 > An extraction of emacs-w3m that causes exactly the same problem
 > with the most recent XEmacs 21.5 is attached below.  There are
 > a defsubst and two defcustom's.  After the byte compilation, the
 > first defcustom has "#1=#FOO" and "#1#"s, but the second one
 > has only "#1#"s.  Because of this, the elc file cannot be loaded.

Katsumi, Mike and Dieter: this change
http://hg.debian.org/hg/xemacs/xemacs?cs=e8f448f997ac 
fixes the problem for me. I’d appreciate confirmation of this from your end,
if you have the time to check.
msg384 [hidden] ([hidden]) Date: 2008-01-19.06:43:17
  Message-ID: <s3tir24kx6x.fsf@magellan.suse.de>
Recent versions of XEmacs apparently miscompile emacs-w3m.

For details please see:

https://bugzilla.novell.com/show_bug.cgi?id=352331

Reproducible like this:

    xemacs -q
    M-x w3m RET
    => Invalid read syntax: Undefined symbol label, 1

The problem disappears after deleting the .elc files of emacs-w3m:

    rm /usr/share/xemacs/site-packages/lisp/w3m/*.elc

The problem is there in a CVS checkout of XEmacs 21.5.28 from of
20071220.

Dieter Klünter reports that it does *not* happen with

   XEmacs 21.5  (beta28) "fuki" (+CVS-20071205)

i.e. apparently a change between 20071205 and 20071220 broke it.
History
Date User Action Args
2008-05-13 13:15:59aidansetstatus: chatting -> committed
severity: some work obstructed
platform: + N/A
nosy: + acs
messages: + msg749
module: + core code 21.5
priority: normal
assignedto: acs
reason: fixed
type: defect
2008-01-19 06:43:17aidansetmessages: + msg388
2008-01-19 06:43:17yamaokasetmessages: + msg387
2008-01-19 06:43:17mfabiansetmessages: + msg386
2008-01-19 06:43:17aidansetstatus: new -> chatting
messages: + msg385
2008-01-19 06:43:17mfabiancreate