This problem was identified in an email conversation between me and
Joris Huizer: I'm pasting (with permission) the conversation below, to
describe the problem (skip to the bottom for a short summary):
=== Joris:
Hello AIS,
Just some detail that few will probably ever notice, but...
(I'm not sure whether I should post such at the newsgroup or just send
to you?)
I found, in ick version 0.28, invalid input gets silently accepted.
A minimal example:
REALLY TWISTED INPUT WON'T GET SPLATTERED
PLEASE GIVE UP
This input would get the following response, as it misses a DO/PLEASE
at
the first line, in 0.27:
ICL017I DO YOU EXPECT ME TO FIGURE THIS OUT?
ON THE WAY TO 1
CORRECT SOURCE AND RESUBNIT
Not so in 0.28. Instead it compiles. The relevant generated C code is
as
follows:
int ick_abstained[2] = {1, };
/*...*/
/* line 000 */
/* (null) */
ick_lineno = 2;
if (ick_roll(0) && !ICKABSTAINED(0)) {
ick_lose(IE000, 2, "\n\
REALLY TWISTED INPUT WON'T GET SPLATTERED");
}
/* line 002 */
/* PLEASE GIVE UP */
ick_lineno = 3;
if (!ICKABSTAINED(1)) {
if(ick_onespots) free(ick_onespots);
if(ick_oneforget) free(ick_oneforget);
exit(0);
}
The generated code for the first line is rather weird - why did it get
ick_roll(0)? Why did it end up initially abstained?
This is a bug I think. You can't defend parsing this as a comment, as
the required DO/PLEASE part of a statement is missing...
Similar, the following input:
DON'T GIVE UP
REALLY TWISTED INPUT WON'T GET SPLATTERED
PLEASE GIVE UP
This would throw an error previously, but now it marks the first two
lines as one comment.
This isn't *really* a bug - you might defend this as correct (though
it
changes semantics somewhat) because additional input after the GIVE UP
may correctly be interpreted as part of the statement, rendering as a
comment.
All in all, this breaks backwards compatibility (code compiling under
ick 0.28 may not compile in older ick versions)
Three patch files are attached:
- splat.patch just initialises the first with exechance = 100, getting
rid of the unexpected ick_roll(0).
- splat2a.patch reverts to previous behavior in the first case,
(maintaining the second case of incompatibility, which, arguably,
fixed
ick to do the right thing anyway)
- splat2.patch reverts to previous behavior in both cases, while
maintaining CREATE support;
Let me know what you think...
regards,
Joris
=== ais523:
The first bit (the missing exechance) is definitely a bug. Thanks for
spotting it! The issue seems to be that the required tuple is never
initialised at all, so it's getting random data.
The other cases is more interesting. My interpretation of the
INTERCAL-72 spec is that invalid input is meant to get silently
accepted, and previous versions of C-INTERCAL were buggy in this
regard; so this was a deliberate change to improve conformance (CREATE
could work without this change, in theory, but see below). I don't
think E017 should ever come up just because the program couldn't be
parsed, because that's a compile-time syntax error, and those don't
happen in INTERCAL. (It still comes up for meshes over #65535, because
the spec specifically forbids those.)
However, the issue now becomes about whether junk after the end of a
command should cause that command to become a syntax error. The
argument in favour of this is that if it didn't, a huge number of
existing programs would now break; for instance, if I write
DON'T LOOK AT THIS: IT'S BROKEN
then the parser will parse this as a valid CREATable statement DON'T
LOOK AT THIS followed by the invalid character string : IT'S BROKEN
(this is invalid because a twospot needs a number after it). The
current implementation therefore invalidates the whole statement;
there would be a non-abstained invalid statement immediately after the
abstained invalid statement in the other interpretation, which would
cause old comments to break and badly break backwards compatibility.
However, as you noted below, the current implementation breaks forward
compatibility; some situations which are legal now would have errored
in the past. This seems to me to be not as big an issue as breaking
backwards compatibility, although it's still undesirable.
One possible compromise solution would be for junk to invalidate the
previous statement if and only if it isn't a statement found in the
spec; this would be in keeping with the INTERCAL-72 statement "each
statement starts where the preceding one ends", which is quite
different from the common practice of statements starting at a DO. CLC-
INTERCAL appears to use the full improved semantics you suggested
without problems, but it comes at a huge runtime penalty, because each
command has to be reparsed whenever it's encountered.
If you can think of a way of implementing additional input as a
separate statement in C-INTERCAL without breaking old comments, please
let me know; I'd be happy to hear an improvement on the situation we
have currently. The junk after end of statement is an interesting
problem to deal with; as I said above, it's a deliberate choice, but I
can't see how to implement any other choice without acting even more
counterintuitively.
May I copy this conversation to alt.lang.intercal? It would be
interesting to see what others have to say about it.
=== Joris:
Sure, you can copy it there alright.
Hmm, on second thought, it seems correct for the first example
"program"
to die with E000 indeed... shows again how careful one must be at
reading the INTERCAL docs
I think the breaking forward compatibility isn't really a problem
here,
as this matches the spec better. :-)
I'm not sure what you meant about needing reparsing, quite possible a
subtle problem I can't yet see, being kinda unfamiliar with the CREAT
stuff, but I sure don't think this deserves such a huge overhead
=== (end of conversation)
So in short: when you have a valid statement followed by junk, like DO
(1000) NEXT JUNK, should that be one invalid statement, or a valid
statement followed by an invalid statement? The first interpretation
can be counterintuitive in some cases, but so can the second. What
about a totally invalid statement like DON'T ABC : DEF? Is that an
abstained invalid statement, followed by a non-abstained invalid
statement? What if someone's CREATEd the ABC statement? How does CLC-
INTERCAL solve this problem (I think I know, but am not sure and would
appreciate an official answer)?
Any opinions?
--
ais523