REBOL3 tracker
  0.9.12 beta
Ticket #0001167 User: anonymous

Project:



rss
TypeBug Statustested Date1-Aug-2009 00:08
Versionalpha 76 CategorySyntax Submitted byBrianH
PlatformAll Severityminor Prioritynormal

Summary to-word "a," needs to be an error
Description The code TO-WORD "a," generates a word that can't be loaded. Commas should be screened for, and throw an error.

Related tickets: #330, #733, #537
Example code
>> to-word "a b"
** Script error: contains invalid characters
>> to-word "1"
** Syntax error: invalid character in: "1"
>> to-word "a;"
** Syntax error: invalid character in: "a;"
>> to-word "a,"
== a,  ; should throw ** Syntax error: invalid character in: "a,"

Assigned ton/a Fixed inalpha 97 Last Update7-Feb-2010 21:33


Comments
(0001452)
Sunanda
1-Aug-2009 08:08

In the char range 0..255 there are a few other words like "a," that can be to-worded into a word, but not serialised and reloaded.

The full list I see is:
["^@a" "a^-" "a " "a#" "a#a" "a$" "a$a" "a%"
"a%a" "a," "a,a" "a" "a@" "a@a" "a\" "a\a"]

***

That comes by testing the word-ness and loadability of each of
a?
?a
a?a
where a is "a" and ? is to-char n [n 0..255]

****

This is way down on R2 where things like:
make word "7a"
was permitted.

***

This code finds the above list:

rebol []
find-nonloadable-words: func [
/local
nlw
s
w
][
nlw: copy []

for n 0 255 1 [
w: copy []
append w rejoin ["" to-char n "a"] ;; test as leading letter
append w rejoin ["a" to-char n] ;; test as trailing letter
append w rejoin ["a" to-char n "a"] ;; test as mid letter
foreach c w [
if not error? try [to-word c][ ;; test only those that can be to-worded
s: join c ": 999"
if any [error? try [unset? load s] ;; Can we load it?
error? try [unset? do s] ;; Can we do the assign a: 999 ?
unset? do s ;; Did we get 999 if we did?
999 <> do s
][
append nlw c ;; none of the above: it's not a serialisable word
]
]

]
]
return nlw
]

odd-words: find-nonloadable-words



(0001453)
Sunanda
1-Aug-2009 10:14

There is a similar issue with refinements (perhaps the same parsing issue internally):

f: func [/a,] [print 1] ;; odd refinement accepted when making a function
== make function! [[/a,][print 1]]

f/a, ;; cannot be used from console
** Syntax error: invalid "word" -- "a,"
** Near: (line 1) f/a,


f/a ;; has not just been ignored
** Script error: f has no refinement called a


type? first probe words-of :f ;; it is a refinement
[/a,]
== refinement!

words-of self ;; and it's in the words table
== [system f func a, print words-of]
(0001454)
BrianH
1-Aug-2009 17:17

"In the char range 0..255" - R3 syntax is UTF-8, not Latin-1. If you are going to be expanding beyond 0..127 into the multi-byte range, you should go all the way. At least test other space characters to make sure they are delimiters.
(0001455)
Sunanda
2-Aug-2009 21:03

Yes, you are right. In the specific code given, I really mean ASCII not Latin-1....And I have not catered for generating multibytes chars.

Feel free to enhance the code, if it is of any benefit.
(0001982)
Carl
7-Feb-2010 00:28

Fixed this bug, but it's a fairly big change, so we'll need to test it well; however, it does speed up lexical analysis.

R3 *does not* allow delimiters beyond 127, so it is valid to use those within words, even if they are non-printing. This is consistent with the great range of characters, such as Chinese, that may not be displayable to any specific user. The only true form of the source is the UTF-8 itself, not its rendered appearance on any specific system or output device (e.g. printer.)

Date User Field Action Change
7-Feb-2010 21:33 BrianH Status Modified built => tested
7-Feb-2010 00:28 Carl Comment : 0001982 Added -
7-Feb-2010 00:22 Carl Fixedin Modified => alpha 97
7-Feb-2010 00:22 Carl Status Modified reviewed => built
3-Aug-2009 19:26 carl Status Modified submitted => reviewed
2-Aug-2009 21:03 sunanda Comment : 0001455 Added -
1-Aug-2009 17:18 BrianH Comment : 0001454 Modified -
1-Aug-2009 17:17 BrianH Comment : 0001454 Added -
1-Aug-2009 10:14 sunanda Comment : 0001453 Added -
1-Aug-2009 08:08 sunanda Comment : 0001452 Added -
1-Aug-2009 00:11 BrianH Code Modified -
1-Aug-2009 00:10 BrianH Description Modified -
1-Aug-2009 00:10 BrianH Code Modified -
1-Aug-2009 00:08 BrianH Ticket Added -