Type | Wish | Status | submitted | Date | 24-Mar-2015 09:45 |
---|---|---|---|---|---|
Version | r3 master | Category | Unspecified | Submitted by | fork |
Platform | All | Severity | minor | Priority | normal |
Summary | Support named HTML5 entities table in escaping using ^[entity] |
---|---|
Description |
HTML5 has formalized named entities for unicode codepoints. About 1500 or so: http://dev.w3.org/html5/html-author/charref If you want to get `Foo ⊗ Bar`, it's certainly nicer to be able to type `{Foo ^[otimes] Bar}` than to have to dig up a table and find that is `{Foo ^(2297) Bar}`. (It's also rather prettier than HTML's version of `Foo ô Bar`, I think.) @MarkI pointed out that the original proposal to use ^(entity) would mean the entity ∾ would collide with existing ^(ac), which is already a valid string escape that is not ^(223E). As ^{ is necessary for escaping braces in strings and ^< will be necessary for escaping in tags, the only remaining choice is ^[entity]. Back-of-the-envelope calculation is that if you estimate 6 characters average per entity name, and 2 bytes for the UTF-16 codepoint, it's going to be 12K-ish for the data, uncompressed. The data would likely compress well with the existing "paid for" compression routines already in the code. To claim "Unicode support", a feature like this would be very desirable...helping not only the writer, but all people coming down the line trying to read that code. Also, with this table built in it would help anyone trying to process HTML, because they could parse out the entity name and then convert it to a character: ch: none parse "ô" [ "&" copy entity-name to ";" (ch: attempt [load combine [{'^[} entity-name {]}]]) ] either ch [ print [{Entity} entity-name {is equivalent to} ch] ] [ print [ if/only entity-name [{Entity} entity-name {is}] {not a valid HTML5 entity} ] ] So there's an extra-super cool reason to be compatible and include the table. It may be desirable, however, to offer an API so that the reverse can be done...to turn a character into an entity name (if available). |
Example code |
Assigned to | n/a | Fixed in | - | Last Update | 24-Mar-2015 11:46 |
---|
Comments | |
---|---|
(0004615)
MarkI 24-Mar-2015 10:23 |
Aww ... so *close*. Sadly, this'll break compatibility: ^(ac) is already a valid string escape that is not ^(223E). |
Date | User | Field | Action | Change |
---|---|---|---|---|
24-Mar-2015 11:46 | fork | Summary | Modified | Support named HTML5 entities table in escaping => Support named HTML5 entities table in escaping using ^[entity] |
24-Mar-2015 11:46 | fork | Description | Modified | - |
24-Mar-2015 10:23 | MarkI | Comment : 0004615 | Added | - |
24-Mar-2015 10:12 | Fork | Description | Modified | - |
24-Mar-2015 09:45 | Fork | Ticket | Added | - |