REBOL3 tracker
  0.9.12 beta
Ticket #0001466 User: anonymous

Project:



rss
TypeWish Statustested Date7-Feb-2010 09:20
Versionalpha 96 CategoryMezzanine Submitted byBrianH
PlatformAll Severitymajor Priorityhigh

Summary Support for compressed scripts/modules in LOAD and SAVE
Description As requested by Carl here: http://www.rebol.net/r3blogs/0274.html

Compressed scripts and modules would save space on disk (or wherever the scripts are stored) and disk or network overhead. They can be added seamlessly to LOAD and SAVE, or be controlled by a SAVE /compress option. The compressed data could be either raw binary or a binary! value encoded in REBOL script syntax. It should be encoded in REBOL's COMPRESS encoding of UTF-8 encoded string data, which will also allow the scripts to be generated by R2 if necessary.

Format of compressed scripts:
- A UTF-8 encoded REBOL header with a 'compress header option (see #1683)
- Optionally followed by some UTF-8 whitespace
- Followed by COMPRESS-encoded binary data, either as raw binary or a REBOL syntax binary! value.

SAVE /compress method option, or 'compress header option:
- The method parameter can be one of the logic values true or false, or the word 'script. All other values are prohibited.
- A provided script header could have a 'compress header option. Or not.
- If there is no header provided for a compressed script, an otherwise empty header will be added.
- The /compress option should override any 'compress header option already in a provided script header.
- If the /compress method is 'script, SAVE will encode the compressed data as a base-64 binary! value in REBOL syntax ("script-encoded").
- There is no way to specify script-encoding in the header. You have to use SAVE /compress 'script.
- SAVE should set the 'compress header option in the saved header, or remove it if not compressed.
- SAVE shouldn't modify its arguments.

Gotchas and cheats:
- If the data is compressed then DECOMPRESS can figure it out. However, this is expensive, so the 'compress header option is still required.
- If the 'compress header option is set then the type of compression will be autodetected, first by trying binary then decoding a REBOL binary! value and trying to decompress that.
- We can probably afford to ignore anything after the script-encoded compressed binary! value in the source.
- If the 'compress header option is set then the data had better be compressed, or else a 'script 'bad-press error should be thrown.
- A script-in-a-block can support a compressed script encoded as a binary! value, but not raw binary.
- If you LOAD/header a script-encoded compressed script, it won't automatically save again as script-encoded. You have to specify /compress 'script to SAVE.

There must continue to be no particular importance given to the file extension of the script. We must be able to determine from the data itself whether it is compressed. That way we can embed compressed scripts in the host code, or store them in databases or some other non-file containers.
Example code
>> save/compress none [print "Hello World!"] true
== #{
5245424F4C205B0A202020206F7074696F6E733A205B636F6D70726573735D0A
5D0A789C2B28CACC2B5150F248CDC9C95708CF2FCA495154E20200526B06D915
000000
}

>> save/compress none [print "Hello World!"] 'script
== #{
5245424F4C205B0A202020206F7074696F6E733A205B636F6D70726573735D0A
5D0A3634237B654A77724B4D724D4B314651386B6A4E79636C58434D3876796B
6C52564F494341464A7242746B56414141417D
}

>> to-string save/compress none [print "Hello World!"] 'script
== {REBOL [
    options: [compress]
]
64#{eJwrKMrMK1FQ8kjNyclXCM8vyklRVOICAFJrBtkVAAAA}}

>> load "REBOL [options: [compress]] 64#{eJwrKMrMK1FQ8kjNyclXCM8vyklRVAIAS5IGzxQAAAA=}"
== [print "Hello World!"]

>> load/header "REBOL [options: [compress]] 64#{eJwrKMrMK1FQ8kjNyclXCM8vyklRVAIAS5IGzxQAAAA=}"
== [make object! [
        title: "Untitled"
        name: none
        type: none
        version: none
        date: none
        file: none
        author: none
        needs: none
        options: [compress]
        checksum: none
    ] print "Hello World!"]

Assigned ton/a Fixed inalpha 109 Last Update21-Oct-2010 05:33


Comments
(0001989)
BrianH
7-Feb-2010 09:33

Implemented in mezz-load.r 6867.
(0002446)
BrianH
15-Jul-2010 00:26

Some notes:
- The 'string keyword could be changed to 'script, and perhaps should.
- Embedding compressed scripts/modules in the host code is not recommended, due to the overhead. However, those scripts will likely still need to be checked for compression, just in case.
- The same method can be used to support encrypted scripts, or other compression methods. There will be limits to autodetection though.
(0002450)
Carl
16-Jul-2010 00:08

Feedback:

1. Call the refinement just /compress ?

2. Makes sense on 'script change.

3. Compression ok in host code, but developer should be smart about the overhead.

4. My understanding from above is: SAVE LOAD sequence of a compressed script would produce a compressed output because header indicates compressed. Sounds fine.
(0002451)
BrianH
16-Jul-2010 03:43

1-3, done. For 4, true for SAVE/header LOAD/header, after a tweak to LOAD make it restore the Compressed: 'script header for script-encoded binaries so they stay script-encoded unless you specify otherwise.

The new implementation is in mezz-load.r 7578. Tweaked the ticket description based on the changes.
(0002637)
BrianH
20-Oct-2010 12:50

Ticket description changed again based on the #1683 options header that made it into a108 and the revamped system model. The loader is implemented according to this description in a108, but SAVE is still using the old proposal behavior so it doesn't yet work. This ticket has been fully implemented in a109.

Date User Field Action Change
21-Oct-2010 05:34 BrianH Comment : 0002637 Modified -
21-Oct-2010 05:33 BrianH Status Modified pending => tested
21-Oct-2010 05:33 BrianH Fixedin Modified => alpha 109
20-Oct-2010 12:50 BrianH Comment : 0002637 Modified -
20-Oct-2010 12:50 BrianH Comment : 0002637 Added -
20-Oct-2010 12:47 BrianH Code Modified -
20-Oct-2010 12:47 BrianH Description Modified -
16-Jul-2010 03:56 BrianH Code Modified -
16-Jul-2010 03:56 BrianH Description Modified -
16-Jul-2010 03:43 BrianH Comment : 0002451 Modified -
16-Jul-2010 03:43 BrianH Comment : 0002451 Added -
16-Jul-2010 03:14 BrianH Code Modified -
16-Jul-2010 03:12 BrianH Description Modified -
16-Jul-2010 00:08 carl Comment : 0002450 Added -
15-Jul-2010 00:26 BrianH Comment : 0002446 Added -
15-Jul-2010 00:22 BrianH Description Modified -
7-Feb-2010 09:33 BrianH Comment : 0001989 Added -
7-Feb-2010 09:32 BrianH Status Modified submitted => pending
7-Feb-2010 09:32 BrianH Code Modified -
7-Feb-2010 09:32 BrianH Description Modified -
7-Feb-2010 09:20 BrianH Ticket Added -