REBOL3 tracker
  0.9.12 beta
Ticket #0002105 User: anonymous

Project:



rss
TypeIssue Statussubmitted Date16-Feb-2014 09:12
Versionr3 master CategoryUnspecified Submitted byfork
PlatformAll Severitymajor Priorityhigh

Summary Ecology of Headers and Naming
Description Along with issues of file extensions, there's an issue that goes hand-in-hand. With the rise of Red and Red/System... and with Rebmu poised to define the future of computing... the question of the "header ecology" is on our plate.

For the ecology to succeed, @DocKimbel is emphatic that Rebol must check for a valid Rebol header when running a file via DO. Today it does not. I can make a file containing the string PRINT "HELLO" and save it as test.reb. Then:

>> do %test.reb
Script: none Version: none Date: none
HELLO

What is generally agreed upon is that a Rebol-family system should be agnostic about the file extension. But enforcing a header--even an empty one--helps the file itself contain some metadata that travels with it, regardless of its tag in the filesystem.

Header detection is an issue. We know that requiring it to be the very first line of the file creates a problem if you want to write a file that the shell can dispatch, e.g.

#!/usr/local/bin/rebol -cs
Rebol []
print "HELLO"

Currently passing this test is merely checking for a line somewhere in the file starting with "Rebol" and if so, processing it from there. This created a problem n my Draem web content dialect (now powering hostilefork.com), which allows the embedding of code samples:

Draem []
{The following is some *cool code*!}
[code rebol {
Rebol []
}]

Currently Red is using a similar pattern. I feel like this idea of accepting arbitrary data up to a line starting with something you're looking for is the wrong way to go. Instead, it should be more limited and only tolerate the limited situations that exist in practice (such as skipping over lines starting with # symbols).

I'd like to propose a further point, that by convention, any file ending in .reb should also have a Rebol header. Similarly, any file ending in .red should have a Red header, any file ending in .reds should have a Red/System header, etc.

This raises the question of what file extension should a data file have if it is designed to be structural and LOADed in a generic way? A kind of analogue to .txt or .dat for information that just happens to be compatible with Rebol's parser, but not specifically designed for execution or any particular schema?

In thinking about this issue, Ren (REadable Notation) came to mind:

https://github.com/humanistic/REN

Unlike other cases, the header in Ren is optional... allowing it to position against things like JSON. Luckily, there is no super popular file extension for .ren. From the web:

"REN files are Uncommon Files primarily associated with Renai Tsumyu Resshon Tsukuru 2 (Love Lesson Maker 2) (Enterpbrain). REN files are also associated with Possibly a Renamed File and FileViewPro."

I think the Ren proposal is great, and it's unfortunate that we don't have a lot of implementations of it. However--I've pointed out to @DocKimbel that if a user of Red wants an ANSI-C format library for manipulating Red files...Rebol *is* that library (plus evaluator, network protocols, etc.)

To sum up my proposal here:

* Rebol's DO should not accept any file or URL where the target lacks a Rebol[] header.

* If a file has a Rebol header, the ideal extension should be .reb (unenforced, just a rule of thumb). If the file has another type of header, it should be a different extension (again, no members of the ecology should enforce behavior based on extension).

* If there is no header and just a raw feed of Rebol data, then the go-to extension should be .ren (as opposed to .dat, .txt, .reb, or no extension)

* What content is tolerated as a preamble to a header as sought by a member of the ecology should be chosen such that no two ecology members may accidentally disagree on identifying the header class.

As for dealing with collisions in the ecology, one trick people have used is to piggyback on the DNS (for better or worse). If we went down that road, we might wind up with headers starting "com/hostilefork/draem []" instead of just "Draem []". I think that a more organic approach would be to come up with some kind of certified mode where additional information was put in the header, if you were concerned about collisions with things like "another Red/System". So perhaps:

Red/System [
TypeID: http://red-lang.org/red-system/
]

On my own repositories, as I go through and rename them to .reb (and fix up web links...eh) I'm following this strategy. Posting this here to solicit debate.
Example code

			

Assigned ton/a Fixed in- Last Update17-Feb-2014 21:01


Comments
(0004231)
fork
16-Feb-2014 12:24

An additional thing to note:

The more file extensions people decide to use for their custom dialect files, the syntax highlighters of the world will not recognize them by default.

It might be concluded that therefore, the best thing is to end all files with the same extension. Yet for the same reasons that Red isn't content to distinguish its files by the header alone and toss .red and .reds in favor of .reb, so I don't want Rebmu files to end in .reb... or Draem files to end in .reb. The extension should belong to the user for clarity of labeling files in the filesystem. Rebol's emphasis on putting the type information into the file itself makes this more feasible.

One interesting note regarding syntax highlighting: GitHub and other sites are becoming more "heuristics-based" and paying attention to the content to guess highlighting (instead of going by the extensions).

But again I was just talking about a convention that if your header was not Rebol [] that you should probably use a different extension. There would be no enforcement, so it would be a policy issue depending on your files and what environments would be syntax highlighting them (or not, beyond your control).
(0004232)
abolka
16-Feb-2014 16:33

Related: #2091, which discusses the optionality of the `Rebol []` header.
(0004234)
BrianH
17-Feb-2014 21:01

You also need to consider multi-scripts, which can be all in one language or in multiple languages. As long as all the dialects use length-embedding and comparable header block syntax, there should be no categorical reason why we can't have SCRIPT? just skip past non-Rebol scripts. For fully syntax-compatible dialects, even script-in-a-block embedding should work.

Heterogeneous multi-scripts are one way to be able to make compiled extensions that both Rebol and Red can use. Even homogeneous multi-scripts are useful for extensions.

Date User Field Action Change
17-Feb-2014 21:02 BrianH Comment : 0004234 Modified -
17-Feb-2014 21:01 BrianH Comment : 0004234 Added -
16-Feb-2014 16:33 abolka Comment : 0004232 Added -
16-Feb-2014 12:24 Fork Comment : 0004231 Added -
16-Feb-2014 09:12 Fork Ticket Added -