REBOL3 tracker
  0.9.12 beta
Ticket #0001993 User: anonymous

Project:



rss
TypeBug Statussubmitted Date12-Mar-2013 20:53
Versionr3 master CategoryNative Submitted byBrianH
PlatformAll Severityminor Prioritynormal

Summary FOR start end bump behavior inconsistent, doesn't make sense
Description The behavior of FOR with different start, end and bump values isn't consistent between start > end, start = end, and start < end, and doesn't necessarily make sense in any of those cases. We need a clean and consistent model for its behavior with those parameters, especially when start-vs-end and bump conflict.

The main thing that these 3 parameters decide is the direction of advancement, and for that matter to what extent the loop happens at all. And the main potential conflict is that both the bump and the relative values of start and end could decide this, but we need to decide which gets precedence.

If we decide that the bump is the primary factor for deciding advancement, then a positive bump would mean going forward, a negative bump should mean going back, and a zero bump (+0.0 and -0.0 too) should trigger an error because it would be by definition undefined behavior. The start and end parameters would only be paid attention to after the bump is considered, having the loop not happen at all of their relative direction is the opposite of the sign of the bump.

If, on the other hand, we consider the relative values or indexes of the start and end positions to be the primary determinant of the direction, start < end would mean going forward, start > end would mean going backward, and start = end would mean not advancing at all (looping once and stopping). After that direction is set, the bump would be considered simply to set the velocity of the loop in that direction. If the direction is forward, bump > 0 would go forward, and bump <= 0 would not loop at all. If the direction is backward, bump < 0 would go backward, and bump >= 0 would not loop at all. For the start = end case, the bump would be ignored. No errors triggered.

Either of those models would make sense, and in practice the only difference between them is that bump=0 would trigger an error for the bump first model, while bump=0 would just do nothing for the start-vs-end first model.

Which should we choose?
Example code

			

Assigned ton/a Fixed in- Last Update18-Apr-2014 21:20


Comments
(0003612)
BrianH
12-Mar-2013 21:12

I'm leaning towards the start-vs-end-first model, because it makes just as much sense as the other model and it triggers fewer errors. If a good enough argument can be made that a set of behavior isn't an error, triggering an error seems rude. What do you think?
(0003613)
fork
12-Mar-2013 22:08

I've always felt FOR was a bit unimpressive, and when you start getting to this point where a Rebol function takes five arguments, the ugliness of the call starts to look inferior to other languages, instead of hitting the ball out of the park.

Rebol has LOOP (want to do something N times, don't care about which time it is within the block), REPEAT (want to do something N times, would like to know which time this is), and other things like WHILE and UNTIL. Those all look pretty cool to me.

But FOR seems just too ugly to be in Rebol, it has poor aesthetics IMO.

We all know that when a function takes twelve arguments etc. in Rebol you probably missed the boat and were supposed to make a dialect. This is one of those cases. I would propose that a proper solution to this problem would be the death of FOR (at least as a native, perhaps mezzanine) and creation of a new native based on dialecting. Perhaps call it EVERY

EVERY x [1 11 bump 2] [print x]

Or whatever, I'm sure someone's tried this kind of thing, it could be different:

EVERY x [1 (2) 11] [print x]

But the key is that if you just said something simple like:

EVERY x [1 10] [print x]

The dialect would presume an integer +1 increment, or whatever.

I will make the observation that the dialect for specifying points in integer space vs. positions in a series, as per "slice", are rather obviously related.
(0003614)
BrianH
12-Mar-2013 23:04

Well, we can't change the number of arguments of FOR because it's a legacy function. Fortunately FOR and REPEAT take series arguments as well, so your observation at the end is already covered by them.

It would be a good idea to have a general-purpose dialected loop, with a powerful enough dialect that we have an answer to the Python fans who request list comprehensions. EVERY isn't a bad name for such a loop, much better than FOR even if FOR wasn't taken already. That dialect you specify for EVERY seems a little weak though, it doesn't add anything to what the other loops can already do, so it needs a bit of work before it would be worth doing. Rebol doesn't just do dialects for the sake of doing them, they do them when they are the best approach to the problem. Every dialect needs to justify its existence. Make a ticket for EVERY and we'll hammer it out.

On another note, what do you think about the actual subject of this ticket? Which of the two models is better?
(0003615)
Gregg
12-Mar-2013 23:41

Brian, just asking, is there a doc somewhere about what we are and aren't willing to break in R3? I know FOR is probably used by a lot of people, no matter how much we've discouraged it, but it may be worth breaking if we can make things much better going forward.

On the current topic, the two models you laid out are so close that I don't have a strong preference. My main concern is that a bump of zero doesn't lead to infinite loops. Between the two, I would choose the latter (start-end model), because of your comparison here. If a bump of zero means an infinite loop instead, I vote for the bump-error model.

On EVERY, I use that name the following way:

every: func [size value body /local interval] [
    if zero? value // size [
        interval: value / size
        do bind/copy body 'interval
    ]
]
comment {
    repeat i 50 [every 5 i [print [i interval]]]
}
(0003616)
BrianH
13-Mar-2013 01:04

The policy is mostly documented in the many places where I've mentioned the policy in CC, so many I don't even have a count. It was set by Carl back at the beginning, so he might have written a blog about it, I don't remember.

The basic themes are in #666 and #667. Overall, it's a matter of having to justify breaking things, and in many cases we can, especially when the old function was so broken that noone really uses it (#1973 for instance, #1972 maybe). In other cases, we change the name of the function when it was the name itself that was seriously broken (#1971). In most cases though, we just fix the internal semantics so that the external API will pretty much work the same, but better (like this one). In just a few cases, the API of the old function was so broken or so tied into the old R2 system model that we just had to revamp it even knowing that things would break (LOAD is the only one that comes to mind, with too many tickets to even list in #666). We even try to keep the order of function options the same when we can so it doesn't break APPLY calls, though if we need to remove an option that becomes less important.

Internal semantic changes are easier to justify than API changes, but only if they're improvements or fixes. Sometimes we have made some arguments take more or less types, to clean up their models - the rebalancing of none propagation, and the systemic binary conversions revamp, are good examples of those. In one case (issue!) we even changed the semantic model of a datatype from one type class to another, but it's possible to gloss over that quite a bit since those classes have more in common than they don't (it's on my todo list). In one notable case and likely a few more, we might want to revert a few changes or change them again better (at least before 3.0 when core semantics and naming of existing stuff gets standardized again).

If you are replacing an old R2 function with a new one, you need to both justify adding the new function, and separately justify dropping the old. The only reason I've seen so far for replacing a function with another function of the same name is if the old name was so bad that it was actively misleading, and we want to make sure the renaming sticks just for our developers' benefit (the only instance I can recall is #1972, and even that might just result in removing those function names altogether). More often we just add a new function with a new name. And for the most part we only remove an old function altogether if it no longer makes sense in the new system model (AS-STRING and AS-BINARY), or is basically hostile in the new model (ALIAS).

More often, legacy functions might become optional, exported from a module that you don't need to import if you don't want to. Modules are intended as the general answer to the question of "Do you need it?", the answer generally being "Yes, someone needs it, but maybe not by default, and maybe not built in."; modules are intended as the balance between /Command/View maximalists and /Base minimalists. All current functions are considered to be subject to that culling, especially if they're mezzanine, as long as they're not used to implement R3 itself.

One thing that you need to consider though is that natives don't add any overhead to Rebol when they're not used. Mezzanines have to be loaded and initialized at startup time, but natives just sit there. You can even reassign their words if some native that you don't like gets exported, though you might want to check before changing them in lib if they're being used internally. DO operations aren't keywords (of DO at least), they're all pretty much optional. This isn't PARSE, remember. So it's hard to justify removing a native from Rebol altogether when people are already using it, since you really don't need to, and R3 being open source means that you can do whatever you like with your own builds.

In the case of FOR, people actually use it when it makes sense to do so (like me, for instance). They didn't use it much in R2, but that wasn't because it was badly designed for its task, it was because it was badly implemented (slow and buggy). But even given that they still used it, which showed that there was some merit to its design. That suggests that we keep the API for the most part, fix the ugly parts of the semantics, and optimize the heck out of it so people will actually want to use it.

And if someone wants to make an awesome dialected general loop, they'll have to give it a new name like EVERY, because FOR is taken by something that works. Or they can just make their own add-on module and name it whatever they like, their choice.
(0003617)
Gregg
13-Mar-2013 01:21

Great. Thanks. Can we put that somewhere on a prominent doc wiki page for future reference and refinement?
(0003618)
Ladislav
13-Mar-2013 01:29

"But FOR seems just too ugly" - yes, I often felt that way
(0003626)
fork
13-Mar-2013 15:17

I will add that the sooner we get started on R3/Backward, the sooner we can worry less about compatibility issues, as if we are to be realistic at all we must realize that R2 code just won't run in R3. I will take the idea of being nervous about improving FOR seriously as soon as SHIFT reverts to the R2 interpretation of sign.

(I don't think it should, I'm just sayin'...point of no return already reached, so let's learn how to manage it elegantly.)

So make an R3/Backward implementation of FOR that does whatever you decide here, and replace FOR with something more like Ladislav's "CFOR" (if not exactly Ladislav's CFOR). How's that?

http://curecode.org/rebol3/ticket.rsp?id=884
(0003627)
BrianH
13-Mar-2013 19:13

Ladislav, thanks for writing all of that code in AltME, it will help with the tests.

As Fork says, this might be a matter for R3/Backward and rebol-patches, but it still needs doing.
(0003629)
Gregg
13-Mar-2013 20:02

Ladislav, thanks for writing things so clearly, with examples and explanations for each. It's very helpful.
(0003631)
BrianH
13-Mar-2013 20:44

Semantically, what is the difference? How are your tests any different from the behavior of the start-vs-end primacy model? If you have to explain in code, go ahead. The important thing is how it will behave differently.

The model is only important in being an explanation of why something behaves the way it does, the meaning of the behavior. Simply saying what it does is not an explanation, it's a description. Your code is a great description of what behavior to expect, in simple and extensive enough terms that it can be translated to unit test vectors, and that is definitely needed. However, an explanation that can't be stated in two sentences or less needs rethinking. I had to actually go through the code and compare test cases to get an understanding of your model, because your explanation didn't pass the doesn't-know-Rebol-newbie-programmer test. If you need to be a CS major to understand a model then that model won't work, even if it accurately results in the behavior we want.

If there is any behavioral difference between your above tests and what I was describing as being the resultant behavior of the start-vs-end setting the direction, bump setting the velocity model I stated in the ticket, let us know. Otherwise, we can assume that the behavior in your comment above is what you want.
(0003641)
BrianH
13-Mar-2013 23:25

I'm sorry, it was an English problem, my bad. "Primacy" didn't mean what you thought it meant. It doesn't mean that start-vs-end would be more important than bump (or vice-versa), it meant that it would be processed first. The second one would be just as important, but it would be more important later on in the process. Your stuff was talking about termination tests, which are just as important, but processed later.

Also, you keep using the term "your proposal" in the singular sense. There were two proposals, both for high-level philosophical models to explain the reason why we would choose one of two different sets of behavior. The set of behavior you are advocating is in keeping with one of those models, the one that explains why we don't trigger an error when bump is 0; it's the other model that explains why bump being 0 would trigger an error. We would pick between the two sets of behavior for practical reasons, and then use the model to pretend that we did so for philosophical reasons.

Once you start getting into termination conditions the loop has already gotten well past the point where the difference between these two models matters at all. That is why your termination conditions description has to be so involved: It is using termination conditions to try to explain choices made before termination is even a factor. Sorry for the confusion.
(0003646)
Ladislav
14-Mar-2013 11:50

"I'm sorry, it was an English problem, my bad. "Primacy" didn't mean what you thought it meant. It doesn't mean that start-vs-end would be more important than bump (or vice-versa), it meant that it would be processed first. The second one would be just as important, but it would be more important later on in the process. Your stuff was talking about termination tests, which are just as important, but processed later. " - I am sorry, but this still misses big. The point is that I do not mind what words you use. I do mind what the meaning is. Therefore, I do not mind whether you write to give "primacy" (you wrote that) or "precedence" (you wrote that as well) or "process first" (again, you wrote that). What I mind about is that you don't handle the factors equally, which is obvious no matter which words you use to describe it.
(0003647)
Ladislav
14-Mar-2013 17:00

Regarding the proposal. This came from the discussion at GG Rebol mailing list and it is not giving precedence to either START-END or BUMP:

First of all, it is useful to terminate before evaluating the cycle body in some cases.

To be able to terminate before evaluating the cycle body we need a test to be applied *just before* the cycle body is evaluated.

So, we need to have a cycle test to compare VALUE (the value of the cycle variable) with END (the cycle argument) and evaluate the body when the test is TRUE, terminating the cycle otherwise.

* START and END parameters can be used to determine the cycle test in the following way:

** if START <= END and VALUE is the value of the cycle variable, the test should look as follows:

VALUE <= END

** if START >= END the test should look as follows:

VALUE >= END

** the above two cases combined imply that if START = END the test should look as follows:

VALUE = END

* the BUMP value can be used to determine the cycle test as well

** if BUMP >= 0 the test should look as follows:

VALUE <= END

** if BUMP <= 0 the test should look as follows:

VALUE >= END

** the above two combined imply that if BUMP = 0 the test should look as follows:

VALUE = END

* since we have two ways how to determine the cycle test we need to resolve the conflict

** the best way is to use the conjunction of both tests, putting both "test sources" on equal footing

Examples:

for i 1 2 1

- both methods yield the same VALUE <= END test. The conjunction yields the VALUE <= END cycle test.

for i 2 1 -1

- both methods yield the same VALUE >= END test. The conjunction yields the VALUE >= END cycle test.

for i 2 1 1

- the START-END method yields the VALUE >= END test while the BUMP method yields the VALUE <= END test. The conjunction yields the VALUE = END cycle test. This test is already FALSE for START causing the cycle to not evaluate the body.

for i 1 2 -1

- the START-END method yields the VALUE <= END test while the BUMP method yields the VALUE >= END test. The conjunction yields the VALUE = END test. This test is already FALSE for START causing the cycle to not evaluate the body.

for i 1 2 0

- the START-END method yields the VALUE <= END test while the BUMP method yields the VALUE = END test. The conjunction yields the VALUE = END test. The test is already FALSE for START causing the cycle to not evaluate the body.

for i 2 1 0

- the START-END method yields the VALUE >= END test while the BUMP method yields the VALUE = END test. The conjunction yields the VALUE = END test. The test is already FALSE for START causing the cycle to not evaluate the body.

for i 1 1 1

- the START-END method yields the VALUE = END test while the BUMP method yields the VALUE <= END test. The conjunction yields the VALUE = END cycle test.

for i 1 1 -1

- the START-END method yields the VALUE = END test while the BUMP method yields the VALUE >= END test. The conjunction yields the VALUE = END cycle test.

for i 1 1 0

- both the START-END method as well as the BUMP method yield the VALUE = END test. The conjunction yields the VALUE = END test.

Note that in this case the test obtained would cause the cycle to become infinite, though. If wanting the cycle to not become infinite, there is no other way than to use some other "arbitrary" test. Due to the fact that we need termination and the value of the cycle variable is assumed to never change, the arbitrary test has to fail at the start causing the cycle to not evaluate the body.

Note: the above observations don't account for possible arithmetic overflow cases. Those issues need a separate consideration.
(0003648)
Ladislav
14-Mar-2013 17:05

"Semantically, what is the difference?"

* the difference is that the proposal I wrote does not use arbitrary decisions like "primacy", "precedence", "should trigger an error", "looping once and stopping"
* the main "vehicle" of my proposal is the cycle test, which needs to be determined.
* In a manner compatible with your proposal it is shown how the cycle test could be constructed using just the START-END arguments
* Again in a manner compatible with your proposal it is shown how the cycle test could be constructed using just the BUMP argument
* Having two (possibly incompatible) sources of the cycle test I put both ways on equal footing (no "primacy" or "precedence") stating that the actual cycle test shall be the conjunction of both particular tests. This puts both cycle test sources on exactly equal footing since conjunction does not give precedence to any of the factors. Also it eliminates any "arbitrariness" or "explanations" why the precedence, primacy or the "processing priority" was (or should have been) given to one factor.
* My proposal (just) determines the cycle test which happens to be FALSE in some specific cases (as demonstrated on the illustrative set of examples) for the START value explaining why the loop does not actually evaluate the cycle body at all.
* Also, since just the cycle test is obtained, I can determine when the cycle body isn't evaluated (it is when the START value cycle test already yields FALSE), but I cannot (nor I want to) state things like "looping once" since that is what we cannot know in advance because it depends on the value of the cycle variable which can be changed in the loop body and is tested only after the body was evaluated.

I hope this makes it clear where the differences are.
(0003649)
BrianH
14-Mar-2013 18:32

It came out in AltME that when you get past all of the argument over verbiage, there is one actual semantic difference between Ladislav's proposal and my two proposals.

My two proposals were both intended to make absolutely sure that no combination of start, end and bump would by themselves end up with FOR doing an infinite loop. The main purpose of this is to allow the developer to remove expensive conditional code that would be needed to screen start, end and bump in combination to make sure that such loops don't happen by accident. The fact that FOREVER exists makes it unnecessary to use FOR for your infinite loops, and if an infinite loop is your intention then using FOREVER makes that more clear so this would increase code clarity as well. This would allow us to take advantage of the constrained usage model of FOR to add an additional constraint to benefit the developer, since screening for infinite loops in FOR's native code would be much less expensive. And it doesn't really prevent the developer from making infinite loops intentionally using FOREVER or even changing the index word in the body block.

In Ladislav's proposal, he wants to make infinite loops with FOR. He doesn't see the need to screen for them, and AFAICT it is because he genuinely doesn't make the kinds of mistakes that regular developers who would benefit from this kind of screening make. No offence is meant by that, it's kind of amazing to see his code. Nonetheless, that is what "handle the factors equally" meant: he insisted that since the body wasn't screened, the other parameters shouldn't be screened either.

Rebol 2 was originally aimed at newbies to programming, which is an admirable goal that noone has really managed yet, including Rebol. What Rebol got instead was interesting people, many of whom are power users like Ladislav, but many of whom aren't quite at that level yet. Because of the kinds of things Rebol was actually used to build, Carl decided to change the target market of Rebol 3 to these power users and interesting people (I forget the blog link). That goal has affected the design of R3 much to its benefit, but unfortunately it hasn't gone back in time to retroactively design R2 with that goal in mind in the first place so we still have functions like FOR.

One thing we have to consider is that the new target market requires balance in the design, because interesting people and power users became that way by being different, with different goals, and great minds don't think alike. One of those issues is the need to balance the flexibility that power users need, against the increased need to make it easier for our developers to correct mistakes when they make them, because interesting people make interesting mistakes.

In Rebol 3, one of the design principles we have been doing fairly consistently through the language is that while code is data, for security and stability reasons you, as a developer, should be more careful with code than you are with data - "code" in this case meaning any-function values and code blocks. This isn't an arbitrary thing. Non-active values can be screened fairly easily at runtime just with type testing, and typesets and ASSERT/type make that efficient. Simple value screening can be somewhat easy to do too, if somewhat more expensive. Screening the contents of larger structures can be a bit expensive, but depending on what you want to screen for it could be possible. Screening code at runtime is considered to be a genuinely hard problem and in some cases impossible (people get PHDs for making any headway in this), with way too much overhead to be reasonable, so asking a function to do so is just silly.

However, an actual competent developer reviewing code before they run it is capable of figuring out whether code is safe to run, in some cases using information that Rebol could never know, like the developer's intentions. And because a developer in R3's new target market should be able to handle this concept, we have been working under a policy that R3 developers will be expected to do their own code screening. That means that if a code block makes it as far as being passed to a function to be executed, it is presumed to be OK to execute even if it does weird stuff, because it is presumed that the developer intended to do what they said they wanted to do. The functions protect themselves from malicious code with APPLY and such, and SECURE protects the system a bit, but the code itself is presumed to be intentional. Maybe they intended to shoot themselves in the foot, who are we to judge? Maybe the foot had it coming.

For non-code values, we presume them to be possibly unintentional, maybe the result of data from unknown sources. In those cases, the developer would need help screening out bad data, preferably in native code as much as possible because conditional screening code in DO dialect code is relatively expensive, and if that process involves screening for triggered errors then it gets even more expensive. So we have a policy of trying to rebalance the code so that errors are only triggered in a case where it is a real error, at least according to the declared and documented semantic model and constraints of the function - that makes the remaining errors something you want triggered, for your own benefit. And for value screening, we can presume that the developer actually wanted that screening, because they called that function with those options in their code, rather than another function or options. Because of this we are careful with designing the semantic model of the function, and deciding on the range of the values supported, and what to do when presented with values out of that range, since the right balance of constraints will make the function more useful to the developer. Then they can pick the function that does the screening they need so they don't have to do it themselves.

It is this difference in how we treat code vs. non-code that lets us balance Rebol so it may meet the needs of power-users like Ladislav, and powerful on a good day but would appreciate a little help on occasion users like me. That difference isn't arbitrary, it's a balanced design policy. Consistency would be a mistake here.

Now in this particular case, I would recommend that we implement the start-vs-end-sets-direction bump-sets-velocity velocity-must-advance model in rebol-patches and R3/Backward, because this is an R2 function and we should therefore aim it at the R2 market - which is admittedly now just R2 fans who never really used this function much and just need it to have the same number of parameters in the same order serving the same roles. Hopefully with a rewrite and some useful constraints against infinite loops they might start using it.

For R3 and R2/Forward, I recommend implementing #884 and calling it FOR. No, really, it's a better fit for R3's target market. R2's FOR has some useful constraints, but it isn't itself as useful as the #884 function. #884 takes 4 code blocks, not non-code values, and has "General loop" right at the beginning of its main doc string, so it's clearly a power user function that isn't meant to be constrained. That's enough of a "Here there be dragons!" warning that we can assume it wouldn't be called by someone without a sword and shield handy.
(0003660)
BrianH
15-Mar-2013 00:09

I had a tough time reading your arguments, because they seemed to be focusing on stuff that wasn't at all relevant to the actual semantics involved. Treating all of the arguments equally? Do you think the bump parameter cares what I think about it? This is code, not people.

The infinite loop thing was the only actual semantic difference between your proposals and my proposals afaict. I specifically crafted my proposals to exclude infinite loops - the only difference was what to do instead. Excluding infinite loops was a feature. That was a feature you disagreed with, fine.

But going on about having the parameters treated equally even when they aren't actually equal, and then saying that the decision to treat them differently was wrong and arbitrary, that needed an answer. It was a design choice, one which we have been applying to the R3 project for 5 years now. If you didn't understand that design choice, I have helpfully explained the sensible rationale for it above.
(0003661)
Ladislav
15-Mar-2013 00:11

"The model is only important in being an explanation of why something behaves the way it does, the meaning of the behavior. Simply saying what it does is not an explanation, it's a description. Your code is a great description of what behavior to expect, in simple and extensive enough terms that it can be translated to unit test vectors, and that is definitely needed. However, an explanation that can't be stated in two sentences or less needs rethinking. I had to actually go through the code and compare test cases to get an understanding of your model, because your explanation didn't pass the doesn't-know-Rebol-newbie-programmer test. If you need to be a CS major to understand a model then that model won't work, even if it accurately results in the behavior we want." - this deserves a note. My text describing how FOR should work is not any "explanation" or whatnot. It is:

* a complete specification of the behaviour, i.e., it precisely specifies how the cycle has to behave (contrasting that to Brian's proposal which simply does not describe the behaviour completely enough to know what happens when a cycle variable changes in the cycle body)
* a proof of the concept demonstrating how the behaviour can be derived from the basic principles mentioned
(0003664)
Ladislav
15-Mar-2013 00:45

"I had a tough time reading your arguments, because they were focusing on stuff that wasn't at all relevant to the actual semantics involved" - LOL. It is the other way around. My specification not just *concentrates* on the semantics involved, it actually does so in a *complete* manner specifying the semantics completely without needing to add some precisations later
(0003666)
Ladislav
15-Mar-2013 00:53

"Treating all of the arguments equally? Do you think the bump parameter cares what I think about it? This is code, not people. " - I am not sure this makes sense to discuss at all, but trying to just inform the uninformed:

* I never said I treated the arguments equally. What I did treat equally was the *sources* of cycle tests allowing me to not give precedence to some of the sources arbitrarily suppressing the role of the other when defining the cycle test needed to specify the behaviour completely

* "Do you think the bump parameter cares what I think about it?" - Do you think that the programmer does not care what the implementer of the loop thinks about the BUMP parameter?

Also, when speaking about the cycle tests:

* specifying/knowing the cycle test used allows me to completely specify/know the behaviour of the cycle

* treating all possible sources of the cycle tests equally allows me to use every available bit of information when constructing the cycle test
(0003671)
Ladislav
15-Mar-2013 02:00

Also, this looks worth examining: "I specifically crafted my proposals to exclude infinite loops " - hmm, did we already decide what to do in cases like:

for i 1.0 1e30 1.0 [...]

, which looks like an infinite loop to me.
(0003672)
BrianH
15-Mar-2013 03:06

It looks like have been arguing past each other rather than from an understanding of the real differences between our proposals, or of the purpose of this ticket. Let me help things by trying to explain my proposal better. I posted tests that implement the main goal of the proposal in AltME and explained them there too, but just in case that isn't enough here are the 5 steps that matter for the purposes of this discussion:

1) When the function first starts, it checks whether start < end, start > end, or start = end, and then picks one of 3 sets of termination conditions the loop will use. Each set contains starting and post-cycle termination conditions.
2) The starting condition only pays attention to the values start, end and bump, and its main goal is to make sure that bump > 0 when start < end, that bump < 0 when start > end, or that start = end, and if any of those are not true then the loop doesn't start even a single cycle. For the starting condition you don't need to check bump at all if start = end, it only matters when start != end.
3) If you get this far, run one cycle of the loop by doing body. If the result of that is a break unwind, stop and return the associated value (default unset). If it's a continue unwind, go to 4. If it's another kind of unwind, stop and return it. (Replace the "unwind" stuff with the R2 equivalent in R2.)
4) Having finished a cycle (if you get this far), check the post-cycle termination condition chosen in step 1. The post-cycle condition only pays attention to start, end, and the value assigned to word at the end of the cycle (:word); it ignores bump. The :word value might have changed since the start of the cycle, so don't assume it's the same and don't revert it. If start = end then terminate if :word = end. If start < end then terminate if :word >= end. If start > end then terminate if :word <= end. If you terminate, return the result of the body evaluation.
5) If you haven't terminated, add bump to :word and go back to step 3. Note that 1 and 2 are irrelevant at this point

Now, as for the actual subject of this ticket, it is step 2 in that list, the starting condition. Of my two proposals, one has it trigger an error when bump = 0 and start != end, and one has the loop just not do anything and return none. I like the second of those choices, because you could plausibly argue that we have decided that a bump of 0 is just out of range when start != end. That will allow the developer cut down on conditional code to avoid errors, which is the whole reason for defining 0 to be out of range in the first place so it would be counterproductive to require them to add back the conditional code for another reason.

If you are trying to make an argument that we should skip that starting condition check altogether then you will not convince me, because adding that starting condition is the entire purpose for me writing the ticket in the first place.

If you say that the model is too complex then I will ask you to make a simple model that accomplishes the goal of adding that starting condition, preferably one that also doesn't consider the starting condition again at any point after the cycles start the way the above does. If you say it isn't formulated correctly then it will be on you to reformulate it in a way that explains semantics that meet the same goal of adding that starting condition, because I don't care about form. If there is a problem in anything other than the starting condition, like say the post-cycle condition, let me know.

If you say that the starting condition isn't affected by changes to the value of :word in the body code block then I will point you to step 4, which says that :word can change; and my above message, which says that stuff that happens in a code block is considered to be intentional and the developer's responsibility, not mine; and to where in the process the starting condition is even considered, specifically before those changes could possibly occur because the body block hasn't run yet.

If you say that R2-style FOR sucks, then will point out that it is off topic for this ticket, and redirect you to #884 where I agree with you at length and propose that #884 replace FOR completely for R3 and R2/Forward code, but then tell you that we will still need to add the starting condition to the FOR in R3/Backward and rebol-patches.

I hope that is more clear.
(0003676)
Ladislav
15-Mar-2013 07:21

"For the starting condition you don't need to check bump at all if start = end" - it turns out then that I find your proposal to check BUMP first determining if it is zero more consistent.
(0003679)
BrianH
15-Mar-2013 18:22

Cool. The starting condition thing was the main point. And I didn't even consider the existing starting conditions which I didn't think were a problem, like if start and end are both series (with direction being set by the relative index positions) then they should be references to the same underlying series data.

If it isn't supported already, and it isn't too confusing, I think that we might consider allowing the case of start referring to a series and end referring to a number, where the number would be either interpreted as an index (one-based relative from the head of the series) or as an offset (zero-based relating to the start position); we'd have to choose one of those two and only support that one, so I'd prefer offsets but would go with whichever. MOVE supports both, but it has a /to refinement to make that choice, and it just doesn't make sense to add a refinement to FOR since it never had to process one before.

The initial proposal pretty much assumed that the post-cycle termination condition would be something that made sense - well, one of a set of 3 termination conditions depending on the direction, but they would all make sense. If the post-cycle condition needs work, feel free to chime in.
(0003680)
Ladislav
15-Mar-2013 18:43

I would like to demonstrate one bug in the way how would the FOR cycle behave if using the rules you summarized above:

for i 1 1 1 [print i i: i + 1]

would be an infinite cycle printing 1 3 5 7 9 ...

That is inconsistent with

for i 1 2 1 [print i i: i + 1]

, which would print 1 and terminate. The problem is that comparing the two cycles having the same body nobody would expect the latter to "terminate sooner".
(0003681)
Ladislav
15-Mar-2013 18:46

"if start and end are both series (with direction being set by the relative index positions) then they should be references to the same underlying series data" - looks reasonable and works this way in R2 but not in R3.
(0003682)
Ladislav
15-Mar-2013 19:00

"we'd have to choose one of those two and only support that one, so I'd prefer offsets but would go with whichever" I prefer to use a poll for this, however, there already is a certain way Carl chose, and I am not sure it is good to decline.
(0003684)
BrianH
15-Mar-2013 19:16

(In reply to comment 3680)

Should the start=end post-cycle termination condition (in my above model) just be to terminate, regardless of what :word and end are? That would deal with the potentially dangerous inconsistency you mentioned in comment 3680, and would be more consistent with start=end just ignoring bump in the starting condition. That would be in keeping with the theme of trying to help the developer avoid unintended infinite loops, and would make the just-one-cycle definition of start=end more consistent too.

The relative inequalities of the start != end cases do a better job of protecting developers than the equality test that the start = end case currently has as a post-cycle condition.

(Comment 3682) "I prefer to use a poll for this, but we may even get no answer" - that means it's too iffy to add as a feature to an R2-like function. If there is no obvious answer to that question, any choice we make would be a support nightmare. Let's just skip it.

(Comment 3681) Assuming that by "but not in R3" you don't mean #884, that might need a ticket for that problem. It's a regression.

(Comment 3671) Assume that I don't understand your argument here. Explain how you would tweak the post-cycle condition of my model to deal with this, please?
(0003686)
Ladislav
15-Mar-2013 20:09

"Explain how you would tweak the post-cycle condition of my model to deal with this" - the issue is as follows: Rebol decimals have limited precision and some numbers don't increase when we add a positive (but relatively small) BUMP to them. Thus, increasing may stop at some moment even when we are adding a positive number. I am not sure FOR *should* handle such cases in some special way, though, because every exception handled increases the overhead. In this case I would suggest to try to just ignore the issue, what do you think?
(0003687)
BrianH
15-Mar-2013 20:33

Given that it isn't really FOR's fault, I agree, we probably can ignore this whole class of issues. But if you want the starting condition to be something like (bump + end) > end rather than bump > 0 in this case (and (bump - end) < end in the descending case), that could work too, assuming overflow is handled. Or whatever math ensures the (bump advances in the appropriate direction if start != end) starting constraint. Only fix it if it's pragmatic to do so.
(0003732)
fork
26-Mar-2013 17:50

Ladislav's CFOR is R3 FOR.

The old FOR does whatever stupid R2 FOR did.

The end.
(0004404)
fork
18-Apr-2014 21:20

I didn't even realize I'd commented on this issue beFORe. But I apparently had. As per the poet Donald Rumsfeld:

"I believe what I said yesterday. I don't know what I said, but I know what I think, and, well, I assume it's what I said."

CFOR, as in the idea of "Rebol's incarnation of C's FOR loop", is a good name for something like this:

cfor [x: 0 y: 0] [all [x < 10 y < 5]] [++ x ++ y] [
print x
]

@rgchris sketched an implementation as:

cfor: func [init [block!] test [block!] step [block!] body [block!] /local out][init: context init while bind test init bind compose/deep [set/any 'out (to paren! body) (step)] init get/any 'out]

As we think of a dialected solution that could support FOR x [1 thru 10] [print x] vs FOR x [1 to 10] [print x], and thinking about overall language consistency... we can see that a dialected FOR overlaps with REPEAT. Thus we should probably take a step back and look at the words in play.

LOOP, FOR, REPEAT, EVERY, (others?)

Of this set, FOR is sort of the least meaningful. It is taken for granted because existing programmers have been told what a "FOR-LOOP" is. But the word FOR does not imply a loop of any kind. It must be combined with another word to get that meaning. (FOR-EACH) A standalone FOR is perhaps best summarized as "purpose":

http://www.merriam-webster.com/dictionary/for

LOOP, REPEAT, and EVERY are better. But how can one apply intuition to know which of these would set a word for each iteration vs which would not? If I said "LOOP N TIMES" vs "REPEAT N TIMES" is there any indication of which would let you tell which step of the iteration you were on? We could argue that LOOP has a negative connotation of the "infinite loop" and that is sort of about a clueless program that lacks self-awareness... "stuck in a loop"... and if you had access to a number showing you'd performed the iteration a million times, how could you get stuck?

This might be a decent rationale for defining LOOP as it is. But it leaves the door open to what the range dialect iterator which names the iteration variable should be called. Is FOR a lousy word? Is REPEAT better? EVERY?

Date User Field Action Change
28-Aug-2014 16:48 Ladislav Comment : 0003646 Modified -
28-Aug-2014 16:47 Ladislav Comment : 0003646 Modified -
18-Apr-2014 21:20 fork Comment : 0004404 Added -
26-Mar-2013 17:50 fork Comment : 0003732 Added -
16-Mar-2013 16:53 BrianH Comment : 0003684 Modified -
16-Mar-2013 16:53 BrianH Comment : 0003672 Modified -
16-Mar-2013 16:52 BrianH Comment : 0003649 Modified -
15-Mar-2013 23:49 BrianH Comment : 0003662 Removed -
15-Mar-2013 23:45 BrianH Comment : 0003673 Removed -
15-Mar-2013 23:44 BrianH Comment : 0003672 Modified -
15-Mar-2013 23:38 BrianH Comment : 0003660 Modified -
15-Mar-2013 23:34 BrianH Comment : 0003627 Modified -
15-Mar-2013 23:31 BrianH Comment : 0003684 Modified -
15-Mar-2013 23:18 Ladislav Comment : 0003685 Removed -
15-Mar-2013 23:16 Ladislav Comment : 0003683 Removed -
15-Mar-2013 23:15 Ladislav Comment : 0003682 Modified -
15-Mar-2013 23:13 Ladislav Comment : 0003675 Removed -
15-Mar-2013 23:12 Ladislav Comment : 0003674 Removed -
15-Mar-2013 23:11 Ladislav Comment : 0003670 Removed -
15-Mar-2013 23:10 Ladislav Comment : 0003670 Modified -
15-Mar-2013 23:09 Ladislav Comment : 0003669 Removed -
15-Mar-2013 23:08 Ladislav Comment : 0003668 Removed -
15-Mar-2013 23:07 Ladislav Comment : 0003657 Removed -
15-Mar-2013 23:06 Ladislav Comment : 0003630 Removed -
15-Mar-2013 23:06 Ladislav Comment : 0003656 Removed -
15-Mar-2013 20:50 BrianH Comment : 0003687 Modified -
15-Mar-2013 20:49 BrianH Comment : 0003687 Modified -
15-Mar-2013 20:33 BrianH Comment : 0003687 Added -
15-Mar-2013 20:09 Ladislav Comment : 0003686 Added -
15-Mar-2013 20:01 Ladislav Comment : 0003685 Added -
15-Mar-2013 19:50 BrianH Comment : 0003684 Modified -
15-Mar-2013 19:45 BrianH Comment : 0003684 Modified -
15-Mar-2013 19:22 BrianH Comment : 0003684 Modified -
15-Mar-2013 19:20 BrianH Comment : 0003684 Modified -
15-Mar-2013 19:17 BrianH Comment : 0003684 Modified -
15-Mar-2013 19:16 BrianH Comment : 0003684 Added -
15-Mar-2013 19:06 Ladislav Comment : 0003683 Modified -
15-Mar-2013 19:03 Ladislav Comment : 0003683 Added -
15-Mar-2013 19:00 Ladislav Comment : 0003682 Added -
15-Mar-2013 18:49 Ladislav Comment : 0003681 Modified -
15-Mar-2013 18:46 Ladislav Comment : 0003681 Added -
15-Mar-2013 18:44 Ladislav Comment : 0003680 Modified -
15-Mar-2013 18:43 Ladislav Comment : 0003680 Added -
15-Mar-2013 18:29 Ladislav Comment : 0003657 Modified -
15-Mar-2013 18:25 BrianH Comment : 0003679 Modified -
15-Mar-2013 18:24 BrianH Comment : 0003679 Modified -
15-Mar-2013 18:22 BrianH Comment : 0003679 Added -
15-Mar-2013 07:23 Ladislav Comment : 0003676 Modified -
15-Mar-2013 07:21 Ladislav Comment : 0003676 Added -
15-Mar-2013 07:18 Ladislav Comment : 0003675 Modified -
15-Mar-2013 07:17 Ladislav Comment : 0003675 Added -
15-Mar-2013 07:04 Ladislav Comment : 0003674 Modified -
15-Mar-2013 07:01 Ladislav Comment : 0003674 Modified -
15-Mar-2013 06:59 Ladislav Comment : 0003674 Modified -
15-Mar-2013 06:55 Ladislav Comment : 0003674 Added -
15-Mar-2013 03:44 BrianH Comment : 0003673 Added -
15-Mar-2013 03:06 BrianH Comment : 0003672 Added -
15-Mar-2013 02:00 Ladislav Comment : 0003671 Added -
15-Mar-2013 01:33 Ladislav Comment : 0003670 Modified -
15-Mar-2013 01:31 Ladislav Comment : 0003670 Added -
15-Mar-2013 01:28 Ladislav Comment : 0003669 Added -
15-Mar-2013 01:21 Ladislav Comment : 0003668 Modified -
15-Mar-2013 01:17 Ladislav Comment : 0003668 Modified -
15-Mar-2013 01:14 Ladislav Comment : 0003668 Modified -
15-Mar-2013 01:10 Ladislav Comment : 0003668 Modified -
15-Mar-2013 01:06 Ladislav Comment : 0003668 Added -
15-Mar-2013 01:04 Ladislav Comment : 0003666 Modified -
15-Mar-2013 01:02 Ladislav Comment : 0003666 Modified -
15-Mar-2013 00:53 Ladislav Comment : 0003666 Added -
15-Mar-2013 00:45 Ladislav Comment : 0003664 Added -
15-Mar-2013 00:13 BrianH Comment : 0003662 Modified -
15-Mar-2013 00:11 BrianH Comment : 0003662 Added -
15-Mar-2013 00:11 Ladislav Comment : 0003661 Added -
15-Mar-2013 00:09 BrianH Comment : 0003660 Added -
14-Mar-2013 23:52 Ladislav Comment : 0003657 Modified -
14-Mar-2013 23:48 Ladislav Comment : 0003657 Modified -
14-Mar-2013 23:46 Ladislav Comment : 0003657 Added -
14-Mar-2013 23:38 Ladislav Comment : 0003656 Added -
14-Mar-2013 18:38 BrianH Comment : 0003649 Modified -
14-Mar-2013 18:32 BrianH Comment : 0003649 Added -
14-Mar-2013 17:18 Ladislav Comment : 0003647 Modified -
14-Mar-2013 17:07 Ladislav Comment : 0003647 Modified -
14-Mar-2013 17:05 Ladislav Comment : 0003648 Added -
14-Mar-2013 17:01 Ladislav Comment : 0003636 Removed -
14-Mar-2013 17:00 Ladislav Comment : 0003647 Added -
14-Mar-2013 16:59 Ladislav Comment : 0003625 Removed -
14-Mar-2013 16:54 Ladislav Comment : 0003625 Modified -
14-Mar-2013 12:27 Ladislav Comment : 0003646 Modified -
14-Mar-2013 12:25 Ladislav Comment : 0003625 Modified -
14-Mar-2013 12:14 Ladislav Comment : 0003625 Modified -
14-Mar-2013 11:50 Ladislav Comment : 0003646 Added -
13-Mar-2013 23:26 BrianH Comment : 0003641 Modified -
13-Mar-2013 23:25 BrianH Comment : 0003641 Added -
13-Mar-2013 22:46 Ladislav Comment : 0003636 Modified -
13-Mar-2013 22:43 Ladislav Comment : 0003636 Modified -
13-Mar-2013 22:42 Ladislav Comment : 0003636 Added -
13-Mar-2013 20:44 BrianH Comment : 0003631 Added -
13-Mar-2013 20:28 BrianH Comment : 0003627 Modified -
13-Mar-2013 20:20 Ladislav Comment : 0003630 Added -
13-Mar-2013 20:02 Gregg Comment : 0003629 Added -
13-Mar-2013 19:13 BrianH Comment : 0003627 Added -
13-Mar-2013 17:06 Ladislav Comment : 0003625 Modified -
13-Mar-2013 15:19 Fork Comment : 0003626 Modified -
13-Mar-2013 15:17 Fork Comment : 0003626 Added -
13-Mar-2013 14:25 Ladislav Comment : 0003625 Modified -
13-Mar-2013 14:20 Ladislav Comment : 0003625 Modified -
13-Mar-2013 14:16 Ladislav Comment : 0003625 Modified -
13-Mar-2013 14:15 Ladislav Comment : 0003625 Modified -
13-Mar-2013 14:08 Ladislav Comment : 0003625 Modified -
13-Mar-2013 14:07 Ladislav Comment : 0003625 Modified -
13-Mar-2013 13:50 Ladislav Comment : 0003625 Modified -
13-Mar-2013 13:45 Ladislav Comment : 0003625 Modified -
13-Mar-2013 13:44 Ladislav Comment : 0003625 Modified -
13-Mar-2013 13:21 Ladislav Comment : 0003625 Modified -
13-Mar-2013 13:20 Ladislav Comment : 0003625 Modified -
13-Mar-2013 13:19 Ladislav Comment : 0003625 Modified -
13-Mar-2013 13:19 Ladislav Comment : 0003625 Modified -
13-Mar-2013 13:15 Ladislav Comment : 0003625 Modified -
13-Mar-2013 13:15 Ladislav Comment : 0003625 Modified -
13-Mar-2013 13:13 Ladislav Comment : 0003625 Modified -
13-Mar-2013 13:11 Ladislav Comment : 0003625 Modified -
13-Mar-2013 12:30 Ladislav Comment : 0003625 Modified -
13-Mar-2013 12:25 Ladislav Comment : 0003625 Modified -
13-Mar-2013 12:24 Ladislav Comment : 0003625 Modified -
13-Mar-2013 12:23 Ladislav Comment : 0003625 Modified -
13-Mar-2013 12:17 Ladislav Comment : 0003625 Modified -
13-Mar-2013 12:14 Ladislav Comment : 0003625 Added -
13-Mar-2013 02:00 BrianH Comment : 0003616 Modified -
13-Mar-2013 01:29 Ladislav Comment : 0003618 Added -
13-Mar-2013 01:21 Gregg Comment : 0003617 Added -
13-Mar-2013 01:07 BrianH Comment : 0003616 Modified -
13-Mar-2013 01:04 BrianH Comment : 0003616 Added -
12-Mar-2013 23:42 Gregg Comment : 0003615 Modified -
12-Mar-2013 23:42 Gregg Comment : 0003615 Modified -
12-Mar-2013 23:41 Gregg Comment : 0003615 Added -
12-Mar-2013 23:06 BrianH Comment : 0003614 Modified -
12-Mar-2013 23:04 BrianH Comment : 0003614 Added -
12-Mar-2013 22:17 fork Comment : 0003613 Modified -
12-Mar-2013 22:08 fork Comment : 0003613 Added -
12-Mar-2013 21:12 BrianH Comment : 0003612 Added -
12-Mar-2013 20:53 BrianH Ticket Added -