REBOL3 tracker
  0.9.12 beta
Ticket #0001494 User: anonymous

Project:



rss
TypeWish Statussubmitted Date16-Feb-2010 01:17
Versionalpha 97 CategoryDatatype Submitted byPeterWood
PlatformAll Severitymajor Prioritynormal

Summary Hash! datatype
Description The hash! datatype in Rebol 2 was very, very useful for creating fast case sensitive look up lists. The map! datatype does not provide equivalent functionality without using work-arounds which eat into any performance improvements of the map! datatype.

Please add a hash! datatype to rebol 3
Example code

			

Assigned ton/a Fixed in- Last Update2-Jan-2013 04:03


Comments
(0002031)
BrianH
16-Feb-2010 04:40

Peter, we have been using the Severity field here as an estimate of how difficult it would be to implement the wish or fix the bug, and the Priority for how important it is (relative to current plans if possible). In this case you probably got it right: Adding a new datatype is a major hassle (relatively speaking).

That being said, there are some things that the block! type can do that the map! type can't:
- Optional case-sensitivity (see #1315 and #1437 for the previous failed attempts to add it)
- Persistent ordering, as maps could in theory be compacted and reordered
- Full series-like behavior, rather than just a subset

Now the object!-like behavior of map! (including the case-insensitivity) has been shown to be much more widely useful than R2's hash! type, but there are still rare occasions in highly optimal low-level code where having a hashed block! has been useful. In many cases code is greatly improved by switching from hash! to map!, but others need block! behavior and map! doesn't work for them.

If we do re-add the hash! type we should not use the R2 hash function, we should use the R3 map! hash function, modified to be case-sensitive. In particular, FIND/case and SELECT/case should also be sensitive to the preserved case of R3 words, and non-string values should benefit from hashing as much as they do when used as map! keys.

The R2 hash! type isn't worth adding back as is, but with a few tweaks we might be able to make it worth using instead of map! in some sufficiently important cases. And a lot of the code to implement it could be shared with block! and map!, so it might not be so severe a task to add.

Or this might be another vote for utype!.
(0002033)
abolka
16-Feb-2010 15:43

+1 from me for the general idea of having a hashed, case-sensitive datatype.

I suggest adding a datatype called STRICT-MAP!. This would behave just like MAP! but using STRICT-EQUAL? for key comparison instead of MAP!'s use of EQUAL?.
(0002037)
BrianH
17-Feb-2010 02:50

Note that abolka's suggestion is a counter-proposal to hash!: Same case-sensitivity rules (though not STRICT-EQUAL? because word binding should be ignored), but map!-like behavior, not block!-like.
(0002906)
oldes
19-Nov-2010 11:57

+1 for case-sensitive hashed datatype. Maybe REBOL is by default case-insensitive, but I need it to cooperate with case-sensitive world and using only binary keys as a workaround is really not a solution! It's enough that SWITCH has no /case refinement as well, so I must use PARSE instead.
(0002908)
Ladislav
21-Nov-2010 09:52

As I see it, the need described is not to have a HASH! datatype, but to have a case-sensitive alternative.

My opinion is based on the simple fact that there is nothing in the request requiring any other property than a case-sensitivity.
(0002914)
BrianH
21-Nov-2010 23:00

That would be a different request, Ladislav, for which we have #1774. This is a request for a block-like hashed datatype. You can tell by the name: Calling the datatype hash! and then not being even vaguely compatible with the hash! type is a no-go. The map! type isn't called hash! because it isn't block-like.

The issue! situation is different because they can be made mostly compatible, but hash! has position and persistent ordering and you can't retrofit that into an object-like type.

The motivation of the request is not the same as the request itself. #1774 came from the same motivation, but it is a different request. For that matter, #1774 was suggested earlier than hash! in the conversation that started with the #1315 request (just not submitted until recently). This request came out later in the conversation when people had brought up the other compatibility issues beyond case.
(0003311)
Ladislav
28-Dec-2012 17:25

"That would be a different request, Ladislav, for which we have #1774." - how do you know? Did you ask Peter? (there is, in fact, nothing indicating you are right)
(0003312)
BrianH
28-Dec-2012 18:39

Yes. We had a long discussion about these tickets in AltME. The reasons are those that I gave above.

The name needing to be different is why we had to make #1774. We can't use the name "hash!" for a case-sensitive version of map! - that would be grounds for immediately dismissing the request. There have been frequent requests for restoring the hash! type from R2 though, so this ticket can serve as a placeholder for those.
(0003314)
Ladislav
31-Dec-2012 16:58

"There have been frequent requests for restoring the hash! type " - well, I am questioning the word "frequent". If somebody wants case-sensitive look-ups that is a relevant request, but it does not suggest they need the hash! datatype. Similarly for other requests I have seen: I did not find any request that would suggest the hash! datatype was really needed or wanted.
(0003315)
BrianH
31-Dec-2012 18:10

"Needed", no, hash! would rarely be needed. "Wanted", on the other hand, there are people who definitely want hash! back. Requests for the hash! type have popped up fairly often on AltME. Usually, what the person is hoping to do would be done better by using the map! type, though that sometimes needs code changes. Or they can use blocks without code changes, but with different speed (better or worse, depending on circumstances). Every once in a while the person really needs a hash!, usually for indexes in database replacement code. It's rare, but it happens.

By "frequent", let us contrast the list! type, for which there have been no requests I recall in the last several years. Frequent is relative to the community size, which hasn't been very large for R3 until the open sourcing. A half-dozen decent requests is frequent enough to merit a ticket.

I agree that a case-sensitive map! type (#1774) would be a more important request to implement, since it will be more useful in more situations than an indexed series type (hash!) would. That only makes this ticket less important though, not unimportant.
(0003316)
Ladislav
31-Dec-2012 22:51

For example, the "The hash! datatype in Rebol 2 was very, very useful for creating fast case sensitive look up lists. " - is false as my tests proved. Thus, the requesters actually do not know at all what they want.

The same can be said about all requests I read.
(0003317)
Ladislav
31-Dec-2012 23:01

"let us contrast the list! type, for which there have been no requests I recall in the last several years." - that may be a result of my decency - I am not trying to enforce my opinion in this case although I am missing lists (I even suggested a better implementation for R3, which you, probably on purpose, are ignoring when saying the above)
(0003318)
BrianH
1-Jan-2013 00:24

Nope, wasn't ignoring your list! stuff, just missed it or forgot. I ignored my own request for list! though, which I only used in an obscure circumstance where I needed extra references to a particular list position to survive inserts and deletes of earlier elements, the main semi-useful behavioral difference compared to other series types. I also used hash! elsewhere in that code, though map! would have been better.

I am not saying that we won't reject this ticket on its merits; R3 dropped the hash! type in favor of map! for very good reasons, and it is rare to find code that can't be improved by switching from hash! to map! or block!. Maybe that reasoning will continue to hold true.

However, if someone decides to make a request that a case-sensitive map! type (#1774) be named hash!, that request will be rejected on well-established naming principles. I don't argue against the merits of a case-sensitive map! type, just against it being called hash!. Dropping the hash! type causes easily trackable errors when porting from R2 to R3, and the methods for converting to the use of block! or map! are well established. Changing the semantic category of the hash! type without renaming it makes things much more difficult when adjusting to the transition between R2 and R3, because it will make the errors caused by those semantic differences much harder to track down.

This naming policy isn't arbitrary, it's based on solid principles.
(0003319)
Ladislav
1-Jan-2013 11:33

" I ignored my own request for list! though, which I only used in an obscure circumstance where I needed extra references to a particular list position to survive inserts and deletes of earlier elements" - my implementation actually solves just these obscurities using a sophisticated (and fast) approach, do you happen to have something similar?
(0003320)
BrianH
1-Jan-2013 22:55

No, I actually needed the list! behavior as it was in R2, otherwise I could have used block!. I was using a combination of hash! and list! to be an index. The hash! provided the index, but the list! made sure that the offset references in the hash! would refer to the same elements if other elements in the list were inserted or deleted. If I had put the data in a block!, inserts and deletes would invalidate the index because the references would be referring to different elements. It really made it seem like the reference invalidation that you get from inserting or deleting earlier elements in other series types was a bit of a bug, and that only list! worked properly with INSERT.

As for how this could be on topic, the whole thing could have been replaced with a map! if that had existed in R2, and I wouldn't have needed list! or hash!.

A map! in R3 can be used like the table type in Lua to fake array-like behavior by using integer keys. However, the only thing you can't do with map! that you can with hash! is have references to an offset position, and ordering of the elements. I'm sure that you can prove that this lack can be worked around with code though, so that might not be a sufficient argument to justify adding hash! back. If that isn't enough, the only arguments left are possible memory savings relative to map! (I haven't compared them), or for backwards compatibility.
(0003321)
Ladislav
1-Jan-2013 23:44

"No, I actually needed the list! behavior as it was in R2" - surely, my proposed list! implementation for R3:

- has got the needed list! properties (fast remove - O(1) for removing one list element, fast sequential access - O(1) to move to the next element, fast insert - O(1) to insert one element to the list)
- while dealing with references to removed list elements in a fast (O(1), in fact) way and yet not referring "out"

I guess that is what you mean by "list! behavior as it was in R2" and "particular list position to survive inserts and deletes"?

- aha, I may be wrong regarding "particular list position to survive inserts" - what exactly do you mean by that? (my guess is that you simply mean the behaviour as observed in the list! datatype in R2, don't you?)

However, your implementation using block and hash! never can exhibit the proper list! behaviour (the O(1) operations simply aren't there)
(0003322)
BrianH
2-Jan-2013 01:27

I still don't remember your proposal, so don't assume I know what's in it :)

O(1) inserts and deletes were completely irrelevant for what I was doing. This is what I'm talking about:

>> a: make list! [x y z]
== make list! [x y z]
>> b: skip a 2
== make list! [z]
>> first b
== z
>> remove next a
== make list! [z]
>> first b
== z
>> insert b 'w
== make list! [z]
>> first b
== z

Removing or inserting an element earlier than the b reference doesn't invalidate b - b still points to the same element. Blocks don't work like that, they keep the index the same, not the referenced element. Behavior is the first priority, speed not as much so. Keeping that reference persistence is more important than O(1) inserts and deletes.

The way that map! would replace this usage of list! is something like this:

>> a: make map! [1 x 2 y 3 z]
== make map! [
1 x
2 y
3 z
]
>> b: 3
== 3
>> select a b
== z
>> a/2: none
== none
>> a
== make map! [
1 x
3 z
]
>> select a b
== z
>> a/2: 'w
== w
>> a
== make map! [
1 x
2 w
3 z
]
>> select a b
== z

You don't have offset references, but you do have key persistence and that can be good enough in many cases.
(0003323)
Ladislav
2-Jan-2013 02:58

" Behavior is the first priority, speed not as much so." - that is funny, usually the list! is defined the other way, by specifying which operations are elementary. However, luckily for you, the behaviour you like is actually optimal for lists. (but your proposed implementation surely isn't)
(0003324)
BrianH
2-Jan-2013 04:03

Nope, I didn't propose any implementation for lists. I just was describing some external behavior that I found useful.

I recall having a discussion with you about the implementation of blocks, but that is a different issue. Actually, forgive me for bringing up lists in a ticket about hashes. Big side-track.

Date User Field Action Change
2-Jan-2013 04:03 BrianH Comment : 0003324 Added -
2-Jan-2013 03:03 Ladislav Comment : 0003323 Modified -
2-Jan-2013 02:58 Ladislav Comment : 0003323 Added -
2-Jan-2013 02:16 BrianH Comment : 0003322 Modified -
2-Jan-2013 01:30 BrianH Comment : 0003322 Modified -
2-Jan-2013 01:27 BrianH Comment : 0003322 Added -
1-Jan-2013 23:51 Ladislav Comment : 0003321 Modified -
1-Jan-2013 23:50 Ladislav Comment : 0003321 Modified -
1-Jan-2013 23:46 Ladislav Comment : 0003321 Modified -
1-Jan-2013 23:44 Ladislav Comment : 0003321 Added -
1-Jan-2013 22:55 BrianH Comment : 0003320 Added -
1-Jan-2013 11:33 Ladislav Comment : 0003319 Added -
1-Jan-2013 02:48 BrianH Comment : 0003318 Modified -
1-Jan-2013 00:47 BrianH Comment : 0003318 Modified -
1-Jan-2013 00:46 BrianH Comment : 0003318 Modified -
1-Jan-2013 00:44 BrianH Comment : 0003318 Modified -
1-Jan-2013 00:42 BrianH Comment : 0003318 Modified -
1-Jan-2013 00:39 BrianH Comment : 0003318 Modified -
1-Jan-2013 00:26 BrianH Comment : 0003318 Modified -
1-Jan-2013 00:24 BrianH Comment : 0003318 Added -
31-Dec-2012 23:01 Ladislav Comment : 0003317 Added -
31-Dec-2012 22:54 Ladislav Comment : 0003316 Modified -
31-Dec-2012 22:52 Ladislav Comment : 0003316 Modified -
31-Dec-2012 22:51 Ladislav Comment : 0003316 Added -
31-Dec-2012 21:30 BrianH Comment : 0003315 Modified -
31-Dec-2012 18:10 BrianH Comment : 0003315 Added -
31-Dec-2012 16:59 Ladislav Comment : 0003314 Modified -
31-Dec-2012 16:58 Ladislav Comment : 0003314 Added -
28-Dec-2012 18:39 BrianH Comment : 0003312 Added -
28-Dec-2012 17:26 Ladislav Comment : 0003311 Modified -
28-Dec-2012 17:25 Ladislav Comment : 0003311 Added -
28-Dec-2012 17:24 Ladislav Comment : 0002908 Modified -
28-Dec-2012 17:23 Ladislav Comment : 0002908 Modified -
22-Nov-2010 21:42 BrianH Comment : 0002914 Modified -
22-Nov-2010 21:38 BrianH Comment : 0002914 Modified -
22-Nov-2010 18:49 Ladislav Comment : 0002908 Modified -
22-Nov-2010 18:48 Ladislav Comment : 0002917 Removed -
22-Nov-2010 18:48 Ladislav Comment : 0002917 Added -
21-Nov-2010 23:45 BrianH Comment : 0002914 Modified -
21-Nov-2010 23:26 BrianH Comment : 0002914 Modified -
21-Nov-2010 23:21 BrianH Comment : 0002914 Modified -
21-Nov-2010 23:02 BrianH Comment : 0002914 Modified -
21-Nov-2010 23:00 BrianH Comment : 0002914 Added -
21-Nov-2010 09:52 Ladislav Comment : 0002908 Added -
19-Nov-2010 11:57 oldes Comment : 0002906 Added -
17-Feb-2010 02:50 BrianH Comment : 0002037 Added -
16-Feb-2010 15:43 abolka Comment : 0002033 Added -
16-Feb-2010 04:40 BrianH Comment : 0002031 Added -
16-Feb-2010 01:21 PeterWood Type Modified Bug => Wish
16-Feb-2010 01:21 PeterWood Description Modified -
16-Feb-2010 01:17 PeterWood Ticket Added -