REBOL3 tracker
  0.9.12 beta
Ticket #0001984 User: anonymous

Project:



rss
TypeWish Statusreviewed Date5-Mar-2013 12:50
Versionr3 master CategoryNative Submitted bySunanda
PlatformAll Severityminor Prioritynormal

Summary Can we have set operations on map! please?
Description In R2 I can use hash! for fast set operations, eg:

dataset1: to-hash [1 2 3]
dataset2: to-hash [1 3 5]
find dataset1 1 ;; membership yes/no
datasetu: union dataset1 dataset2 ;; set of all members
datasetd: difference dataset1 dataset2 ;; set of differences
dataseti: intersect dataset1 dataset2 ;; set of common members

These other datasets are hash!s themeselves and so are fast to search.

In R3 I can do set membership like this:
dataset1: to-map [1 true 2 true 3 true]
dataset2: to-map [1 true 3 true 5 true]
select dataset1 1 ;; membership yes/no

And I can do other operations like this:
datasetu: union words-of dataset1 words-of dataset2 ;; set of all members
datasetd: difference words-of dataset1 words-of dataset2 ;; set of differences
dataseti: intersect words-of dataset1 words-of dataset2 ;; set of common members

But what I end up with is a block! not a map!

I know I can convert datasetu etc to map! but that's quite an overhead in code.

Example code

			

Assigned ton/a Fixed in- Last Update8-Mar-2013 13:53


Comments
(0003567)
BrianH
6-Mar-2013 00:50

Yes please! UNIQUE could just be a shallow copy, but the rest would be useful. And it might make sense to do a UNION, INTERSECT, DIFFERENCE or EXCLUDE with a map and a block that would be interpreted as a collection of keys. Their /skip refinements can be ignored though, just as they are with SELECT.

Note that since keys are unique anyway we can just compare on keys. No #428 /skip issues here.
(0003568)
abolka
6-Mar-2013 01:22

+1 as well.

Note that this leaves open the question of what to do with _values_ for the UNION and INTERSECT set operations. While there are several useful possibilities, I suggest implementing the most straightforward one: by definition, have the value from the second (*) series argument ("SET2") survive:

>> union map [a 1 b 2] map [b 3 c 4]
== map [a 1 b 3 c 4]

>> intersect map [a 1 b 2] map [b 3 c 4]
== map [b 3]

(*) Alternatively, define that the value from the first series argument survives. That may be better if we consider this definition giving "precedence" to the two arguments, and that the first argument can be considered "more important" than the second argument.

On the other hand, the definition of having values from the second series argument survive is nicely in line with how the MAP constructor behaves:

>> map [a 1 b 2 b 3 c 4]
== map [a 1 b 3 c 4]

(With this reading, UNION could for example be considered as the result of joining both map body blocks and than creating a map from that.)
(0003569)
BrianH
6-Mar-2013 01:45

The set functions themselves are otherwise first-wins, so it would make sense to be first-wins here as well. For example:

>> get first union reduce [in construct [a: 1] 'a] reduce [in construct [a: 2] 'a]
== 1 ; the first a won

And since map operations are all implied /skip 2 /compare 1, we won't have to worry about those being the default and assumed behavior.

Date User Field Action Change
8-Mar-2013 13:53 sunanda Description Modified -
6-Mar-2013 01:52 BrianH Comment : 0003569 Modified -
6-Mar-2013 01:45 BrianH Comment : 0003569 Added -
6-Mar-2013 01:22 abolka Comment : 0003568 Added -
6-Mar-2013 01:09 BrianH Comment : 0003567 Modified -
6-Mar-2013 01:08 BrianH Description Modified -
6-Mar-2013 01:07 BrianH Description Modified -
6-Mar-2013 01:07 BrianH Category Modified Datatype => Native
6-Mar-2013 01:07 BrianH Status Modified submitted => reviewed
6-Mar-2013 00:56 BrianH Comment : 0003567 Modified -
6-Mar-2013 00:55 BrianH Comment : 0003567 Modified -
6-Mar-2013 00:50 BrianH Comment : 0003567 Added -
5-Mar-2013 12:50 sunanda Ticket Added -