Type | Bug | Status | built | Date | 8-Jul-2011 23:17 |
---|---|---|---|---|---|
Version | alpha 111 | Category | Mezzanine | Submitted by | GHigley |
Platform | All | Severity | minor | Priority | low |
Summary | SPLIT adds empty string to end of returned block |
---|---|
Description | SPLIT returns an empty string at the end of the returned block when using a delimiter. Although this may be by design, I'm skeptical. (I have not tested this on any platform but OS X.) |
Example code |
foo: "a.b.c" split foo "." ; returns ["a" "b" "c" ""] foo: "a.b.c." split foo "." ; returns ["a" "b" "c" ""], same as above |
Assigned to | n/a | Fixed in | r3 master | Last Update | 19-Feb-2014 21:27 |
---|
Comments | |
---|---|
(0003190)
Gregg 19-Jul-2011 20:44 |
The following implementation is a bit ugly, with the special handling added, but it does address all the SPLIT tickets I've found here. |
(0003191)
Gregg 19-Jul-2011 20:44 |
split: func [ "Split a series into pieces; fixed or variable size, fixed number, or at delimiters" series [series!] "The series to split" dlm [block! integer! char! bitset! any-string!] "Split size, delimiter(s), or rule(s)." /into "If dlm is an integer, split into n pieces, rather than pieces of length n." /local size piece-size count mk1 mk2 res fill-val add-fill-val ][ either all [block? dlm parse dlm [some integer!]] [ map-each len dlm [ either positive? len [ copy/part series series: skip series len ] [ series: skip series negate len ; return unset so that nothing is added to output () ] ] ][ size: dlm ; alias for readability res: collect [ parse/all series case [ all [integer? size into] [ if size < 1 [cause-error 'Script 'invalid-arg size] count: size - 1 piece-size: to integer! round/down divide length? series size if zero? piece-size [piece-size: 1] [ count [copy series piece-size skip (keep/only series)] copy series to end (keep/only series) ] ] integer? dlm [ if size < 1 [cause-error 'Script 'invalid-arg size] [any [copy series 1 size skip (keep/only series)]] ] 'else [ ; = any [bitset? dlm any-string? dlm char? dlm] [any [mk1: some [mk2: dlm break | skip] (keep/only copy/part mk1 mk2)]] ] ] ] ;-- Special processing, to handle cases where the spec'd more items in ; /into than the series contains (so we want to append empty items), ; or where the dlm was a char/string/charset and it was the last char ; (so we want to append an empty field that the above rule misses). fill-val: does [copy either any-block? series [[]] [""]] add-fill-val: does [append/only res fill-val] case [ all [integer? size into] [ ; If the result is too short, i.e., less items than 'size, add ; empty items to fill it to 'size. ; We loop here, because insert/dup doesn't copy the value inserted. if size > length? res [ loop (size - length? res) [add-fill-val] ] ] ; integer? dlm [ ; ] 'else [ ; = any [bitset? dlm any-string? dlm char? dlm] ; If the last thing in the series is a delimiter, there is an ; implied empty field after it, which we add here. case [ bitset? dlm [ ; ATTEMPT is here because LAST will return NONE for an ; empty series, and finding none in a bitest is not allowed. if attempt [find dlm last series] [add-fill-val] ] char? dlm [ if dlm = last series [add-fill-val] ] string? dlm [ if all [ find series dlm empty? find/last/tail series dlm ] [add-fill-val] ] ] ] ] res ] ] |
(0003192)
Gregg 19-Jul-2011 20:46 |
A quick test func:
test: func [block expected-result /local res] [ if error? try [ print [mold/only :block newline tab mold res: do block] if res <> expected-result [print [tab 'FAILED! tab 'expected mold expected-result]] ][ print [mold/only :block newline tab "ERROR!"] ] ] |
(0003193)
Gregg 19-Jul-2011 20:47 |
And a few tests:
test [split "1234567812345678" 4] ["1234" "5678" "1234" "5678"] test [split "1234567812345678" 3] ["123" "456" "781" "234" "567" "8"] test [split "1234567812345678" 5] ["12345" "67812" "34567" "8"] test [split/into [1 2 3 4 5 6] 2] [[1 2 3] [4 5 6]] test [split/into "1234567812345678" 2] ["12345678" "12345678"] test [split/into "1234567812345678" 3] ["12345" "67812" "345678"] test [split/into "1234567812345678" 5] ["123" "456" "781" "234" "5678"] ; Dlm longer than series test [split/into "123" 6] ["1" "2" "3" "" "" ""] ;or ["1" "2" "3"] test [split/into [1 2 3] 6] [[1] [2] [3] [] [] []] ;or [1 2 3] test [split [1 2 3 4 5 6] [2 1 3]] [[1 2] [3] [4 5 6]] test [split "1234567812345678" [4 4 2 2 1 1 1 1]] ["1234" "5678" "12" "34" "5" "6" "7" "8"] test [split first [(1 2 3 4 5 6 7 8 9)] 3] [(1 2 3) (4 5 6) (7 8 9)] test [split #{0102030405060708090A} [4 3 1 2]] [#{01020304} #{050607} #{08} #{090A}] test [split [1 2 3 4 5 6] [2 1]] [[1 2] [3]] test [split [1 2 3 4 5 6] [2 1 3 5]] [[1 2] [3] [4 5 6] []] test [split [1 2 3 4 5 6] [2 1 6]] [[1 2] [3] [4 5 6]] ; Old design for negative skip vals ;test [split [1 2 3 4 5 6] [3 2 2 -2 2 -4 3]] [[1 2 3] [4 5] [6] [5 6] [3 4 5]] ; New design for negative skip vals test [split [1 2 3 4 5 6] [2 -2 2]] [[1 2] [5 6]] test [split "abc,de,fghi,jk" #","] ["abc" "de" "fghi" "jk"] test [split "abc |
(0003429)
abolka 6-Feb-2013 02:41 |
Gregg, that looks fantastic.
Seems one of the tests got messed up/misformatted by CureCode, but I can confirm that the rest of the tests pass fine with R3 A111. |
(0003934)
abolka 18-Aug-2013 10:48 |
I'd like to see this merged. I think there was a desire (expressed by BrianH, for example, IIRC) to have the /INTO refinement of above proposed function renamed.
If I remember this correctly, are there any suggestions as to what rename /INTO to? |
(0003938)
johnk 19-Aug-2013 07:47 |
Merged into mainline https://github.com/rebol/rebol/pull/130
Keep the ticket open to discuss renaming /INTO |
(0003940)
onetom 19-Aug-2013 11:24 |
Why not break it out into it's own ticket with a back reference to this one?
It's easier to search for it and the discussion can be more focused too. (Awesome work, Gregg. A gem! and it's burried here for so long...) Btw, are we not moving to github issues? |
(0003941)
onetom 19-Aug-2013 11:43 |
Like this: https://github.com/rebol/rebol/issues/131 ? |
(0003942)
abolka 19-Aug-2013 12:06 |
A separate ticket sounds good.
But please keep the discussion here on CureCode for now, until we properly migrate this whole CureCode database to GitHub issues. |
(0003943)
abolka 19-Aug-2013 14:45 |
Created ticket #2051 to discuss the renaming of /INTO. |
(0003946)
abolka 21-Aug-2013 00:24 |
In the core tests suite. (Added Gregg's tests from above comment.) |
Date | User | Field | Action | Change |
---|---|---|---|---|
19-Feb-2014 21:27 | BrianH | Platform | Modified | Mac OSX => All |
19-Feb-2014 21:27 | BrianH | Status | Modified | submitted => built |
19-Feb-2014 21:27 | BrianH | Fixedin | Modified | => r3 master |
19-Feb-2014 21:27 | BrianH | Code | Modified | - |
21-Aug-2013 00:24 | abolka | Comment : 0003946 | Added | - |
19-Aug-2013 14:45 | abolka | Comment : 0003943 | Added | - |
19-Aug-2013 12:06 | abolka | Comment : 0003942 | Added | - |
19-Aug-2013 11:43 | onetom | Comment : 0003941 | Added | - |
19-Aug-2013 11:25 | onetom | Comment : 0003940 | Modified | - |
19-Aug-2013 11:24 | onetom | Comment : 0003940 | Added | - |
19-Aug-2013 07:47 | johnk | Comment : 0003938 | Added | - |
19-Aug-2013 02:38 | johnk | Comment : 0003937 | Removed | - |
19-Aug-2013 01:53 | johnk | Comment : 0003937 | Modified | - |
19-Aug-2013 01:53 | johnk | Comment : 0003937 | Modified | - |
19-Aug-2013 01:52 | johnk | Comment : 0003937 | Modified | - |
19-Aug-2013 01:52 | abolka | Comment : 0003934 | Modified | - |
19-Aug-2013 01:52 | abolka | Comment : 0003934 | Modified | - |
19-Aug-2013 01:44 | johnk | Comment : 0003937 | Added | - |
18-Aug-2013 10:48 | abolka | Comment : 0003934 | Added | - |
6-Feb-2013 02:41 | abolka | Comment : 0003429 | Added | - |
19-Jul-2011 20:47 | Gregg | Comment : 0003193 | Added | - |
19-Jul-2011 20:46 | Gregg | Comment : 0003192 | Added | - |
19-Jul-2011 20:45 | Gregg | Comment : 0003191 | Modified | - |
19-Jul-2011 20:45 | Gregg | Comment : 0003191 | Modified | - |
19-Jul-2011 20:44 | Gregg | Comment : 0003191 | Modified | - |
19-Jul-2011 20:44 | Gregg | Comment : 0003191 | Added | - |
19-Jul-2011 20:44 | Gregg | Comment : 0003190 | Added | - |
8-Jul-2011 23:17 | GHigley | Ticket | Added | - |