|
|||
|
On Fri, Nov 20, 2009 at 9:23 PM, Daniel Moore <yahivin@gmail.com> wrote:
> > ## Distinct Sets (#225) > > Aloha Rubyists, > > This week's quiz comes from Ruby Quiz Suggestions MVP Martin DeMello[1]. > > [based on a surprisingly tricky stackoverflow problem] > > You have an list of sets, which you want to transform by the following > method: if any two sets have a common element, merge them into a > single set. You will be left with a reduced list partitioning all the > elements into sets where every set is disjoint from every other. $ ruby -v 225.rb ruby 1.8.2 (2004-12-25) [powerpc-darwin8.0] Loaded suite 225 Started |
|
|
||||
|
||||
|
|
|
|||
|
On Nov 21, 2009, at 11:50 AM, brabuhr@gmail.com wrote: > On Fri, Nov 20, 2009 at 9:23 PM, Daniel Moore <yahivin@gmail.com> > wrote: >> >> ## Distinct Sets (#225) >> >> Aloha Rubyists, >> >> This week's quiz comes from Ruby Quiz Suggestions MVP Martin >> DeMello[1]. >> >> [based on a surprisingly tricky stackoverflow problem] >> >> You have an list of sets, which you want to transform by the >> following >> method: if any two sets have a common element, merge them into a >> single set. You will be left with a reduced list partitioning all the >> elements into sets where every set is disjoint from every other. > > $ ruby -v 225.rb > ruby 1.8.2 (2004-12-25) [powerpc-darwin8.0] > Loaded suite 225 > Started > . > Finished in 0.059557 seconds. > > 1 tests, 10 assertions, 0 failures, 0 errors > > But, I cheated a bit in my assertions; sorting the expected and actual > values before asserting them equal: I'm sure that there's a better way than mine, but it seems to work well. ruby -v -rubygems distinct_sets_test.rb ruby 1.8.6 (2008-08-11 patchlevel 287) [universal-darwin9.0] /Library/Ruby/Gems/1.8/gems/thoughtbot-shoulda-2.9.1/lib/shoulda/ context.rb:4: warning: method redefined; discarding old contexts Loaded suite distinct_sets_test Started .................... Finished in 0.018326 seconds. 20 tests, 148 assertions, 0 failures, 0 errors This includes all the tests given in the original quiz description and all the tests from brabuhr@gmail.com, but using Shoulda and splitting the one "tests" method with 10 assertions into 4 separate methods with one assertion each and eliminating the duplicates. http://gist.github.com/240457 I'm guessing the speed difference is due more to 1.8.2 v. 1.8.6 and the actual machines than any significant difference in our algorithms. -Rob Rob Biedenharn http://agileconsultingllc.com Rob@AgileConsultingLLC.com |
|
|||
|
> http://gist.github.com/240457
Both of your tests use rather small input sets. It would be interesting to know how the solutions deal with input that contains many (10, 50, 100, ....) sets and/or many different signs (not just letters). |
|
|||
|
On Nov 22, 2009, at 1:51 AM, lith wrote:
>> http://gist.github.com/240457 > > Both of your tests use rather small input sets. It would be > interesting to know how the solutions deal with input that contains > many (10, 50, 100, ....) sets and/or many different signs (not just > letters). I accept your challenge! The gist has been updated with sets that use numbers and symbols as well as strings. There are also some tests of large sets (which worked fine, but getting the test setup by hand was nasty). ruby -rubygems distinct_sets_test.rb Loaded suite distinct_sets_test Started .......................... Finished in 1.237836 seconds. 26 tests, 202 assertions, 0 failures, 0 errors The only change that I has to make was in how I sorted the final array to account for symbols or mixed contents: Numerics compare "naturally" with <=> and non-numeric or mixed are compared using the #to_s representation. -Rob P.S. I could add my solution to the gist, too, but I'll give everyone a chance to try the new tests first. Rob Biedenharn http://agileconsultingllc.com Rob@AgileConsultingLLC.com |
|
|||
|
On Mon, Nov 23, 2009 at 11:10 AM, Rob Biedenharn
<Rob@agileconsultingllc.com> wrote: > On Nov 22, 2009, at 1:51 AM, lith wrote: >>> http://gist.github.com/240457 >> >> Both of your tests use rather small input sets. It would be >> interesting to know how the solutions deal with input that contains >> many (10, 50, 100, ....) sets and/or many different signs (not just >> letters). > > I accept your challenge! =A0The gist has been updated with sets that use > numbers and symbols as well as strings. =A0There are also some tests of l= arge > sets (which worked fine, but getting the test setup by hand was nasty). Thanks. > ruby1.8 -v -rubygems distinct_sets_test.rb ruby 1.8.7 (2009-06-12 patchlevel 174) [i486-linux] /var/lib/gems/1.8/gems/shoulda-2.10.2/lib/shoulda/context.rb:4: warning: method redefined; discarding old contexts Loaded suite distinct_sets_test Started ..................EEE..... Finished in 3.424045 seconds. 1) Error: test: non-uniform contents should handle matching on symbols. (DistinctSetsTest): ArgumentError: comparison of String with :bill failed ./distinct_sets.rb:7:in `sort' 2) Error: test: non-uniform contents should handle mix of strings and symbols (matching on string). (DistinctSetsTest): ArgumentError: comparison of String with :bill failed ./distinct_sets.rb:7:in `sort' 3) Error: test: non-uniform contents should handle mix of strings, numbers, and symbols. (DistinctSetsTest): ArgumentError: comparison of Fixnum with :emergency failed ./distinct_sets.rb:7:in `sort' 26 tests, 175 assertions, 0 failures, 3 errors > The only change that I has to make was in how I sorted the final array to > account for symbols or mixed contents: Numerics compare "naturally" with = <=3D> > and non-numeric or mixed are compared using the #to_s representation. Simply not sorting: 26 tests, 202 assertions, 17 failures, 0 errors Simply sort_by{to_s}: 26 tests, 202 assertions, 3 failures, 0 errors More complex sort{}: 26 tests, 202 assertions, 0 failures, 0 errors |
|
|||
|
[Note: parts of this message were removed to make it a legal post.]
Hey Rubyists! I think I got quite a fast solution, using 1.9. Here is my test file : http://pastie.org/711737 You just need to change METHOD and require your own file So here is my best result for the 19tests: Finished in 0.222688 seconds. 1 tests, 19 assertions, 0 failures, 0 errors, 0 skips Without sorting(I just had to change the order of some tests of Rob) Maye I should also say I'm using a 64bit ruby 1.9.2 ? Anyway I think this method is far faster than my others(about 100 times) and probably some of yours. Enjoy the quiz, Benoit 2009/11/23 <brabuhr@gmail.com> > On Mon, Nov 23, 2009 at 11:10 AM, Rob Biedenharn > <Rob@agileconsultingllc.com> wrote: > > On Nov 22, 2009, at 1:51 AM, lith wrote: > >>> http://gist.github.com/240457 > >> > >> Both of your tests use rather small input sets. It would be > >> interesting to know how the solutions deal with input that contains > >> many (10, 50, 100, ....) sets and/or many different signs (not just > >> letters). > > > > I accept your challenge! The gist has been updated with sets that use > > numbers and symbols as well as strings. There are also some tests of > large > > sets (which worked fine, but getting the test setup by hand was nasty). > > Thanks. > > > ruby1.8 -v -rubygems distinct_sets_test.rb > ruby 1.8.7 (2009-06-12 patchlevel 174) [i486-linux] > /var/lib/gems/1.8/gems/shoulda-2.10.2/lib/shoulda/context.rb:4: > warning: method redefined; discarding old contexts > Loaded suite distinct_sets_test > Started > ..................EEE..... > Finished in 3.424045 seconds. > > 1) Error: > test: non-uniform contents should handle matching on symbols. > (DistinctSetsTest): > ArgumentError: comparison of String with :bill failed > ./distinct_sets.rb:7:in `sort' > > 2) Error: > test: non-uniform contents should handle mix of strings and symbols > (matching on string). (DistinctSetsTest): > ArgumentError: comparison of String with :bill failed > ./distinct_sets.rb:7:in `sort' > > 3) Error: > test: non-uniform contents should handle mix of strings, numbers, and > symbols. (DistinctSetsTest): > ArgumentError: comparison of Fixnum with :emergency failed > ./distinct_sets.rb:7:in `sort' > > 26 tests, 175 assertions, 0 failures, 3 errors > > > The only change that I has to make was in how I sorted the final array to > > account for symbols or mixed contents: Numerics compare "naturally" with > <=> > > and non-numeric or mixed are compared using the #to_s representation. > > Simply not sorting: > > 26 tests, 202 assertions, 17 failures, 0 errors > > Simply sort_by{to_s}: > > 26 tests, 202 assertions, 3 failures, 0 errors > > More complex sort{}: > > 26 tests, 202 assertions, 0 failures, 0 errors > > |
|
|||
|
Here's my not-fast solution:
require 'set' class Set def intersect?(other) other.each { |o| return true if include?(o) } false end end def distinct_sets(array_of_arrays) set_of_sets = array_of_arrays.map{|a| a.to_set }.to_set set_of_sets.divide{|i, j| i.intersect?(j) }.map{|s| s.flatten.to_a } end Adding the intersect? method to Set was primarily motivated by readability, but also provided a noticeable speed improvement over my original alternative (intersection.size.>). |
|
|||
|
[Note: parts of this message were removed to make it a legal post.]
I must admit is a very elegant solution. And it's not so slow, 0.36s with my tests and adding .map { |s| s.flatten.to_a*.sort* }, it pass all the tests without sorting. Awesome for a Ruby-based class! A nice exemple of using forgot methods like divide. 2009/11/23 <brabuhr@gmail.com> > Here's my not-fast solution: > > require 'set' > > class Set > def intersect?(other) > other.each { |o| return true if include?(o) } > false > end > end > > def distinct_sets(array_of_arrays) > set_of_sets = array_of_arrays.map{|a| > a.to_set > }.to_set > > set_of_sets.divide{|i, j| > i.intersect?(j) > }.map{|s| > s.flatten.to_a > } > end > > Adding the intersect? method to Set was primarily motivated by > readability, but also provided a noticeable speed improvement over my > original alternative (intersection.size.>). > > |
|
|||
|
> And it's not so slow,
I wrote a small approximative benchmark for that runs the script with different set configurations. http://pastie.org/711915 It might be interesting to compare the runtime behaviour of your script with other solutions. |
|
|||
|
On Tue, Nov 24, 2009 at 1:20 AM, lith <minilith@gmail.com> wrote:
>> And it's not so slow, > > I wrote a small approximative benchmark for that runs the script with > different set configurations. > http://pastie.org/711915 > cat /proc/cpuinfo | fgrep name model name : Intel(R) Pentium(R) M processor 1.60GHz > uname -a Linux eXist 2.6.31-14-generic #48-Ubuntu SMP Fri Oct 16 14:04:26 UTC 2009 i686 GNU/Linux > ruby1.8 -v 711915.rb 225.rb ruby 1.8.7 (2009-06-12 patchlevel 174) [i486-linux] user system total real elements * sets: 1 * 5 => 1 * 1 0.010000 0.000000 0.010000 ( 0.010348) elements * sets: 1 * 5 => 2 * 1 0.010000 0.000000 0.010000 ( 0.008285) elements * sets: 2 * 5 => 3 * 1 0.010000 0.000000 0.010000 ( 0.015534) elements * sets: 2 * 5 => 4 * 1 0.020000 0.000000 0.020000 ( 0.020882) elements * sets: 2 * 5 => 5 * 1 0.020000 0.010000 0.030000 ( 0.023482) elements * sets: 3 * 5 => 6 * 1 0.020000 0.000000 0.020000 ( 0.021767) elements * sets: 3 * 5 => 7 * 1 0.040000 0.000000 0.040000 ( 0.035285) elements * sets: 3 * 5 => 8 * 1 0.020000 0.010000 0.030000 ( 0.028141) elements * sets: 4 * 5 => 9 * 1 0.020000 0.000000 0.020000 ( 0.028562) elements * sets: 4 * 5 => 10 * 1 0.020000 0.010000 0.030000 ( 0.029465) elements * sets: 18 * 255 => 1 * 51 0.780000 0.140000 0.920000 ( 0.922092) elements * sets: 35 * 255 => 2 * 51 1.330000 0.190000 1.520000 ( 1.526110) elements * sets: 52 * 255 => 3 * 51 1.110000 0.220000 1.330000 ( 1.338236) elements * sets: 69 * 255 => 4 * 51 1.670000 0.280000 1.950000 ( 1.942574) elements * sets: 86 * 255 => 5 * 51 1.440000 0.320000 1.760000 ( 1.763389) elements * sets: 103 * 255 => 6 * 51 1.820000 0.330000 2.150000 ( 2.147160) elements * sets: 120 * 255 => 7 * 51 1.720000 0.430000 2.150000 ( 2.165951) elements * sets: 137 * 255 => 8 * 51 2.300000 0.480000 2.780000 ( 2.774613) elements * sets: 154 * 255 => 9 * 51 2.180000 0.390000 2.570000 ( 2.571221) elements * sets: 171 * 255 => 10 * 51 2.430000 0.500000 2.930000 ( 2.921943) elements * sets: 34 * 505 => 1 * 101 2.550000 0.650000 3.200000 ( 3.199823) elements * sets: 68 * 505 => 2 * 101 4.540000 1.000000 5.540000 ( 5.547267) elements * sets: 102 * 505 => 3 * 101 3.980000 0.740000 4.720000 ( 4.744038) elements * sets: 135 * 505 => 4 * 101 5.890000 1.210000 7.100000 ( 7.099683) elements * sets: 169 * 505 => 5 * 101 5.170000 1.120000 6.290000 ( 6.288145) elements * sets: 203 * 505 => 6 * 101 6.660000 1.510000 8.170000 ( 11.120129) elements * sets: 236 * 505 => 7 * 101 6.470000 1.360000 7.830000 ( 7.843291) elements * sets: 270 * 505 => 8 * 101 8.390000 1.830000 10.220000 ( 10.233364) elements * sets: 304 * 505 => 9 * 101 7.750000 1.630000 9.380000 ( 9.377075) elements * sets: 337 * 505 => 10 * 101 8.730000 2.000000 10.730000 ( 10.752132) And another one-off benchmark I had run: per sets set user system total real 10, 1 0.000000 0.000000 0.000000 ( 0.005441) 10, 10 0.010000 0.000000 0.010000 ( 0.006944) 10, 100 0.040000 0.010000 0.050000 ( 0.044136) 10, 1000 0.310000 0.060000 0.370000 ( 0.372266) 10, 10000 1.620000 0.090000 1.710000 ( 1.718613) 10, 100000 18.700000 1.020000 19.720000 ( 19.869159) user system total real 1, 10 0.000000 0.000000 0.000000 ( 0.002515) 10, 10 0.010000 0.000000 0.010000 ( 0.006701) 100, 10 0.370000 0.080000 0.450000 ( 0.454811) 1000, 10 36.210000 8.340000 44.550000 ( 44.798496) user system total real 1, 1 0.000000 0.000000 0.000000 ( 0.000435) 10, 10 0.010000 0.000000 0.010000 ( 0.006239) 100, 100 2.900000 0.750000 3.650000 ( 3.762373) (Sets were filled with random integers.) |
|
|||
|
> =A0 set_of_sets.divide{|i, j|
> =A0 =A0 i.intersect?(j) > =A0 } I didn't know the Set#divide method. Interesting. Here is a graph-based approach: http://pastie.org/714759 Regards, Tom |
|
|||
|
On Wed, Nov 25, 2009 at 11:59 AM, lith <minilith@gmail.com> wrote:
> Here is a graph-based approach: > http://pastie.org/714759 Running the small approximative benchmark: > fgrep name /proc/cpuinfo model name : Intel(R) Pentium(R) M processor 1.60GHz > uname -a Linux eXist 2.6.31-14-generic #48-Ubuntu SMP Fri Oct 16 14:04:26 UTC 2009 i686 GNU/Linux > ruby1.8 -v 711915.rb 714759.rb ruby 1.8.7 (2009-06-12 patchlevel 174) [i486-linux] user system total real elements * sets: 1 * 5 => 1 * 1 0.000000 0.000000 0.000000 ( 0.003731) elements * sets: 1 * 5 => 2 * 1 0.000000 0.000000 0.000000 ( 0.002467) elements * sets: 2 * 5 => 3 * 1 0.000000 0.000000 0.000000 ( 0.003154) elements * sets: 2 * 5 => 4 * 1 0.010000 0.000000 0.010000 ( 0.004540) elements * sets: 2 * 5 => 5 * 1 0.000000 0.000000 0.000000 ( 0.005251) elements * sets: 3 * 5 => 6 * 1 0.010000 0.000000 0.010000 ( 0.004601) elements * sets: 3 * 5 => 7 * 1 0.000000 0.000000 0.000000 ( 0.006505) elements * sets: 3 * 5 => 8 * 1 0.000000 0.010000 0.010000 ( 0.008302) elements * sets: 4 * 5 => 9 * 1 0.000000 0.000000 0.000000 ( 0.008121) elements * sets: 4 * 5 => 10 * 1 0.010000 0.010000 0.020000 ( 0.008174) elements * sets: 18 * 255 => 1 * 51 0.070000 0.000000 0.070000 ( 0.068440) elements * sets: 35 * 255 => 2 * 51 0.120000 0.020000 0.140000 ( 0.145098) elements * sets: 52 * 255 => 3 * 51 0.220000 0.030000 0.250000 ( 0.253006) elements * sets: 69 * 255 => 4 * 51 0.310000 0.060000 0.370000 ( 0.375845) elements * sets: 86 * 255 => 5 * 51 0.450000 0.070000 0.520000 ( 0.516071) elements * sets: 103 * 255 => 6 * 51 0.580000 0.090000 0.670000 ( 0.680203) elements * sets: 120 * 255 => 7 * 51 0.740000 0.130000 0.870000 ( 0.864627) elements * sets: 137 * 255 => 8 * 51 0.920000 0.160000 1.080000 ( 1.082008) elements * sets: 154 * 255 => 9 * 51 1.190000 0.130000 1.320000 ( 1.319013) elements * sets: 171 * 255 => 10 * 51 1.380000 0.190000 1.570000 ( 1.569024) elements * sets: 34 * 505 => 1 * 101 0.140000 0.010000 0.150000 ( 0.151603) elements * sets: 68 * 505 => 2 * 101 0.260000 0.060000 0.320000 ( 0.336318) elements * sets: 102 * 505 => 3 * 101 0.490000 0.060000 0.550000 ( 0.537813) elements * sets: 135 * 505 => 4 * 101 0.700000 0.090000 0.790000 ( 0.802585) elements * sets: 169 * 505 => 5 * 101 0.950000 0.160000 1.110000 ( 1.106535) elements * sets: 203 * 505 => 6 * 101 1.240000 0.220000 1.460000 ( 1.456670) elements * sets: 236 * 505 => 7 * 101 1.590000 0.260000 1.850000 ( 1.850735) elements * sets: 270 * 505 => 8 * 101 2.030000 0.260000 2.290000 ( 2.294973) elements * sets: 304 * 505 => 9 * 101 2.350000 0.430000 2.780000 ( 2.784864) elements * sets: 337 * 505 => 10 * 101 2.860000 0.470000 3.330000 ( 3.335518) And a little one-off benchmark (sets of random integers): per sets set user system total real 10, 1 0.000000 0.000000 0.000000 ( 0.000776) 10, 10 0.000000 0.000000 0.000000 ( 0.005667) 10, 100 0.350000 0.030000 0.380000 ( 0.384304) 10, 1000 Timeout::Error - execution expired (> 3 minutes) user system total real 1, 10 0.000000 0.000000 0.000000 ( 0.000743) 10, 10 0.000000 0.000000 0.000000 ( 0.009329) 100, 10 0.040000 0.010000 0.050000 ( 0.056932) 1000, 10 5.180000 0.070000 5.250000 ( 7.455180) 10000, 10 8.460000 0.510000 8.970000 ( 12.744708) 100000, 10 Timeout::Error - execution expired (> 3 minutes) user system total real 1, 1 0.010000 0.000000 0.010000 ( 0.003573) 10, 10 0.010000 0.000000 0.010000 ( 0.008709) 100, 100 28.750000 0.350000 29.100000 ( 42.588412) 1000, 1000 Timeout::Error - execution expired (> 3 minutes) |
|
|||
|
<brabuhr@gmail.com>:
>> Here's my not-fast solution: Benoit Daloze <eregontp@gmail.com>: > I must admit is a very elegant solution. Here's a fresh non-elegant solution: def distinct_sets(sets) sets = sets.dup h1 = {}; h2 = {} sets.each{|s| h1[s.object_id] = s.dup; s.each{|e| (h2[e] ||= []) << s.object_id}} merges = h2.select{|_, ids| ids.size > 1}.map{|_, ids| ids} return sets.sort.uniq if merges.size == 0 flag = true while flag flag = false merges = h1.keys.map{|id| merges.select{|m| m.include?(id)}.tap{|m| flag = true if m.size > 1}.flatten.uniq }.uniq end result = [] merges.each{|m| result << m.map{|id| s = h1[id]; h1.delete(id); s}.flatten.sort.uniq } (result + h1.values).sort.uniq end Slight bug in this version, (sometimes) adds an empty set to "result", e.g.: [ [], ["B", "C", "D", "F", "G", "J", "K", "L", "M", "N", "Q"], ["E", "H"] ] ruby 1.8.7 (2009-06-12 patchlevel 174) [i486-linux] user system total real elements * sets: 1 * 5 => 1 * 1 0.010000 0.000000 0.010000 ( 0.008809) elements * sets: 1 * 5 => 2 * 1 0.000000 0.000000 0.000000 ( 0.008136) elements * sets: 2 * 5 => 3 * 1 0.010000 0.000000 0.010000 ( 0.007661) elements * sets: 2 * 5 => 4 * 1 0.010000 0.000000 0.010000 ( 0.007797) elements * sets: 2 * 5 => 5 * 1 0.010000 0.000000 0.010000 ( 0.008351) elements * sets: 3 * 5 => 6 * 1 0.010000 0.000000 0.010000 ( 0.008181) elements * sets: 3 * 5 => 7 * 1 0.010000 0.000000 0.010000 ( 0.011584) elements * sets: 3 * 5 => 8 * 1 0.010000 0.000000 0.010000 ( 0.011224) elements * sets: 4 * 5 => 9 * 1 0.010000 0.000000 0.010000 ( 0.011744) elements * sets: 4 * 5 => 10 * 1 0.020000 0.000000 0.020000 ( 0.014110) elements * sets: 18 * 255 => 1 * 51 0.630000 0.100000 0.730000 ( 0.752032) elements * sets: 35 * 255 => 2 * 51 1.760000 0.230000 1.990000 ( 2.104564) elements * sets: 52 * 255 => 3 * 51 2.230000 0.330000 2.560000 ( 2.578473) elements * sets: 69 * 255 => 4 * 51 2.760000 0.400000 3.160000 ( 3.179041) elements * sets: 86 * 255 => 5 * 51 3.230000 0.520000 3.750000 ( 3.770260) elements * sets: 103 * 255 => 6 * 51 3.740000 0.560000 4.300000 ( 4.326179) elements * sets: 120 * 255 => 7 * 51 4.250000 0.630000 4.880000 ( 4.919036) elements * sets: 137 * 255 => 8 * 51 4.870000 0.700000 5.570000 ( 5.998519) elements * sets: 154 * 255 => 9 * 51 5.350000 0.710000 6.060000 ( 6.334980) elements * sets: 171 * 255 => 10 * 51 5.960000 0.700000 6.660000 ( 6.852846) elements * sets: 34 * 505 => 1 * 101 2.200000 0.310000 2.510000 ( 2.528937) elements * sets: 68 * 505 => 2 * 101 6.230000 0.860000 7.090000 ( 7.158239) elements * sets: 102 * 505 => 3 * 101 8.130000 1.290000 9.420000 ( 10.024654) elements * sets: 135 * 505 => 4 * 101 10.190000 1.370000 11.560000 ( 12.229713) elements * sets: 169 * 505 => 5 * 101 12.060000 1.710000 13.770000 ( 14.657392) elements * sets: 203 * 505 => 6 * 101 13.870000 2.060000 15.930000 ( 16.317533) elements * sets: 236 * 505 => 7 * 101 15.860000 2.310000 18.170000 ( 18.707348) elements * sets: 270 * 505 => 8 * 101 17.610000 2.750000 20.360000 ( 20.495151) elements * sets: 304 * 505 => 9 * 101 19.780000 2.880000 22.660000 ( 23.630379) elements * sets: 337 * 505 => 10 * 101 22.020000 2.980000 25.000000 ( 26.895880) per set set user system total real 10, 1 0.000000 0.000000 0.000000 ( 0.000205) 10, 10 0.000000 0.000000 0.000000 ( 0.000565) 10, 100 0.010000 0.000000 0.010000 ( 0.004842) 10, 1000 0.070000 0.000000 0.070000 ( 0.069138) 10, 10000 0.770000 0.100000 0.870000 ( 0.872998) 10, 100000 18.600000 1.130000 19.730000 ( 19.776977) user system total real 1, 10 0.000000 0.000000 0.000000 ( 0.000126) 10, 10 0.000000 0.000000 0.000000 ( 0.000569) 100, 10 0.010000 0.000000 0.010000 ( 0.007341) 1000, 10 1.360000 0.040000 1.400000 ( 1.419874) 10000, 10 (> 3 minutes) user system total real 1, 1 0.000000 0.000000 0.000000 ( 0.000044) 10, 10 0.000000 0.000000 0.000000 ( 0.000587) 100, 100 0.080000 0.010000 0.090000 ( 0.077950) 1000, 1000 (> 3 minutes) |
|
|||
|
<brabuhr@gmail.com>:
>>> Here's my not-fast solution: Benoit Daloze <eregontp@gmail.com>: >> I must admit is a very elegant solution. <brabuhr@gmail.com> wrote: > Here's a fresh non-elegant solution: Had an epiphany while thinking about the last one I had sent :-) The last half of the previous one can be applied directly to the input array of arrays of items instead of to the intermediate array of arrays of object ids: def distinct_sets(sets) sets = sets.dup values = sets.flatten.sort.uniq flag = true while flag flag = false sets = values.map{|v| sets.select{|s| s.include?(v) }.tap{|s| flag = true if s.size > 1 }.flatten.sort.uniq }.uniq end sets end Easier code, but generally scales more poorly than the previous more complex version: ruby 1.8.7 (2009-06-12 patchlevel 174) [i486-linux] user system total real elements * sets: 1 * 5 => 1 * 1 0.000000 0.000000 0.000000 ( 0.001928) elements * sets: 1 * 5 => 2 * 1 0.000000 0.000000 0.000000 ( 0.002136) elements * sets: 2 * 5 => 3 * 1 0.000000 0.000000 0.000000 ( 0.003218) elements * sets: 2 * 5 => 4 * 1 0.000000 0.000000 0.000000 ( 0.003926) elements * sets: 2 * 5 => 5 * 1 0.010000 0.000000 0.010000 ( 0.007761) elements * sets: 3 * 5 => 6 * 1 0.010000 0.000000 0.010000 ( 0.012438) elements * sets: 3 * 5 => 7 * 1 0.010000 0.000000 0.010000 ( 0.009863) elements * sets: 3 * 5 => 8 * 1 0.010000 0.000000 0.010000 ( 0.015234) elements * sets: 4 * 5 => 9 * 1 0.020000 0.000000 0.020000 ( 0.022202) elements * sets: 4 * 5 => 10 * 1 0.020000 0.000000 0.020000 ( 0.025353) elements * sets: 18 * 255 => 1 * 51 0.400000 0.120000 0.520000 ( 0.514576) elements * sets: 35 * 255 => 2 * 51 0.970000 0.180000 1.150000 ( 1.280102) elements * sets: 52 * 255 => 3 * 51 1.690000 0.210000 1.900000 ( 2.260107) elements * sets: 69 * 255 => 4 * 51 2.310000 0.390000 2.700000 ( 2.888354) elements * sets: 86 * 255 => 5 * 51 3.220000 0.460000 3.680000 ( 4.250457) elements * sets: 103 * 255 => 6 * 51 4.090000 0.590000 4.680000 ( 4.723892) elements * sets: 120 * 255 => 7 * 51 5.200000 0.600000 5.800000 ( 6.258614) elements * sets: 137 * 255 => 8 * 51 6.330000 0.700000 7.030000 ( 7.694748) elements * sets: 154 * 255 => 9 * 51 7.470000 0.870000 8.340000 ( 8.730147) elements * sets: 171 * 255 => 10 * 51 8.970000 0.820000 9.790000 ( 9.871180) elements * sets: 34 * 505 => 1 * 101 1.720000 0.250000 1.970000 ( 2.331276) elements * sets: 68 * 505 => 2 * 101 3.620000 0.750000 4.370000 ( 4.983125) elements * sets: 102 * 505 => 3 * 101 6.110000 0.920000 7.030000 ( 7.258356) elements * sets: 135 * 505 => 4 * 101 8.730000 1.460000 10.190000 ( 10.912860) elements * sets: 169 * 505 => 5 * 101 12.130000 1.630000 13.760000 ( 14.714311) elements * sets: 203 * 505 => 6 * 101 15.280000 2.040000 17.320000 ( 17.556731) elements * sets: 236 * 505 => 7 * 101 19.090000 2.490000 21.580000 ( 22.005613) elements * sets: 270 * 505 => 8 * 101 23.430000 2.620000 26.050000 ( 26.366511) elements * sets: 304 * 505 => 9 * 101 28.090000 3.020000 31.110000 ( 32.421622) elements * sets: 337 * 505 => 10 * 101 33.270000 3.550000 36.820000 ( 39.281699) per set set user system total real 10, 1 0.000000 0.000000 0.000000 ( 0.000338) 10, 10 0.010000 0.000000 0.010000 ( 0.006647) 10, 100 0.320000 0.000000 0.320000 ( 0.326307) 10, 1000 (>3 minutes) user system total real 1, 10 0.000000 0.000000 0.000000 ( 0.000311) 10, 10 0.010000 0.000000 0.010000 ( 0.006137) 100, 10 0.290000 0.020000 0.310000 ( 0.305198) 1000, 10 77.240000 8.000000 85.240000 ( 90.868828) 10000, 10 (>3 minutes) user system total real 1, 1 0.000000 0.000000 0.000000 ( 0.000052) 10, 10 0.000000 0.000000 0.000000 ( 0.006212) 100, 100 84.440000 1.080000 85.520000 ( 91.856640) 1000, 1000 (>3 minutes) |
|
|||
|
Sorry for all the posts :-) I think I'm done now. I refined the
complex, non-elegant solution a bit more; so, to make it easy to reference later, here are my three main solutions: First: require 'set' class Set def intersect?(other) other.each { |o| return true if include?(o) } false end end def distinct_sets(array_of_arrays) set_of_sets = array_of_arrays.map{|a| a.to_set }.to_set set_of_sets.divide{|i, j| i.intersect?(j) }.map{|s| s.flatten.to_a.sort } end Third: def distinct_sets(sets) sets = sets.dup values = sets.flatten.sort.uniq flag = true while flag flag = false sets = values.map{|v| sets.select{|s| s.include?(v) }.tap{|s| flag = true if s.size > 1 }.flatten.sort.uniq }.uniq end sets end Second (from the earlier post): def distinct_sets(sets) sets = sets.dup h1 = {}; h2 = {} sets.each{|s| h1[s.object_id] = s.dup s.each{|e| (h2[e] ||= []) << s.object_id} } merges = h2.select{|_, ids| ids.size > 1 }.map{|_, ids| ids} return sets.sort.uniq if merges.size == 0 flag = true while flag flag = false merges = h1.keys.map{|id| merges.select{|m| m.include?(id) }.tap{|m| flag = true if m.size > 1 }.flatten.uniq }.uniq end result = [] merges.each{|m| result << m.map{|id| s = h1[id]; h1.delete(id); s }.flatten.sort.uniq } (result + h1.values).sort.uniq end Second (refined version): def distinct_sets(sets) sets = sets.dup h1 = {}; h2 = {} sets.each{|s| h1[s.object_id] = s.dup s.each{|e| (h2[e] ||= []) << s.object_id} } merges = h2.values.sort.uniq flag = true while flag flag = false merges = h1.keys.map{|id| merges.select{|m| m.include?(id) }.tap{|m| flag = true if m.size > 1 }.flatten.sort.uniq }.sort.uniq end merges.map{|m| m.map{|id| s = h1[id]; h1.delete(id); s }.flatten.sort.uniq } end |
|
|
|
|
![]() |
| Popular Tags in the Forum |
| #225, distinct, quiz, sets |
| Thread Tools | |
| Display Modes | |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Los Angeles Locksmith Install High security locks featuring MedecoLocks 1-877-364-5264 | lifeine | Newsgroup comp.lang.ruby | 0 | 10-12-2009 02:16 AM |
| Sherman Oaks Locksmith Install deadbolts knob sets Locks Re-key locks1-877-364-5264 | lifeine | Newsgroup comp.lang.ruby | 0 | 10-12-2009 02:15 AM |
| Los Angeles Locksmith Install deadbolts knob sets Locks Re-key locks1-877-364-5264 | lifeine | Newsgroup comp.lang.ruby | 0 | 10-12-2009 02:15 AM |
| Re: PROC SQL--select DISTINCT | Richard Read Allen | Newsgroup comp.soft-sys.sas | 0 | 01-10-2008 10:31 PM |
| Re: PROC SQL--select DISTINCT | data _null_, | Newsgroup comp.soft-sys.sas | 0 | 01-10-2008 09:13 PM |