Copyright (c) 2022 JEON Myoungjin <jeongoon@g… >
LICENSE: Open Software License 3.0
Combinations in Haskell Series
- Combinations In Haskell (called Leaders and Followers)
- Combinations In Haskell (called Tail After Tail)
- Tail After Tail Story (Combinations In Haskell)
Same Theory; Different Implementation
In programming world, the pseudo code or theory will trigger your programming implementation in many ways. This kind of diversion makes programming interesting as well.
Previously I made TailAfterTail.lhs
And I found that inits
or tails
are not quite necessary if I don’t rely on scanl
or zipWith
.
It is probably generally accepted that some different kind of implementation, which consist of even small amount of codes, but which run at very high frequency, will eventually show huge gaps in performance after many iterations of execution.
So, this is about what I found during implementation of Tail After Tail combinations.
Module Begins
Firstly, I going to write down the all function will be exposed.{-# LANGUAGE BangPatterns #-}
module TailAfterTail
( combinationsWithScanl
, combinationsWithSingleStep
, combinations
, combinationsWithTwoSteps
, allCombinationsWithScanl
, allCombinationsWithSingleStep
, allCombinationsWithTwoSteps
, allCombinationswhere
)
import Data.List (tails, inits, scanl') -- only required for **WithScanl
Original Version (scanl)
Please Find out more information HERE.
However, combinations1'
, flatten_allCombinationsGrouped
and
genPart
will be common helper functions.
combinations1' :: [a] -> [[[a]]]
= [ [[m]] | m <- ms ]
combinations1' ms
= map concat . allComboFunc
flatten_allCombinationsGrouped allComboFunc
genPart :: Foldable t => a -> t [[a]] -> [[a]]
= [ leader : followers
genPart leader followerGroups | followers <- concat followerGroups ]
And I define some helper functions.
usefulTails :: [a] -> [[a]]
= init . tails
usefulTails
genStep :: [[[a]]] -> [a] -> [[[a]]]
=
genStep prevTails members' zipWith genPart members' (usefulTails prevTails')
where
= tail prevTails -- head is not useful
prevTails'
= reverse . tail . inits -- tail is used to skip empty list. membersTails
and finally combinationsWithScanl
family goes below.
allCombinationsWithScanl' :: [a] -> [[[[a]]]]
=
allCombinationsWithScanl' ms
scanl' genStep (combinations1' ms) (membersTails ms)
allCombinationsWithScanlGrouped :: [a] -> [[[a]]]
=
allCombinationsWithScanlGrouped
flatten_allCombinationsGrouped allCombinationsWithScanl'
allCombinationsWithScanl :: [a] -> [[a]]
= concat . allCombinationsWithScanlGrouped allCombinationsWithScanl
Pure Implementation Without Scanl (SingleStep)
The following code is created without scanl or zipWith.
It gains slightly more performance with (bang pattern: !). Which will be covered may be in another article. But IMHO, it helps to reduce laziness and use less stack.
unsafe_allCombinationsWithSingleStep :: [a] -> [[[[a]]]]
=
unsafe_allCombinationsWithSingleStep members let
! cases = -- bang pattern added
helper let
:ms) (_:cs:[]) = [ [ m : c | c <- cs ] ]
genStep (m:ms) (_:cs) =
genStep (m-- note ^ : we don't use first element
: genStep ms cs
genPart m cs in
: helper (genStep members cases)
cases in
. combinations1' $ members helper
As you can see helper
function is just an entry level wrapper function
and make a recursion call.
genStep
will actually create next cases and act as thunk which is evaluated
later thanks to laziness in haskell.
I named the function as unsafe_
on purpose. Because helper function actually
doesn’t know when it will stop, and if you run unsafe_allCombinationsWithSingleStep
in bare context will explode with exception.
sh> ghci
GHCi, version 8.10.7: https://www.haskell.org/ghc/ :? for help
Loaded GHCi configuration from /your/home/.config/ghc/ghci.conf
λ> :l 2022-04-15-Combinations-TailAfterTail'.lhs
[1 of 1] Compiling TailAfterTail ( 2022-04-15-Combinations-TailAfterTail'.lhs, interpreted )
Ok, one module loaded.
λ> unsafe_allCombinationsWithSingleStep [1..5]
[[[[1]],[[2]],[[3]],[[4]],[[5]]],[[[1,2],[1,3],[1,4],[1,5]],[[2,3],[2,4],[2,5]],[[3,4],[3,5]],[[4,5]]],[[[1,2,3],[1,2,4],[1,2,5],[1,3,4],[1,3,5],[1,4,5]],[[2,3,4],[2,3,5],[2,4,5]],[[3,4,5]]],[[[1,2,3,4],[1,2,3,5],[1,2,4,5],[1,3,4,5]],[[2,3,4,5]]],[[[1,2,3,4,5]]],[[]*** Exception: 2022-04-15-Combinations-TailAfterTail'.lhs:(120,9)-(122,52): Non-exhaustive patterns in function genStep
unsafe_allCombinationsWithSingleStepGrouped
flatten the result but yet
grouped by selection size, this is still unsafe but combinationsWith
will handle it.
unsafe_allCombinationsWithSingleStepGrouped :: [a] -> [[[a]]]
=
unsafe_allCombinationsWithSingleStepGrouped flatten_allCombinationsGrouped unsafe_allCombinationsWithSingleStep
So now we could get all combinations by flatten a more time.
allCombinationsWithSingleStep :: [a] -> [[a]]
=
allCombinationsWithSingleStep members concat
-- this makes unsafe_* safe by limiting the size of list.
. take (length members)
. unsafe_allCombinationsWithSingleStepGrouped
$ members
With Two Steps
This is another version of without scanl
. the Main improvement is that
this function separates the jobs into two operations:
- create first cases from the previous tails.
- create rest of cases and start next next case based on the result.
allCombinationsWithTwoSteps' :: [a] -> [[[[a]]]]
allCombinationsWithTwoSteps'@(fm:rms) = -- ^ fm : first member; rms: rest members
memberslet
= [[fm]]
initFirstCase = combinations1' rms
initRestCases
= genPart fm
genFirstCases
= []
genRestCases _ [] :ms) rcs@(_:rcs') = -- ^ rcs : rest of cases
genRestCases (m$ rcs) : (genRestCases ms rcs') (genPart m
It looks almost identical when comparing to SingleStep
but
now helper
function knows exactly where to start as newTail
is
memorized at the moment. It only saves time to tail
by pattern matching
in SingleStep
but resuts are propagated when the choices are growing.
BTW, tail
by pattern matching means (_:cs) in the following code.
:ms) (_:cs) =
genStep (m-- note ^ : we don't use first element
: c | c <- concat cs ] : genStep ms cs [ m
= []
helper [] ! prevTail =
helper let
= genRestCases rms (tail prevTail)
newTail in
: newTail) : helper newTail
((genFirstCases prevTail) in (initFirstCase : initRestCases) : helper initRestCases
the following steps are similar to the other implementation.
allCombinationsWithTwoStepsGrouped :: [a] -> [[[a]]]
=
allCombinationsWithTwoStepsGrouped
flatten_allCombinationsGrouped allCombinationsWithTwoSteps'
allCombinationsWithTwoSteps :: [a] -> [[a]]
=
allCombinationsWithTwoSteps members concat . allCombinationsWithTwoStepsGrouped $ members
Another benefit of the TwoSteps
implementation is that we can stop
easily because now newTail
is always available and we could know whether
next step is available or not. I don’t need to name it unsafe_ any more.
combinations variant from each implementation
Now, it’s time to make select K
out of given choice.
And I found that this is a common helper function:
combinationsWith
combinationsWith :: ([a] -> [[[a]]]) -> [a] -> Int -> Int -> [[a]]
@selectFrom n2@selectTo =
combinationsWith allComboGroupedFunc ms n1let
= -- smaller value first
( isFlipped, n1', n2' ) if n1 < n2 then ( False
max n1 0
, max n2 0)
, else ( True
max n2 0
, max n1 0)
, -- and ensure all range value are zero or positive by using `max`
= n2' - n1' + 1
rangeLength
reverseIfNeeded| isFlipped = reverse
| otherwise = id
in
-- note: read from the bottom
concat -- 4. final flattening
. reverseIfNeeded -- 3. if user put opposite way, reverse it.
. take rangeLength -- 2. takes only interested lists
. drop (pred n1') -- 1. ignore some
$ allComboGroupedFunc ms
And all variant combinations* function available as below:
= combinationsWith allCombinationsWithScanlGrouped
combinationsWithScanl = combinationsWith unsafe_allCombinationsWithSingleStepGrouped
combinationsWithSingleStep = combinationsWith allCombinationsWithTwoStepsGrouped combinationsWithTwoSteps
Benchmark
you can find the benchmark code on my github repository. To save your time, THIS is one of my benchmark result.
Choose Default allCombinations and combinations
After benchmarking, I found AllcombinationsWithTwoSteps
shows best result
in all categories(small, medium, large) among them.
= allCombinationsWithTwoSteps
allCombinations = combinationsWithTwoSteps combinations