Copyright (c) 2022 JEON Myoungjin <jeongoon@g… >
LICENSE: Open Software License 3.0
Combinations in Haskell Series
- Combinations In Haskell (called Leaders and Followers)
- Combinations In Haskell (called Tail After Tail)
- Tail After Tail Story (Combinations In Haskell)
Same Theory; Different Implementation
In programming world, the pseudo code or theory will trigger your programming implementation in many ways. This kind of diversion makes programming interesting as well.
Previously I made TailAfterTail.lhs
And I found that inits
or tails
are not quite necessary if I don’t rely on scanl or zipWith.
It is probably generally accepted that some different kind of implementation, which consist of even small amount of codes, but which run at very high frequency, will eventually show huge gaps in performance after many iterations of execution.
So, this is about what I found during implementation of Tail After Tail combinations.
Module Begins
Firstly, I going to write down the all function will be exposed.{-# LANGUAGE BangPatterns #-}
module TailAfterTail
( combinationsWithScanl
, combinationsWithSingleStep
, combinations
, combinationsWithTwoSteps
, allCombinationsWithScanl
, allCombinationsWithSingleStep
, allCombinationsWithTwoSteps
, allCombinations
) where
import Data.List (tails, inits, scanl') -- only required for **WithScanlOriginal Version (scanl)
Please Find out more information HERE.
However, combinations1', flatten_allCombinationsGrouped and
genPart will be common helper functions.
combinations1' :: [a] -> [[[a]]]
combinations1' ms = [ [[m]] | m <- ms ]
flatten_allCombinationsGrouped allComboFunc = map concat . allComboFunc
genPart :: Foldable t => a -> t [[a]] -> [[a]]
genPart leader followerGroups = [ leader : followers
| followers <- concat followerGroups ]And I define some helper functions.
usefulTails :: [a] -> [[a]]
usefulTails = init . tails
genStep :: [[[a]]] -> [a] -> [[[a]]]
genStep prevTails members' =
zipWith genPart members' (usefulTails prevTails')
where
prevTails' = tail prevTails -- head is not useful
membersTails = reverse . tail . inits -- tail is used to skip empty list.and finally combinationsWithScanl family goes below.
allCombinationsWithScanl' :: [a] -> [[[[a]]]]
allCombinationsWithScanl' ms =
scanl' genStep (combinations1' ms) (membersTails ms)
allCombinationsWithScanlGrouped :: [a] -> [[[a]]]
allCombinationsWithScanlGrouped =
flatten_allCombinationsGrouped allCombinationsWithScanl'
allCombinationsWithScanl :: [a] -> [[a]]
allCombinationsWithScanl = concat . allCombinationsWithScanlGroupedPure Implementation Without Scanl (SingleStep)
The following code is created without scanl or zipWith.
It gains slightly more performance with (bang pattern: !). Which will be covered may be in another article. But IMHO, it helps to reduce laziness and use less stack.
unsafe_allCombinationsWithSingleStep :: [a] -> [[[[a]]]]
unsafe_allCombinationsWithSingleStep members =
let
helper ! cases = -- bang pattern added
let
genStep (m:ms) (_:cs:[]) = [ [ m : c | c <- cs ] ]
genStep (m:ms) (_:cs) =
-- note ^ : we don't use first element
genPart m cs : genStep ms cs
in
cases : helper (genStep members cases)
in
helper . combinations1' $ membersAs you can see helper function is just an entry level wrapper function
and make a recursion call.
genStep will actually create next cases and act as thunk which is evaluated
later thanks to laziness in haskell.
I named the function as unsafe_ on purpose. Because helper function actually
doesn’t know when it will stop, and if you run unsafe_allCombinationsWithSingleStep
in bare context will explode with exception.
sh> ghci
GHCi, version 8.10.7: https://www.haskell.org/ghc/ :? for help
Loaded GHCi configuration from /your/home/.config/ghc/ghci.conf
λ> :l 2022-04-15-Combinations-TailAfterTail'.lhs
[1 of 1] Compiling TailAfterTail ( 2022-04-15-Combinations-TailAfterTail'.lhs, interpreted )
Ok, one module loaded.
λ> unsafe_allCombinationsWithSingleStep [1..5]
[[[[1]],[[2]],[[3]],[[4]],[[5]]],[[[1,2],[1,3],[1,4],[1,5]],[[2,3],[2,4],[2,5]],[[3,4],[3,5]],[[4,5]]],[[[1,2,3],[1,2,4],[1,2,5],[1,3,4],[1,3,5],[1,4,5]],[[2,3,4],[2,3,5],[2,4,5]],[[3,4,5]]],[[[1,2,3,4],[1,2,3,5],[1,2,4,5],[1,3,4,5]],[[2,3,4,5]]],[[[1,2,3,4,5]]],[[]*** Exception: 2022-04-15-Combinations-TailAfterTail'.lhs:(120,9)-(122,52): Non-exhaustive patterns in function genStepunsafe_allCombinationsWithSingleStepGrouped flatten the result but yet
grouped by selection size, this is still unsafe but combinationsWith will handle it.
unsafe_allCombinationsWithSingleStepGrouped :: [a] -> [[[a]]]
unsafe_allCombinationsWithSingleStepGrouped =
flatten_allCombinationsGrouped unsafe_allCombinationsWithSingleStepSo now we could get all combinations by flatten a more time.
allCombinationsWithSingleStep :: [a] -> [[a]]
allCombinationsWithSingleStep members =
concat
-- this makes unsafe_* safe by limiting the size of list.
. take (length members)
. unsafe_allCombinationsWithSingleStepGrouped
$ membersWith Two Steps
This is another version of without scanl. the Main improvement is that
this function separates the jobs into two operations:
- create first cases from the previous tails.
- create rest of cases and start next next case based on the result.
allCombinationsWithTwoSteps' :: [a] -> [[[[a]]]]
allCombinationsWithTwoSteps'
members@(fm:rms) = -- ^ fm : first member; rms: rest members
let
initFirstCase = [[fm]]
initRestCases = combinations1' rms
genFirstCases = genPart fm
genRestCases _ [] = []
genRestCases (m:ms) rcs@(_:rcs') = -- ^ rcs : rest of cases
(genPart m $ rcs) : (genRestCases ms rcs')It looks almost identical when comparing to SingleStep but
now helper function knows exactly where to start as newTail is
memorized at the moment. It only saves time to tail by pattern matching
in SingleStep but resuts are propagated when the choices are growing.
BTW, tail by pattern matching means (_:cs) in the following code.
genStep (m:ms) (_:cs) =
-- note ^ : we don't use first element
[ m : c | c <- concat cs ] : genStep ms cs helper [] = []
helper ! prevTail =
let
newTail = genRestCases rms (tail prevTail)
in
((genFirstCases prevTail) : newTail) : helper newTail
in (initFirstCase : initRestCases) : helper initRestCasesthe following steps are similar to the other implementation.
allCombinationsWithTwoStepsGrouped :: [a] -> [[[a]]]
allCombinationsWithTwoStepsGrouped =
flatten_allCombinationsGrouped allCombinationsWithTwoSteps'
allCombinationsWithTwoSteps :: [a] -> [[a]]
allCombinationsWithTwoSteps members =
concat . allCombinationsWithTwoStepsGrouped $ membersAnother benefit of the TwoSteps implementation is that we can stop
easily because now newTail is always available and we could know whether
next step is available or not. I don’t need to name it unsafe_ any more.
combinations variant from each implementation
Now, it’s time to make select K out of given choice.
And I found that this is a common helper function:
combinationsWith
combinationsWith :: ([a] -> [[[a]]]) -> [a] -> Int -> Int -> [[a]]
combinationsWith allComboGroupedFunc ms n1@selectFrom n2@selectTo =
let
( isFlipped, n1', n2' ) = -- smaller value first
if n1 < n2 then ( False
, max n1 0
, max n2 0)
else ( True
, max n2 0
, max n1 0)
-- and ensure all range value are zero or positive by using `max`
rangeLength = n2' - n1' + 1
reverseIfNeeded
| isFlipped = reverse
| otherwise = id
in
-- note: read from the bottom
concat -- 4. final flattening
. reverseIfNeeded -- 3. if user put opposite way, reverse it.
. take rangeLength -- 2. takes only interested lists
. drop (pred n1') -- 1. ignore some
$ allComboGroupedFunc msAnd all variant combinations* function available as below:
combinationsWithScanl = combinationsWith allCombinationsWithScanlGrouped
combinationsWithSingleStep = combinationsWith unsafe_allCombinationsWithSingleStepGrouped
combinationsWithTwoSteps = combinationsWith allCombinationsWithTwoStepsGroupedBenchmark
you can find the benchmark code on my github repository. To save your time, THIS is one of my benchmark result.
Choose Default allCombinations and combinations
After benchmarking, I found AllcombinationsWithTwoSteps shows best result
in all categories(small, medium, large) among them.
allCombinations = allCombinationsWithTwoSteps
combinations = combinationsWithTwoSteps