This release packs a LOT of new features, so I will begin with the most significant feature: proxy transformers. The proxy transformer pattern provides a very simple extension framework that cleanly solves many problems that iteratee library authors face.
Users of the library should read
Control.Proxy.Trans.Tutorial, which explains how proxy transformers work. However, this post also provides a decent introduction to them, too.
Introduction
Wouldn't it be nice if you could catch and handle errors within a proxy? Now you can! It's as simple as:
import Control.Monad (forever)
import Control.Monad.Trans (lift)
import Control.Proxy
import Control.Proxy.Trans.Either as E
import Safe (readMay)
promptInts :: () -> EitherP String Proxy C () () Int IO r
promptInts () = recover $ forever $ do
str <- lift getLine
case readMay str of
Nothing -> E.throw "Could not parse an integer"
Just n -> liftP $ respond n
recover p =
p `E.catch` (\str -> lift (putStrLn str) >> recover p)
main = runProxy $ runEitherK $ mapP printD <-< promptInts
>>> main
1<Enter>
1
Test<Enter>
Could not parse an integer
Apple<Enter>
Could not parse an integer
5<Enter>
5
The above program condenses many new features of this release into a nice compact example and I'll use it to show-case each feature.
Proxy transformers
The above program uses the
EitherP proxy transformer. To access this feature, you just import the transformer you wish to use:
import Control.Proxy.Trans.Either as E
Control.Proxy imports all the remaining machinery you need.
EitherP extends any proxy-like type with the ability to
throw and
catch errors locally, as if it lived inside a native
EitherT block. It does so in such a way that preserves composition (and the category laws!), so you can directly compose the result without unwrapping the
EitherP.
When you are done composing, just use
runEitherK to convert it back to the underlying proxy:
runEitherK
:: (q -> EitherP e p a' a b' b m r )
-> (q -> p a' a b' b m (Either e r))
Utilities
This release introduces the "proxy prelude", a set of convenience functions for users of the library.
Control.Proxy automatically exports these and they don't clash with the Prelude or any common libraries.
Our old friend
printer got a name-change and now goes by
printD. This utility function prints all values bound downstream:
printD :: Show a => x -> Proxy x a x a IO r
I provide many more utility functions under the
Control.Proxy.Prelude hierarchy, and people who enjoyed my
functor design pattern post will also enjoy the abundance of cute trivial examples of the functor pattern in the documentation for
Control.Proxy.Prelude.Base.
Proxy transformers are functors
However, this release includes a far more sophisticated set of functors: the proxy transformers themselves. Each proxy transformer implements the
ProxyTrans class which defines two functions:
mapP and
liftP, related by the equation:
mapP = (liftP .)
mapP defines two separate functors.
The first functor behaves like a traditional monad transformer, converting the base Kleisli category to the extended Kleisli category:
mapP return = return
mapP (f >=> g) = mapP f >=> mapP g
You can write these laws using
liftP to see that our proxy transformers behave like ordinary monad transformers:
liftP $ return x = return x
do x <- liftP m
liftP $ f x
= liftP $ do x <- m
f x
The above program uses this capacity of
liftP to lift operations from the
Proxy monad to the
EitherP String Proxy monad.
The second functor lifts the base proxy composition to the extended proxy composition:
mapP idT = idT
mapP (p1 >-> p2) = mapP p1 >-> mapP p2
This latter functor lets you compose simpler proxies with extended proxies. The above program uses
mapP in this capacity to promote
printD for composition with
promptInts:
mapP printD <-< promptInts
This demonstrates a concrete application of the functor design pattern, allowing seamless interoperability between proxies written to varying feature sets. The proxy transformers lift both the monad instance and the composition instance correctly so that simpler proxies play nicely with extended proxies.
Type signatures
The above program demos the new replacement for
Void:
C. This will shorten type signatures and also removes the dependency on
void.
Also, now the
Proxy type is a newtype around the underlying
FreeT implementation. This gives nicer type errors when things go wrong.
Proxy Transformer Stacks
Just like monad transformers, you can stack proxy transformers to automatically combine their effects. By combining the
StateP and
EitherP proxy transformers, you can implement non-backtracking parsers for free:
{-# LANGUAGE GeneralizedNewtypeDeriving, OverloadedStrings #-}
import Control.Monad.Trans
import Control.Proxy
import Control.Proxy.Trans.Either as E
import Control.Proxy.Trans.State
import Data.Text as T hiding (take)
newtype ParseP p a' a b' b m r =
ParseP { unParseP ::
StateP Text (EitherP Text p) a' a b' b m r }
deriving (Monad, MonadTrans, Channel)
instance ProxyTrans ParseP where
liftP = ParseP . liftP . liftP
runParseK
:: (q -> ParseP p a' a b' b m r)
-> (q -> p a' a b' b m (Either Text (r, Text)))
runParseK = runEitherK . runStateK T.empty . (unParseP .)
The
Channel type class defines proxy composition, so we can compose our parsing proxies seamlessly.
Let's write a few parsing primitives:
import Data.Monoid
import Data.Text.IO as T
import Prelude hiding (take)
take n = ParseP go where
go = do
s <- get
if (T.length s < n)
then do
s' <- liftP $ liftP $ request ()
put (s <> s')
go
else do
let (h, t) = T.splitAt n s
put t
return h
parseFail str = ParseP $ liftP $ E.throw str
string str = do
str' <- take (T.length str)
if (str' == str)
then return str
else parseFail $
"Expected: " <> str <> " -- Found: " <> str'
You wouldn't even know those were proxies if it were not for that single
request statement.
Let's write a contrived parser based off of those primitives:
parser () = do
string "Hello"
str <- take 5
lift $ T.putStrLn str
... and supply it with some input:
source () = do
respond "Hell"
respond "o, world!"
Now compose!
>>> runProxy $ runParserK $ parser <-< mapP source
, wor
Right ((),"ld!")
Let's see how failed parses turn out:
invalid () = do
respond "A"
respond "AAAAAAAA"
>>> runProxy $ runParseK $ parser <-< mapP invalid
Left "Expected: Hello -- Found: AAAAA"
I didn't include parsers in the library because I didn't want to add a
bytestring or
text dependency to the main
pipes package. Instead, I will release the parsing extension as a separate library. This library will provide you with the streaming benefits of attoparsec with the ability to interleave effects.
Pushback
The above parsing example suggests my solution to push-back, which is to give each proxy its own local state using the
StateP proxy transformer. You can then use the local state to keep track of unused input, as the above parsing example did.
Like all proxy transformers, this extension requires no special integration with the underlying proxy type and you can layer it anywhere within a proxy transformer stack with no special considerations.
Extra categories
The library now provides two additional categories for interacting with the
Proxy type. These are term-rewriting categories (I believe the technical term is "sesquicategory", but I may be mistaken).
The first category's composition operator replaces all
request statements within a
Proxy with a suitably typed replacement:
f /</ g -- Replace all occurrences of 'request' in 'f' with 'g'
request is the identity of this category, so we expect that:
-- Replacing 'request' with 'request' changes nothing
f /</ request = f
-- Replacing 'request' with 'f' gives 'f'
request /</ f = f
Also, this substitution is associative:
(f /</ g) /</ h = f /</ (g /</ h)
Similarly, the
respond command has its own substition operator,
(\<\), and they form their own category:
f \<\ g -- Replaces all 'respond's in 'g' with 'f'
f \<\ respond = f
respond \<\ f = f
(f \<\ g) \<\ h = f \<\ (g \<\ h)
Each category distributes in one direction over the Kleisli category:
-- Distributivity
r \<\ (f <=< g) = (r \<\ f) <=< (r \<\ g)
-- Zero
r \<\ return = return
-- Distributivity
(f <=< g) /</ r = (f /</ r) <=< (g /</ r)
-- Zero
return /</ r = return
Lifting request and respond
I originally envisioned that proxy transformers would also automatically lift
request and
respond statements. The laws for this lifting are quite simple:
mapP request = request
mapP respond = respond
In other words, the functor laws, applied to the identity of the two new categories I just introduced. However, unfortunately Haskell's type class system severely got in my way and I could not solve the issue before the release. I have a tentative plan for how to solve this using Edward's
constraint package but it will take time. Until then, you will have to manually lift
request and
respond statements from the base
Proxy type.
Overall, I was pretty disappointed with Haskell's type class system (more so than usual). This library really exercised it considerably and I even had to drop an additional proxy transformer because it was unimplementable due to the broken constraint system.
Performance
Raw proxies give performance comparable to
conduit when doing simple
IO:
import Control.Monad
import Control.Monad.Trans
import Control.Proxy hiding (await)
import Data.Conduit
import Data.Conduit.List as L
import Data.Maybe (fromJust) -- You did not see this
n = 100000 :: Int
-- Choose your poison
main = runProxy $ printD <-< enumFromToS 1 n
main = L.enumFromTo 1 n
$$ forever (await >>= lift . print . fromJust)
Using
pipes:
real 0m1.761s
user 0m0.384s
sys 0m0.712s
Using
conduit:
real 0m1.528s
user 0m0.224s
sys 0m0.660s
Conduit is 15% faster.
The margin is substantially larger for entirely pure code:
import Control.Monad
import Control.Monad.Trans
import Control.Proxy hiding (await)
import Data.Conduit
import Data.Conduit.List as L
n = 100000 :: Int
main = runProxy $ discard <-< enumFromToS 1 n
discard' = do
a <- await
case a of
Nothing -> return ()
Just _ -> discard'
main = L.enumFromTo 1 n $$ discard'
Using
pipes:
real 0m0.085s
user 0m0.088s
sys 0m0.000s
Using
conduit:
real 0m0.011s
user 0m0.004s
sys 0m0.004s
Conduit is almost 8(!) times faster.
Conduit dramatically improves for entirely pure code since it bends the monad transformer laws to skip binds in the base monad. This is one reason that this
pipes release type-classes all the
Proxy operations. If people request that I copy
conduit's approach, I will release a separate library that copies
conduit's optional monad bind and have it implement all the same type-classes. Then all the proxy transformers are guaranteed to work transparently with it because they abstract completely over the type classes.
Additionally, I want to note that the
pipes library currently has only one optimization
PRAGMA in the entire library:
{-# INLINABLE mapK #-} -- An obscure utility function
... whereas
conduit uses a considerable number of rewrite rules and
INLINABLE statements. I don't know how much these contribute to
conduit's speed, but I will copy Michael's optimizations in the next few releases and benchmark how much they contribute to performance.
Additionally, I've also benchmarked the overhead of proxy transformers. First, comparing performance for some trivial
IO:
import Control.Monad
import Control.Monad.Trans
import Control.Proxy
import Control.Proxy.Trans.Writer
import Data.Monoid
n = 100000 :: Int
main = runProxy $ without <-< enumFromToS 1 n
main :: IO ((), Sum Int)
main = runProxy $ runWriterK $ with <-< mapP (enumFromToS 1 n)
with
:: (Monoid w, Show a)
=> () -> WriterP w Proxy () a () C IO r
with () = forever $ do
n <- liftP $ request ()
lift $ print n
without :: (Show a) => () -> Proxy () a () C IO r
without () = forever $ do
n <- request ()
lift $ print n
Using the bind in the
WriterP w Proxy monad (i.e.
with):
real 0m1.739s
user 0m0.396s
sys 0m0.680s
Using the bind in the
Proxy monad (i.e.
without):
real 0m1.704s
user 0m0.368s
sys 0m0.668s
A difference of 2%(!).
Again, the difference widens if you switch to pure code:
import Control.Monad
import Control.Monad.Trans
import Control.Proxy
import Control.Proxy.Trans.Writer
import Data.Monoid
n = 100000 :: Int
main = runProxy $ without <-< enumFromToS 1 n
main :: IO ((), Sum Int)
main = runProxy $ runWriterK $ with <-< mapP (enumFromToS 1 n)
with
:: (Monoid w, Show a)
=> () -> WriterP w Proxy () a () C IO r
with () = forever $ liftP $ request ()
without :: (Show a) => () -> Proxy () a () C IO r
without () = forever $ request ()
Using
WriterP w Proxy's bind:
real 0m0.134s
user 0m0.124s
sys 0m0.008s
Using
Proxy's bind:
real 0m0.084s
user 0m0.076s
sys 0m0.004s
Now it's about a factor of 2.
So I can summarize these benchmarks by saying that if you are doing even a little bit of
IO, the performance differences are pretty small, and as I aggressively optimize the library, they should get even smaller.
Switch to free
Edward was kind enough to migrate my
transformers-free functionality into his
free package, so now
pipes uses
free for its free monad transformer dependency.
Resource management
I plan on releasing a
Proxy-like type that implements resource management that will replace the
Frame type. This type will include functions to promote existing
Proxy code to this resource-managed version. Until then, you will have to manually manage resources by opening all file handles before composition, and closing them all afterwards, like so:
import Control.Proxy
import System.IO
main = do
h <- openFile "test.txt" WriteMode
runProxy $ hPrintD h <-< enumFromToS 1 10
hClose h
... or you can use Michael's
ResourceT in the base monad, if that is your thing.
You won't get the benefit of conserving handles, but you will still get predictable streaming performance.
Library writers
If you are considering building off the
pipes library, I recommend implementing any functionality using the
Proxy type, which I guarantee will be promotable to any future extensions, and I plan on personally writing several
Proxy-based libraries over the next few months.
While I still preserve the
Pipe type, I fully endorse the
Proxy type as the type to standardize on as it has many more nice theoretical properties than the
Pipe type and also supports greater functionality.
Conclusions
This release is very close to the final state I envisioned for the core
pipes library. Most existing features won't disappear, with the exception of
Control.Frame, which I will phase out once I release a suitable replacement in a separate library.
Most additional features that I plan on implementing will go into separate libraries that build on top of this one. I only plan on adding functionality to the core library if I discover additional interesting structure for the
Proxy type.