суббота, 29 декабря 2018 г.

Type annotations vs partial type signatures vs visible type applications

In Haskell, function calls must sometimes be annotated. One of well known examples is reading arbitrary types: read "5" :: Int. Without the type annotation :: Int, compiler cannot decide what the user wants to read. In point-free expressions type annotations may grow in length, say read :: String -> Int. Often, when an expression is wrapped inside an appropriate type context, compiler is able to infer the type and the type annotation gets no longer needed. Say, in [1, 2] ++ read "[3, 4]", the type of the read’s argument can only be a list of numbers. Let’s consider an example where the type annotations are essential. For this, we will implement specialized read functions inside a standalone module ReadFromByteString.
{-# LANGUAGE TypeFamilies, EmptyDataDecls #-}

module ReadFromByteString (readFromByteString
                          ,readFromByteStringAsJSON
                          ,readFromByteStringWithRPtr
                          ,readFromByteStringWithRPtrAsJSON
                          ) where

import           Foreign.Ptr
import           Foreign.Storable
import qualified Data.ByteString as B
import           Data.ByteString (ByteString)
import qualified Data.ByteString.Char8 as C8
import qualified Data.ByteString.Lazy as L
import           Data.Binary.Get
import           Data.Aeson
import           Control.Arrow
import           Safe

data Readable a
data ReadableAsJSON a

class FromByteString a where
    type WrappedT a
    fromByteString :: Maybe a -> ByteString -> Maybe (WrappedT a)

instance Read a => FromByteString (Readable a) where
    type WrappedT (Readable a) = a
    fromByteString = const $ readMay . C8.unpack

instance FromJSON a => FromByteString (ReadableAsJSON a) where
    type WrappedT (ReadableAsJSON a) = a
    fromByteString = const decodeStrict

readFromByteString :: Read a => ByteString -> Maybe a
readFromByteString = fromByteString (Nothing :: Maybe (Readable a))

readFromByteStringAsJSON :: FromJSON a => ByteString -> Maybe a
readFromByteStringAsJSON = fromByteString (Nothing :: Maybe (ReadableAsJSON a))

readFromByteStringWithRPtr :: Read a => ByteString -> (Ptr (), Maybe a)
readFromByteStringWithRPtr = readRPtr &&& readFromByteString . skipRPtr

readFromByteStringWithRPtrAsJSON :: FromJSON a =>
    ByteString -> (Ptr (), Maybe a)
readFromByteStringWithRPtrAsJSON =
    readRPtr &&& readFromByteStringAsJSON . skipRPtr

readRPtr :: ByteString -> Ptr ()
readRPtr = wordPtrToPtr . fromIntegral . runGet getWordhost . L.fromStrict

skipRPtr :: ByteString -> ByteString
skipRPtr = B.drop $ sizeOf (undefined :: Word)
I leveraged GHC extensions TypeFamilies and EmptyDataDecls to build robust and type-safe functions whose names start with readFromByteString. They all can read only custom types deriving or implementing Read or FromJSON. This restriction comes from the fact that type class FromByteString being not exported from the module, provides only instances bound with type classes Read and FromJSON. Data Readable and ReadableAsJSON are used for indexing the instances, while type WrappedT serves as a selector for the return type of fromByteString, simply unwrapping it from Readable or ReadableAsJSON. Function fromByteString expects an argument of type Maybe a which is not used inside though: without this, the module will not compile because type variable a must be referred elsewhere in the type signature besides the return type. I wrapped this argument in Maybe to simplify implementations of the exported functions. Notice also, that exported functions with WithRPtr inside their names, expect a raw pointer in front of the serialized data. Let’s compile this and begin using.
ghc --make ReadFromByteString.hs 
[1 of 1] Compiling ReadFromByteString ( ReadFromByteString.hs, ReadFromByteString.o )
I want to create two custom data types Conf and ConfJSON.
{-# LANGUAGE DeriveGeneric #-}

module TestReadFromByteString

import           ReadFromByteString

import           Data.ByteString (ByteString)
import qualified Data.ByteString.Lazy as L
import qualified Data.ByteString.Lazy.Char8 as C8L
import           Data.Aeson
import           GHC.Generics

showAsLazyByteString :: Show a => a -> L.ByteString
showAsLazyByteString = C8L.pack . show

newtype Conf = Conf Int deriving (Read, Show)

data ConfJSON = ConfJSONCon1 Int
              | ConfJSONCon2 deriving (Generic, Show)
instance FromJSON ConfJSON
This is a header of a new module TestReadFromByteString, and all expected language extensions and imports are in. Let’s implement two simple converters using functions from module ReadFromByteString.
testReadConf :: ByteString -> L.ByteString
testReadConf = showAsLazyByteString . readFromByteString

testReadConfJSON :: ByteString -> L.ByteString
testReadConfJSON = showAsLazyByteString . readFromByteStringAsJSON
Apparently, this won’t compile: both functions do not have any notion of what must be converted. Therefore we must add type annotations. As soon as the functions are implemented in the point-free style, the annotations will be rather lengthy.
testReadConf :: ByteString -> L.ByteString
testReadConf = showAsLazyByteString .
    (readFromByteString :: ByteString -> Maybe Conf)

testReadConfJSON :: ByteString -> L.ByteString
testReadConfJSON = showAsLazyByteString .
    (readFromByteStringAsJSON :: ByteString -> Maybe ConfJSON)
This time it must compile.
ghc --make TestReadFromByteString.hs 
[2 of 2] Compiling TestReadFromByteString ( TestReadFromByteString.hs, TestReadFromByteString.o )
We can load this in GHCi, and play a bit.
*TestReadFromByteString> :set -XOverloadedStrings 
*TestReadFromByteString> testReadConf "Conf 8"
"Just (Conf 8)"
*TestReadFromByteString> testReadConfJSON "{\"tag\":\"ConfJSONCon2\"}"
"Just ConfJSONCon2"
Now let’s implement similar converters for functions that expect a raw pointer.
testReadConfWithRPtr :: ByteString -> L.ByteString
testReadConfWithRPtr = showAsLazyByteString .
    (readFromByteStringWithRPtr :: ByteString -> (Ptr (), Maybe Conf))

testReadConfWithRPtrJSON :: ByteString -> L.ByteString
testReadConfWithRPtrJSON = showAsLazyByteString .
    (readFromByteStringWithRPtrAsJSON :: ByteString -> (Ptr (), Maybe ConfJSON))
So long and tedious! Even worse, it won’t compile!
ghc --make TestReadFromByteString.hs 
[2 of 2] Compiling TestReadFromByteString ( TestReadFromByteString.hs, TestReadFromByteString.o )

TestReadFromByteString.hs:41:51: error:
    Not in scope: type constructor or class ‘Ptr’
   |
41 |     (readFromByteStringWithRPtr :: ByteString -> (Ptr (), Maybe Conf))
   |                                                   ^^^

TestReadFromByteString.hs:45:57: error:
    Not in scope: type constructor or class ‘Ptr’
   |
45 |     (readFromByteStringWithRPtrAsJSON :: ByteString -> (Ptr (), Maybe ConfJSON))
   |                                                         ^^^
Yes, we must import module Foreign.Ptr only to satisfy the type annotation!
import           Foreign.Ptr
Now that compilation succeeds, let’s play in GHCi again.
*TestReadFromByteString> testReadConfWithRPtr "blahblahConf 90"
"(0x68616c6268616c62,Just (Conf 90))"
*TestReadFromByteString> testReadConfWithRPtrJSON "blahblah{\"tag\":\"ConfJSONCon1\",\"contents\":7}"
"(0x68616c6268616c62,Just (ConfJSONCon1 7))"
Here the blahblah is a raw 8-byte pointer, why not? :) Ok, let’s think what we can do better. Let’s give a try to the Partial Type Signatures extension. At least, the Foreign.Ptr import can be removed because we do not use pointers somehow in the code, while the corresponding type annotations will be replaced by placeholders.
{-# LANGUAGE DeriveGeneric, PartialTypeSignatures #-}

-- ...

--import           Foreign.Ptr  -- no longer needed

-- ...

testReadConf :: ByteString -> L.ByteString
testReadConf = showAsLazyByteString .
    (readFromByteString :: _ -> Maybe Conf)

testReadConfJSON :: ByteString -> L.ByteString
testReadConfJSON = showAsLazyByteString .
    (readFromByteStringAsJSON :: _ -> Maybe ConfJSON)

testReadConfWithRPtr :: ByteString -> L.ByteString
testReadConfWithRPtr = showAsLazyByteString .
    (readFromByteStringWithRPtr :: _ -> (_, Maybe Conf))

testReadConfWithRPtrJSON :: ByteString -> L.ByteString
testReadConfWithRPtrJSON = showAsLazyByteString .
    (readFromByteStringWithRPtrAsJSON :: _ -> (_, Maybe ConfJSON))
It compiles, but now we get many warnings.
ghc --make TestReadFromByteString.hs 
[2 of 2] Compiling TestReadFromByteString ( TestReadFromByteString.hs, TestReadFromByteString.o )

TestReadFromByteString.hs:35:28: warning: [-Wpartial-type-signatures]
    • Found type wildcard ‘_’ standing for ‘ByteString’
    • In an expression type signature: _ -> Maybe Conf
      In the second argument of ‘(.)’, namely
        ‘(readFromByteString :: _ -> Maybe Conf)’
      In the expression:
        showAsLazyByteString . (readFromByteString :: _ -> Maybe Conf)
    • Relevant bindings include
        testReadConf :: ByteString -> C8L.ByteString
          (bound at TestReadFromByteString.hs:34:1)
   |
35 |     (readFromByteString :: _ -> Maybe Conf)
   |                            ^
Of course, we could add option -Wno-partial-type-signatures in the GHC command-line, but it looks like we’re using a feature that was not designed for this case. Nevertheless, GHCi tests will run as expected (but I won’t show GHCi sessions anymore). Let’s try the third option: Visible Type Applications available in GHC since version 8.0.1. They have very interesting semantics which meets our intention very well: annotated types get substituted in place of type variables in the function signature. It means that we can finally get rid of building long type annotations!
{-# LANGUAGE DeriveGeneric, TypeApplications #-}

-- ...

testReadConf :: ByteString -> L.ByteString
testReadConf = showAsLazyByteString .
    readFromByteString @Conf

testReadConfJSON :: ByteString -> L.ByteString
testReadConfJSON = showAsLazyByteString .
    readFromByteStringAsJSON @ConfJSON

testReadConfWithRPtr :: ByteString -> L.ByteString
testReadConfWithRPtr = showAsLazyByteString .
    readFromByteStringWithRPtr @Conf

testReadConfWithRPtrJSON :: ByteString -> L.ByteString
testReadConfWithRPtrJSON = showAsLazyByteString .
    readFromByteStringWithRPtrAsJSON @ConfJSON
This looks really nice! So let me make a small conclusion regarding this, a little bit contrived example.
  1. Type annotations for an external function may require not only its type structure, but also references to the type names or implementations (i.e. import of modules where these types are declared).
  2. Partial type signatures still require the type structure of the function (making use of the scaffold of the type signature).
  3. Visible type applications is the tersest and cleanest way to annotate an external function’s type. It does not require re-building of the type signature in case of ambiguity.
The original source code can be found in module NgxExport.Tools. The specialized read functions are used there to facilitate typed exchange between Haskell handlers and directives in Nginx configuration files. See also examples in the documentation for the module.