Haskell Language

Topics related to Haskell Language:

Getting started with Haskell Language

Haskell logo

Haskell is an advanced purely-functional programming language.

Features:

  • Statically typed: Every expression in Haskell has a type which is determined at compile time. Static type checking is the process of verifying the type safety of a program based on analysis of a program's text (source code). If a program passes a static type checker, then the program is guaranteed to satisfy some set of type safety properties for all possible inputs.
  • Purely functional: Every function in Haskell is a function in the mathematical sense. There are no statements or instructions, only expressions which cannot mutate variables (local or global) nor access state like time or random numbers.
  • Concurrent: Its flagship compiler, GHC, comes with a high-performance parallel garbage collector and light-weight concurrency library containing a number of useful concurrency primitives and abstractions.
  • Lazy evaluation: Functions don't evaluate their arguments. Delays the evaluation of an expression until its value is needed.
  • General-purpose: Haskell is built to be used in all contexts and environments.
  • Packages: Open source contribution to Haskell is very active with a wide range of packages available on the public package servers.

The latest standard of Haskell is Haskell 2010. As of May 2016, a group is working on the next version, Haskell 2020.

The official Haskell documentation is also a comprehensive and useful resource. Great place to find books, courses, tutorials, manuals, guides, etc.

Overloaded Literals

Integer Literals

is a numeral without a decimal point

for example 0, 1, 42, ...

is implicitly applied to fromInteger which is part of the Num type class so it indeed has type Num a => a - that is it can have any type that is an instance of Num


Fractional Literals

is a numeral with a decimal point

for example 0.0, -0.1111, ...

is implicitly applied to fromRational which is part of the Fractional type class so it indeed has type a => a - that is it can have any type that is an instance of Fractional


String Literals

If you add the language extension OverloadedStrings to GHC you can have the same for String-literals which then are applied to fromString from the Data.String.IsString type class

This is often used to replace String with Text or ByteString.


List Literals

Lists can defined with the [1, 2, 3] literal syntax. In GHC 7.8 and beyond, this can also be used to define other list-like structures with the OverloadedLists extension.

By default, the type of [] is:

> :t []
[] :: [t]

With OverloadedLists, this becomes:

[] :: GHC.Exts.IsList l => l

Foldable

If t is Foldable it means that for any value t a we know how to access all of the elements of a from "inside" of t a in a fixed linear order. This is the meaning of foldMap :: Monoid m => (a -> m) -> (t a -> m): we "visit" each element with a summary function and smash all the summaries together. Monoids respect order (but are invariant to different groupings).

Traversable

Lens

What is a Lens?

Lenses (and other optics) allow us to separate describing how we want to access some data from what we want to do with it. It is important to distinguish between the abstract notion of a lens and the concrete implementation. Understanding abstractly makes programming with lens much easier in the long run. There are many isomorphic representations of lenses so for this discussion we will avoid any concrete implementation discussion and instead give a high-level overview of the concepts.

Focusing

An important concept in understanding abstractly is the notion of focusing. Important optics focus on a specific part of a larger data structure without forgetting about the larger context. For example, the lens _1 focuses on the first element of a tuple but doesn't forget about what was in the second field.

Once we have focus, we can then talk about which operations we are allowed to perform with a lens. Given a Lens s a which when given a datatype of type s focuses on a specific a, we can either

  1. Extract the a by forgetting about the additional context or
  2. Replace the a by providing a new value

These correspond to the well-known get and set operations which are usually used to characterise a lens.

Other Optics

We can talk about other optics in a similar fashion.

OpticFocuses on...
LensOne part of a product
PrismOne part of a sum
TraversalZero or more parts of a data structure
Isomorphism...

Each optic focuses in a different way, as such, depending on which type of optic we have we can perform different operations.

Composition

What's more, we can compose any of the two optics we have so-far discussed in order to specify complex data accesses. The four types of optics we have discussed form a lattice, the result of composing two optics together is their upper bound.

enter image description here

For example, if we compose together a lens and a prism, we get a traversal. The reason for this is that by their (vertical) composition, we first focus on one part of a product and then on one part of a sum. The result being an optic which focuses on precisely zero or one parts of our data which is a special case of a traversal. (This is also sometimes called an affine traversal).

In Haskell

The reason for the popularity in Haskell is that there is a very succinct representation of optics. All optics are just functions of a certain form which can be composed together using function composition. This leads to a very light-weight embedding which makes it easy to integrate optics into your programs. Further to this, due to the particulars of the encoding, function composition also automatically computes the upper bound of two optics we compose. This means that we can reuse the same combinators for different optics without explicit casting.

QuickCheck

Common GHC Language Extensions

These language extensions are typically available when using the Glasgow Haskell Compiler (GHC) as they are not part of the approved Haskell 2010 language Report. To use these extensions, one must either inform the compiler using a flag or place a LANGUAGE programa before the module keyword in a file. The official documentation can be found in section 7 of the GCH users guide.

The format of the LANGUAGE programa is {-# LANGUAGE ExtensionOne, ExtensionTwo ... #-}. That is the literal {-# followed by LANGUAGE followed by a comma separated list of extensions, and finally the closing #-}. Multiple LANGUAGE programas may be placed in one file.

Free Monads

Type Classes

The following diagram taken from the Typeclassopedia article shows the relationship between the various typeclasses in Haskell.

Relationships among standard Haskell type classes, Figure 1 as published in Typeclassopedia.

IO

Record Syntax

Partial Application

Let's clear up some misconceptions that beginners might make.

You may have encountered functions such as:

max :: (Ord a) => a -> a -> a  
max m n  
  | m >= n = m  
  | otherwise = n  

Beginners will typically view max :: (Ord a) => a -> a -> a as function that takes two arguments (values) of type a and returns a value of type a. However, what is really happening, is that max is taking one argument of type a and returning a function of type a -> a. This function then takes an argument of type a and returns a final value of type a.

Indeed, max can be written as max :: (Ord a) => a -> (a -> a)

Consider the type signature of max:

Prelude> :t max  
max :: Ord a => a -> a -> a  

Prelude> :t (max 75)  
(max 75) :: (Num a, Ord a) => a -> a  

Prelude> :t (max "Fury Road")  
(max "Fury Road") :: [Char] -> [Char]  

Prelude> :t (max "Fury Road" "Furiosa")  
(max "Fury Road" "Furiosa") :: [Char]  

max 75 and max "Fury Road" may not look like functions, but in actuality, they are.

The confusion stems from the fact that in mathematics and many, other, common programming languages, we are allowed to have functions that take multiple arguments. However, in Haskell, functions can only take one argument and they can return either values such as a, or functions such as a -> a.

Monoid

Category Theory

Lists

  1. The type [a] is equivalent to [] a.
  2. [] constructs the empty list.
  3. [] in a function definition LHS, e.g. f [] = ..., is the empty list pattern.
  4. x:xs constructs a list where an element x is prepended to the list xs
  5. f (x:xs) = ... is a pattern match for a non-empty list where x is the head and xs is the tail.
  6. f (a:b:cs) = ... and f (a:(b:cs)) = ... are the same. They are a pattern match for a list of at least two elements where the first element is a, the second element is b, and the rest of the list is cs.
  7. f ((a:as):bs) = ... is NOT the same as f (a:(as:bs)) = .... The former is a pattern match for a non-empty list of lists, where a is the head of the head, as is the tail of the head, and bs is the tail.
  8. f (x:[]) = ... and f [x] = ... are the same. They are a pattern match for a list of exactly one element.
  9. f (a:b:[]) = ... and f [a,b] = ... are the same. They are a pattern match for a list of exactly two elements.
  10. f [a:b] = ... is a pattern match for a list of exactly one element where the element is also a list. a is the head of the element and b is the tail of the element.
  11. [a,b,c] is the same as (a:b:c:[]). Standard list notation is just syntactic sugar for the (:) and [] constructors.
  12. You can use all@(x:y:ys) in order to refer to the whole list as all (or any other name you choose) instead of repeating (x:y:ys) again.

Sorting Algorithms

Type Families

Monads

Stack

Generalized Algebraic Data Types

Recursion Schemes

Data.Text

Text is a more efficient alternative to Haskell's standard String type. String is defined as a linked list of characters in the standard Prelude, per the Haskell Report:

type String = [Char]

Text is represented as a packed array of Unicode characters. This is similar to how most other high-level languages represent strings, and gives much better time and space efficiency than the list version.

Text should be preferred over String for all production usage. A notable exception is depending on a library which has a String API, but even in that case there may be a benefit of using Text internally and converting to a String just before interfacing with the library.

All of the examples in this topic use the OverloadedStrings language extension.

Using GHCi

GHCI is the interactive REPL that comes bundled with GHC.

Strictness

Syntax in Functions

Functor

A Functor can be thought of as a container for some value, or a computation context. Examples are Maybe a or [a]. The Typeclassopedia article has a good write-up of the concepts behind Functors.

To be considered a real Functor, an instance has to respect the 2 following laws:

Identity

fmap id == id

Composition

fmap (f . g) = (fmap f) . (fmap g)

Testing with Tasty

Creating Custom Data Types

Reactive-banana

Optimization

Concurrency

Function composition

Databases

Data.Aeson - JSON in Haskell

Higher-order functions

Higher Order Functions are functions that take functions as parameters and/or return functions as their return values.

Containers - Data.Map

Fixity declarations

Web Development

Vectors

It [Data.Vector] has an emphasis on very high performance through loop fusion, whilst retaining a rich interface. The main data types are boxed and unboxed arrays, and arrays may be immutable (pure), or mutable. Arrays may hold Storable elements, suitable for passing to and from C, and you can convert between the array types. Arrays are indexed by non-negative Int values.

The Haskell Wiki has these recommendations:

In general:

  • End users should use Data.Vector.Unboxed for most cases
  • If you need to store more complex structures, use Data.Vector
  • If you need to pass to C, use Data.Vector.Storable

For library writers;

  • Use the generic interface, to ensure your library is maximally flexible: Data.Vector.Generic

Cabal

Type algebra

Arrows

Typed holes

Rewrite rules (GHC)

Date and Time

The Data.Time module from time package provides support for retrieving & manipulating date & time values:

List Comprehensions

Streaming IO

Google Protocol Buffers

To use Protocol Buffers with Haskell you should install the htprotoc package:

  1. Clone the project from Github
  2. Use Stack to build and install

You should now find the hprotoc executable in $HOME/.local/bin/.

Template Haskell & QuasiQuotes

What is Template Haskell?

Template Haskell refers to the template meta-programming facilities built into GHC Haskell. The paper describing the original implementation can be found here.

What are stages? (Or, what is the stage restriction?)

Stages refer to when code is executed. Normally, code is exectued only at runtime, but with Template Haskell, code can be executed at compile time. "Normal" code is stage 0 and compile-time code is stage 1.

The stage restriction refers to the fact that a stage 0 program may not be executed at stage 1 - this would be equivalent to being able to run any regular program (not just meta-program) at compile time.

By convention (and for the sake of implementation simplicity), code in the current module is always stage 0 and code imported from all other modules is stage 1. For this reason, only expressions from other modules may be spliced.

Note that a stage 1 program is a stage 0 expression of type Q Exp, Q Type, etc; but the converse is not true - not every value (stage 0 program) of type Q Exp is a stage 1 program,

Futhermore, since splices can be nested, identifiers can have stages greater than 1. The stage restriction can then be generalized - a stage n program may not be executed in any stage m>n. For example, one can see references to such stages greater than 1 in certain error messages:

>:t [| \x -> $x |]

<interactive>:1:10: error:
    * Stage error: `x' is bound at stage 2 but used at stage 1
    * In the untyped splice: $x
      In the Template Haskell quotation [| \ x -> $x |]

Using Template Haskell causes not-in-scope errors from unrelated identifiers?

Normally, all the declarations in a single Haskell module can be thought of as all being mutually recursive. In other words, every top-level declaration is in the scope of every other in a single module. When Template Haskell is enabled, the scoping rules change - the module is instead broken into groups of code seperated by TH splices, and each group is mutually recursive, and each group in the scope of all further groups.

Phantom types

Modules

Tuples (Pairs, Triples, ...)

  • Haskell does not support tuples with one component natively.

  • Units (written ()) can be understood as tuples with zero components.

  • There are no predefined functions to extract components of tuples with more than two components. If you feel that you need such functions, consider using a custom data type with record labels instead of the tuple type. Then you can use the record labels as functions to extract the components.

Graphics with Gloss

State Monad

Newcomers to Haskell often shy away from the State monad and treat it like a taboo—like the claimed benefit of functional programming is the avoidance of state, so don't you lose that when you use State? A more nuanced view is that:

  • State can be useful in small, controlled doses;
  • The State type provides the ability to control the dose very precisely.

The reasons being that if you have action :: State s a, this tells you that:

  • action is special because it depends on a state;
  • The state has type s, so action cannot be influenced by any old value in your program—only an s or some value reachable from some s;
  • The runState :: State s a -> s -> (a, s) puts a "barrier" around the stateful action, so that its effectfulness cannot be observed from outside that barrier.

So this is a good set of criteria for whether to use State in particular scenario. You want to see that your code is minimizing the scope of the state, both by choosing a narrow type for s and by putting runState as close to "the bottom" as possible, (so that your actions can be influenced by as few thing as possible.

Pipes

As the hackage page describes:

pipes is a clean and powerful stream processing library that lets you build and connect reusable streaming components

Programs implemented through streaming can often be succinct and composable, with simple, short functions allowing you to "slot in or out" features easily with the backing of the Haskell type system.

await :: Monad m => Consumer' a m a

Pulls a value from upstream, where a is our input type.

yield :: Monad m => a -> Producer' a m ()

Produce a value, where a is the output type.

It's highly recommended you read through the embedded Pipes.Tutorial package which gives an excellent overview of the core concepts of Pipes and how Producer, Consumer and Effect interact.

Infix operators

Most Haskell functions are called with the function name followed by arguments (prefix notation). For functions that accept two arguments like (+), it sometimes makes sense to provide an argument before and after the function (infix).

Parallelism

Simon Marlow's book, Concurrent and Parallel Programming in Haskell, is outstanding and covers a multitude of concepts. It is also very much accessible for even the newest Haskell programmer. It is highly recommended and available in PDF or online for free.

Parallel vs Concurrent

Simon Marlow puts it best:

A parallel program is one that uses a multiplicity of computational hardware (e.g., several processor cores) to perform a computation more quickly. The aim is to arrive at the answer earlier, by delegating different parts of the computation to different processors that execute at the same time.

By contrast, concurrency is a program-structuring technique in which there are multiple threads of control. Conceptually, the threads of control execute “at the same time”; that is, the user sees their effects interleaved. Whether they actually execute at the same time or not is an implementation detail; a concurrent program can execute on a single processor through interleaved execution or on multiple physical processors.

Weak Head Normal Form

It's important to be aware of how lazy-evaluation works. The first section of this chapter will give a strong introduction into WHNF and how this relates to parallel and concurrent programming.

Parsing HTML with taggy-lens and lens

Foreign Function Interface

While cabal has support for including a C and C++ libraries in a Haskell package, there are a few bugs. First, if you have data (rather than a function) defined in b.o that is used in a.o, and list the C-sources: a.c, b.c, then cabal will be unable to find the data. This is documented in #12152. A workaround when using cabal is to reorder the C-sources list to be C-sources: b.c, a.c. This may not work when using stack, because stack always links the C-sources alphabetically, regardless of the order in which you list them.

Another issues is that you must surround any C++ code in header (.h) files with #ifdef __cplusplus guards. This is because GHC doesn't understand C++ code in header files. You can still write C++ code in header files, but you must surround it with guards.

ccall refers to the calling convention; currently ccall and stdcall (Pascal convention) are supported. The unsafe keyword is optional; this reduces overhead for simple functions but may cause deadlocks if the foreign function blocks indefinitely or has insufficient permission to execute1.

Gtk3

On many Linux distributions, the Haskell Gtk3 library is available as a package in the systems package manager (e.g. libghc-gtk in Ubuntu's APT). However, for some developers it might be preferable to use a tool like stack to manage isolated environments, and have Gtk3 installed via cabal instead of via an global installation by the systems package manager. For this option, gtk2hs-buildtools is required. Run cabal install gtk2hs-buildtools before adding gtk, gtk3 or any other Gtk-based Haskell libraries to your projects build-depends entry in your cabal file.

Monad Transformers

Bifunctor

A run of the mill Functor is covariant in a single type parameter. For instance, if f is a Functor, then given an f a, and a function of the form a -> b, one can obtain an f b (through the use of fmap).

A Bifunctor is covariant in two type parameters. If f is a Bifunctor, then given an f a b, and two functions, one from a -> c, and another from b -> d, then one can obtain an f c d (using bimap).

first should be thought of as an fmap over the first type parameter, second as an fmap over the second, and bimap should be conceived as mapping two functions covariantly over the first and second type parameters, respectively.

Proxies

Applicative Functor

Definition

class Functor f => Applicative f where
    pure  :: a -> f a
    (<*>) :: f (a -> b) -> f a -> f b

Note the Functor constraint on f. The pure function returns its argument embedded in the Applicative structure. The infix function <*> (pronounced "apply") is very similar to fmap except with the function embedded in the Applicative structure.

A correct instance of Applicative should satisfy the applicative laws, though these are not enforced by the compiler:

pure id <*> a = a                              -- identity
pure (.) <*> a <*> b <*> c = a <*> (b <*> c)   -- composition
pure f <*> pure a = pure (f a)                 -- homomorphism
a <*> pure b = pure ($ b) <*> a                -- interchange

Common monads as free monads

Common functors as the base of cofree comonads

Arithmetic

The numeric typeclass hierarchy

Num sits at the root of the numeric typeclass hierarchy. Its characteristic operations and some common instances are shown below (the ones loaded by default with Prelude plus those of Data.Complex):

λ> :i Num
class Num a where
  (+) :: a -> a -> a
  (-) :: a -> a -> a
  (*) :: a -> a -> a
  negate :: a -> a
  abs :: a -> a
  signum :: a -> a
  fromInteger :: Integer -> a
  {-# MINIMAL (+), (*), abs, signum, fromInteger, (negate | (-)) #-}
      -- Defined in ‘GHC.Num’
instance RealFloat a => Num (Complex a) -- Defined in ‘Data.Complex’
instance Num Word -- Defined in ‘GHC.Num’
instance Num Integer -- Defined in ‘GHC.Num’
instance Num Int -- Defined in ‘GHC.Num’
instance Num Float -- Defined in ‘GHC.Float’
instance Num Double -- Defined in ‘GHC.Float’

We have already seen the Fractional class, which requires Num and introduces the notions of "division" (/) and reciprocal of a number:

λ> :i Fractional
class Num a => Fractional a where
  (/) :: a -> a -> a
  recip :: a -> a
  fromRational :: Rational -> a
  {-# MINIMAL fromRational, (recip | (/)) #-}
      -- Defined in ‘GHC.Real’
instance RealFloat a => Fractional (Complex a) -- Defined in ‘Data.Complex’
instance Fractional Float -- Defined in ‘GHC.Float’
instance Fractional Double -- Defined in ‘GHC.Float’

The Real class models .. the real numbers. It requires Num and Ord, therefore it models an ordered numerical field. As a counterexample, Complex numbers are not an ordered field (i.e. they do not possess a natural ordering relationship):

λ> :i Real
class (Num a, Ord a) => Real a where
  toRational :: a -> Rational
  {-# MINIMAL toRational #-}
      -- Defined in ‘GHC.Real’
instance Real Word -- Defined in ‘GHC.Real’
instance Real Integer -- Defined in ‘GHC.Real’
instance Real Int -- Defined in ‘GHC.Real’
instance Real Float -- Defined in ‘GHC.Float’
instance Real Double -- Defined in ‘GHC.Float’

RealFrac represents numbers that may be rounded

λ> :i RealFrac
class (Real a, Fractional a) => RealFrac a where
  properFraction :: Integral b => a -> (b, a)
  truncate :: Integral b => a -> b
  round :: Integral b => a -> b
  ceiling :: Integral b => a -> b
  floor :: Integral b => a -> b
  {-# MINIMAL properFraction #-}
      -- Defined in ‘GHC.Real’
instance RealFrac Float -- Defined in ‘GHC.Float’
instance RealFrac Double -- Defined in ‘GHC.Float’

Floating (which implies Fractional) represents constants and operations that may not have a finite decimal expansion.

λ> :i Floating
class Fractional a => Floating a where
  pi :: a
  exp :: a -> a
  log :: a -> a
  sqrt :: a -> a
  (**) :: a -> a -> a
  logBase :: a -> a -> a
  sin :: a -> a
  cos :: a -> a
  tan :: a -> a
  asin :: a -> a
  acos :: a -> a
  atan :: a -> a
  sinh :: a -> a
  cosh :: a -> a
  tanh :: a -> a
  asinh :: a -> a
  acosh :: a -> a
  atanh :: a -> a
  GHC.Float.log1p :: a -> a
  GHC.Float.expm1 :: a -> a
  GHC.Float.log1pexp :: a -> a
  GHC.Float.log1mexp :: a -> a
  {-# MINIMAL pi, exp, log, sin, cos, asin, acos, atan, sinh, cosh,
              asinh, acosh, atanh #-}
      -- Defined in ‘GHC.Float’
instance RealFloat a => Floating (Complex a) -- Defined in ‘Data.Complex’
instance Floating Float -- Defined in ‘GHC.Float’
instance Floating Double -- Defined in ‘GHC.Float’

Caution: while expressions such as sqrt . negate :: Floating a => a -> a are perfectly valid, they might return NaN ("not-a-number"), which may not be an intended behaviour. In such cases, we might want to work over the Complex field (shown later).

Role

Arbitrary-rank polymorphism with RankNTypes

GHCJS

XML

Reader / ReaderT

Function call syntax

In general, the rule for converting a C-style function call to Haskell, in any context (assignment, return, or embedded in another call), is to replace the commas in the C-style argument list with whitespace, and move the opening parenthesis from the C-style call to contain the function name and its parameters.

If any expressions are wrapped entirely in parentheses, these (external) pairs of parentheses can be removed for readability, as they do not affect the meaning of the expression.
There are some other circumstances where parentheses can be removed, but this only affects readability and maintainability.

Logging

Attoparsec

zipWithM

Profunctor

Type Application