Iterables

Other topics

New iterable type

In Julia, when looping through an iterable object I is done with the for syntax:

for i = I   # or  "for i in I"
    # body
end

Behind the scenes, this is translated to:

state = start(I)
while !done(I, state)
    (i, state) = next(I, state)
    # body
end

Therefore, if you want I to be an iterable, you need to define start, next and done methods for its type. Suppose you define a type Foo containing an array as one of the fields:

type Foo
    bar::Array{Int,1}
end

We instantiate a Foo object by doing:

julia> I = Foo([1,2,3])
Foo([1,2,3])

julia> I.bar
3-element Array{Int64,1}:
 1
 2
 3

If we want to iterate through Foo, with each element bar being returned by each iteration, we define the methods:

import Base: start, next, done

start(I::Foo) = 1

next(I::Foo, state) = (I.bar[state], state+1)

function done(I::Foo, state)
    if state == length(I.bar)
        return true
    end
    return false
end

Note that since these functions belong to the Base module, we must first import their names before adding new methods to them.

After the methods are defined, Foo is compatible with the iterator interface:

julia> for i in I
           println(i)
       end

1
2
3

Combining Lazy Iterables

The standard library comes with a rich collection of lazy iterables (and libraries such as Iterators.jl provide even more). Lazy iterables can be composed to create more powerful iterables in constant time. The most important lazy iterables are take and drop, from which many other functions can be created.

Lazily slice an iterable

Arrays can be sliced with slice notation. For instance, the following returns the 10th to 15th elements of an array, inclusive:

A[10:15]

However, slice notation does not work with all iterables. For instance, we cannot slice a generator expression:

julia> (i^2 for i in 1:10)[3:5]
ERROR: MethodError: no method matching getindex(::Base.Generator{UnitRange{Int64},##1#2}, ::UnitRange{Int64})

Slicing strings may not have the expected Unicode behaviour:

julia> "αααα"[2:3]
ERROR: UnicodeError: invalid character index
 in getindex(::String, ::UnitRange{Int64}) at ./strings/string.jl:130

julia> "αααα"[3:4]
"α"

We can define a function lazysub(itr, range::UnitRange) to do this kind of slicing on arbitrary iterables. This is defined in terms of take and drop:

lazysub(itr, r::UnitRange) = take(drop(itr, first(r) - 1), last(r) - first(r) + 1)

The implementation here works because for UnitRange value a:b, the following steps are performed:

  • drops the first a-1 elements
  • takes the ath element, a+1th element, and so forth, until the a+(b-a)=bth element

In total, b-a elements are taken. We can confirm our implementation is correct in each case above:

julia> collect(lazysub("αααα", 2:3))
2-element Array{Char,1}:
 'α'
 'α'

julia> collect(lazysub((i^2 for i in 1:10), 3:5))
3-element Array{Int64,1}:
  9
 16
 25

Lazily shift an iterable circularly

The circshift operation on arrays will shift the array as if it were a circle, then relinearize it. For example,

julia> circshift(1:10, 3)
10-element Array{Int64,1}:
  8
  9
 10
  1
  2
  3
  4
  5
  6
  7

Can we do this lazily for all iterables? We can use the cycle, drop, and take iterables to implement this functionality.

lazycircshift(itr, n) = take(drop(cycle(itr), length(itr) - n), length(itr))

Along with lazy types being more performant in many situations, this lets us do circshift-like functionality on types that would otherwise not support it:

julia> circshift("Hello, World!", 3)
ERROR: MethodError: no method matching circshift(::String, ::Int64)
Closest candidates are:
  circshift(::AbstractArray{T,N}, ::Real) at abstractarraymath.jl:162
  circshift(::AbstractArray{T,N}, ::Any) at abstractarraymath.jl:195

julia> String(collect(lazycircshift("Hello, World!", 3)))
"ld!Hello, Wor"
0.5.0

Making a multiplication table

Let's make a multiplication table using lazy iterable functions to create a matrix.

The key functions to use here are:

  • Base.product, which computes a Cartesian product.
  • prod, which computes a regular product (as in multiplication)
  • :, which creates a range
  • map, which is a higher order function applying a function to each element of a collection

The solution is:

julia> map(prod, Base.product(1:10, 1:10))
10×10 Array{Int64,2}:
  1   2   3   4   5   6   7   8   9   10
  2   4   6   8  10  12  14  16  18   20
  3   6   9  12  15  18  21  24  27   30
  4   8  12  16  20  24  28  32  36   40
  5  10  15  20  25  30  35  40  45   50
  6  12  18  24  30  36  42  48  54   60
  7  14  21  28  35  42  49  56  63   70
  8  16  24  32  40  48  56  64  72   80
  9  18  27  36  45  54  63  72  81   90
 10  20  30  40  50  60  70  80  90  100

Lazily-Evaluated Lists

It's possible to make a simple lazily-evaluated list using mutable types and closures. A lazily-evaluated list is a list whose elements are not evaluated when it's constructed, but rather when it is accessed. Benefits of lazily evaluated lists include the possibility of being infinite.

import Base: getindex
type Lazy
    thunk
    value
    Lazy(thunk) = new(thunk)
end

evaluate!(lazy::Lazy) = (lazy.value = lazy.thunk(); lazy.value)
getindex(lazy::Lazy) = isdefined(lazy, :value) ? lazy.value : evaluate!(lazy)

import Base: first, tail, start, next, done, iteratorsize, HasLength, SizeUnknown
abstract List
immutable Cons <: List
    head
    tail::Lazy
end
immutable Nil <: List end

macro cons(x, y)
    quote
        Cons($(esc(x)), Lazy(() -> $(esc(y))))
    end
end

first(xs::Cons) = xs.head
tail(xs::Cons) = xs.tail[]
start(xs::Cons) = xs
next(::Cons, xs) = first(xs), tail(xs)
done(::List, ::Cons) = false
done(::List, ::Nil) = true
iteratorsize(::Nil) = HasLength()
iteratorsize(::Cons) = SizeUnknown()

Which indeed works as it would in a language like Haskell, where all lists are lazily-evaluated:

julia> xs = @cons(1, ys)
Cons(1,Lazy(false,#3,#undef))

julia> ys = @cons(2, xs)
Cons(2,Lazy(false,#5,#undef))

julia> [take(xs, 5)...]
5-element Array{Int64,1}:
 1
 2
 1
 2
 1

In practice, it is better to use the Lazy.jl package. However, the implementation of the lazy list above sheds lights into important details about how to construct one's own iterable type.

Syntax:

  • start(itr)
  • next(itr, s)
  • done(itr, s)
  • take(itr, n)
  • drop(itr, n)
  • cycle(itr)
  • Base.product(xs, ys)

Parameters:

ParameterDetails
ForAll Functions
itrThe iterable to operate on.
Fornext and done
sAn iterator state describing the current position of the iteration.
Fortake and drop
nThe number of elements to take or drop.
ForBase.product
xsThe iterable to take first elements of pairs from.
ysThe iterable to take second elements of pairs from.
...(Note that product accepts any number of arguments; if more than two are provided, it will construct tuples of length greater than two.)

Contributors

Topic Id: 5466

Example Ids: 19440,23900,24139

This site is not affiliated with any of the contributors.