Lexical Analysis With F# – Part 4

Confronting immutability

As I’ve been working on making the lexical analyzer (hereafter called “tokenizer”) more complete and simpler to understand, it’s begun to really dawn on me why immutability is so very important. I’ve begun to realize that by eliminating the traditional concept of assignment we are left only with function invocation.

This is why its possible to perceive an F# program to be just a tree of nested function dependencies, in a sense there’s little else to do but invoke a function or compute a return value and so that’s why we see F# and other functional code as being – in effect – a large nested set of functions.

We’ve all written code – rightly or wrongly – that has flags or sentinels scattered around, often to deal with special cases or arising due to change requests that require functionality that’s out of the initial design’s pattern.

Sometimes these are obvious, even being named “_flag” or something at other times they’re less obvious and may appear as some enum or special numeric test like:


do while (error_count < 10)
  {
  if (error_count == 0)
     do_something_special();
  if (not_first_time_around)
    append_node (some_tree, some_node);
  // Other stuff...
  }

To understand imperative code that has stuff like this in it isn’t always easy, one needs to mentally visualize the scenarios, fathom out the various behaviors that arise form differing combinations of values for these flags/counters, try to assess whether some combinations are possible or impossible and so on.

These are the kinds of things that really can and do catch us out and lead to unanticipated runtime behavior unless they’re very very carefully thought through. What’s emerging as I explore functional languages is that these are not actually strictly necessary – certainly in a functional version anyway.

Because functional development a) abolishes or discourages this kind of mutability and b) equips developers with techniques that enable them to think creatively without mutability, it becomes quite feasible to avoid if not eliminate a whole class of complex error scenarios that are associated with stuff like the above.

I’m also wondering how challenging it may be to write F# code that is easily readable. Because one can can nest functions almost without a second thought we might get to a situation where the code is good quality and free from nasty side effects, but it may also be challenging to understand non-trivial F# code.

Improved Tokenizer

My self imposed project of writing a hand coded tokenizer for some language is progressing steadily, just to reiterate: the purpose of this little project is to help me learn – really learn – F# and functional development.

I’m primarily striving to deliver a well written F# solution to the problem one that uses F# as it should be used, one that respects the functional idioms, has no mutability and so on.

The next post will explore the latest version of the code in which the input file and the token stream itself are exposed as a demand driven sequence in which tokens are created when requested and characters are read from the input as tokens are created.

I’m going to implement a 2D state table and 2D action table as well so that determining the next state when a character is read and performing an action when the next character is read are both defined in a lookup table leading to very simple and elegant code – doing this in F# should prove very interesting…

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s