The new C# range feature

The new C# range feature is a helpful addition to the language, however it appears to be a good example of where the committee approach to managing language design and improvements may have stifled common sense.

In this post I’m going to explain why I have formed this opinion.

Here’s a snippet from the current Microsoft documentation:

“A range specifies the start and end of a range. Ranges are exclusive, meaning the end is not included in the range.”

This already begins to reveal the potential for misunderstanding, the paragraph says that ranges are exclusive but then has to qualify that (because it’s not true) with the proviso that actually only the end is the part that exclusive.

So right away we have to regard the start value of a range as being fundamentally different from the end value of a range, they are different concepts. One identifies an element offset from the start of the array and the other identifies an element offset – 1 from the start of the array. (By offset I mean offset from address).

That is to say that ^0 (using the array in the examples below) is the offset of a [non-existent element] – 1 (* size of element) from the array’s base address.

Imagine we are discussing something tangible like a range of houses in a street that has 10 houses, we might say things like “I want 4 houses starting at the 5th house” this is a natural way to express the concept, but in C# using ranges we’d express this as “I want the range 4..8”

The new range feature offers no support for the number of elements, only the index of the first and last (+ 1 don’t forget) elements from the start of the array. There is also support for specifying an index from the end (again, + 1) of an array, this is indicated by using the operator

^

So, since we know we have 10 houses we could express our range as 4…^3, but there is no support for expressing how many contiguous elements we want which would have been a very helpful feature to include.

In my experience small misunderstandings about offsets, positions, counts, start elements and end elements are frequently a source of subtle and frustrating bugs, so surely making it more intuitive to express these ideas is what a language should strive to do?

Examples

Here are some examples.

            var houses = new string[]
             {
                "House 1", // 0
                "House 2", // 1
                "House 3", // 2
                "House 4", // 3
                "House 5", // 4
                "House 6", // 5
                "House 7", // 6
                "House 8", // 7
                "House 9", // 8
                "House 10" // 9
             };

            // (remember the second value in a .. expression is always EXCLUSIVE)
            var set1 = houses[0..10]; // all elements
            var set2 = houses[0..^0]; // all elements 

            var last1 = houses[9];  // House 10
            var last2 = houses[^1]; // House 10 

            

 

Notice how the representation of “the last element” differs in the case of a range and in the case of a single subscript? In the case of a single subscript it is 9 or ^1 but in a range it is represented by 10 or ^0.

This is because the second value in a range is ALWAYS decremented (or is that incremented?) at runtime in order to derive the actual index of the element.

Missing the obvious

We can see from the above examples that the following basic concepts are present in these problems:

  1. Inclusivity and Exclusivity
  2. Directionality
  3. Contiguous element count

Inclusivity

The inclusivity/exclusivity is “hard wired” into the language now, it is fixed that the first value in a range is always regarded as inclusive and the last value in a range is always regarded as exclusive. A developer has no means of explicitly expressing this.

Directionality

The “direction” of a value in a range is expressed by either including or omitting the ^ operator as a prefix. The absence of the operator means “from the start” and the presence of the operator means “from the end”, this isn’t too bad but could have been better – by providing two tokens that mean “from the start” and “from the end” we’d end up with more symmetric looking range expression and these are easier to reason about.

Element Count

Given that many examples discussed on the internet about the new range feature often contrast it with LINQ it is striking that this basic concept has been left out of range expressions.

How it could have been implemented

So after all my complaining is there really any significantly better way this could have been designed? I think so, by focusing on the three concepts above and emphasizing symmetry (or at least providing the ability to express things symmetrically) a much more intuitive and natural syntax could have been devised.

Inclusivity and Exclusivity

Rather than forcing this and making it differ in the first and last part of a range, a small set of tokens could have been introduced.

        static void Main(string[] args)
        {
            var houses = new string[]
             {
                "House 1", // 0 - 9
                "House 2", // 1 - 8
                "House 3", // 2 - 7
                "House 4", // 3 - 6
                "House 5", // 4 - 5
                "House 6", // 5 - 4
                "House 7", // 6 - 3
                "House 8", // 7 - 2
                "House 9", // 8 - 1
                "House 10" // 9 - 0
             };

            // House 4, House 5 and House 6 (implicitly inclusive)
            var set1 = houses[3 .. 5];      

            // House 4, House 5 and House 6 explicitly inclusive
            var set2 = houses[3 >..< 5];

            // House 5, exclude 3rd element from start and exclude 5th element from start
            var set3 = houses[3 <..> 5];

            // House 5, House 6, exclude 3rd element from start but include 5th element from start
            var set4 = houses[3 <..< 5];

            // House 4, House 5, include 3rd element from start but exclude 5th element from start
            var set5 = houses[3 >..> 5];  

        }
        

As you can see this requires just four simple tokens

  • <..>
  • <..<
  • >..>
  • >..<

And allows us to completely and symmetrically express inclusivity and exclusivity for both the start and the end elements.

Directionality

This can be expressed adequately with the ^ operator as it now stands, but again having explicit tokens for each direction would have been an improvement:

        static void Main(string[] args)
        {
            var houses = new string[]
             {
                "House 1", // 0 - 9
                "House 2", // 1 - 8
                "House 3", // 2 - 7
                "House 4", // 3 - 6
                "House 5", // 4 - 5
                "House 6", // 5 - 4
                "House 7", // 6 - 3
                "House 8", // 7 - 2
                "House 9", // 8 - 1
                "House 10" // 9 - 0
             };

            // House 4, House 5 and House 6 (implicitly inclusive)
            var set1a = houses[ 3 ..  5];
            var set1b = houses[^6 .. ^4];

            // House 4, House 5 and House 6 explicitly inclusive
            var set2a = houses[ 3 >..<  5];
            var set2b = houses[^6 >..< ^4];

            // House 5, exclude 3rd element from start and exclude 5th element from start
            var set3a = houses[ 3 <..>  5];
            var set3b = houses[^6 <..> ^4];

            // House 5, House 6, exclude 3rd element from start but include 5th element from start
            var set4a = houses[ 3 <..<  5];
            var set4b = houses[^6 <..< ^4];

            // House 4, House 5, include 3rd element from start but exclude 5th element from start
            var set5a = houses[ 3 >..>  5];
            var set5b = houses[^6 >..> ^4];

        }
        

This approach had it been taken would have made it natural to regard the first element (lowest address element) of an array as element zero without the hat operator and the last element (highest address element) of an array as zero with the hat operator, this symmetry would in turn make it easier to reason about code.

Element count

It is possible to express element count by specifying the same value for the start and end of a range and adding the count to that end range value but you must be cautious when using the hat operator, here are two ranges expressed with and without the hat operator – they yield 2 and 3 elements respectively: (This is legal C# hence the element number comments are different to my examples above)

        static void Main(string[] args)
        {
            var houses = new string[]
             {
                "House 1", // 0 - 10
                "House 2", // 1 - 9
                "House 3", // 2 - 8
                "House 4", // 3 - 7
                "House 5", // 4 - 6
                "House 6", // 5 - 5
                "House 7", // 6 - 4
                "House 8", // 7 - 3
                "House 9", // 8 - 2
                "House 10" // 9 - 1
             };

            // House 4, House 5 and House 6 (implicitly inclusive)

            int TWO_ELEMENTS = 2;
            int THREE_ELEMENTS = 3;

            var set1a = houses[3..(3 + TWO_ELEMENTS)];
            var set1b = houses[3..(3 + THREE_ELEMENTS)];

            var set2a = houses[^7..^(7 - TWO_ELEMENTS)];
            var set2b = houses[^7..^(7 - THREE_ELEMENTS)];

        }
        

I have created some extension methods that simplify this kind of thing, as it stands it can get quite bewildering. For example consider the statement “get me two elements starting at element 2 from the end”.

Well what does this mean? well element 2 (counting from the end) is “House 9” and we want two elements – but if we’re doing this “from” the end should we expect an array that has two elements, element 0 being “House 9” and element 1 being “House 8”? after all this is “from the end” where counting moves to elements at lower addresses?

 

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s