Thursday, November 26, 2015

Modifying Sets during iteration in ECMAScript 6

I discovered something pleasantly weird while playing with new Javascript features in ES6: You can mutate sets while iterating over their elements.

Sets come with any decent standard library and are useful for quickly keeping track of objects you've seen, holding event handlers, and stripping duplicates from collections (to name a few use cases). Sets are container objects which differ from arrays two ways: 1) They don't store items in any particular order, and 2) They may only contain a given item once.

Because sets are theoretically unordered and usually don't expose their internal data structures, one must use an iterator to list their elements. Most implementations also don't like it when a set changes while iterating over it.

Here is an example in Python:

s = set([1,2, 4,5])
for x in s:
    print x
    if x == 2:

It fails with the following message:

user@hostname /tmp $ python 
Traceback (most recent call last):
  File "", line 2, in 
    for x in s:
RuntimeError: Set changed size during iteration

Here's an equivalent program in Java:

import java.util.*;

public class test {
    public static void main (String[] args) {
        Set s = new HashSet(
            Arrays.asList(1, 2, 4, 5)
        for (Integer x : s) {
            if (x == 2) {
user@hostname /tmp $ javac && java test
Exception in thread "main" java.util.ConcurrentModificationException
 at java.util.HashMap$HashIterator.nextNode(
 at java.util.HashMap$
 at test.main(

Now, let's look at the same code in Javascript:

'use strict';

const s = new Set([1,2, 4,5]);
for (let x of s) {
    if (x == 2) {
karl@charmander /tmp $ node --version
karl@charmander /tmp $ node test.js

Not only does the Set iterator not complain, it also iterates over the new element!

I wonder if this is in the Javascript spec, a brilliant piece of engineering on Google's part, or some sort of laziness. Does the Set keep an internal list of all the elements that have been added since iteration began? Or is it lazy, keeping an internal array of its elements and scanning the list in O(n) time as new elements are added?