Java – Volatile keyword and Double Checked Locking

Java volatile keyword:

(http://tutorials.jenkov.com/java-concurrency/volatile.html)

  1. A CPU may have its own cache, a variable’s value may reside in cpu’s cache when a thread read it.
  2. volatile keyword inform enforce that the variable need to be read/write from/to main memory (i.e., no cpu cache)

volatile keyword also enforce that

  • If Thread A writes to a volatile variable and Thread B subsequently reads the same volatile variable, then all variables visible to Thread A before writing the volatile variable, will also be visible to Thread B after it has read the volatile variable. Example:
int a = 0;
volatile int b = 0;
a = 1;
b= 2;    
// Here, not only b's value need to be write into main memory, but also               
// a need to be write to main memory so that all other thread can read                 
// the same a value as current thread
  • The reading and writing instructions of volatile variables cannot be reordered by the JVM (the JVM may reorder instructions for performance reasons as long as the JVM detects no change in program behavior from the reordering). Instructions before and after can be reordered, but the volatile read or write cannot be mixed with these instructions. Whatever instructions follow a read or write of a volatile variable are guaranteed to happen after the read or write.

For a given shared variable, if there is one write thread, and multiple read thread, volatile keyword is good enough to maintain the correct access concurrency. If there is multiple write thread, volatile keyword is not good enough anymore, we need synchronization (lock) for write access.

Double checked locking problem

(https://en.wikipedia.org/wiki/Double-checked_locking)

Following code return a helper variable, works ok in single-thread case.

// Single-threaded version
class Foo {
    private Helper helper;
    public Helper getHelper() {
        if (helper == null) {
            helper = new Helper();
        }
        return helper;
    }

    // other functions and members...
}

To support multi-thread, synchronization works but could be expensive

class Foo {
    private Helper helper;
    public synchronized Helper getHelper() {
        if (helper == null) {
            helper = new Helper();
        }
        return helper;
    }

    // other functions and members...
}

Following code may looks works, but actually not in java. Note that the purpose of this double-checked locking code is to do lazy-initialization, while trying to avoid synchronization as much as possible.

// Broken multithreaded version
// "Double-Checked Locking" idiom
class Foo {
    private Helper helper;
    public Helper getHelper() {
        if (helper == null) {  // check 1
            synchronized(this) {
                if (helper == null) {  // check 2
                    helper = new Helper();    // potential problem
                }
            }
        }
        return helper;
    }

    // other functions and members...
}

Why it could fails? Because when helper = new Helper(); is called in thread 1, before Helper’s constructer  is fully finished and the helper object instance is fully initialized,  JVM may optimize the code to make helper variable available (i.e, not null anymore). This means when thread 2 call getHelper function before thread 1 finish the new statement, thread 2 can get “helper” variable, even though it is not fully initialized. If thread 2 access helper.xxx, where xxx is an un-initialized member in Helper object, problem could happens.

Using volatile keyword can help solve the problem as follows:

// Works with acquire/release semantics for volatile in Java 1.5 and later
// Broken under Java 1.4 and earlier semantics for volatile
class Foo {
    private volatile Helper helper;
    public Helper getHelper() {
        Helper result = helper;
        if (result == null) {
            synchronized(this) {
                result = helper;
                if (result == null) {
                    helper = result = new Helper();
                }
            }
        }
        return result;
    }

    // other functions and members...
}

Since helper is volatile, it will not be make available before it is fully initialized (i.e., no JVM optimization).

Note the local variable result, which seems unnecessary. The effect of this is that in cases where helper is already initialized (i.e., most of the time), the volatile field is only accessed once (due to “return result;” instead of “return helper;”), which can improve the method’s overall performance by as much as 25 percent.[7]

Best lazy singleton lazy initialization practice

this relies on the fact that nested classes are not loaded until they are referenced.

// Correct lazy initialization in Java
class Foo {
    private static class HelperHolder {
       public static final Helper helper = new Helper();
    }

    public static Helper getHelper() {
        return HelperHolder.helper;
    }
}

 

Java – Volatile keyword and Double Checked Locking

ACID explained

Acid: Atomicity, Consistency, Isolation, Durability

Isolation Level Explained:

Assuming that there is one read transaction (with two exact same read) and one write transaction, the time sequence are (read1,  write1 + commit write, read2 + commit read)

Serializable (Higest Level)-  Read and write transaction need to be fully serializable, i.e., One cannot start after the other one finished

Repeatable reads – No range lock support, this means it is possible for write transaction to delete a row and add another back. Even though the final result may (or maynot) be same for read (e.g, count(*), max() ), but it may not be expected.   The problem this causes is called Phantom Read.  For single row read, the DBMS must return the old value for the second SELECT.

Read commit – Each read is guaranteed to see “committed” write transaction value.  But two read is not guaranteed to get the same result. The problem this causes is called “non-repeatable read”

Read uncommitted – Read may get committed write result (dirty value). The problem this causes is called “Dirty read”

In summary, we have

Dirty Read            non-repeatable Read      Phantom Read

Read uncommitted             V                                   V                                        V

Read committed                 X                                    V                                        V

Repeatable read                 X                                    X                                        V

Serializable                         X                                    X                                        X

ACID explained