WHAT IS THREAD-LOCAL STORAGE (TLS)?

Thread-local storage5 is global and static memory, but instead of being shared between threads, it is local to individual threads. To better explain this, I will briefly go over the types of memory in a program6:

Data: preinitialized modifiable static and global data
BSS: uninitialized static and global data
Heap: shared static and global data
Stack: local variables and parameters
Registers: really fast CPU memory

The types of memory that are shared among threads include: Data, BSS, and Heap.

This poses a problem for when you want a variable that is static or global, but not shared among threads (i.e. each thread has its own copy of the variable). Different languages have different solutions to this problem.

LANGUAGE IMPLEMENTATIONS

PThread Implementation

Pthread implementations in C have pthread_key_create and pthread_key_delete to allocate and deallocate space on a thread, and pthread_getspecific and pthread_setspecific to retrieve and set the data.

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>

#define NUMTHREADS 4

pthread_key_t glob_var_key;

void do_something()
{
  //get thread specific data
  int* glob_spec_var = pthread_getspecific(glob_var_key);
  printf("Thread %d before mod value is %d\n", (unsigned int)pthread_self(), *glob_spec_var);
  *glob_spec_var += 1;
  printf("Thread %d after mod value is %d\n", (unsigned int)pthread_self(), *glob_spec_var);
}

void* thread_func(void *arg)
{
  int *p = malloc(sizeof(int));
  *p = 1;
  pthread_setspecific(glob_var_key, p);
  do_something();
  do_something();
  pthread_setspecific(glob_var_key, NULL);
  free(p);
  pthread_exit(NULL);
}

int main(void)
{
  pthread_t threads[NUMTHREADS];
  int i;

  pthread_key_create(&amp;glob_var_key,NULL);
  for (i=0; i < NUMTHREADS; i++)
    pthread_create(threads+i,NULL,thread_func,NULL);

  for (i=0; i < NUMTHREADS; i++)
    pthread_join(threads[i], NULL);

  return 0;
}

Swift Implementation

Swift's implementation uses a dictionary called threadDictionary.

public func checkThreadLocal<T: AnyObject>(key: String, create: () -> T) -> T {
  let threadDictionary = NSThread.currentThread().threadDictionary

  if let cachedObject = threadDictionary[key] as T? {
    return cachedObject
  } else {
    let newObject = create()
    threadDictionary[key] = newObject
    return newObject
  }
}

func getFormatter() -> NSDateFormatter {
  return checkThreadLocal("SomeName") {
    println("This block will only be executed once")
    let enUSPOSIXLocale = NSLocale(localeIdentifier: "en_US_POSIX")
    let formatter = NSDateFormatter()
    formatter.locale = enUSPOSIXLocale
    formatter.dateFormat = "yyyy'-'MM'-'dd'T'HH':'mm':'ss'Z'"
    formatter.timeZone = NSTimeZone(forSecondsFromGMT: 0)
    return formatter
  }
}

let x1 = getFormatter()
let x2 = getFormatter()
let x3 = getFormatter()

In the above code example, the checkThreadLocal function checks the threadDictionary to see if an entry with the given key value has been added to the current thread's threadDictionary. If it has, then it simply returns the already created object. Otherwise it creates the object (using the create() lambda) and returns it. x1, x2, and x3 all reference the same variable in the current running thread's memory.

However, if you were to run getFormatter() once again in a new thread, you would create a separate instance of the NSDateFormatter:

let x1 = getFormatter() // NEW INSTANCE CREATED!!
let x2 = getFormatter() // References the instance created by x1
let x3 = getFormatter() // References the instance created by x1

DispatchQueue.main.async {
  let x4 = getFormatter() // NEW INSTANCE CREATED!!
  let x5 = getFormatter() // References the instance created by x4
  let x6 = getFormatter() // References the instance created by x4
}

let x7 = getFormatter() // References the instance created by x1

This is because inside the DispatchQueue.main.async block, you are inside a separate thread with its own threadDictionary.

Java Implementation

Java's implementation uses a data structure that is near identical to what Flat uses. Java uses a ThreadLocal object that takes a generic argument for the type of data that is being stored, and you use get/set/remove functions to manage that memory within the thread.

public class ThreadLocalExample {
  public static class MyRunnable extends Thread {
    private static ThreadLocal<Integer> threadLocal = ThreadLocal<Integer>();

    public void run() {
      threadLocal.set((int)(Math.random() * 100));

      try {
          Thread.sleep(2000);
      } catch (InterruptedException e) {}

      System.out.println(threadLocal.get());
    }
  }

  public static void main(String[] args) {
    Thread thread1 = MyRunnable();
    Thread thread2 = MyRunnable();

    thread1.start();
    thread2.start();

    thread1.join(); // wait for thread 1 to terminate
    thread2.join(); // wait for thread 2 to terminate
  }
}

Flat Implementation

The same code in Flat would look like:

class ThreadLocalExample {
  static class MyRunnable extends Thread {
    static ThreadLocal<Int> threadLocal = ThreadLocal()

    public run() {
      threadLocal.set((Int)(Math.random() * 100))

      Thread.sleep(2000)

      Console.writeLine(threadLocal.get())
    }
  }

  public static main(String[] args) {
    let thread1 = MyRunnable()
    let thread2 = MyRunnable()

    thread1.start()
    thread2.start()

    thread1.join() // wait for thread 1 to terminate
    thread2.join() // wait for thread 2 to terminate
  }
}

They are nearly identical. However, this is not the only way that Flat allows you to declare thread locals. With a little syntax sugar from the thread_local modifier (or [ThreadLocal] annotation), you are able to achieve the same result with minimal change to how you would program it if it were not necessary to be local to the thread:

class ThreadLocalExample {
  static class MyRunnable extends Thread {
    static thread_local Int threadLocal

    public run() {
      threadLocal = (Int)(Math.random() * 100)

      Thread.sleep(2000)

      Console.writeLine(threadLocal)
    }
  }

  public static main(String[] args) {
    let sharedRunnableInstance = MyRunnable()

    thread1.start()
    thread2.start()

    thread1.join() // wait for thread 1 to terminate
    thread2.join() // wait for thread 2 to terminate
  }
}

The thread_local modifier also allows for more platform specific optimizations to take place. For instance, when compiled to C, the compiler will output the thread_local fields using the __thread modifier, which offers a performance boost. Flat's thread_local modifier will be available in version 0.3.7 and up.

Conclusion

Considering the different approaches that the different languages in this article showcased, Flat decided to use a ThreadLocal data structure because of its clean implementation with the thread_local modifier. There needed to be a seamless way to define thread-local data without using some sort of key/value pair to keep track of it. Instead of requiring you to keep track of the thread ID and the variable itself, the thread_local modifier consolidates the user's focus on the variable itself.

Footnotes:

5. More information on thread-local storage can be found here.

6. More information can be found here.

FLAT

FLAT

THREAD-LOCAL STORAGE

WHAT IS THREAD-LOCAL STORAGE (TLS)?

LANGUAGE IMPLEMENTATIONS

PThread Implementation

Swift Implementation

Java Implementation

Flat Implementation

Conclusion

Footnotes:

SHARE THIS PAGE COPY

FLAT

FLAT

THREAD-LOCAL STORAGE

SHARE PAGE

WHAT IS THREAD-LOCAL STORAGE (TLS)?permalink

LANGUAGE IMPLEMENTATIONSpermalink

PThread Implementationpermalink

Swift Implementationpermalink

Java Implementationpermalink

Flat Implementationpermalink

Conclusionpermalink

Footnotes:permalink

SHARE THIS PAGE COPY

WHAT IS THREAD-LOCAL STORAGE (TLS)?

LANGUAGE IMPLEMENTATIONS

PThread Implementation

Swift Implementation

Java Implementation

Flat Implementation

Conclusion

Footnotes: