Skip navigation

Fast Thread-Safe Map Cache

Recently I came across an interesting programming problem I want to share with you. In short this post discusses thread-safe and fast java implementation of a cache for some special case.

To give you some background, this problem arose in scope of my pet project PUNKSearch (local network indexer and searcher based on Lucene). PUNKSearch shows online/offline status of the hosts displayed in search results. These statuses are cached for several minutes to avoid “dogpile effect” on remote hosts and increase overall responsiveness of the application. Cache is organized as HashMap for which keys are IPs and values are pairs of Date and Boolean wrapped in some extra object named HostStatus. The Date is used to determine if we need to refresh the status for the host and Boolean defines whatever host is online. Obtaining status is a Slow Process (timeout usually is 5 seconds).

Since PUNKSearch is a web application this cache must be thread-safe. Another property we want it to have is speed. Locking the whole cache for the period some thread tries to determine if a host is online is no way.

Lets try to construct such a cache.
Try 1: The simple not thread-safe implementation (cache renewal is dropped for the sake of clearness)


class MapCacheSimpleFastBroken {

	private Map<String, Object> cache       = new HashMap<String, Object>();
	private SlowProcess         slowProcess = new SlowProcess();

	public Object get(String key) {
		Object value = cache.get(key);
		if (value == null) {
			value = slowProcess.get(key);
			cache.put(key, value);
		}
		return value;
	}
}

This class obviously not thread-safe. What can we do to make it thread-safe? Of course, make the method “synchronized”!

Try 2: The simple thread-safe implementation (cache renewal is dropped for the sake of clearness)


class MapCacheSimpleSlowSafe {

	private Map<String, Object> cache       = new HashMap<String, Object>();
	private SlowProcess         slowProcess = new SlowProcess();

	public synchronized Object get(String key) {
		Object value = cache.get(key);
		if (value == null) {
			value = slowProcess.get(key);
			cache.put(key, value);
		}
		return value;
	}
}

This implementation is thread-safe, good. But it is extremely slow! How can we avoid locking the whole cache object? What if we lock only required portion of the map? The map key in particular. Lets try.

Try 3: The fine-grained want-to-be thread-safe implementation (cache renewal is dropped for the sake of clearness)


class MapCacheTrickyFastBroken {

	private Map<String, Object> cache       = new HashMap<String, Object>();
	private SlowProcess         slowProcess = new SlowProcess();
	private static final Object STUB        = new Object();

	public Object get(String key) throws InterruptedException {
		String lock = "";
		// sync on the whole cache to: 1) get cached data, 2) insert stub if necessary
		synchronized (cache) {
			Object value = cache.get(key);
			if (value != null && value != STUB) { // "null" only if this is the first time we see this key
				return value;
			} else if (value == null) { // we see the key for the first time, add the stub to create a key in map keySet
				cache.put(key, STUB);
			}
			// init lock with the key for the required object
			for (String cacheKey : cache.keySet()) {
				if (cacheKey.equals(key)) {
					lock = cacheKey;
					break;
				}
			}
		}
		// grab the lock only on the part of the cache map, so online checks for different hosts can go simultaneously
		// can't use "key" argument as lock here, since we want to sync on cache's key object
		synchronized (lock) {
			Object value = cache.get(key);
			// maybe object was already updated by other thread while we were waiting for lock?
			// call the slow process only if that was not happened
			if (value == STUB) {
				value = slowProcess.get(key);
				cache.put(key, value);
			}
			return value;
		}
	}
}

This implementation seems to be ok. We lock the whole map for a short period of time to verify if we have required object in the cache and possibly modify the map. If the object was not found we put a stub into the map to sync against its key later.
Bad news. This implementation has a serious flaw. Can you find it? Look under the cut for the explanation and the correct implementation.
Read More »

The old well-known(?) puzzle

Suppose we have two sorted arrays: A and B of size n.
We would like to know if there is such a pair of indexes (i,j) that A[i] + B[j] = C.
C is a given number.
What’s the best worst-case time complexity (Big Oh) of the algorithm to solve the problem?

Looking at The Law of Demeter

Have you heard of The Law of Demeter? I haven’t until recently I’ve come across this term a few times in one day. Immediately I decided to take more closer look and get familiar myself with that term. It turned out that it’s not about ancient mythology… well… it’s named in honor of Demeter which is in Greek mythology:

is the goddess of grain and fertility, the pure nourisher of the youth and the green earth, the health-giving cycle of life and death, and preserver of marriage and the sacred law.

But our Law of Demeter is about software engineering, to be more precise it is about a design guideline for developing software also called as “Only talk to your immediate friends”.

First a came across this term reading Martin Fowler’s article “Mocks Aren’t Stubs” which is really great article where he explains differences of Test Doubles. If you don’t really know the difference between Dummy, Fake, Stub and Mock objects, go and read it, the article is great source of that knowledge. BTW, I think I’m a classic TDDer who uses Mocks for the real objects that are awkward to work with.

So, back to LoD in “Mocks Aren’t Stubs” Martin Fowler talks how your testing style influence the design style.

Mockist testers do talk more about avoiding ‘train wrecks’ - method chains of style of getThis().getThat().getTheOther(). Avoiding method chains is also known as following the Law of Demeter. While method chains are a smell, the opposite problem of middle men objects bloated with forwarding methods is also a smell.

And there is a good article on Google Testing Blog describing why violation of LoD is a smell and how to avoid it.

To me it’s not only about number of dots in a “train wreck” it’s about behavior of objects and their responsibility.
The most famous example showing LoD violation is Paper boy.

The first point is that the customer usually does not give the wallet to the paper boy to pay for the paper bill. Instead, the customer takes out the money from the wallet and hands it to the paper boy.

Instead of using (in class PaperBoy)

	customer.wallet.totalMoney;

we use a method:

	customer.getPayment(..)

You don’t want to give away your wallet that’s true, but sometimes two dots don’t necessarily mean that there is a LoD violation.
Let’s take an example getOrder().getCustomer().getCustomerAddress(). Do we want to delegate the getCustomerAddress() instead?

	getOrder().getCustomerAddress();

The question is should the order have a getCustomerAddress() method?

That’s what a few experts think about LoD:

The basic effect of applying this Law is the creation of loosely coupled classes, whose implementation secrets are encapsulated. Such classes are fairly unencumbered, meaning that to understand the meaning of one class, you need not understand the details of many other classes.
— G. Booch.

Avoid traversing multiple links or methods. A method should have limited knowledge of an object model. A method must be able to traverse links to obtain its neighbors and must be able to call operations on them, but it should not traverse a second link from the neighbor to a third class.
– J. Rambaugh

There are so many resources out there about this subject, if you’re interested I would suggest to read:

How to write Evil, Unstable code

How to write evil code? It’s easy, follow the guidelines. :)

Conditional Slalom - Always, always, feel good when writing lengthy if branches and switch statements. These increase the number of possible execution paths that tests will need to cover when exercising the code under test. The higher the Cyclomatic complexity, the harder it is to test! When someone suggests to use polymorphism instead of conditionals, laugh at their thoughtfulness towards testing. Make the branching both deep and wide: if you’re not consistently at least 5 conditionals deep, you’re spoon feeding testable code to the TDD zealots

Use More Statics - Statics are a really powerful tool to bring TDD Infected engineers to their knees. Static methods can’t be overridden in a subclass (sometimes subclassing a class and overriding methods is a technique for testing). When you use static methods, they can’t be mocked using mocking libraries (nixing another trick up the pro-testing engineer’s sleeve).

Use static initializes - Do as much work as possible when your class is loaded. Testing nuts will be so frustrated when they find out just loading your class causes nasty stuff like network or file access.

There is more following the link. It’s worth reading. I like that blog!

Hard World of Ordinary User

I was in vacation last two weeks. My macbook was deliberately left at home, because I want to rest from the world of HiTech — that was vacation, right? Indeed, I left the HiTech world, but dive into the hard world on ordinary user. As you may assume I had access to computers and to be honest at most times against my will. As IT guy (do not confuse with It girl) I was forced to help people around with their computer problems.

The computer world of ordinary user is hard. The first problem is viruses. That was nightmare! Being *nix adept for a long time I completely forgot the “virus” word. In the hard world viruses are around. The anti-virus solutions are diverse, can’t live with each other, need constant internet connection to update their bases and internet connection supplies you with new viruses again and again. Having no internet connection is death — the first USB drive with doc/xls/jpg files from you friend and you have a trojan.

The second problem is LCD monitors. Yes, LCD monitors — that simple. Impressing number of users set 1024×768 on 15/17 inch displays while the native resolution is 1280×1024! You can imagine the quality of the picture! I simply can’t understand that. Is the ordinary user blind? Don’t they just care? And finally some of them buy modern and stylish widescreen monitors and set 4:3 ratio. Hard, hard world :(

I was simply happy back to my laptop! Wish you only positive impressions during your vacation :)