Java wait/notify/notifyAll — Stop Deadlocks at 2 AM
Race conditions corrupt data; deadlocks freeze your app at 2 AM.
20+ years shipping production Java in banking & fintech. Lessons pulled from things that broke in production.
- wait() must always be called inside a
whileloop re-checking the condition, never anif, to guard against spurious wakeups and race conditions. - notify() wakes one arbitrarily chosen thread; use it only when all waiting threads are waiting for the same condition and exactly one can make progress.
- notifyAll() wakes every waiting thread; it's always safe but can cause thundering herd overhead under high contention.
- Always call wait/notify/notifyAll inside a synchronized block that holds the object's monitor; the thread must own the lock before calling these methods.
- Prefer java.util.concurrent (e.g.,
BlockingQueue,ReentrantLock,CountDownLatch) over raw wait/notify in application code; the low-level API is error-prone and reserved for foundational infrastructure.
Imagine a coffee shop with one barista and a line of customers. When the barista runs out of coffee beans, they put up a 'Wait' sign — customers stop and sit down instead of crowding the counter. When the beans arrive, the barista either taps one specific customer (notify) or shouts 'everyone back in line!' (notifyAll). Java's wait/notify is exactly that: a polite way for threads to pause and resume without burning CPU cycles staring at a wall.
Wait, notify, and notifyAll are the lowest-level thread coordination primitives in Java. They're also the most dangerous. Master them, and you can build high-performance, resource-efficient concurrent systems. Get them wrong, and your application will deadlock silently in production, or burn CPU with busy loops while waiting for work that never arrives.
How wait/notify/notifyAll Actually Coordinates Threads
wait, notify, and notifyAll are the low-level inter-thread communication primitives in Java, built into every object via the Object class. A thread calling wait() on an object releases that object's intrinsic lock (monitor) and enters the object's wait set, suspending execution until another thread calls notify() or notifyAll() on the same object. This is not a polling mechanism — it's a precise signaling contract that requires the calling thread to already hold the object's monitor (i.e., be inside a synchronized block or method).
The key property: wait() atomically releases the lock and blocks, so there is no race between releasing and waiting. When a thread is notified, it does not resume immediately — it must re-acquire the lock before returning from wait(). This means the notified thread competes with other threads for the monitor, and the order of resumption is not guaranteed (JVM-dependent). notify() wakes one arbitrarily chosen thread; notifyAll() wakes all waiting threads. The awakened threads then re-check their condition, typically in a loop, because spurious wakeups are possible and the condition may have changed by the time they reacquire the lock.
Use wait/notify/notifyAll when you need a thread to wait for a condition that depends on another thread's action — for example, a producer-consumer queue where a consumer must wait until an item is available. In production systems, this pattern is the foundation of bounded blocking queues (like ArrayBlockingQueue) and thread pools. The alternative — busy-waiting with a while loop and sleep — wastes CPU cycles and introduces latency. But the low-level API is error-prone; in modern Java, prefer java.util.concurrent constructs (BlockingQueue, CountDownLatch, Phaser) unless you have a very specific reason to manage the monitor yourself.
wait(); — always use while (condition) wait(); to guard against spurious wakeups and ensure the condition is truly met before proceeding.notify() instead of notifyAll() in a thread pool's task queue, causing a priority inversion where a high-priority task waited indefinitely because a low-priority thread was the one woken.notify() does not release the lock — the notifying thread keeps the monitor until its synchronized block exits.wait() in a while loop checking the condition — spurious wakeups are real and platform-dependent.How wait() and notify() Actually Work Inside the JVM
Every Java object carries two hidden data structures inside its monitor: a lock (a mutex) and a wait-set (a queue of sleeping threads). When you call synchronized(someObject) you're competing for that object's lock. Once you hold it, calling someObject.wait() does three things atomically: it adds the calling thread to the wait-set, releases the lock, and suspends the thread. That release is crucial — without it, no other thread could ever call notify() because they'd never acquire the lock.
When another thread calls someObject.notify(), the JVM picks one thread from the wait-set and moves it to the entry-set — the queue competing for the lock. The notified thread doesn't run immediately; it re-acquires the lock first, then returns from wait(). This is why you must always re-check your condition after wait() returns using a while loop, not an if. The window between being notified and re-acquiring the lock is a real window, and another thread can slip in and invalidate the condition you were waiting for.
The JVM spec also permits spurious wakeups — a thread can return from wait() with no notify() having been called at all. This isn't just theoretical; it happens on Linux due to how POSIX condition variables are implemented under the hood. The while-loop pattern isn't defensive paranoia — it's mandatory correctness.
wait() without your condition being true. The while loop costs nothing and prevents data corruption that only shows up under load.Building a Correct Bounded Producer-Consumer Queue
The classic application of wait/notify is a bounded blocking queue — a buffer with a fixed capacity shared between producers and consumers. Producers wait when the buffer is full; consumers wait when it's empty. This pattern appears everywhere: thread pools, message brokers, async pipelines.
The two conditions to model are: 'buffer is not full' (producers wait on this) and 'buffer is not empty' (consumers wait on this). With a single lock object, both conditions share the same wait-set, which is why notifyAll() becomes important here — a notify() might wake the wrong type of waiter.
Pay close attention to where notify/notifyAll is called in the code below: inside the synchronized block, after the state change, before the lock is released. This ordering guarantees the notified thread will see the updated state when it re-acquires the lock.
notify() might wake a thread that still can't proceed (e.g., waking a producer when the buffer is still full). notifyAll() is O(n) on the number of waiting threads but prevents missed signals and is correct by default. Switch to notify() only after profiling proves the wakeup storm is a real bottleneck — typically with a single condition and many identical waiters.notify() vs notifyAll() — When Each One Is the Right Tool
This is one of the most misunderstood distinctions in Java concurrency. The short version: notify() is an optimisation, not a default. Use it only when you can guarantee that exactly one waiting thread can make progress after the state change, and all waiting threads are waiting for the same condition.
notifyAll() wakes all threads in the wait-set. Each one re-acquires the lock in turn, re-checks the condition, and either proceeds or goes back to sleep. Yes, this produces a 'thundering herd' — every woken thread competes for the lock, and most will just go back to sleep. Under high contention with hundreds of waiting threads this overhead is measurable. But it's always safe.
notify() wakes exactly one thread — JVM-chosen, not your choice. If that thread can't proceed (wrong condition), no other thread gets woken. You now have a system that's deadlocked even though progress is possible. This is called a 'missed signal' or 'lost wakeup' and it is brutally hard to debug.
The canonical rule: you can safely use notify() if and only if both conditions hold — (1) all threads waiting on this object are waiting for the same condition, and (2) one notification is sufficient to allow exactly one thread to proceed.
notify() problem elegantly. You create separate Condition objects — one per logical condition — so notFull.signal() wakes only producers and notEmpty.signal() wakes only consumers. This is exactly how LinkedBlockingQueue is implemented in the JDK. Know both: wait/notify for the interview, Condition for production code.Production Gotchas — What Goes Wrong in Real Systems
Knowing the API is only half the battle. Here's what actually bites engineers in production.
Calling wait() outside a synchronized block throws IllegalMonitorStateException immediately — no data corruption, just a crash. Easy to catch. The harder bug is calling notify() on a different object than the one you called wait() on. Both compile silently and both cause missed signals.
InterruptedException handling is where a lot of production code quietly breaks. Swallowing the interrupt (catch block that does nothing) means the thread will never respond to a shutdown signal. Always either re-throw or call Thread.currentThread().interrupt() to restore the flag so the caller can react.
Holding the lock too long is a performance killer. Everything inside the synchronized block is serialised. If your condition check or state update does I/O, database calls, or heavy computation, every other thread queues up. Push heavy work outside the synchronized block; use the lock only for reading/writing the shared state and calling wait/notify.
Nested locks — never call wait() while holding two locks. If Thread A holds Lock1 and waits on Lock2's monitor, and Thread B holds Lock2 and waits on Lock1's monitor, you have a classic deadlock. Lock ordering rules exist for this reason.
lock.wait(). This surprises engineers who expect it to be a non-blocking check. If you want a non-blocking check, just read the condition variable directly (inside a synchronized block) without calling wait() at all.The Spurious Wakeup — Why Your while() Loop Saves Your Ass
Competitors will tell you to use a while loop around . They won't tell you why you'll get paged at 3 AM if you don't.wait()
Spurious wakeups are real. The JVM spec explicitly allows to return without a corresponding wait() or notify()notifyAll(). Some operating systems (looking at you, certain POSIX implementations) deliver them when signals interrupt a thread. Your code must be hardened against this.
The fix is simple: never use if around . Always wait()while. The condition you waited on might be false when you wake up. Check it again. Loop until it's true.
Here's the pattern that'll keep your production systems green: the thread checks the predicate, waits if false, and rechecks when it awakens. No exceptions. No shortcuts.
while(), not if(). Every. Single. Time.wait() in a while loop checking the condition. Spurious wakeups are guaranteed by the JVM spec, and your code must handle them.Lost Wakeup — The Silent Killer of Thread Coordination
The wait/notify contract has a hidden landmine: lost wakeups. It happens when you call before the waiting thread has entered notify(). The notification vanishes. The waiting thread waits forever.wait()
This is not a theoretical edge case. It's the #1 bug I find in code review from teams new to synchronization. The root cause: thinking of notify() as a broadcast when it's actually a pulse.
The fix is defensive: use notifyAll() unless you have a concrete reason for single-thread wakeup. And always pair your state changes with notifications inside the same synchronized block. The producer must set the condition flag AND call before releasing the lock.notify()
Race conditions between checking and waiting? That's what happens when you try to be clever with timing. Don't. Use the lock correctly.
notify() and state mutation aren't in the same synchronized block, you have a bug. Full stop. The JVM's happens-before guarantee only applies within the same lock acquisition.notify()/notifyAll() inside the same synchronized block. Never split them across separate lock acquisitions.wait(long timeout, int nanos) — The Nanosecond Mirage
You've used wait(1000) to sleep a thread for a second. But wait(long timeout, int nanos) exists for a reason, and it's not because the JVM team had nothing better to do. The nanos argument is not a precision tool; it's a rounding hint. The JVM will never guarantee nanosecond accuracy. On most operating systems, thread scheduling quantums are in the millisecond range. That 500,000 nanosecond (0.5ms) timeout? It'll likely round up to the next OS tick.
Why does this matter in production? Because developers cargo-cult this API thinking they can build microsecond-precision timeouts. You can't. The real value is in interval-based polling where you want finer granularity than a full millisecond, but you're still at the mercy of the OS scheduler. If you need real-time precision, you're in the wrong language. Use this to prevent tight loops from burning CPU when you know a condition will resolve in sub-millisecond time, but never as a substitute for a proper cooldown period.
Conclusion: What You Actually Need to Remember
Wait, notify, and notifyAll are the bare metal of Java thread coordination. You've seen how they work inside the JVM, why spurious wakeups force you to use while() loops, and how lost wakeups silently corrupt your data. The bounded buffer example showed the pattern that works: synchronized on the same object, check condition in a loop, signal after state change.
Here's the production cheat sheet. Never call wait() without a condition loop — the one time you skip it, a spurious wakeup will toast your invariants. Use notifyAll() unless you can prove single-waiter correctness; notify() saves one thread switch but costs you a debugging nightmare. And never hold multiple locks when waiting — that's how deadlocks are born.
You don't need to memorize the JVM internals. Memorize the pattern: synchronized → while(not ready) → wait() → recheck → done. Every senior dev I know has been burned by at least one of these gotchas. Now you won't be.
wait() outside a while() loop in code review, block the merge. Every time. No exceptions.wait() → recheck → proceed. Forget the loop, and you'll debug intermittent failures for weeks.Notifier — Who Actually Calls notify() and How It Works
The Notifier is the thread responsible for calling notify() or notifyAll() to wake waiting threads. Without a notifier, all waiting threads deadlock forever. The notifier must hold the same monitor (synchronized block) as the waiting threads when it calls notify(). A common mistake is calling notify() outside the synchronized block, causing IllegalMonitorStateException. In producer-consumer patterns, the producer acts as notifier after adding items, waking consumers. The key insight: notify() only hints the JVM to wake one thread, but the actual handoff happens only when the notifier releases the lock. The awakened thread must reacquire the lock before returning from wait(). Never call notify() on a condition you haven't changed; otherwise, you cause spurious wakeups that waste CPU.
notify() without changing the condition can cause the awakened thread to immediately re-enter wait(), wasting CPU and risking lost wakeups if the condition never changes.notify(), and do both inside the same synchronized block.WaitNotifyTest — A Concrete Test to Validate Your Coordination
WaitNotifyTest is a simple test harness that validates a working wait/notify coordination under concurrency. It creates one waiting thread and one notifier thread, uses a shared lock object, and a volatile boolean flag to signal readiness. The test ensures the waiting thread blocks until notified, then checks it completed without timeout. Key components: an ExecutorService for thread management, a CountDownLatch to synchronize test start, and assertions that the waiting thread finishes within a deadline. This catches lost wakeup bugs: if the notifier signals before the waiter starts waiting, the waiter blocks forever. Always structure tests to start the waiter first, then the notifier, and use while(condition) loops inside wait() to handle spurious wakeups. Production teams fail without such tests.
wait(). This guarantees the waiter is waiting before the notifier fires, preventing race conditions in tests.Key takeaways
wait() is not defensive paranoianotify() and lock re-acquisition.notify() is an optimisation valid only when all waiting threads are identical and exactly one can always proceed.Interview Questions on This Topic
Frequently Asked Questions
20+ years shipping production Java in banking & fintech. Lessons pulled from things that broke in production.
That's Multithreading. Mark it forged?
9 min read · try the examples if you haven't