An Optimized Reduction Design to Minimize Atomic Operations in Shared Memory Multiprocessors