不积跬步,无以至千里

并发安全Map实现


不同语言下都会有Map的实现,Map的存在使得查找的复杂度降低到O(1),但Map并不是并发安全的,使用原生的Map做并发操作时,需要添加同步锁才能保证多线程读写的安全性,但加锁的机制同时会造成读阻塞.

java中的ConcurrentHashMap通过分段锁降低了读阻塞的概率,但是当对某一个段做频繁的读写操作的话,依旧会对性能造成较大的影响.而HashTable通过sychronized关键字实现的同步机制更不必说.

在golang中,sync包下同样实现了Map来应对并发.

首先看结构体定义:

type Map struct {
	mu Mutex
	read atomic.Value // readOnly
	dirty map[interface{}]*entry
	misses int
}
  • mu
    mu是一个同步锁
  • read
    read可以看做是dirty的一份拷贝,对读操作和更新操作线程安全,但不能新增key
  • dirty
    对dirty做操作时,需要通过mu做同步
  • misses
    misses记录了读操作时无法获取到指定key的次数

可以看到Map中维护两份数据,一个用来读,一个用来写.对应的两个主要的方法有Load,Store

type entry struct {
	// p points to the interface{} value stored for the entry.
	//
	// If p == nil, the entry has been deleted and m.dirty == nil.
	//
	// If p == expunged, the entry has been deleted, m.dirty != nil, and the entry
	// is missing from m.dirty.
	//
	// Otherwise, the entry is valid and recorded in m.read.m[key] and, if m.dirty
	// != nil, in m.dirty[key].
	//
	// An entry can be deleted by atomic replacement with nil: when m.dirty is
	// next created, it will atomically replace nil with expunged and leave
	// m.dirty[key] unset.
	//
	// An entry's associated value can be updated by atomic replacement, provided
	// p != expunged. If p == expunged, an entry's associated value can be updated
	// only after first setting m.dirty[key] = e so that lookups using the dirty
	// map find the entry.
	p unsafe.Pointer // *interface{}
}

read和dirty的底层数据都指向一个entry

读取

func (m *Map) Load(key interface{}) (value interface{}, ok bool) {
	read, _ := m.read.Load().(readOnly)
	e, ok := read.m[key]
	if !ok && read.amended {
		m.mu.Lock()
		// Avoid reporting a spurious miss if m.dirty got promoted while we were
		// blocked on m.mu. (If further loads of the same key will not miss, it's
		// not worth copying the dirty map for this key.)
		read, _ = m.read.Load().(readOnly)
		e, ok = read.m[key]
		if !ok && read.amended {
			e, ok = m.dirty[key]
			// Regardless of whether the entry was present, record a miss: this key
			// will take the slow path until the dirty map is promoted to the read
			// map.
			m.missLocked()
		}
		m.mu.Unlock()
	}
	if !ok {
		return nil, false
	}
	return e.load()
}

相关流程可以总结为:

从read取数据    
    取到数据
    未取到数据,并且dirty中包含read中没有的key
        加锁
        从dirty取数据
        记录misses//当missed的个数和read的个数相当时,用dirty替换read
        解锁
返回数据

写入

func (m *Map) Store(key, value interface{}) {
	read, _ := m.read.Load().(readOnly)
	if e, ok := read.m[key]; ok && e.tryStore(&value) {
		return
	}

	m.mu.Lock()
	read, _ = m.read.Load().(readOnly)
	if e, ok := read.m[key]; ok {
		if e.unexpungeLocked() {
			// The entry was previously expunged, which implies that there is a
			// non-nil dirty map and this entry is not in it.
			m.dirty[key] = e
		}
		e.storeLocked(&value)
	} else if e, ok := m.dirty[key]; ok {
		e.storeLocked(&value)
	} else {
		if !read.amended {
			// We're adding the first new key to the dirty map.
			// Make sure it is allocated and mark the read-only map as incomplete.
			m.dirtyLocked()
			m.read.Store(readOnly{m: read.m, amended: true})
		}
		m.dirty[key] = newEntry(value)
	}
	m.mu.Unlock()
}

相关流程可以总结为:

从read取数据    
    取到数据,并且取到的数据和要写入的数据相等
返回
加锁
再次从read取数据
    取到数据
        把read中的entry删除
        成功?
            dirty中添加entry
    entry中的value赋值
未取到数据,从dirty中取到数据
    entry中的value赋值
read和dirty的数据一致//read和dirty都没有读到指定key的数据,并且数据一样,说明这个key是新添加的key
    read中的amended标记为true,表示dirty中包含read中没有的key
    dirty添加元素
解锁

以上可以看到golang中实现map多现成安全的策略是读写分离,read用来读,dirty用来写,从read中获取不到指定key的时候才从dirty中读取;同时misses达到一定次数之后才会用dirty替换read,达到更新read的目的.

由于对dirty的更新同样需要借助锁同步,所以当写入操作频繁的话,读操作获取不到指定的key,读的压力依旧会落到dirty,除非及时更新read.所以这里misses的设定需要设定一个比较合理的值,才能让读写达到更好的效率.

简单来说,如果写频繁,可以将misses的值设小一点,及时更新read,避免影响度读效率;而如果读操作更加频繁,misses可以设定高一点,避免频繁更新read.