@InterfaceAudience.Private public class ReplicationSourceManager extends Object
When a region server dies, this class uses a watcher to get notified and it tries to grab a lock in order to transfer all the queues in a local old source.
Synchronization specification:
sources. sources is a ConcurrentHashMap and there
is a Lock for peer id in PeerProcedureHandlerImpl. So there is no race for peer
operations.walsById. There are four methods which modify it,
addPeer(String), removePeer(String),
cleanOldLogs(String, boolean, ReplicationSourceInterface) and preLogRoll(Path).
walsById is a ConcurrentHashMap and there is a Lock for peer id in
PeerProcedureHandlerImpl. So there is no race between addPeer(String) and
removePeer(String). cleanOldLogs(String, boolean, ReplicationSourceInterface)
is called by ReplicationSourceInterface. So no race with addPeer(String).
removePeer(String) will terminate the ReplicationSourceInterface firstly, then
remove the wals from walsById. So no race with removePeer(String). The only
case need synchronized is cleanOldLogs(String, boolean, ReplicationSourceInterface) and
preLogRoll(Path).walsByIdRecoveredQueues. There are three methods which
modify it, removePeer(String) ,
cleanOldLogs(String, boolean, ReplicationSourceInterface) and
claimQueue(ServerName, String).
cleanOldLogs(String, boolean, ReplicationSourceInterface) is called by
ReplicationSourceInterface. removePeer(String) will terminate the
ReplicationSourceInterface firstly, then remove the wals from
walsByIdRecoveredQueues. And
claimQueue(ServerName, String) will add the wals to
walsByIdRecoveredQueues firstly, then start up a ReplicationSourceInterface. So
there is no race here. For claimQueue(ServerName, String) and
removePeer(String), there is already synchronized on oldsources. So no need
synchronized on walsByIdRecoveredQueues.latestPaths to avoid the new open source miss new log.oldsources to avoid adding recovered source for the
to-be-removed peer.| Constructor and Description |
|---|
ReplicationSourceManager(ReplicationQueueStorage queueStorage,
ReplicationPeers replicationPeers,
org.apache.hadoop.conf.Configuration conf,
Server server,
org.apache.hadoop.fs.FileSystem fs,
org.apache.hadoop.fs.Path logDir,
org.apache.hadoop.fs.Path oldLogDir,
UUID clusterId,
WALFactory walFactory,
org.apache.hadoop.hbase.replication.regionserver.SyncReplicationPeerMappingManager syncReplicationPeerMappingManager,
MetricsReplicationGlobalSourceSource globalMetrics)
Creates a replication manager and sets the watch on all the other registered region servers
|
| Modifier and Type | Method and Description |
|---|---|
void |
addHFileRefs(TableName tableName,
byte[] family,
List<Pair<org.apache.hadoop.fs.Path,org.apache.hadoop.fs.Path>> pairs) |
void |
addPeer(String peerId)
Add peer to replicationPeers
Add the normal source and related replication queue
Add HFile Refs
|
void |
cleanUpHFileRefs(String peerId,
List<String> files) |
void |
drainSources(String peerId)
This is used when we transit a sync replication peer to
SyncReplicationState.STANDBY. |
org.apache.hadoop.fs.FileSystem |
getFs()
Get the handle on the local file system
|
org.apache.hadoop.fs.Path |
getLogDir()
Get the directory where wals are stored by their RSs
|
org.apache.hadoop.fs.Path |
getOldLogDir()
Get the directory where wals are archived
|
List<ReplicationSourceInterface> |
getOldSources()
Get a list of all the recovered sources of this rs
|
ReplicationPeers |
getReplicationPeers()
Get the ReplicationPeers used by this ReplicationSourceManager
|
ReplicationSourceInterface |
getSource(String peerId)
Get the normal source for a given peer
|
List<ReplicationSourceInterface> |
getSources()
Get a list of all the normal sources of this rs
|
String |
getStats()
Get a string representation of all the sources' metrics
|
long |
getTotalBufferLimit()
Returns the maximum size in bytes of edits held in memory which are pending replication across
all sources inside this RegionServer.
|
AtomicLong |
getTotalBufferUsed() |
Map<String,Map<String,NavigableSet<String>>> |
getWALs()
Get a copy of the wals of the normal sources on this rs
|
void |
join()
Terminate the replication on this region server
|
void |
logPositionAndCleanOldLogs(ReplicationSourceInterface source,
org.apache.hadoop.hbase.replication.regionserver.WALEntryBatch entryBatch)
This method will log the current position to storage.
|
void |
postLogRoll(org.apache.hadoop.fs.Path newLog) |
void |
preLogRoll(org.apache.hadoop.fs.Path newLog) |
void |
refreshSources(String peerId)
Close the previous replication sources of this peer id and open new sources to trigger the new
replication state changes or new replication config changes.
|
void |
removePeer(String peerId)
Remove peer for replicationPeers
Remove all the recovered sources for the specified id and related replication queues
Remove the normal source and related replication queue
Remove HFile Refs
|
public ReplicationSourceManager(ReplicationQueueStorage queueStorage, ReplicationPeers replicationPeers, org.apache.hadoop.conf.Configuration conf, Server server, org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path logDir, org.apache.hadoop.fs.Path oldLogDir, UUID clusterId, WALFactory walFactory, org.apache.hadoop.hbase.replication.regionserver.SyncReplicationPeerMappingManager syncReplicationPeerMappingManager, MetricsReplicationGlobalSourceSource globalMetrics) throws IOException
queueStorage - the interface for manipulating replication queuesconf - the configuration to useserver - the server for this region serverfs - the file system to uselogDir - the directory that contains all wal directories of live RSsoldLogDir - the directory where old logs are archivedIOExceptionpublic void addPeer(String peerId) throws IOException
peerId - the id of replication peerIOExceptionpublic void removePeer(String peerId)
peerId - the id of the replication peerpublic void drainSources(String peerId) throws IOException, ReplicationException
This is used when we transit a sync replication peer to SyncReplicationState.STANDBY.
When transiting to SyncReplicationState.STANDBY, we can remove all the pending wal
files for a replication peer as we do not need to replicate them any more. And this is
necessary, otherwise when we transit back to SyncReplicationState.DOWNGRADE_ACTIVE
later, the stale data will be replicated again and cause inconsistency.
See HBASE-20426 for more details.
peerId - the id of the sync replication peerIOExceptionReplicationExceptionpublic void refreshSources(String peerId) throws IOException
peerId - the id of the replication peerIOExceptionpublic void logPositionAndCleanOldLogs(ReplicationSourceInterface source, org.apache.hadoop.hbase.replication.regionserver.WALEntryBatch entryBatch)
source - the replication sourceentryBatch - the wal entry batch we just shippedpublic void preLogRoll(org.apache.hadoop.fs.Path newLog)
throws IOException
IOExceptionpublic void postLogRoll(org.apache.hadoop.fs.Path newLog)
throws IOException
IOExceptionpublic void join()
public Map<String,Map<String,NavigableSet<String>>> getWALs()
public List<ReplicationSourceInterface> getSources()
public List<ReplicationSourceInterface> getOldSources()
public ReplicationSourceInterface getSource(String peerId)
public AtomicLong getTotalBufferUsed()
public long getTotalBufferLimit()
public org.apache.hadoop.fs.Path getOldLogDir()
public org.apache.hadoop.fs.Path getLogDir()
public org.apache.hadoop.fs.FileSystem getFs()
public ReplicationPeers getReplicationPeers()
public String getStats()
public void addHFileRefs(TableName tableName, byte[] family, List<Pair<org.apache.hadoop.fs.Path,org.apache.hadoop.fs.Path>> pairs) throws IOException
IOExceptionCopyright © 2007–2020 The Apache Software Foundation. All rights reserved.