When an abnormally large message is sent between Group Replication group members, it can result in some group members being reported as failed and expelled from the group. This is because the single thread used by Group Replication's group communication engine (XCom, a Paxos variant) is occupied processing the message for too long, so some of the group members might report the receiver as failed. From MySQL 8.0.16, by default, large messages are automatically split into fragments that are sent separately and reassembled by the recipients.
The system variable
group_replication_communication_max_message_size
specifies a maximum message size for Group Replication
communications, above which messages are fragmented. The default
maximum message size is 10485760 bytes (10 MiB). The greatest
permitted value is the same as the maximum value of the
slave_max_allowed_packet
system
variable, which is 1073741824 bytes (1 GB). The setting for
group_replication_communication_max_message_size
must be less than the
slave_max_allowed_packet
setting,
because the applier thread cannot handle message fragments larger
than slave_max_allowed_packet
. To
switch off fragmentation, specify a zero value for
group_replication_communication_max_message_size
.
As with most other Group Replication system variables, you must restart the Group Replication plugin for the change to take effect. For example:
STOP GROUP_REPLICATION;
SET GLOBAL group_replication_communication_max_message_size= 5242880;
START GROUP_REPLICATION;
Message delivery for a fragmented message is considered complete when all the fragments of the message have been received and reassembled by all the group members. Fragmented messages include information in their headers that enables a member joining during message transmission to recover the earlier fragments that were sent before it joined. If the joining member fails to recover the fragments, it expels itself from the group.
In order for a replication group to use fragmentation, all group
members must be at MySQL 8.0.16 or above, and the Group
Replication communication protocol version in use by the group
must allow fragmentation. You can inspect the communication
protocol in use by a group by using the
group_replication_get_communication_protocol()
UDF, which returns the oldest MySQL Server version that the group
supports. Versions from MySQL 5.7.14 allow compression of
messages, and versions from MySQL 8.0.16 also allow fragmentation
of messages. If all group members are at MySQL 8.0.16 or above and
there is no requirement to allow members at earlier releases to
join, you can use the
group_replication_set_communication_protocol()
UDF to set the communication protocol version to MySQL 8.0.16 or
above in order to allow fragmentation. For more information, see
Section 18.4.2.4, “Setting a Group's Communication Protocol Version”.
If a replication group cannot use fragmentation because some
members do not support it, the system variable
group_replication_transaction_size_limit
can be used to limit the maximum size of transactions the group
accepts. In MySQL 8.0, the default setting is approximately 143
MB. Transactions above this size are rolled back. You can also use
the system variable
group_replication_member_expel_timeout
to allow additional time (up to an hour) before a member under
suspicion of having failed is expelled from the group.