자료 : SoK: Sharding on Blockchain
지난 포스트 - SoK: Sharding on Blockchain 8

7 EPOCH RECONFIGURATION 본문

Sharding protocols partition the consensus nodes into differ- ent shards, so that each shard can process the transactions in parallel, and hence improve the scalability of the whole system. However, partitioning the nodes into shards in block- chain sharding introduces new challenges when dealing with the phenomenon of the churn. For example, corrupted nodes could strategically leave and rejoin the shards, so that even- tually they can take over one of the shards and break the security guarantees of the blockchain protocol. Moreover, the adversary can actively corrupt a constant number of uncor- rupted nodes in each epoch even if no nodes join/rejoin [141]. Most current sharding protocols did not explicitly provide the approaches to deal with the epoch reconfiguration. How- ever, the epoch reconfiguration is critical to guarantee the security of blockchain system.

Clearly, to prevent attacks from the adversary, e.g, cor- rupting a specific shard, the adversary should not have the knowledge, in advance, how the partition (reconfiguration) process works. This requires that the partition process should not be affected by the adversary who do not know which participating nodes will be assigned to which shard ahead. Also, for each shard working correctly, it must guarantee that the majority of participating nodes within each shard (e.g., at least 2/3 of the shard members) are honest and fol- low the consensus protocol. One simple and naive way is to leverage the randomness, discussed in Section 5. By applying the randomness on epoch reconfiguration, the probability of one shard being bad is negligible (e.g., less than 10−7). In this section, we present several state-of-the-art schemes to deal with epoch reconfiguration, which typically rely on the (modified) epoch randomness and the specific mechanisms together. We call epoch reconfiguration and shard reconfig- uration interchangeably in this section.

7.1 Hash + Final Committee

One simple and naive approach for epoch reconfiguration is to re-elect all committees periodically faster than the ad- versary’s ability to generate the churn. A previous approach is used to generate epoch randomness [8]. However, this so- lution tolerates at most 1/6 fraction of malicious nodes and only works for a small network since it essentially bears an excessive message complexity. The cryptographic hash operations can be used to achieve the same purpose at some extent. In the last step of Elastico [100], it takes a similar but optimized approach via the final committee (or called consensus committee) to achieve epoch reconfiguration. The final committee at the final step generates a set of random strings used for next epoch. In general, Elastico consists of two main phases for epoch reconfiguration.

In the first phase of the reconfiguration, each member of the final committee chooses a r-bit random string Ri and sends a hash H (Ri ) to everyone in that committee. The final committee then runs an interactive consistency protocol to agree on a single set of hash values S [112] and broadcasts S to everyone in the network. This set S contains at least 2c/3 (where c is the size of the final committee) hash values and serves as a commitment to the random strings. In the second phase, each member of the final committee broadcasts a message containing the random string Ri itself to everyone (i.e., not just to the final committee). This phase starts only after the agreement of S is done, i.e., having 2c/3 signatures on S. This is to guarantee that honest members release their commitments only after they are sure that the committee has agreed on S and the adversary cannot change its commitment. After the second phase, each node in this system has received at least 2c/3 and at most 3c/2 pairs of Ri and H(Ri) from members of the final committee, since the honest members follow the protocol, while the malicious nodes may choose not to release their commitments. Nodes discard any random strings Ri that do not match the commitments H (Ri ). Finally, the agreed-to set S is used to configure the next epoch.

However, there exist several weaknesses in this kind of epoch reconfiguration. First, re-generating all the commit- tees is very expensive due to the large overhead of the boot- strapping protocol. Second, maintaining a separate ledger for each committee is challenging when several committee members may be fully replaced in every epoch. Third, the randomness used in each epoch can be biased by an adver- sary, and hence, compromise the committee selection process and even allow malicious nodes to precompute PoW puzzles. Besides, Elastico requires a trusted setup for generating an initial common randomness that is revealed to all parties at the same time.

7.2 DRG + PoW + Cuckoo Rule

RapidChain adopts a different approach to handle partial issues in Elastico via Cuckoo rule [9] [119]. In general, the epoch reconfiguration has three components: offline PoW, epoch randomness generation, and reconfiguration process. The reconfiguration process uses Cuckoo rule to re-organize only a subset of shard members during the reconfiguration event that shards are balanced with respect to their sizes as nodes join or leave the network.

RapidChain relies on PoW to protect against Sybil attack by requiring every node who wants to join or stay in the protocol to solve a PoW puzzle. In each epoch, a fresh puzzle is generated based on the epoch randomness so that the adversary cannot precompute the solutions ahead of the time to compromise the committees. All nodes in RapidChain solve a PoW offline without making the protocol stop and wait for the solution. Thus, the expensive PoW calculations are performed off the critical latency path. The reference committee (CR) in RapidChain is responsible to check the PoW solutions of all nodes at the start of each epoch, and then agrees on a reference block consisting of the list of all active nodes for that epoch as well as their assigned committees.

To compute an offline PoW solution, an epoch randomness generation process is needed, in which the members of the reference committee run a distributed random generation (DRG) protocol to agree on an unbiased random value. CR includes the randomness in the reference block so that other committees can randomize their epochs. RapidChain uses a verifiable secret sharing (VSS) of Feldman [67] to generate an unbiased randomness within the reference committee. Any new node who wishes to join the system can contact any node in any committees at any time and request the randomness of this epoch as a fresh PoW puzzle.

To assign the nodes to shards, it first maps each node to a random position in [0, 1) using a hash function. Then the range [0, 1) is partitioned into k regions of size k/n, and a committee is defined as the group of nodes that are assigned to O(loд(n)) regions, for some constant k. Awerbuch and Scheideler [9] propose the Cuckoo rule to ensure that the set of committees created in the range [0, 1) remain robust to join-leave attacks. Based on this rule, when a node wants to join the network, it is placed at a random position x ∈ [0, 1), while all nodes in a constant-sized interval surrounding x are moved (or cuckoo’ed) to a new random position in [0, 1). It is proved that given ε ≤ 1/2 − 1/k in a steady state, all regions of size O(loд(n))/n have O(loд(n)) nodes (i.e., they are balanced) of which less than 1/3 are faulty, with high probability, for any polynomial number of rounds.

7.3 VRF + Global Reconfiguration

Similar to Elastico, OminiLedger also runs a global recon- figuration protocol at each epoch, e.g., once a day, to allow new participants to join the protocol. The protocol generates identities and assigns participants to shards using a slow identity blockchain protocol that assumes the synchronous channels. In each epoch, a fresh randomness is generated using a bias-resistant random generation protocol that relies on a verifiable random function (VRF) [105] for unpredicat- ble leader election in a way similar to the lottery algorithm of Algorand [72]. Then, the protocol uses the elected leader as the client in the RandHound [127] protocol to generate the epoch randomness.

More specifically, at the beginning of an epoch, each valida- tor computes a ticket which contains all properly registered validators of the current epoch (e.g., as stored in the identity blockchain) and the view counter. Validators then gossip these tickets with each other for a time δ , after which they lock in the lowest-value valid ticket they have seen thus far and accept the corresponding node as the leader of the Rand- Hound protocol run. Once the validators have successfully completed a run of RandHound and the leader has broadcast randomness together with its correctness proof, each of the registered validators can verify and use this randomness to compute a permutation, and subdivide the result into ap- proximately equally-sized buckets, thereby determining the assignment of nodes to shards.

7 EPOCH RECONFIGURATION

Elastico의 epoch 재구성 프로세스는 두 단계로 진행됩니다. 첫 번째 단계에서 최종 위원회의 각 구성원은 임의 문자열을 생성하고 해시값을 위원회의 모든 구성원에게 전송합니다. 두 번째 단계에서 최종 위원회의 각 구성원은 임의 문자열 자체를 네트워크의 모든 구성원에게 전송합니다. 이 프로세스를 통해 블록체인 네트워크는 다음 epoch의 위원회를 구성할 수 있습니다.

Elastico의 epoch 재구성 프로세스는 공격자에게 영향을 받지 않도록 설계되었습니다. 공격자는 최종 위원회의 구성원을 공격하기 어렵기 때문에 epoch 재구성 프로세스를 공격하기 어렵습니다.

샤딩 프로토콜이 노드를 샤드로 분할하는 중에 노드 이탈 현상이 발생할 수 있다. 이 이탈 현상(churn)은 샤딩 프로토콜의 보안을 위협할 수 있습니다.

노드 이탈 현상(노드가 블록체인 네트워크에서 나가고 다시 가입하는 현상)이 보안을 위협하는 부분
손상된 노드를 사용하여 샤드를 장악
공격자는 손상된 노드를 사용하여 합의 프로토콜에 대한 제어권을 획득하고, 블록체인 네트워크에 허위 트랜잭션을 포함하거나, 블록체인 네트워크를 중단할 수 있다.
이탈 현상을 사용하여 블록체인 네트워크의 투명성을 손상
공격자는 손상된 노드를 사용하여 트랜잭션을 숨기거나, 트랜잭션의 순서를 변경할 수 있다.
이탈 현상을 사용하여 블록체인 네트워크의 속도를 저하시킬 수 있다.
공격자는 손상된 노드를 사용하여 트랜잭션을 처리하는 것을 지연하거나, 블록체인 네트워크의 통신을 방해할 수 있다.

따라서, 샤딩 프로토콜은 epoch 재구성과 같은 손상된 노드를 제거하고 블록체인 네트워크의 보안을 유지하는 방법이 필요하다.

이 섹션에서는 epoch 재구성 또는 샤드 재구성 프로세스를 설계하는 방법을 소개

7.1 Hash + Final Committee

(무작위성을 사용한 epoch 재구성 프로세스)

무작위성을 사용하여 epoch를 재구성하게 되면, 공격자가 프로세스를 예측할 수 없게 만들고, 악용하기 어렵게 만들 수 있다.

이러한 방법을 사용한 프로토콜이 바로, “Elastico”.

Elastico

epoch 재구성 프로세스를 사용하여 churn을 방지하는 샤딩 프로토콜

최종 위원회를 사용하여 epoch 재구성 프로세스를 수행
- 최종 위원회는 무작위성을 사용하여 다음 epoch의 위원회를 결정
이렇게 만들면 공격자가 최종 위원회를 공격하기 어렵기 때문에 epoch 재구성 프로세스를 공격하기 어렵다.

Elastico의 epoch 재구성 두 가지 단계

최종 위원회의 각 구성원은 임의 문자열을 생성하고 해시값을 위원회의 모든 구성원에게 전송
- 최종 위원회의 각 구성원은 r 비트의 임의 문자열 Ri를 선택하고 해당 위원회의 모든 사람에게 해시 H (Ri)를 보낸다.
- 최종 위원회는 다음으로 단일 해시 값 집합 S에 동의하기 위해 대화 형 일관성 프로토콜을 실행하고,
- 이 집합 S를 네트워크의 모든 사람에게 브로드캐스트한다.
  - 이 집합 S에는 최소 2c/3 (여기서 c는 최종 위원회의 크기)의 해시 값이 포함되어 있으며 임의 문자열에 대한 약속으로 작용한다.
최종 위원회의 각 구성원은 임의 문자열 자체를 네트워크의 모든 구성원에게 전송
- 최종 위원회의 각 구성원은 임의 문자열 Ri 자체를 포함하는 메시지를 모든 사람에게 브로드캐스트합니다 (즉, 최종 위원회에만 제한되지 않음).
  - 이 단계는 S의 합의가 완료된 후에만 시작된다. 즉, S에 대해 2c/3 개의 서명이 있는 경우.
  - 이는 정직한 구성원이 위원회가 S에 동의하고 공격자가 약속을 변경할 수 없음을 확신한 후에만 약속을 공개할 수 있도록 보장하기 위함.
- 두 번째 단계가 끝나면 이 시스템의 각 노드가 최종 위원회의 구성원로부터 최소 2c/3, 최대 3c/2 쌍의 Ri와 H (Ri)를 수신했다.
  - 정직한 구성원은 프로토콜을 따르지만 악의적인 노드는 약속을 공개하지 않을 수 있다. 노드는 H (Ri)와 일치하지 않는 임의 문자열 Ri를 버린다.
- 마지막으로, 합의된 집합 S는 다음 epoch을 구성하는 데 사용된다.

Elastico의 epoch 재구성 약점

그러나 이러한 epoch 재구성에는 몇 가지 약점이 있다.

첫째, 부트스트랩 프로토콜의 높은 오버헤드로 인해 모든 위원회를 다시 생성하는 데 비용이 많이 든다.
둘째, 각 epoch마다 여러 위원회 구성원이 완전히 교체될 수 있는 경우 각 위원회에 대해 별도의 원장(ledger)을 유지하는 것이 어렵다.
셋째, 각 epoch에서 사용되는 무작위성은 공격자에 의해 편향될 수 있으며, 따라서 위원회 선택 프로세스를 손상시키고 심지어 악의적인 노드가 PoW 퍼즐을 미리 계산할 수 있도록 허용할 수 있다.
또한 Elastico는 모든 당사자에게 동시에 공개되는 초기 공통 무작위성을 생성하기 위해 신뢰할 수 있는 설정을 필요로 하다.

이럴거면 안쓰는게 나을지도….

7.2 DRG + PoW + Cuckoo Rule

(RapidChain의 Cuckoo 규칙)

RapidChain은 Cuckoo 규칙을 사용하여 Elastico의 epoch 재구성 프로세스를 개선했다고 함.

Cuckoo Rule

Cuckoo 규칙에는 PoW와 DRG가 포함되어 샤드의 구성원을 재구성하는 프로세스이다.

1) PoW로 epoch 재구성 프로세스를 방해하는 것을 방지

RapidChain은 PoW를 사용하여 Sybil 공격을 방지한다.
Sybil 공격은 악의적인 노드가 많은 수의 가짜 ID를 생성하여 네트워크를 제어하려는 시도
RapidChain은 각 epoch마다 새로운 PoW 퍼즐을 생성하여 공격자가 미리 계산할 수 없도록 한다.
RapidChain의 모든 노드는 PoW 퍼즐을 오프라인에서 해결하므로 프로토콜이 중단되고 솔루션을 기다릴 필요가 없다.
- 따라서 비용이 많이 드는 PoW 계산은 중요한 지연 경로에서 수행된다.
RapidChain의 참조 위원회(CR)는 각 epoch의 시작 부분에서 모든 노드의 PoW 솔루션을 확인하고 해당 epoch의 모든 활성 노드 목록과 할당된 위원회로 구성된 참조 블록에 동의한다.

2) 분산형 난수 생성(DRG) 프로토콜

분산형 난수 생성(DRG)는 여러 대의 컴퓨터가 함께 작동하여 난수를 생성하는 프로세스이다.

RapidChain의 오프라인 PoW 솔루션을 계산하려면 epoch 무작위성 생성 프로세스가 필요한데, 여기서 참조 위원회의 구성원들은 분산형 난수 생성(DRG) 프로토콜을 실행하여 편향되지 않은 난수값에 동의하게 된다.

이 때, 참조위원회는 VSS(검증 가능한 비밀 공유, 여러 대의 컴퓨터가 함께 작동하여 비밀을 공유하는 프로세스)를 이용해서 난수를 생성한다.

Cuckoo Rule 동작방식

각 노드를 [0, 1)의 임의의 위치로 매핑하는 해시 함수를 사용하여 노드를 샤드에 할당
- Cuckoo 규칙에 따라 네트워크에 가입하려는 노드는 임의의 위치 x ∈ [0, 1)에 배치
그런 다음 범위 [0, 1)을 k/n 크기의 k 영역으로 분할
- 그 후, x 주변의 일정 크기의 간격에 있는 모든 노드는 새로운 임의의 위치 [0, 1)로 이동
k라는 상수에 대해 O(loд(n)) 영역에 할당된 노드 그룹으로 위원회를 정의
- ε ≤ 1/2 − 1/k가 안정 상태에서 주어질 경우, 모든 O(loд(n))/n 크기의 영역에는 O(loд(n)) 개의 노드가 있으며, 그중 고정확률로 3분의 1 미만이 고장
  즉, 다항 시간 동안 고장날 노드의 수가 적어도 1/3 미만

Cuckoo 규칙은 join-leave 공격에 강력하다.

join-leave 공격: 악의적인 노드가 네트워크에 참여하거나 탈퇴함으로써 네트워크를 공격하는 시도

7.3 VRF + Global Reconfiguration

(OminiLedger의 글로벌 재구성 프로토콜)

OminiLedger는 Elastico와 유사하게 각 epoch마다 글로벌 재구성 프로토콜을 실행한다.

이 프로토콜은 VRF를 사용하여 리더를 무작위로 선택하고, RandHound를 사용하여 epoch 무작위 값을 생성한다.

VRF: 검증 가능한 무작위 함수 RandHound: 무작위 값을 생성하는 데 사용되는 프로토콜

글로벌 재구성 프로토콜 동작방식

epoch의 시작 부분에서 각 검증자는 현재 epoch의 모든 정상적으로 등록된 검증자 (예: 신원 블록체인에 저장된 검증자)와 뷰 카운터(view counter)를 포함하는 티켓(validator가 epoch마다 생성하는 토큰)을 계산한다.
view counter: epoch의 진행 상황을 추적하는 카운터
그런 다음, 검증자들은 δ시간 동안 이러한 티켓들을 서로 교환하며, 그 중에서 지금까지 본 가장 작은 값을 가지고 있는 유효한 티켓을 고정시키고 해당 노드를 RandHound 프로토콜의 리더로 인정한다.
- 검증자들이 RandHound를 성공적으로 실행하고 리더가 무작위성과 그의 정당성 증명을 브로드캐스트.
- 가장 작은 값을 가지고 있는 유효한 티켓을 왜 고정하는가?
각 등록된 검증자는 이 무작위성을 검증하고 사용하여 순열을 계산하고 결과를 거의 동일한 크기의 버킷으로 나누어 노드를 샤드에 할당

다음에 계속…

BlockChain Core - Sharding on Blockchain 9

7 EPOCH RECONFIGURATION

7.1 Hash + Final Committee

Elastico

Elastico의 epoch 재구성 두 가지 단계

Elastico의 epoch 재구성 약점

7.2 DRG + PoW + Cuckoo Rule

Cuckoo Rule

1) PoW로 epoch 재구성 프로세스를 방해하는 것을 방지

2) 분산형 난수 생성(DRG) 프로토콜

Cuckoo Rule 동작방식

7.3 VRF + Global Reconfiguration

글로벌 재구성 프로토콜 동작방식

관련된 글

BlockChain Core - Sharding on Blockchain 8

BlockChain Core - Sharding on Blockchain 10

BlockChain Core - Sharding on Blockchain 11