Wireless Coded Computation With Error Detection

In wireless networks with distributed computing, the computational performance is limited by stragglers. To mitigate the stragglers’ effect, coded computation is adopted through computational redundancy. Moreover, in wireless transmission, transmission errors may occur due to noise, channel fading and so on. Existing works design coded computation and error detection separately. However, this leads to frequent encoding and inefficient allocation. In this paper, we propose a joint computation and transmission coding (JCTC) scheme to design coded computation and error detection jointly. The coded computation is based on Luby transform (LT) code and linear error-detecting codes are applied for the re-transmission mechanism. To achieve the low dynamic encoding, two-layer encoding is adopted. Then, the performances of JCTC scheme are analyzed in terms of latency and computation reliability. Finally, in order to achieve efficient task and redundancy allocation, the wireless LT coded computation with error detection (WLTCC-ED) algorithm is given from both iterative and low-complexity perspectives respectively. Through theoretical analysis and numerical simulation, it shows that our proposed JCTC scheme has significant advantages over separate designs.

To address the stragglers' effect, a new framework named coded computation [6] was proposed.Inspired by classical coding theory, the authors in [6] applied maximum distance separable (MDS) code to speed up the distributed matrix multiplications by introducing necessary computational redundancy in the homogeneous networks with nodes of uniform computation capabilities.Without waiting for the responses from all the nodes, the desired computational results could be recovered only using some fast-responding nodes.It implied that MDS coding scheme could reduce the computation time significantly and achieve an order-wise improvement over the original uncoded scheme.The authors in [7] and [8] further studied the corresponding scheme to allocate the optimal computational task for nodes in the heterogeneous networks with disparate computation capabilities.
Compared to MDS code with a fixed rate, Luby transform (LT) code offers the rateless property and lower decoding complexity.Using the rateless property, the corresponding LT coding scheme was proposed in [9].Through sub-block division, this scheme could exploit the computed results from all the nodes including stragglers.It led to negligible redundant computation and maximum straggler tolerance for a lower latency.Moreover, the authors in [10] showed the LT coding scheme could further reduce the computation latency at the expense of an increased communication load.
In wireless distributed networks, transmission latency also has an important effect on the performance.For homogeneous wireless networks, the authors in [11] analyzed the performance of MDS coding scheme from the total latency's point of view.With packet losses due to channel fading, the work of [12] investigated the performance of total latency and provided guidelines to design optimal MDS code.For heterogeneous wireless networks, the authors in [13] proposed wireless coded computation scheme to deal with both computation and transmission stragglers.Then, the authors in [14] further exploited the computed results of stragglers in wireless networks based on block-division.As for LT coding scheme, the work of [15] proposed block-design based wireless LT coded computation scheme to balance both computation and transmission latency.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
[19] and [20] discussed the convolution and the regression problem respectively, and the distributed computing problem of arbitrary functions was studied in [21].For the scenario where the exact computational result was not required, the authors in [22], [23], and [24] provided a strategy of approximating coded distributed computing to realize a tradeoff between accuracy and speed.As for a more practical distributed network setup, the heterogeneous multi-hop network was considered in [25] and the work of [26] studied multiple distributed matrix multiplication tasks in a multi-master heterogeneous-worker scenario.The deployment of coded computation in wireless edge computing was discussed in [27], [28], and [29].
Most existing works on coded computation disregarded the transmission errors in wireless networks, which may lead to a severe performance degradation.For example, although the works of [11], [13], and [15] considered the transmission latency in wireless distributed networks, they ignored the negative effects caused by wireless channel.The works of [12] and [14] only analyzed the performance of wireless coded computation with packet losses and did not discuss data errors, which were more complex and practical.Also, they did not give a design of wireless coded computation with high reliability.As for the accuracy-sensitive computational tasks, the results with low reliability are intolerable.The error-detecting mechanism based on re-transmission [30] was applied to address this.However, coded computation and error detection were designed separately.Such separate design has the following problems.
1) Frequent Encoding.Each time the input data changes, the corresponding computed result has to be encoded again before transmission, which leads to excessive encoding tasks and huge encoding latency.For the delay-sensitive computational tasks, the high latency causes a severe performance degradation and it is intolerable.So such scheme is not suitable for high dynamic scenarios.
2) Inefficient Allocation.The error-detecting redundancy is designed locally without the global network parameters, while the computational tasks are allocated in the fusion center without considering the re-transmission overhead and reliability.For example, if a worker with good computation and transmission capabilities is in a very bad channel condition, this worker will be still allocated lots of computational tasks in the existing separate designs.It may leads to a huge re-transmission latency or an unsatisfied reliability.Thus, this allocation is not efficient or optimal.
To address the above issues, we propose to design coded computation and error detection jointly.Specifically, we first give the new joint computation and transmission coding (JCTC) scheme.The coded computation is based on LT code, while error-detecting mechanism is based on re-transmission using linear codes.Then, its performances are analyzed for latency and computation reliability.Finally, within the required computation reliability, the sub-optimal efficient task and redundancy allocation strategies based on iterative optimization algorithm and low-complexity algorithm are obtained respectively.The main contributions of this paper are summarized as follows: • Joint Coding Design.The two-layer encoding is performed at the fusion center so that both computation coding and transmission coding can be done offline.Also, the low dynamic encoding can be achieved no matter how frequently the input data changes.• Performance Benefits.Compared with the separate designs with the same computation reliability, our JCTC scheme can achieve less encoding and lower computation latency.Besides, the performance of latency and computation reliability can be balanced well in the proposed scheme.
• Efficient Task and Redundancy Allocation.Within the required computation reliability, sub-optimal efficient task and redundancy allocation can be obtained at the fusion center by wireless LT coded computation with error detection (WLTCC-ED) algorithm based on iterative optimization to realize a tradeoff between computation and transmission latency.As for a scenario of low error rate, an approximate algorithm is proposed to simplify the solving process with a lower complexity.Organization: The rest of this paper is organized as follows.In Section II, the wireless LT coded computation is reviewed and the drawbacks of the existing separate designs are discussed.The proposed JCTC scheme is presented in Section III.Then, the performances of latency and computation reliability for the JCTC scheme are analyzed in Section IV.In Section V, the sub-optimal task and redundancy allocation strategy is obtained through iterative optimization, and a low-complexity algorithm is given for the scenario of low error rate.Simulation results are shown in Section VI and conclusion is finally presented in Section VII.

II. SYSTEM MODEL
We consider a classical distributed master-worker setup [6], [7] in a wireless network, as shown in Fig. 1.The whole network consists of one master and n workers that have different computation and transmission capabilities.The goal is to compute a matrix-vector multiplication y = Ax wirelessly and reliably at the master with the help of the workers, where A ∈ F m×d 2 q is a pre-stored matrix in this distributed network, x ∈ F d 2 q is an input vector that is broadcast to each worker by the master, and y ∈ F m 2 q is the output vector.To speed up the computational tasks in heterogeneous wireless networks, the rateless LT coded computation [9] is applied.In LT coding approach, the master first generates the encoded matrix Ã ∈ F αm×d 2 q (α > 1, and α can be very large to achieve the rateless property) by treating the m rows of A as source symbols according to the robust soliton degree distribution [31].Dividing Ã equally by rows, the data block Ãi ∈ F l×d 2 q (l = αm/n) will be assigned to worker i, i ∈ [n].Then, in order to further utilize the rateless property of LT code and the partial works done by stragglers, the rows of Ãi are divided again by the master into sub-blocks of the same size as { Ãi,j ∈ F bi×d 2 q } ⌈l/bi⌉ j=1 and each data sub-block will be stored in the corresponding workers, where b i denotes the data sub-block size for worker i, i.e., each data sub-block includes b i inner products to be calculated.For the traditional LT coding approach, the size of data sub-blocks cannot be too large and a fine-grained dividing strategy is usually adopted, i.e., b i = 1, i ∈ [n].
After receiving the input x, worker i starts to compute { Ãi,j x} ⌈l/bi⌉ j=1 .When a partial result Ãi,j x is done, worker i can transmit it to the master as soon as possible instead of waiting for the complete result Ãi x.The worker will transmit its data sub-block early if it finishes the computation early and only one worker can transmit its result at each time.Due to the severe channel fading, noise and so on, different data errors may occur during the transmission.We model these transmission errors as a binary symmetric channel with a fixed bit error probability ε i for worker i and the channel transition probability matrix H i of worker i is given as where ε i can be obtained on basis of the number of error symbols by transmitting the reference signal or other channel estimation techniques [32], [33].
As for an inner product transmitted by worker i, assume that each bit error occurs independently and there is an error in the inner product if at least one bit is erroneous [30].Then, the error probability for the inner product can be obtained by ε q,i = 1 − (1 − ε i ) q , where each inner product is represented by q bits.In order to avoid these transmission errors, the re-transmission mechanism [30] is considered.Through introducing the error-detecting redundancy, some transmission errors can be detected and the re-transmission is required for and the extra encoding tasks the corresponding sub-block to ensure the reliability of transmission.Furthermore, because of the limited bandwidth of wireless channel, a uniform maximum number of sub-blocks that can be transmitted successfully from each worker is pre-allocated to avoid the frequent interaction between workers and master, which is denoted as a constant k.In other words, each worker can send up to k sub-blocks to the master to prevent excessive occupation of channel resources.
Once receiving a sub-block, the master detects whether there are any transmission errors.If the transmission errors are detected, the corresponding sub-block will be re-transmitted; otherwise, the master will accept this sub-block and decode it.According to the decoding features of LT code, the master can recover the desired computational result y successfully once any (1 + η) m accepted data inner products are received from all the workers, where η is a small decoding overhead (η → 0 as m → ∞).
From the above discussion, the existing schemes design coded computation and error detection separately, as illustrated in Fig. 2.This has the following drawbacks.
1) Frequent and high dynamic encoding.The pre-stored model matrix A has the characteristic of low dynamic, while the input vector x is highly dynamic in many machine learning and big data applications [34].Each time the input data changes, the corresponding computed result has to be encoded again by each worker before transmission, which causes burdensome encoding tasks and a huge encoding latency.Such a severe performance degradation is intolerable for the delay-sensitive computational tasks.
2) Inefficient task and redundancy allocation.The computational task is allocated in the fusion center without considering the re-transmission latency and reliability, while the error-detecting redundancy is designed by each worker locally without the global network parameters at the fusion center.In other words, the data sub-block size b i , i ∈ [n] is designed by the master but the error-detecting redundancy r i is decided by worker i.The design of the whole sub-block size is fragmented and inefficient.For example, if a worker with good computation and transmission capabilities is in a very bad channel condition, this worker will be still allocated a large sub-block size in the existing separate designs, which leads to a huge re-transmission latency or an unsatisfied reliability.So this allocation strategy is not optimal.

III. JOINT COMPUTATION AND TRANSMISSION CODING DESIGN
In order to overcome the drawbacks of separate designs, the JCTC scheme is proposed.It performs both coded computation and error-detecting coding in the fusion center, as shown in Fig. 3, to achieve low dynamic encoding and efficient allocation.The specific process can be described as follows.
1) Two-Layer Encoding.As shown in Fig. 3a, the master first encodes A for computation to speed up matrix multiplication.After that, the encoded and divided sub-block Ãi,j is encoded again for error detection to ensure the reliability of inner products during transmission.In this paper, the linear error-detecting code is applied.The corresponding data sub-block after error-detecting encoding is denoted as represents the additional redundancy for error detection, where r i is the size of redundancy in each sub-block for worker i.Both Ã′′ i,j and Si,j together constitute the two-layer encoded sub-matrix Then, the master will send Ã′ i,j to worker i ∈ [n].
2) Distributed Computing and Serial Transmitting.After receiving Ã′ i,j , the worker i computes matrix multiplication Ã′ i,j x and then sends the corresponding computed results back to the master.The total computation time for worker i is denoted as a random variable T cmp i .
3) Error Detection and Re-transmission.The transmission from workers to the master may incur transmission errors.Once receiving the transmitted sub-blocks by workers, the master will perform error detection.For a received sub-block, if it contains no error, the master will accept it directly; if it contains a detectable error pattern, the corresponding sub-block will be re-transmitted; if it contains an undetected Fig. 4. A simple example of JCTC scheme with a master and 3 workers.After the two-layer encoding, A ′ i is allocated to worker i.During the first transmission for worker 1, here is a transmission error.Thus, worker 1 re-transmits its computed result.Worker 2 is a straggler, which slows down the whole networks.With the help of coded computation, the master can recover Ax without waiting for worker 2.
error pattern, the master will also accept it with the undetected transmission error, which means the master commits a decoding error and decreases the reliability of the whole networks.For worker i, the total time spent on transmitting computed sub-blocks until accepted by the master is denoted as T trn i (t c ), where t c is the given computation time.For the whole networks, the number of undetected error data inner products is represented as N un .
4) Recovering Desired Result.After receiving enough accepted data inner products, the master is able to recover the desired result Ax.
To facilitate the understanding, a simple example is given as follows: Example 1: As illustrated in Fig. 4, a wireless distributed network with one master and three workers is considered.The corresponding steps in JCTC scheme can be described in the following.
1) Two-Layer Encoding.Matrix A is partitioned into 2 submatrixes: A 1 and A 2 .Each submatrix contains two row vectors.Then, the two-layer encoded matrixes 2 are generated and each will be sent to a corresponding worker by the master.These encoded matrixes contain three row vectors, where the third one is used for error detection by summing up the first two row vectors.
2) Distributed Computing and Serial Transmitting.After receiving the input vector x broadcast from the master, each worker multiplies x with the two-layer encoded matrix and transmits the computed result back to the master.
3) Error Detection and Re-transmission.The master then will check whether transmission errors occur.For instance, Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
there is a transmission error during the first transmission of worker 1 if a x. Thus, worker 1 re-transmits the computed result and the second transmitted result is accepted by the master.
4)Recovering Desired Result.The master can only receive A ′ 1 and A ′ 3 from worker 1 and 3 respectively because of the outage for worker 2. By subtracting A ′ 1 from A ′ 3 , the master can recover A ′ 2 and hence Ax without waiting for the slowest worker.
This example implies that our JCTC scheme can not only mitigate the stragglers' effect, but also achieve the low dynamic encoding because of the two-layer encoding strategy for A. No matter how the input vector x changes, the master can still detect the transmission errors.And workers never perform the error-detecting coding before transmission.
As matrix multiplication is one of the key and fundamental computational tasks underlying machine learning and big data analytics, our proposed JCTC scheme also has potential applications in those areas.For example, convolutional neural networks (CNN) convolve their input data with kernels in each layer [35].With regard to m kernels, m convolutions need to be computed and each convolution operation can be performed as an inner product of two vectors.In other words, the matrix A is consisted of m kernels and the vector x represents the input to the neural network.For another example, the encoders in Transformer perform the matrix calculations of self-attention [36].The system matrix A represents the weight matrixes which have been trained and the input represents the embeddings.Then, the output query, key, and value matrixes can be produced through multiplications.Also, the proposed JCTC scheme can be extended to more complex wireless environments, once the transition probability (or the channel bit error rate) of each worker is obtained by the reference signal or other channel estimation techniques.
According to the proposed JCTC scheme, we can define the following metrics to evaluate the performances of the whole networks.
Definition 1 (Computation Latency): The computation latency, denoted as T cmp , is the time spent on calculating (1 + η) m data inner products for the whole networks.T cmp is a random variable and can be given as: where all the random variables T cmp 1 , T cmp 2 , . . ., T cmp n are assumed to be mutually independent.
Definition 2 (Transmission Latency): The transmission latency, denoted as T trn , is the time spent on transmitting (1 + η) m accepted data inner products for the whole networks.T trn is a random variable related to T cmp and can be given as: Definition 3 (Computation Reliability): The computation reliability for the whole networks, denoted as R cmp , represents the ratio of correct data inner products to all the (1 + η) m data inner products accepted by the master, which can be given as:

IV. PERFORMANCE ANALYSIS
In this section, we will analyze the performances of the JCTC scheme.First, the bounds of expected computation latency and the expectation of transmission latency will be obtained.Then, we will discuss the factors that influence the computation reliability and present the constraint of computation reliability.At last, the superiority of JCTC scheme will be shown.
A. Latency Analysis 1) Computation Latency: Due to the sub-block division and the error-detecting coding at the master, the time of computing j sub-blocks, i.e. j (b i + r i ) inner products, is denoted as a random variable T cmp i,j .The cumulative distribution function (CDF) of T cmp i,j can be described as a shifted exponential distribution [6]: for t ≥ j (b i + r i ) a i and j ≤ k, where µ cmp i and a i denotes the straggling and shift parameters, respectively, determined by the computation capability of worker i.This latency model fits the distribution of computation time in cloud computing environments well.From Eq. ( 5), we can observe that where the random variable T cmp i is exponentially distributed with rate parameter µ cmp i representing the initial setup time at worker i before actually beginning computing an inner product, and c i is the number of sub-blocks computed by worker i completely before completing a total of (1 + η) m accepted data inner products in the network.In Eq. ( 6), T cmp i + a i indicates the time spent on computing one inner product by worker i.
As mentioned in our JCTC scheme, workers perform their computations with sub-block division.The number of sub-blocks calculated by worker i till the given computation time t c is denoted as x i (t c ) in the following lemma.
Lemma 1: With a given computation time t c , the average number of sub-blocks calculated by worker i can be derived as Proof: See Appendix A. ■ According to Lemma 1, we can find that c i = x i (T cmp ) and should be satisfied so that the master can recover the desired result successfully.
For the heterogeneous networks, the order statistics cannot be used to describe the computation latency due to sub-block division and disparate capabilities of workers, so that the exact expression of E [T cmp ] is hard to obtain.So the lower and Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
upper bounds of computation latency are discussed in the following.
Lemma 2: The lower bound of T cmp is given as where T cmp (1) is the first order statistic that follows exponential distribution with rate parameter µ cmp g .Proof: See Appendix B. ■ As a result, the expected lower bound can be described in the following proposition.
Proposition 1 (The Expected Lower Bound of Computation Latency): In the JCTC scheme, the expected lower bound of computation latency L cmp can be given as Proof Based on Lemma 2 and the characteristics of order statistic, L cmp can be obtained by taking the expectation of (8).
The upper bound of T cmp is given as ■ As a result, the expected upper bound can be described in the following proposition.
Proposition 2 (The Expected Upper Bound of Computation Latency): In the JCTC scheme, the expected upper bound of computation latency U cmp can be given as Proof Based on Lemma 3, U cmp can be obtained by taking the expectation of (10).[11], one has from Eq. ( 9) and Eq.(11).Consider a scenario where the channel condition is so good that since the error-detecting redundancy is not required.
2) Transmission Latency: Due to the instability of wireless channel and the disparate transmission capabilities, it is assumed that the transmission time for a single inner product follows a mutually independent exponential distribution [12], [37] with the rate parameter µ trn i , which represents the transmission capability for worker i.In our JCTC scheme, a sub-block with detectable transmission errors is required to be re-transmitted.We denote the total number of times for a sub-block transmitted by worker i as k re,i , which follows a geometric distribution that can be given as: where p s,i is assumed as the success probability and its detailed expression will be discussed in Section IV-B.Then, the time T trn i (T cmp ) spent on transmitting c i sub-blocks can be obtained by for c i > 0, where T trn i,(κ th ) is the time for the result of the κ th inner product transmitted by worker i and T trn i,re,(κ th ) = kre,i u=1 T trn i,(κ th ) is the transmission time of the κ th inner product until it is accepted by the master.Obviously, T trn i (T cmp ) = 0 if c i = 0, which means that worker i has no completed sub-block to transmit by the time T cmp .In the following lemma, we state the statistical property of T trn i,re,(κ th ) .Lemma 4: The random variable T trn i,re,(κ th ) follows an exponential distribution with rate parameter p s,i µ trn i , i.e., Pr T trn i,re,(κ th ) ≤ t = 1 − e −ps,iµ trn i t .
Proof: See Appendix D. ■ Based on Lemma 4, we present an another lemma to show the expectation of T trn i (t c ) in the following.Lemma 5: The expected random variable E [T trn i (t c )] can be given as Proof: See Appendix E. ■ As a result, the expectation of transmission latency can be given in the following proposition.
Proposition 3 (The Expectation of Transmission Latency): In the JCTC scheme, the expectation of transmission latency E [T trn ] for the whole networks can be given as Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
Proof: According to Lemma 5 and Eq. ( 3), E [T trn ] can be obtained by substituting t c = T cmp into Eq.( 15) and summing it over all i ∈ [n].■

B. Computation Reliability Analysis
After the wireless transmission from workers to the master, several scenarios may occur at the master: • No error.The probability that the master receives a sub-block with no error from worker i is denoted as p c,i .From the channel model, we know that •Undetected errors.The probability that the master receives a sub-block with an undetected error pattern from worker i is denoted as p e,i .With regard to all the (b i + r i ,b i ) linear error-detecting codes, the average probability of undetected errors has been proved [38], [39], [40] In this paper, it is assumed that p e,i ≈ pe,i , i.e.
Detectable errors.The probability that the master receives a sub-block with a detectable error pattern from worker i is denoted as p d,i .It can be obtained by p c,i and p e,i , i.e.
A received sub-block is accepted by the master only if it either contains no error or an undetected error pattern.Otherwise, if the master detects the transmission errors, the corresponding sub-block will be re-transmitted until it is accepted.Notice that the number of transmission for a single sub-block follows a geometric distribution with the success probability p s,i = p c,i + p e,i in Eq. (12).
For wireless coded computation, the accepted sub-blocks with undetected errors will affect the accuracy of the desired result Ax and decrease the computation reliability for the whole distributed networks.In the JCTC scheme, we require the ratio of the number of data inner products with undetected errors to (1 + η) m accepted data inner products in Ax does not exceed p r , where p r is the tolerable maximum error inner product rate.It means that there is at most (1 + η) mp r undetected error inner products in the desired result.
As a result, the expected computation reliability for the whole networks can be given in the following proposition.
Proposition 4 (The Expectation of Computation Reliability): In the JCTC scheme, the expectation of computation reliability E [R cmp ] for the whole networks can be obtained by Proof: Due to the re-transmission under detectable errors, the expected total number of transmission for worker i sending x i (t c ) sub-blocks is denoted as E [z i (t c )] with the given computation time t c .Then, E [z i (t c )] can be obtained by With regard to a sub-block transmitted by worker i, it can be accepted on the initial transmission or any re-transmissions.Although it is re-transmitted for many times, there can still be errors for an accepted sub-block because of the limited errordetecting ability.We denote the probability that an accepted sub-block contains undetected errors as p u,i and it can be given by where 1 − p d,i = p c,i + p e,i has been used.From Eq. ( 21) and Eq. ( 22), we know that the average number of data inner products with undetected error patterns can be given as Then, E [R cmp ] can be obtained by taking the expectation of Eq. ( 4) and substituting Eq. ( 23) into it.■ Remark 2 (The Constraint of Computation Reliability): From Proposition 4, the corresponding constraint of computation reliability in our JCTC scheme can be shown as: where the right-hand side of the constraint (24) implies the ratio of the minimum number of correct inner products to the total (1 + η) m inner products.Our design must satisfy the constraint in order to obtain the desired result and meet the requirement of computation reliability at the same time.

C. Comparison With Separate Designs
In the existing separate designs, each worker needs to encode its computed sub-block for error detection by itself before sending to the master, as shown in Fig. 2. In other words, coded computation is independent of error detection and they are designed separately.It implies that each worker not only computes a data sub-block but also spends some time encoding it for error detection.The encoding task performed by worker i can be described by a matrix multiplication as follows: where is the coding matrix used for error detection.For JCTC scheme, the worker i only calculates the matrix multiplication Ã′ i,j x including the data matrix multiplication Ã′′ i,j x and the redundancy matrix multiplication Si,j x.So the computational task for error detection in JCTC scheme can be represented by the redundancy matrix multiplication.With the same computation reliability, it costs less calculated Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
amount for error detection in JCTC scheme than the one in separate designs.The comparison of calculated amount for workers is shown in the following proposition.
Proposition 5 (Comparison of Calculated Amount for Error Detection): Assume that the data sub-block size and the error-detecting redundancy obtained by both JCTC scheme and separate designs are the same, which implies that the computation reliability of both schemes is also the same.Compared with separate designs, the JCTC scheme can decrease the calculated amount served as error detection for all n workers by at least Proof: In the JCTC scheme, worker i calculates each sub-block with a redundancy matrix multiplication Si,j x.It needs extra r i (d − 1) additions and r i d multiplications.So there are a total of n i=1 x i (t c ) r i (2d − 1) operations served as error detection for all n workers with the given time t c .
For the separate designs, the encoding task performed by worker i is described as G i,j ( Ãi,j x).Worker i encodes each sub-block for error detection with extra additions and (b i + r i ) b i multiplications.Thus, here are a total of encoding operations for all n workers in the separate designs.
Notice that when d ≤ min i b i , the JCTC scheme only needs at most n i=1 x i (t c ) r i (2b i − 1) operations served as error detection for all n workers and can decrease at least n i=1 x i (t c ) b i (2b i − 1) operations, compared with the existing design.
■ Because each worker encodes its computed data sub-blocks by itself, the whole computation time for the separate designs, denoted as T S tot,c , contains the original time calculating matrix multiplication and the encoding time for error detection, represented by T S cmp and T S cc respectively.Since the worker with poor capability for matrix multiplication is also weak in encoding, we assumed that T S tot,c can be approximated by the sum of T S cmp and T S cc , i.e.T S tot,c = T S cmp + T S cc .Further, T S cmp and T S cc can be obtained by T S,cc i , where T S,cmp i is the original time calculating matrix multiplications { Ãi,j x} ci j=1 and T S,cc i denotes the encoding time calculating {G i,j ( Ãi,j x)} ci j=1 for worker i.And the CDF of T S,cmp i and T S,cc i can be given respectively by where µ cc i and a cc i represent the encoding capability for worker i.In the following proposition, we compare the whole computation time between these two schemes.
Proposition 6 (Comparison of the Whole Computation Time): Assume that the data sub-block size and the error-detecting redundancy obtained by both JCTC scheme and separate designs are the same, which implies that the computation reliability of both schemes is also the same.When µ cc i = µ cmp i and a cc i = a i , the difference in the expected whole computation time between these two schemes is bounded as where L S cmp is the lower bound of E T S cmp and U S cmp is its upper bound.
Proof: From Eq. ( 26) and Eq. ( 27), we notice that T S,cmp i and T S,cc i can be rewritten as , where the random variables T S,cmp i and T S,cc i are exponentially distributed with rate parameter µ cmp i and µ cc i respectively.Similar to the computation latency analysis of the JCTC scheme in Section IV-A.1, the bounds of E T S cmp and E T S cc can be given as where µ cc g = max i µ cc i , a cc g = min i a cc i , µ cc b = min i µ cc i , a cc b = max i a cc i and L S cc , U S cc represent the lower bound and the upper bound of E T S cc respectively.Then, the expected whole computation time in separate designs can be described as Thus, when 9), Eq. ( 11) and Eq.(33).
■ The above propositions imply that the less calculated amount for workers can also lead to the lower computation latency for the whole networks.Hence, the total latency in JCTC scheme is lower than that in separate designs under the same computation reliability.Fig. 5 and Fig. 6 show the expectation of calculated amount for error detection and the whole computation latency versus the number of workers respectively.It is observed that Monte Carlo simulation results are in good agreement with Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
the theoretical ones.As n increases, both calculated amount and latency decrease, since more computing resources are utilized with larger n.Moreover, the JCTC scheme performs better than the separate designs, which confirms the theoretical analysis.

V. OPTIMAL TASK AND REDUNDANCY ALLOCATION
For JCTC scheme, minimizing the upper bound of the expected total latency E [T cmp + T trn ] is considered, which can still lead to a decrease in total latency.Under the condition that the required computation reliability is satisfied, latency is reduced as much as possible by designing the optimal data sub-block size and the corresponding optimal error-detecting redundancy for each worker.Thus, the optimization problem can be formulated as follows: Pr In P 0 , the constraint (34) determines the range of data sub-block and error-detecting redundancy.The constraint (35) ensures that the master can aggregate a sufficient number of data inner products to recover the desired computational result successfully, and the computation reliability of the whole networks is guaranteed in (36).For any given µ cmp i > 0, a i > 0, µ trn i > 0 and 0 ≤ ε i < 1/2, P 0 is always feasible because there exists at least one feasible solution, i.e., b i = l/k and r i = +∞ for i ∈ [n], satisfying the constraints of P 0 .
However, due to the heavy relation between transmission and computation latency for each worker, it is challenging to obtain the exact expression of E [T cmp + T trn ], which makes this problem hard to solve.According to [7, Section III-A], we can introduce a new variable t cmp to relax the term E [T cmp ] and optimize the computation latency t cmp , the data sub-block size {b i } n i=1 and the error-detecting redundancy {r i } n i=1 simultaneously when the distribution of the random variable T cmp is unknown.Then, the reformulated problem P 1 can be obtained as follows: where the set {t i } n i=1 is introduced to relax the term E [T trn ], representing the transmission time of each worker.The constraint (37) implies the relationship between computation and transmission for each worker, which ensures that the number of transmitted sub-blocks is no more than the number of computed sub-blocks.To aggregate sufficient data inner products to recover the desired result, the number of total accepted results should be more than (1 + η)m, which leads to the constraint (38).And the computation reliability required by the whole networks can be denoted as the constraint (39).The solution to P 1 is provably asymptotically optimal when n becomes very large [13].
For P 1 , there are differences of convex (DC) structure and products of convex functions (PF) structure [41] in the constraint ( 37), ( 38) and (39), which makes this problem non-convex.In the following, we solve this problem in two different ways.

A. Iterative Optimization Algorithm
The non-convexity of P 1 is caused by the DC structures and the PF structures in constraints.Using successive convex approximation (SCA) algorithms [42], we can transform such non-convex structures into convex approximations and iteratively solve the relaxed convex optimization problem to get sub-optimal solutions.For the DC structure, we can linearize the concave part by taking the Taylor expansion to obtain the convex upper approximation, while the product of convex functions can first be rewritten as a function with the DC structure according to [41] and then the corresponding convex upper approximation can be obtained by linearizing the concave part in the rewritten function for the PF structure.Hence, the relaxed convex optimization problem can be given as: where f 1,i (t i , b i , r i ), f 2,i (t i , b i , r i ) and f 3,i (t i , b i , r i ) are convex functions with respect to t i , b i and r i .See Appendix F for the detailed convex approximations of DC and PF structures and the concrete expressions of In P ′ 1 , note that the objective function and the constraint (40) are linear functions.Besides, the constraint (41) can be rewritten as a convex exponential cone.The constraint ( 42) and ( 43) are also convex since they are the sum of some convex functions.Thus, P ′ 1 is a convex problem and we can Set the number of iterations β = 0, the proper initial step-size θ (0) ∈ (0, 1] and adopt the proper initial points 1 t Update t 0 , b 0 and r 0 according to t Apply a diminishing step-size rule [42]: end while 9: for the worker i. 10: end procedure solve it iteratively to find the sub-optimal approximate solution to P 1 .The wireless LT coded computation with error detection based on SCA (WLTCC-ED(SCA)) algorithm is provided, which is given as Alg. 1.
During each iteration of WLTCC-ED(SCA) algorithm, it is required to deal with P ′ 1 , which falls into a convex exponential cone programming category.It can be solved efficiently to a desired accuracy by using interior-point methods with MOSEK in polynomial time computational complexity of O n 3.5 .
Remark 3 (Negligible Channel Transition Probability): When the channel condition is so good that the sub-blocks transmitted from workers are almost error-free, i.e. ε i → 0, i ∈ [n], the error-detecting redundancy obtained by WLTCC-ED is given as r * i = 0.In other words, only coded computation is needed, while error-detecting coding is not required in this situation.Moreover, if the transmission capability of each worker µ trn i is the same and the bandwidth of wireless channel is unlimited, the data sub-block size obtained by WLTCC-ED is given as b * i = 1, which degenerates to the fine-grained LT coding approach.
Remark 4 (Trade-off between Computation and Transmission): WLTCC-ED realizes a trade-off between computation latency and transmission latency.When the transmission capability and channel condition of each worker are the same, i.e.
, the workers with more powerful computation capability will complete more computational tasks.In other words, for worker i, the larger value of µ cmp i and the smaller value of a i will lead to the larger value of kb * i in WLTCC-ED.If the computation capability of each worker 1 One way to find out the initial points is to choose the optimal solution to P ′′ 1 as the value of t are the same, i.e.
, the workers with more powerful transmission capability and better channel condition will complete more computational tasks.In other words, for worker i, the larger value of µ trn i and the smaller value of ε i will lead to the larger value of kb * i in WLTCC-ED.

B. Low-Complexity Algorithm
The iterative procedure in Alg. 1 may incur high computational complexity.To simplify it with a lower complexity, an approximate method is provided when the error rate is small.First, in the scenario of low error rate, the corresponding approximate treatments are done for some terms in P 1 as follows.
• Approximation 1.Since the channel condition is pretty good, i.e. the value of ε i is very small, it does not need to add a large amount of error-detecting redundancy to meet the requirement of computation reliability for the whole networks.In other words, the value of r i is also very small and satisfies r i ≪ b i for each worker.Thus, it is approximated that • Approximation 2. According to [30], p e,i can be approximated by a weaker upper bound in the scenario of low error rate, i.e.
• Approximation 3.For the (b i + r i ,b i ) linear code, up to r i error inner products can be detected [38], [39], [40], i.e. p d,i ≤ ri j=1 C j bi+ri ε j q,i (1 − ε q,i ) bi+ri−j .Approximating binomial distribution by Poisson distribution [43] and then using Stirling's approximation, we can obtain where χ i = (b i + r i ) ε q,i .The condition (a) represents the approximation between binomial distribution and Poisson distribution, while the condition (b) holds because of the Stirling's approximation.
Then, utilizing arithmetic means and geometric means (AM-GM) inequality, the bounds of PF structures in P 1 are given as: • Substitute Eq. ( 45) into the constraint (37) of P 1 , and utilize AM-GM inequality as follows: Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
• Substitute Eq. ( 44) and Eq. ( 46) into the constraint (38) of P 1 , and utilize AM-GM inequality as follows: • Substitute Eq. ( 44) and Eq. ( 45) into the constraint (39) of P 1 , and utilize AM-GM inequality as follows: At last, by replacing the original constraints in P 1 with their tighter convex bounds, the approximate optimization problem for scenario of low error rate can be given as Since the objective function and the constraints in P ′′ 1 are composed of the sum of convex functions, this approximate optimization problem is also convex.Based on the Lagrange function with Karush-Kuhn-Tucker (KKT) conditions [44], we can get the optimal computation time as where and is the straggling factor that indicates whether the worker i is a straggler or not.Moreover, the optimal transmission time, error-detecting redundancy and data sub-block size can be obtained as follows: Obtain γ i in Eq. ( 48) and λ i in Eq. (49) for the worker i; The worker i is chosen; 6: The worker i is abandoned; return b * i and r * i for the worker i according to Eq. (52) and Eq.(51).12: end procedure Then, the wireless LT coded computation with error detection in the scenario of low error rate (WLTCC-ED(LER)) is provided, which is given as Alg. 2.
Compared with SCA algorithm with iterations, Alg. 2 can be carried out in the constant time.It has low-complexity and can obtain approximate solutions faster in the scenario of low error rate.
Remark 5 (Stragglers Recognition): There are not only computation stragglers with the poor computation capability, but also transmission stragglers with the weak transmission capability or bad channel condition.In Alg. 2, a worker can be decided as a straggler or not by λ i , i ∈ [n].When λ i ≤ 0, worker i is a straggler, which implies that it will lead to a severe performance degradation for the whole networks.Thus, worker i will not compute or transmit any inner products, i.e. b * i = 0.

VI. SIMULATION RESULTS AND DISCUSSION
In this section, we will present some numerical results to show the performances of our proposed JCTC scheme.
Similar to [7], [13], [32], and [33], we choose the number of rows in A as m = 5000, the number of workers as n = 100, the tolerable maximum error inner product rate as p r = 0.005, and the maximum number of sub-blocks that can be transmitted by each worker as k = 4. Also, we assume that α = 2.8 and q = 1.The value of decoding parameter η in LT coding approach can be determined as η = 0.0326 [15], which implies the master can recover the Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
desired result successfully once receiving (1 + η)m = 5163 data inner products.For error detection, MDS code is applied and the encoding capability of workers in the separate designs is chosen as a cc i ∼ U (0. }, {a i }, {µ trn i }, {ε i } and p r , the data sub-block size b i and error-detecting redundancy r i of worker i can be obtained by Alg. 2, where latency is optimized using approximations.In order to compare the performances of different schemes, four scenarios are considered as in Table I 2 .For Scenario 1, considering no transmission error, 100 workers are divided into four groups with the different computation capabilities and the same transmission capabilities, whereas each group in Scenario 2 has disparate computation capabilities, transmission capabilities and channel conditions.Scenario 3 is the case of heterogeneous wireless networks where the parameters of each worker are drawn from the corresponding random sources.Scenario 4 is based on a practical wireless distributed 2 The unit of 1 µ cmp i , a i , 1 µ cc i and a cc i is milliseconds per row, and the unit of µ trn i is the number of inner products per millisecond.computing system with 50 workers.First, we observe the computation time, transmission time and channel conditions of these workers to get the statistics data.Then, through fitting the statistics data on computation and transmission time to the exponential model, we get the computation and transmission capabilities of workers.Also, the channel conditions can be obtained by using the reference signal.Finally, the workers can be divided into 4 groups as shown in Table I. Performance comparisons in the above four scenarios between the implemented schemes are shown as Fig. 7 and Fig. 8 for latency and computation reliability, respectively.We can observe that WLTCC-ED(SCA) and WLTCC-ED(LER) can avoid encoding for error detection in workers and minimize the total latency to achieve a sub-optimal trade-off between computation and transmission latency compared with separate designs.For the computation reliability, WLTCC-ED(SCA) can always satisfy the required reliability but the low-complexity algorithm is only applicable to the scenario of low error rate, like Scenario 1 and Scenario 4, because of the approximations.
For Scenario 2, the performance changes over ε i including latency and computation reliability are shown in Fig. 9 and Fig. 10.We can note that the total latency of our JCTC schemes is always lower than the separate designs regardless of the channel condition.Moreover, in order to guarantee the computation reliability, the implemented schemes needs lots of re-transmissions especially when the channel condition Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.  is not satisfactory, which makes them sensitive with ε i .For computation reliability, WLTCC-ED(LER) does not satisfy the required computation reliability when the channel condition is bad, since some approximations for the scenario of low error rate are applied in this scheme.And other schemes keep the value of expected computation reliability above 1 − p r .Fig. 11 and Fig. 12 show the effect of p r on the performances from latency's and computation reliability's points of view respectively.The expected total latency will be reduced as the required computation reliability is decreased.Specially,  if an extremely high computation reliability is required for a scenario with bad channel condition, the total latency may increase to infinity due to the constant re-transmission.From the reliability's point of view, we notice that the low-complexity algorithm cannot guarantee the final computational result with no error regardless of p r due to the bad channel condition, and the expected computation reliability of WLTCC-ED(LER) is worse than other schemes because of the approximation.But WLTCC-ED(LER) can also perform well in a scenario with the low requirement of computation reliability and good channel condition.

VII. CONCLUSION
In this paper, we have proposed the JCTC scheme to design coded computation and error detection jointly.Due to the two-layer encoding strategy, the low dynamic encoding has been achieved.Then, the performances of JCTC scheme, including latency and computation reliability, have been analyzed.Under the same computation reliability, theoretical performance comparisons with separate designs have shown the advantages of JCTC scheme from calculated amount's and latency's points of view.Finally, to achieve the efficient task and redundancy allocation, the WLTCC-ED algorithms have been presented based on both iterative and low-complexity methods.The simulation results have also verified the superiority of our proposed scheme.APPENDIX A PROOF OF LEMMA 1 From Eq. ( 5), we can know that .

APPENDIX B PROOF OF LEMMA 2
When the computation capability tuple for all n workers is µ cmp g , a g , the best computational performance for the whole networks is achieved.It implies that the heterogeneous networks are reduced to the corresponding homogeneous networks with the best computational performance.Thus, we can obtain

APPENDIX C PROOF OF LEMMA 3
When the computation capability tuple for all n workers is (µ cmp b , a b ), the worst computational performance for the whole networks is achieved.It implies that the heterogeneous networks are reduced to the corresponding homogeneous networks with the worst computational performance.Thus, we can obtain Summing over all w ∈ W e , we get

APPENDIX D PROOF OF LEMMA 4
With the certain k re,i = j, T trn i,re,(κ th ) follows an erlang distribution with shape parameter j and rate parameter µ trn i according to the convolution formula.On the basis of the total probability theorem, we can get Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

APPENDIX E PROOF OF LEMMA 5
From Eq. ( 13) and Eq. ( 14), we notice that the random variable T trn i (t c ) = (bi+ri)xi(tc) κ=1 T trn i,re,(κ th ) follows an erlang distribution with shape parameter (b i + r i ) x i (t c ) and rate parameter p s,i µ trn i with the given time t c and fixed x i (t c ).Then, the CDF of T trn i (t c ) can be obtained through the total probability theorem as where U i,j is a random variable following an erlang distribution with shape parameter j (b i + r i ) and rate parameter p s,i µ trn i .According to Eq. ( 5) and Eq. ( 53), the expectation of T trn i (t c ) can be gotten directly by the definition of the mean, i.e., .

APPENDIX F THE CONVEX APPROXIMATIONS OF DC AND PF STRUCTURES IN P 1
In P 1 , we assume that For the DC structures in f 1,i , f 2,i and f 3,i , we set Then we can linearize f DC 1,i and f DC 2,i with the given point b 0 and r 0 as follows: For the PF structures in f 1,i , f 2,i and f 3,i , we can rewrite as follows: with the given point t 0 .
Then, according to [41, Section IV-B], f 1,i (t i , b i , r i ), f 2,i (t i , b i , r i ) and f 3,i (t i , b i , r i ) can be written as the following convex functions: where f p2,i (t i , b i , r i ), f p3,i (t i , b i , r i ) and f p4,i (t i , b i , r i ) can be obtained by and functions f PF 2,i (t i , b i , r i ), f PF 3,i (t i , b i , r i ), f PF 4,i (t i , b i , r i ) are given as follows: Note that f l1,i , f l2,i , f p2,i , f p3,i , f p4,i are composed of linear functions with respect to t i , b i and r i respectively and f p1,i is also a convex function with respect to t i and b i .According to the conclusion on the convexity of composite functions [44], we can know that f 1,i (t i , b i , r i ), f 2,i (t i , b i , r i ) and f 3,i (t i , b i , r i ) are also convex so that P 1 can be relaxed to the convex problem P ′ 1 .Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

Fig. 1 .
Fig. 1.Distributed coded computation with transmission error in wireless networks.Each square represents a sub-block, which is transmitted through a binary symmetric channel with a bit error transition probability {ε i } n i=1 .For worker i, the computation capability is evaluated by µ cmp i , a i and the transmission capability is measured by µ trn i .

Fig. 2 .
Fig. 2.The workflow of separate designs.Coding for computation is performed by the master, while coding for error detection is done by each worker.It causes that the whole computational tasks for worker i contain the original computational tasks { Ãi,j x} ⌈l/b i ⌉ j=1

Fig. 3 .
Fig. 3.The workflow of JCTC scheme.Both coding for computation and error detection are performed by the master.Each worker calculates matrix multiplications { Ã′ i,j x} ⌈l/b i ⌉ j=1including data matrix multiplications

1 ,
for w ∈ W e , where the set of workers that have not completed all their computational tasks until T cmp is denoted as W e , i.e.W e = { i| c i b i < αm/n}, and T cmp w is a exponential random variable with rate parameter µ cmp b .Proof See Appendix C.

n
i=1 x i (t c ) b i (2b i − 1) operations including additions and multiplications with the given computation time t c , when d ≤ min i b i .

Fig. 7 .
Fig. 7. comparison between four separate designs and our JCTC schemes in four different scenarios, where pr = 0.005.

Fig. 8 .
Fig. 8. Computation reliability comparison between four separate designs and our JCTC schemes in four different scenarios, where pr = 0.005.

Fig. 9 .
Fig. 9.The expected total latency E [Ttot] versus channel condition ε i , where pr = 0.005 and other parameters are chosen from Scenario 2.

Fig. 10 .
Fig. 10.The expected error inner product rate 1 − E [Rcmp] versus channel condition ε i , where pr = 0.005 and other parameters are chosen from Scenario 2.

Fig. 11 .
Fig. 11.The expected total latency E [Ttot] versus the tolerable maximum error inner product rate pr, where the parameters of workers are chosen from Scenario 2.

Fig. 12 .
Fig. 12.The expected error inner product rate 1 − E [Rcmp] versus the tolerable maximum error inner product rate pr, where the parameters of workers are chosen from Scenario 2.

w
T cmp ≤ w (c w + 1) (b w + r w ) T cmp w

Borui
Fang received the B.E. degree in communication engineering from Dalian Maritime University, Dalian, China, in 2020.He is currently pursuing the Ph.D. degree with the Department of Electronic Engineering and Information Science, University of Science and Technology of China, Hefei, China.His research interests include coded distributed computing, wireless networks, and integrated communication and computation.Li Chen (Senior Member, IEEE) received the B.E. degree in electrical and information engineering from the Harbin Institute of Technology, Harbin, China, in 2009, and the Ph.D. degree in electrical engineering from the University of Science and Technology of China, Hefei, China, in 2014.He is currently an Associate Professor with the Department of Electronic Engineering and Information Science, University of Science and Technology of China.His research interests include integrated communication and computation, integrated sensing and communication, and the wireless IoT networks.Yunfei Chen (Senior Member, IEEE) received the B.E. and M.E.degrees in electronics engineering from Shanghai Jiao Tong University, Shanghai, China, in 1998 and 2001, respectively, and the Ph.D. degree from the University of Alberta in 2006.He is currently a Professor with the Department of Engineering, Durham University, U.K. His research interests include wireless communications, performance analysis, and joint radar communications designs.Changsheng You (Member, IEEE) received the B.Eng. degree from the University of Science and Technology of China (USTC) in 2014 and the Ph.D. degree from The University of Hong Kong (HKU) in 2018.He was a Research Fellow with the National University of Singapore (NUS).He is currently an Assistant Professor with the Southern University of Science and Technology.His research interests include intelligent reflecting surface, UAV communications, edge learning, and mobile-edge computing.He received the IEEE Communications Society Asia-Pacific Region Outstanding Paper Award in 2019, the IEEE ComSoc Best Survey Paper Award in 2021, and the IEEE ComSoc Best Tutorial Paper Award in 2023.He is listed as a Highly Cited Chinese Researcher and an Exemplary Reviewer of the IEEE TRANSACTIONS ON COMMUNICATIONS and IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS.He is an Editor of IEEE COMMUNI-CATIONS LETTERS, IEEE TRANSACTIONS ON GREEN COMMUNICATIONS AND NETWORKING, and IEEE OPEN JOURNAL OF THE COMMUNICATIONS SOCIETY.Xiaohui Chen (Member, IEEE) received the B.S. and M.S. degrees in communication and information engineering from the University of Science and Technology of China (USTC), Hefei, China, in 1998 and 2004, respectively.He is currently an Associate Professor with the Department of Electronic Engineering and Information Science, USTC.His current research interests include wireless network QoS, mobile computing, and AI-based communication.Weidong Wang received the B.S. degree from the Beijing University of Aeronautics and Astronautics, Beijing, China, in 1989, and the M.S. degree from the University of Science and Technology of China, Hefei, China, in 1993.He is currently a Full Professor with the Department of Electronic Engineering and Information Science, University of Science and Technology of China.His research interests include wireless communication, microwave and millimeterwave, and radar technology.He is a member of the Committee of Optoelectronic Technology, Chinese Society of Astronautics.
Algorithm 1 Wireless LT Coded Computation With Error Detection Based on SCA Require: The parameter tuple (µ cmp i , a i , µ trn i , ε i , p r ) for each worker i ∈ [n].Ensure: The data sub-block size b * i and error-detecting redundancy r * i for the worker i. 1: procedure WLTCC-ED(SCA) 2: Algorithm 2 Wireless LT Coded Computation With Error Detection in the Scenario of Low Error Rate Require: The parameter tuple (µ cmp i , a i , µ trn i , ε i , p r ) for each worker i ∈ [n].Ensure: The data sub-block size b * i and error-detecting redundancy r * i for the worker i. 1: procedure WLTCC-ED(LER) 2: 1, 2) , µ cc i ∼ U (10, 30) , i ∈ [n].The schemes studied are given as follows.1) UUA (Uniform Uncoded Allocation).Computation and error detection are designed separately.Each worker is assigned the same number of rows and does not divide the local data block into sub-blocks, i.e., l = m/n, b i = l = m/n for ∀i ∈ [n].The error-detecting redundancy r i , Coded computation and error detection are designed separately.Each worker is assigned the same number of rows with maximum-grained sub-block division [9, Sec.3.2] based on LT code, i.e., l = αm/n, b i = αm/kn for ∀i ∈ [n].The error-detecting redundancy r i , i ∈ [n] is obtained by [33, Eq. (15)]; 4) BD-WLTCC (Block-Design Based Wireless LT Coded Computation).Coded computation and error detection are designed separately.With the given {µ cmp