Dependency-Aware Service Migration for Backhaul-Free Vehicular Edge Computing Networks

Vehicular edge computing (VEC) is a promising paradigm to improve vehicular services through offloading complex computation tasks to the edge servers. However, the high mobility of vehicles requires frequent service migration among edge servers to guarantee uninterrupted services when vehicles traverse multiple cells. This brings great challenges. In this article, we design a dependency-aware backhaul-free migration scheme to enable service migration without relying on backhaul with constraints on task dependencies. Specifically, the vehicle proactively fetches the migrated results based on task dependencies from the original server and migrates the results to its dynamically connected servers along the traveling path. Considering the incurred intermittent communication and computation due to vehicle mobility, a joint offloading and migration optimization problem for determining the time to offload tasks and fetch results is formulated with a time-varying Markov decision process (MDP) to minimize the total energy consumption. Time-varying transition probability functions are derived to characterize the dynamics during intermittent offloading and fetching. Based on the MDP framework, an efficient online value iteration algorithm is developed by exploiting temporal correlation to estimate the time-varying value functions. Simulation results demonstrate that our proposed algorithm can achieve superior energy-saving performance compared to the baseline online schemes.


Dependency-Aware Service Migration for
Backhaul-Free Vehicular Edge Computing Networks Qibing Fan , Li Chen , Senior Member, IEEE, Changsheng You , Member, IEEE, Yunfei Chen , Senior Member, IEEE, and Huarui Yin , Member, IEEE Abstract-Vehicular edge computing (VEC) is a promising paradigm to improve vehicular services through offloading complex computation tasks to the edge servers.However, the high mobility of vehicles requires frequent service migration among edge servers to guarantee uninterrupted services when vehicles traverse multiple cells.This brings great challenges.In this article, we design a dependency-aware backhaul-free migration scheme to enable service migration without relying on backhaul with constraints on task dependencies.Specifically, the vehicle proactively fetches the migrated results based on task dependencies from the original server and migrates the results to its dynamically connected servers along the traveling path.Considering the incurred intermittent communication and computation due to vehicle mobility, a joint offloading and migration optimization problem for determining the time to offload tasks and fetch results is formulated with a time-varying Markov decision process (MDP) to minimize the total energy consumption.Time-varying transition probability functions are derived to characterize the dynamics during intermittent offloading and fetching.Based on the MDP framework, an efficient online value iteration algorithm is developed by exploiting temporal correlation to estimate the time-varying value functions.Simulation results demonstrate that our proposed algorithm can achieve superior energy-saving performance compared to the baseline online schemes.
Index Terms-Backhaul-free network, Markov decision process, service migration, task dependency, vehicular edge computing.

I. INTRODUCTION
W ITH the rapid development of the Internet of Vehicles (IoV) and autonomous driving, various vehicular applications are emerging, such as image-aided navigation, intelligent vehicle control, and augmented vehicular reality [1].These high-complexity applications require explosive computation resources and stringent time delays, yet it is commonly known that the computation capabilities of vehicles are limited and insufficient to meet such requirements.Vehicular edge computing (VEC) is proposed as a promising paradigm by deploying computing services in the close proximity to the vehicle to greatly reduce the latency [2], [3], [4].The research of VEC can be traced back to the investigations related to mobile edge computing (MEC), a more general computing paradigm targeting not only vehicles but also other mobile devices.The state-of-the-art works on MEC from the communication perspective have been summarized in [5].For example, for a single-user MEC system, an energy-optimal binary offloading strategy was proposed in [6], [7] by comparison between the energy consumption of offloading and local execution.To enable fine-grained computation offloading, bit-wise independence in task partitioning was considered in [8], [9] and a resource allocation scheme was proposed for a multi-user MEC system.Liu et al. [10] proposed a new low-latency and reliable communication-computing system design for enabling mission-critical applications.The joint task offloading and resource allocation in the multi-user collaborative MEC network was investigated in [11] to minimize the total energy consumption under task delay constraints.
When the MEC technology is applied to the vehicular network, VEC arises.For the architecture design, the vehicleassisted architecture that leverages vehicles as the infrastructures for communication and computation was first proposed in [12] to make better use of the underutilized resources of numerous vehicles.A cloud-assisted VEC architecture was developed in [3] to enable collaborative computation across cloud, edge, and vehicles.Based on various architectures, optimal control under VEC is considered.Considering the costs of both vehicles and edge servers, a dual-side offloading decision and resource allocation scheme was proposed in [13] for a single server system.To avoid performance degradation due to overload, the authors in [14] proposed a joint load balancing and offloading algorithm for maximizing system utility in a multi-server system.In [15], a task scheduling problem was investigated under a highly dynamic vehicular network to minimize the system energy consumption while satisfying task latency constraints.For the consideration of VEC realization, several enabling technologies have been introduced.Liu et al. [16] proposed a VEC network enabled by software-defined networking (SDN) to provide low-latency and high-reliability communication.Considering the requirements of various vehicle applications, network slicing was applied to VEC in [17] to support network service differentiation and diversification.
As vehicles may traverse different cells due to high mobility, a key topic in VEC-related research is service migration [18], [19].Specifically, the ongoing computation services need to be migrated to the dynamically connected servers/BSs as the vehicle travels to guarantee uninterrupted services.In [20], an optimal migration policy was designed based on the Markov decision process (MDP) to achieve a good tradeoff between the migration cost and quality of service (QoS).The optimal policy was proved to be threshold-based in [21].To proactively reshape the distribution of resource demands, mobility optimization was introduced into the service migration problem in [22].Using both vehicle-to-vehicle and vehicle-to-infrastructure communications, Wang et al. [23] optimized task offloading and migration decisions based on game theory.As the service migration process involves BS handover, some studies focus on handover management.A vertical handover protocol based on Proxy Mobile IPv6 (PMIPv6) and IEEE 802.21 Media Independent Handover (MIH) standard is proposed in [24], which uses the received signal strength (RSS) and dynamic thresholds to trigger the handover.A blockchain-based handover authentication protocol for Vehicular ad-hoc networks (VANETs) is designed in [25] to guarantee the security of V2I communications.
Most of the above-mentioned works have made two assumptions.First, all the base stations are required to be deployed with backhaul.This will incur a significant infrastructure cost due to installation obstacles of fiber backhaul links [26], [27], especially in the dense VEC networks where base stations are densely deployed to increase the network capacity and provide ubiquitous services for vehicles.To avoid migration by backhaul, Zhang et al. [4] proposed to pre-offload tasks by multihop vehicle-to-vehicle connections, and the parked and moving vehicles were utilized for computation and communication in [12], [28].Second, the IoV applications consist of independent subtasks [20], [29].Nevertheless, in practice, inter-task dependencies exist in most IoV applications, where the outputs of some subtasks are the inputs of others.Taking the example of monitoring abnormal driving behavior in public safety [30], the vehicle first offloads various data collected by the on-board sensors, such as in-vehicle audio, video, and driving trajectory, to the edge for behavior feature extraction.The edge then performs a fusion analysis and sends alerts to the driver and passengers.Due to the continuity of the behavior monitoring application, the behavior analysis subtask relies on behavior data extracted by both the current and previous data processing subtasks, resulting in complex inter-task dependencies.However, traditional service migration methods generally only focus on whether to migrate [20], [29], ignoring the question of how much to migrate.This is essentially because they overlook inter-task dependencies, making it impossible to provide an explicit mathematical description of the migrated intermediate results.Consequently, achieving fine-grained service migration without relying on backhaul under the task dependency model is an urgent problem that deserves attention.In this article, we consider a multi-cell VEC network without backhaul and aim to jointly optimize computation offloading and service migration with inter-task dependency.The joint optimization addresses the following challenges.First, the results of multiple associated subtasks need to be migrated together to the same server due to task dependency.Second, due to the high mobility of vehicles, intermittent wireless connections may result in unsuccessful computation offloading and service migration.To enable service migration without backhaul but with task dependency, we propose an indirect migration scheme, where the vehicle proactively fetches intermediate results based on task dependencies from the original server and migrates the results to its dynamically connected servers/BSs along its traveling path.Accounting for the possible unsuccessful migrations of dependent results, some subtasks are recomputed after the BS handover, which leads to intermittent computation.To tackle intermittent communication and computation, the vehicle independently makes online decisions on when to offload tasks and fetch results based on the statistical information of the vehicular movement.The joint optimization problem minimizes the accumulated energy consumption under a highly dynamic environment of computation and communication.
The main contributions of this article are summarized as follows.
r Dependency-aware indirect migration scheme: We first model the sequential offloading process of a successive task based on general task graphs.Based on this, we propose a novel dependency-aware indirect migration scheme to enable service migration without backhaul.Considering the effects of intermittent computation and communication, an indirect migration design on offloading and fetching is proposed to achieve efficient offloading and migration.
r Joint optimization of computation offloading and results fetching based on MDP: We study the joint offloading and migration problem to determine the time to offload tasks and fetch results to minimize the total energy consumption of completing all subtasks in a prior task graph.The optimization problem is formulated based on a time-varying MDP accounting for time-varying vehicle movement and wireless channels.In the MDP, we first map the vehicle location into the BS index to reduce state dimensions.Then, time-varying transition probability functions are derived to characterize the dynamics during intermittent offloading and fetching.
r Energy-efficient online offloading and fetching policy: The formulated energy minimization problem is solved using an MDP-based online value iteration (OVI) algorithm to obtain a real-time offloading and fetching policy, where the time-varying value function in time-varying MDP is estimated based on temporal correlation.In the proposed algorithm, an extra constraint to the value function is also used to avoid a selfish policy.Simulation results confirm that our proposed algorithm can achieve superior energy-saving performance compared to the oracle online schemes.The remainder of this article is organized as follows.The VEC system model is described in Section II.The proposed indirect migration scheme is depicted in detail in Section III.The MDP formulation and the corresponding algorithm for offloading

II. SYSTEM MODEL
As depicted in Fig. 1, we consider a VEC system with a moving vehicle, N base stations (BSs), denoted by set N = {1, 2, . . ., N}.Each BS is equipped with a VEC server and can provide edge computing services.We assume that there are no overlapping coverage areas of BSs.Thus, the vehicle can only connect to one BS during the movement.The vehicle's computing task, such as analyzing images of the surrounding environment, needs to be offloaded to the BS due to the limited computing capability of the vehicle.The notations used in this article are summarized in Table I.
The vehicle travels through multiple cells in the twodimensional (2-D) region A ⊂ R 2 , as shown in Fig. 1.Time is discretized into multiple time slots, i.e., t ∈ T = {1, 2, • • • T } and each time slot lasts T 0 .The vehicle's location at slot t is denoted as r t .The mobility model can be described by a known probability distribution of the vehicle location at each time slot, denoted as f R (t) (r t ).
We consider a sequential offloading process of a successive task.Specifically, it is assumed that the computing task can be further divided into M subtasks.They will be offloaded in sequence during the movement and are indexed as [1, 2, . . ., M] according to their offloading priority.The computation of a subtask requires the outputs of several previous subtasks, which is referred to as task dependencies.Considering the dependencies, we assume that the m-th offloaded subtask can be parameterized by v m v(L m , α m , I w (m)), where L m (in bits) denotes the size of subtask v m , and α m is computation intensity in terms of CPU cycles required to process a one-bit task computation, and I w (m) is the intermediate results that get updated after the subtask v m is completed.Theoretically, the output results of all completed tasks are included in the intermediate results.However, from the perspective of task dependencies, the intermediate results can be further simplified to reduce unnecessary storage.Specifically, if a completed task is not required by any subsequent task, it should be excluded from the intermediate results, otherwise, it should be retained.Therefore, the intermediate results I w (m) can be updated as where w i is the output size of each subtask v i , and succ(v i ) denotes the set containing all the successors depending on the output of v i in the task topology diagram.
To illustrate the above sequential offloading process, we give an example in Fig. 2. Fig. 2(a) shows the topology of the example task, where the directed arrow from node v i to node v j indicates that subtask v j requires the output of subtask v i , known as w i .Fig. 2(b) shows the sequential offloading process with corresponding intermediate results I w (m) marked on each dotted line, where the sum of outputs i=j,k,... w i is abbreviated as w j,k,... .After all the subtasks are offloaded and computed, the vehicle fetches the results of the last subtask, i.e., the target results of the computation task.The parameter information of all the subtasks is known a priori in our successive task offloading model.Unless otherwise specified, the tasks below refer to subtasks.
The mobility of the vehicle causes disruption to offloading.Specifically, the vehicle may not be able to offload all the tasks within the coverage area of a single BS considering the vehicle's mobility as well as the limited coverage of the BSs.In this case, after the vehicle moves out of the original BS coverage and connects to a new BS, the sequential offloading process will be interrupted due to task dependency limitations.
In order to guarantee continuous computation, a direct way is service migration [19], namely, migrating the intermediate results I w (m) from the original BS to the vehicle's next BS along the traveling path. 1 Let n t ∈ N denote the vehicle's associated BS at slot t.Clearly, results migration is triggered when n t = n t−1 .The intermediate results are generally migrated via the backhaul network.However, providing backhaul connectivity for BSs incurs significant infrastructure costs in VEC networks.Therefore, we consider a backhaul-free scenario where each BS works independently without backhaul connectivity in this work.To support continuous computation without backhaul, we propose an indirect migration scheme that utilizes the connected vehicle to migrate intermediate results.

III. PROPOSED INDIRECT MIGRATION SCHEME
In this section, an indirect migration method is first proposed to achieve a more flexible migration without backhaul, where the vehicle proactively fetches intermediate results from the original BS and migrates them to the next BS as it travels.Considering the changing channels during the movement, the vehicle may experience channel distortion and suffer from time-out failure during the offloading and fetching process.To achieve efficient offloading and migration when intermittent computation and communication occur, an indirect migration design is proposed.After that, a detailed analysis of offloading and fetching decisions is provided for online decision-making.

A. Indirect Migration
The proposed indirect migration, as depicted in Fig. 3, is elaborated as follows.We denote m t ∈ M as the index of the task to be offloaded at slot t.After the task v m t is offloaded and computed at a certain slot t, the vehicle proactively fetches intermediate results I w (m t ) from the connected BS n t via the downlink. 2We refer to the operation of fetching I w (m t ) as setting a checkpoint, and m t is a checkpoint in successive tasks.As the vehicle connects to a new BS at slot t , it migrates the fetched results I w (m t ) to the new BS via the uplink.Compared with the direct migration scheme, our proposed indirect migration can avoid the overhead of backhaul and is more flexible due to wireless connectivity.
However, there are still some challenges.First, some of the tasks may need to be recomputed after a handover.Specifically, when the vehicle migrates the intermediate results I w (m t ) to the new BS at slot t , the BS can only start the computation from the next task following the checkpoint m t , but not from the interrupted task indexed by m t .This is because the migrated intermediate results I w (m t ) cannot satisfy the dependencies of tasks behind m t + 1.In this case, tasks indexed from m t + 1 to m t need to be recomputed, which is termed as the move-out failure.In the highly dynamic scenario of vehicle mobility, the computation process of successive tasks is an intermittent computation process involving checkpoint settings, since the computation service is interrupted due to move-out and unfetched tasks need to be recomputed after a handover.Second, the channel variation due to high vehicular mobility may lead to failures in offloading and fetching.In the proposed indirect migration, the results migration, task offloading, and results fetching are highly dependent on wireless channel conditions.Considering the varying path loss, shadowing, and multipath fading over time and space, the channel will change dynamically with vehicle movement.Therefore, the vehicle may experience severe channel distortion at some slots.We consider a bandwidth-limited scenario where the latency of task processing at a certain slot t may exceed slot length T 0 , which is termed as the time-out failure.With the dynamic channel, the computation process of a successive task is also a process of intermittent communication.The channel power gain of the vehicle to BS n at slot t, denoted as H n (t), is assumed to be frequency-flat block fading, i.e., the channel remains static within each time slot but randomly varies over different slots.

B. Indirect Migration Design
Considering the intermittent computation and communication, a key question is whether to offload and fetch at each slot.On one hand, in order to avoid an invalid offload resulting from a time-out failure, an offloading decision on whether to offload the task at each slot should be taken.On the other hand, after a task is offloaded and computed, the vehicle will make a fetching decision at each slot on whether to fetch the intermediate results for future migration to reduce the possibility of a move-out failure.We assume that the time-out failure is likely to occur during offloading and fetching phases, whereas the migration operation will not time out.It is reasonable because the migrated results are usually smaller than offloaded tasks in size and the migration operation is performed earlier compared with the fetching operation within a slot.
Based on the above intermittent offloading and fetching, we propose an indirect migration design to achieve efficient computation of M tasks under possible time-out failure and move-out failure, which is summarized in Fig. 4. To simplify, we denote the latest fetched task (checkpoint) up to slot t as k t .It indicates the state of fetching and is updated after a successful fetch.For example, if the intermediate results I w (m t ) are fetched at slot t without time-out, the k t+1 is updated to m t , otherwise, it is not updated.At each slot t, there are three steps in our design: 1) Step 1. Migration Judgement: The vehicle system observes at the current slot t whether there is a BS handover.If the handover is observed, i.e., n t−1 = n t , a migration of intermediate results I w (k t ) is first performed at the new BS and thus the index of the task m t is updated to k t + 1. 2) Step 2. Offloading Decision: The system makes decisions at slot t whether to offload the task m t .If it decides to offload m t and it doesn't time out for the offloading and computing operations, the next task in the sequence will be offloaded at t + 1, i.e., m t+1 = m t + 1. 3 On the contrary, if it times out or the system decides not to offload, the task needs to be recomputed at the next slot, i.e., m t+1 = m t .Moreover, we keep k t+1 = k t until it is updated after a successful fetch.3) Step 3: Fetching Decision: After a successful offload, the system decides whether to fetch the intermediate results I w (m t ).If the system chooses to fetch the results and doesn't have a time-out failure in the fetching operation, the latest checkpoint k t+1 is updated to m t .A failed fetch or a negative fetching decision will lead to a failure in updating k t+1 , that is, k t+1 = k t .After the results of the last task, I w (M ) is fetched at a certain slot t 0 , i.e., k t 0 +1 = M , the computation of M tasks is completed.
To illustrate the indirect migration design, an exemplary intermittent offloading and fetching process in the VEC network is given in Fig. 5. 4 First, located in the coverage of BS 1 , the vehicle offloads task v 1 at slot t = 1 but offloads no task at slot t = 2 on account of the predicted time-out for offloading.After that, when the vehicle moves away from the BS 1 at slot t = 3, it offloads the task v 2 and fetches the intermediate results I w (2), thus the latest fetched task at slot t = 4 is updated to k 4 = 2.Then, when it comes to t = 4, the vehicle moves out of the coverage of BS 1 and connects to BS 2 .It migrates the latest fetched intermediate results I w (2) and offloads task v 3 with the intention of fetching I w (3), but ends up with a time-out failure during the fetching phase.At slot t = 5, the vehicle offloads the task v 4 without fetching I w (4).As a result, a move-out failure occurs when the vehicle moves into the coverage of BS 3 at slot t = 6, and the tasks indexed after k 6 = 2, i.e., tasks v 3 , v 4 are required to be recomputed.In other words, the process of task offloading rolls back to the offloading of task v 3 at slot t = 6.

C. Analysis of Offloading and Fetching Decisions
It is easy to see that the performance of the proposed scheme depends on the decisions on whether to offload and fetch.Furthermore, in order to adapt to the time-varying environment in real-time, we focus on online decision-making without requiring future information on user trajectory and wireless channel states in this work.
1) Offloading Decision Analysis: At first, it's difficult to determine whether to offload in real-time with limited channel information.Considering that no energy is consumed if a task is not offloaded, the vehicle can put off offloading to some future slots with higher transmission rates to achieve lower energy consumption.However, the global channel states are unknown to the vehicle in an online decision system.
In addition, the coupling between offloading and fetching decisions should also be considered.In particular, when the vehicle moves away from the BS, it is likely that it will suffer from an increasingly poor channel condition.Therefore, it becomes increasingly difficult for the vehicle to fetch intermediate results before moving out.If the computation of M tasks cannot be completed before the move-out, subsequent offloads are meaningless since the updated intermediate results cannot be fetched for migration.
2) Fetching Decision Analysis: For fetching decisions, the future movement trajectory is not available beforehand.Therefore, it is difficult to precisely locate the move-out slot t and fetch the latest intermediate results I w (t − 1) at one slot before to avoid recomputation.Even if a correct fetching decision is made, the fetching operation may be timed out when the vehicle suffers from a bad channel.
The optimization problem of offloading and fetching decisions is a sequential decision problem.Two commonly accepted powerful analytical tools for sequential decisions are Lyapunov optimization and Markov decision process (MDP).The Lyapunov method assumes a deterministic decision-making process, whereas in intermittent communication and computing systems caused by random vehicle movements, the decisionmaking process becomes stochastic.Fortunately, the latter one, MDP, is a promising approach to model sequential decision problems with probabilistically dynamic transitions in highly stochastic environments.Moreover, the Markov property in MDP that the result of an action does not depend on the previous actions and visited states, but only depends on the current state, can be satisfied in our problem after adjusting the state variables.Therefore, the MDP will be used to solve the problem.

IV. MDP-BASED OFFLOADING AND FETCHING DESIGN
In this section, the problem of offloading and fetching will be formulated based on a discrete-time MDP model to optimize the performance of the proposed indirect migration scheme.First, we define the system states and actions.Time-varying state transition probabilities and immediate costs are then derived.Afterward, the joint optimization is formulated as a finite-horizon time-varying MDP problem to minimize the accumulated energy consumption and it is solved by employing an online value iteration algorithm.

A. State Space and Action Space
State space is not only the characterization of all possible system states but also the key to the curse of dimensionality for the MDP model.We map the vehicle's location into the corresponding BS, that is, transform an infinite-scale state space into a finite-scale state space, thereby leading to lower state dimensions [31].Specifically, the state space is defined as S = Φ × M × K, where Φ = {1, 2, . . ., N} × {1, 2, . . ., N} is the set of BSs connected by the vehicle at the latest two slots, and M = {1, 2, . . ., M} is the set of tasks to be offloaded and K = {0, 1, 2, . . ., M} is the set of the latest fetched tasks in which the element 0 implies that no task has been fetched before t. 5 Correspondingly, the state can be characterized by a composite state s t = (n t , m t , k t ) ∈ S, where n t , m t , k t denote the index of the BSs, the task to be offloaded and the latest fetched task at slot t, respectively.The sub-state n t = (n t , n t−1 ) denotes the indexes of BSs to which the vehicle is connected at slot t and t − 1.Note that n t−1 is introduced into s t to indicate a BS handover at slot t.In this case, the result of an action does not depend on the previous actions and visited states but on the current state, which satisfies the Markov property.
Then, the offloading and fetching decision at each slot t is expressed as action a t in MDP.The action space A = {0, 1, 2} represents the set of possible actions.The system will decide to be on standby (i.e., a t = 0), or offloading without fetching (i.e., a t = 1), or performing both offloading and fetching (i.e., a t = 2) at slot t.Table II provides the detailed state and action information for the example in Fig. 5.

B. Time-Varying State Transition Probability
By applying action a t ∈ A in state s t ∈ S, the system makes a transition from s t to a new state s t+1 ∈ S. Considering the highly dynamic environment with time-varying channels and mobility patterns, we characterize the dynamics using time-varying transition functions.The transition function at slot t is defined as P t : S × A × S → [0, 1] and the state transition probability of ending up in state s t+1 after taking action a t in state s t at slot t is denoted as P t (s t+1 | s t , a t ) or P t (s | s, a).It can be shown as ( where P t (s | s, a) is decomposed into three items, the BS transition probability, the offloading task transition probability, and the latest fetched task transition probability in turn.Their detailed derivations are given below.
1) The BS Transition Probability: The BS transition probability P t (n | s, a) can be simplified as P t (n | n, a) because the BS state n is independent of k and m. n at slot t + 1 is related to the location of the vehicle at t, so P t (n | n, a) can be derived according to the Law of Total Probability as (3) where r t is the vehicle's location in the t-th slot, and f R (t) (r t | n, a) is the conditional probability distribution function (pdf) of the vehicular location variable R (t) given by the current BS state n and action a, and A n denotes the coverage range of BS n t .The location-based conditional state transition probability P t (n | n, a, r t ) can be derived as where n (i), n(i) denote the i-th term of n and n, and n t+1 is independent of n t−1 in the markov process.Further, P t (n t+1 | n t , a, r t ) is derived as where f R (t) (r t ) is the pdf of vehicle's location in the t-th slot and f R (t+1) ,R (t) (r t+1 , r t ) is the joint pdf of vehicle's locations in the t-th and (t + 1)-th slot.Further, we can use the Bayes theorem to derive the probability of the vehicle's location at slot t in (3) conditioned on the BS state and action as where P t (n t |r t ) indicates whether the location r t belongs to the coverage range of the BS n t .It returns the value of one if r t is in the coverage range of BS n t , otherwise zero, i.e., Then, we substitute (4), ( 5), ( 6) and ( 7) into (3) to obtain the BS state transition probabilities P t (n | n, a), The BS state transition probability for any mobility model can be computed by (8), once f R (t) (r t ) and f R (t) ,R (t+1) (r t , r t+1 ) are given.
2) The Latest Fetched Task Transition Probability: According to the proposed indirect migration design in Fig. 4, k is updated only when a task is fetched without the time-out failure, otherwise, it will still equal to k.Therefore, the transition of k is independent of n and the transition probability can be expressed as where I(•) returns the value of one if the condition in the bracket holds, and zero otherwise, and I tout t (s, a) indicates whether a fetching operation times out at slot t.Considering that the fetching operation is the last step in the edge computation process within a slot, I tout t (s, a) can be expressed as where D t (s, a) is the duration of a series of operation process at slot t including migration, computation offloading, computation waiting, and fetching operations, i.e., where D m t (s, a), D u t (s, a), D w t (s, a) and D d t (s, a) are the migration, offloading, waiting and fetching delays in the t-th slot, respectively.These delays are presented in Section IV-C.
3) The Offloading Task Transition Probability: In the indirect migration design, there are three possible values for the task m which will be offloaded at slot t + 1, namely the recomputed Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
task k + 1, the next task m + 1, and the original task m.Specifically, if there is a move-out at t + 1, the task m will be updated to k + 1. Conversely, if the vehicle stays in the original BS coverage area, m is updated based on offloading decisions and time-out possibilities.On one hand, if the task m is offloaded and successfully computed, denoted as I succ t (n , s, a) = 1, m is updated to be m + 1, where On the other hand, if the task m is not offloaded or the offloading operation is timed out, i.e., I succ t (n , s, a) = 0, the task m will be recomputed at slot t + 1, i.e., m = m.Notice that when offloading the final task M without suffering from the move-out failure, the vehicle will keep offloading it until the termination condition k = M is met, namely m = m = M .Given the above-mentioned considerations, the offloading task transition probability P t (m | n , k , s, a) can be given by (13) shown at the bottom of this page, where I mout t+1 indicates whether the vehicle moves out of the coverage of the original BS at t + 1, namely Note that an additional restriction, m = k + 1, is added to the first two conditions in (13) to avoid overlapping conditions when k is equal to m or m − 1 in the third condition.Furthermore, it can be seen from ( 13) that the next state m t+1 is related to the current migration delay D m t (s, a).The current migration delay is incurred when the vehicle moves out of the coverage area of the original BS n t−1 , i.e., n t = n t−1 .Therefore, the next state m t+1 depends on the past BS index n t−1 .To satisfy the required Markov property in MDP, we unbind m t+1 from the past state s t by introducing the n t−1 into the state s t , as described in Section IV-A.

C. Time-Varying Immediate Cost
The vehicle needs to consume energy for data transmission after taking some action in a state.The cost function at slot t is defined as E t : S × A → R which specifies the immediate communication energy cost for action a t in state s t , abbreviated as E t (s, a).The uplink transmission rate of the vehicle to BS n at slot t is given by [32] where B u , B d represent the uplink and downlink bandwidths and P u , P B denote the transmit power for the vehicle and BS, respectively, and N 0 is the noise power.H n (t) is the channel power gain between vehicle and BS n as [33] H n (t) = ψ t g t Ad n (t) −γ (16) where ψ t is the small-scale fast fading power component, assumed to be exponentially distributed with unit mean [34], g t is a log-normal shadowing with a standard deviation ξ, A is the path loss constant, d n (t) is the distance between the vehicle and BS n at slot t, and γ is the decay exponent.The distance d n (t) can be calculated as where r t is the location of the vehicle at slot t, z n is the location of the connected BS n, and h v , h B represent the antenna heights of the vehicle and BS n, respectively.It is worth noting that the location r t of the vehicle is updated in real-time.In addition, we assume that the vehicle can acquire the uplink and downlink transmission rates c u n (t) and c d n (t) in real-time before taking an action in the state s t .Based on the above communication model, the delay of migration, offloading, computation waiting, and fetching operations can be given as follows.
First, the migration delay is incurred by the uplink transmission of the intermediate results I w (k) if there is a move-out, i.e., D m t (s, a) = The offloading delay, induced by the uplink transmission of the task m, can be expressed as After that, the waiting delay is incurred by waiting for the task computation, which can be denoted by where α m is the required computation intensity of task m and f c (n) is the computing rate of the server n, which represents the available CPU cycle frequency of the VEC server for task processing.Finally, the fetching delay in the t-th slot is Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
Given the above four operation delays, the immediate energy cost under all the possible time-out cases can be given as where P u and P r are the vehicle's transmit and receive power, respectively, and the indicator function flag(s, a), which indicates different cases of time-out failure under (s, a), is defined as From the perspective of uplink and downlink transmissions, ( 22) can be written as follows, It is worth noting that the above delays under all the stateaction pairs (s, a) can be computed based on known task parameters and real-time acquired transmission rate before a specific action is taken.In this way, the immediate cost function E t as well as the transition function P t is also available before the system takes an action based on (2), ( 8), ( 9), ( 13) and (24).Notice that in (8), the BS state transition function is computed based on the prior statistical knowledge of vehicle mobility.

D. Online MDP-Based Offloading and Fetching Algorithm
Based on the MDP model, we formulate an MDP optimization problem to minimize the expected accumulated communication energy consumption during T slots under the state transition function, which is given by where π t is a time-related deterministic policy, defined as π t : S → A, that is, a t = π t (s t ) ∈ A, and π = {π t } t∈T , and the expectation E(•) is taken to average all possible accumulated energy consumption in the random process, and E t (s, a) is defined as where Step 1: update value V t (s) by K iterations of value iteration 3: Set V 0 t (s) = V t−1 (s), s ∈ S.

4:
for k = 1 : K do 5: Update V k t (s), s ∈ S as follows:

8:
Step 2: generate the policy π t at the current slot t to minimize the estimated Q function

9:
Set t = t + 1. 10: end while E t (s, a) with E t (s, a) in (25) because no more energy will be consumed once the final task is fetched.Correspondingly, the ending state will not be transferred, i.e., Due to the time-varying transition probabilities and costs as well as finite slots, the problem in ( 25) is a finite-horizon time-varying MDP problem.The MDP problem follows the following procedures.At each slot t = 1, 2, . . ., T , 1) the system observes the current state s and computes the cost function E t and the transition probability function P t , 2) the system takes an action a = π t (s) based on the policy π t and takes a cost E t (s, a), 3) the new state s ∈ S is drawn based on the transition probability distribution P t (•|s, a).The optimal policy at each slot t, denoted as π * t , can be determined by the Bellman optimality equation [35] as (28) where the optimal value V t (s) indicates the expected accumulated cost when starting in s at slot t and taking optimal actions thereafter.The optimal policy, given the current state s, is given (29) Unfortunately, V t+1 (s) is unavailable at the current slot t for any online system.To address the problem, we propose an Online Value Iteration (OVI) algorithm, which is inspired by the work in [36].The main idea is to leverage the time-adjacent value V t (s) to approximate the value V t+1 (s) considering the similarity of the environment at adjacent slots.In this way, ( 28) can be rewritten as (30) which can be solved by employing the well-known value iteration algorithm [35].As illustrated in Algorithm 1, there are two steps at each slot t in the OVI algorithm. 1) Step 1: The algorithm runs K iterations of value iteration based on the immediate cost function E t , the transition probability function P t , and the previously estimated value function V t−1 , and generates a new estimate of the value function V t .

2)
Step 2: The algorithm generates an online policy on the basis of the current estimate of value function V t and the immediate cost function E t and transition probability function P t .Because there is no energy consumption if no task is offloaded, the algorithm may tend not to offload any task at each slot t (i.e., π t (s) = 0), referred to as a selfish offloading policy.To avoid this, we set the initial value V 0 (s) as where H is a constant satisfying H J π .In this case, the Qvalue function Q 1 (s, a) under the selfish offloading policy will always be as large as H since the initial state cannot be transferred to the ending states.Likewise, the subsequent Q t (s, a) from t = 2 will be indirectly influenced by V 0 (s).Using this, the algorithm tends to choose a policy that can eventually guide the state to an ending state to avoid excessive costs, thus avoiding selfish offloading.Notice that the algorithm will dynamically adjust the offloading and fetching policy to adapt to the varying environments.For example, when the vehicle speed becomes larger or the channel becomes worse, a fetch decision (a t = 2) will be preferred.
It can be proven that the proposed OVI algorithm converges, and the proof is presented in Appendix 1.Based on the pseudocode presented in Algorithm 1, the computational complexity for each slot t is determined by the product of the complexity required to update the value function and the number of iterations.For an MDP problem with |S| states and |A| actions, the computational complexity required to update the value function Therefore, the total computational complexity of the OVI algorithm for each slot t is O(|S| 2 • |A| • K).Through mapping the vehicle's locations into the corresponding BS under the prior mobility information, we transform an infinite-scale state space into a finite-scale state space, which contributes smaller |S| and lower algorithm complexity.

V. NUMERICAL RESULTS AND DISCUSSION
This section provides some simulation results to illustrate the performance of the proposed OVI scheme.We consider a highway scenario with a length of L, where N = 5 BSs are placed evenly on the roadside, as depicted in Fig. 6.The vehicle moves according to the ordered uniform distribution model [37], that is, the probability distribution of the vehicle's location variable in the t-th timeslot during T m movement timeslots is given by where f R (r t ) and F R (r t ) are the probability density function (pdf) and the cumulative distribution function (cdf) under the uniform distribution of the vehicle location.Furthermore, the joint pdf of the location variable at slot t and slot t + 1 can be derived based on the statistics knowledge on order statistics [38]: In our simulation, the vehicle's movement trajectory is randomly generated 1000 times according to its mobility model in (32).The segment length and the movement timeslot length are set to be L = 2000 m and T m = 40, respectively.The number of timeslots T equals to the number of vehicle movement timeslots T m .The slot length is T 0 = 2 s and the average speed of the vehicle equals to v = L T m ×T 0 = 90 km/h.For the task parameters, the number of divided subtasks is set to be M = 7.The subtask size is L m ∈ [8,12] Mbits and the computation intensity is α m ∈ [100, 1000] cycles/bit.The output size of each subtask is w m ∈ [1, 1.5] Mbits.The task topology is generated by reference [39].Given the task topology, the offloading priority in the sequential offloading scenario can be determined by a topological sorting algorithm [39], and the corresponding intermediate results I w (m) is generated according to (1).
We follow the simulation setup on the channel for the highway case detailed in 3GPP TR 36.885 [40].The vehicle's transmit power is P u = 23 dBm [33] and the receive power is P r = 26 dBm [41].Besides, the uplink and downlink channel bandwidth  [31] and the noise power is N 0 = −114 dBm [33].In terms of the edge, the BS transmit power is P B = 30 dBm [33] and the computing rate of each VEC server is f c = 10 GHz [31].On each generated movement trajectory, the transmission rate at each timeslot is randomly generated according to the communication model.Notice that the vehicle movement causes Doppler spread and random signal variations, posing a challenge to vehicular communication.Nevertheless, since the 3GPP TR 36.885protocol has already used enhanced Demodulation Reference Signal (DMRS) technology to deal with high Doppler effects in vehicle communications, we assume that the impact of Doppler effects has been compensated for through the enhanced DMRS technology and will not have a significant impact on the simulations.The main parameters used in the simulations are summarized in Table III.Note that some parameters may change in different figures.
We compare the performances of the proposed MDP offloading and fetching scheme with other three online baseline schemes : r Adventurous scheme [6] The system never fetches inter- mediate results at any slot to lower the current energy consumption unless the last task is offloaded.This scheme is widely adopted in the traditional task-indivisible scenario in which tasks computed at the edge are fetched only after they have all been processed.r Conservative scheme [42]: The system always fetches in- termediate results at each slot to avoid extra recomputations as much as possible.This is the traditional scheme in the task-divisible scenario.
r Threshold-based scheme [43]: In the scheme, the system weighs offloading and fetching to balance the energy consumption of recomputation and frequent fetches.Specifically, considering that the negative fetching decisions may incur more potential energy consumption due to the retransmission of recomputed tasks, the future recomputation energy cost is reflected into the current cost to guide the current decision, referred to as risk cost C r t (s, a).The system makes an online decision to minimize the weighted cost, i.e., C t (s, a) = βE t (s, a) , where E t (s, a) is an immediate communication cost defined in (26).We set β = 0.5 in the simulation.Additionally, if the last task is offloaded, the system will fetch its results.This scheme can be written in the form of a threshold-based, where the E t (s) is the dynamic threshold, E t (s) = )).In the above three schemes, an additional constraint should be imposed on the decision-making.Considering that the delay under all state-action pairs is available before a decision is made, the standby decision (a = 0) will replace the offloading or fetching decision (a = 1, 2) to avoid invalid energy consumption when the offloading or fetching operation is predicted to time out.
For comparison, we also figure out the lower bound of the total energy consumption expectation.We consider an offline case [31] and thus J π can achieve global optimum because the cost function E t and transition function P t at each slot are revealed.For optimal offline decisions, the system picks optimal slots for offloading or fetching from all the given slots so that J π is minimized.The classic backward induction (BI) algorithm [44] is adopted to determine the optimal policy.The algorithm consists of the following three steps: 1) the algorithm first sets V T (s) = min a E T (s, a) for all s ∈ S end and V T (s) = H for all s ∈ S n-end to avoid a selfish offloading policy as (31) does.2) the algorithm computes V t (s), s ∈ S in reverse chronological order from t = T − 1 to t = 1 based on the equation (28), 3) the algorithm determines the optimal policy according to (29) for t = 1, . . ., T − 1 and the optimal policy for t = T is π * T (s) = arg min a∈A E t (s, a).For the above online and offline policies π in a movement trajectory, the expectation of the total energy consumption J π is obtained by iteratively updating the following Bellman Equation [35] in reverse chronological order from t = T − 1 to t = 1, (35) Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.where V T (s) equals to E T (s, π T (s)).The J π in a simulation trajectory is equivalent to V 1 (s 0 ), where s 0 is an initial state fixed to (1, 1, 1, 0), and J π is averaged over 1000 simulation trajectories in the simulation.

A. Performance Analysis of Proposed OVI for Different Task Parameters
In order to evaluate the performance of the proposed OVI scheme in different task scenarios, three task parameters are examined, i.e., the task size, the number of tasks, and the ratio of output size to task size.
1) Impact of the Task Size: The total energy consumption J π of the five schemes for different task sizes is shown in Fig. 7.With the increase in the average task size, the total energy consumption also gradually increases.This is because when the task size increases, the transmission delay increases, thereby incurring more energy consumption.Furthermore, our OVI scheme is superior to other online schemes with lower total energy consumption for different task sizes.The superiority of our scheme lies in the long-term consideration of the energy costs in a real-time environment, and the offloading and fetching policy for each slot is determined by minimizing the state value function (30) that takes into account the state transition probabilities derived from the vehicle mobility model and the acquired real-time transmission rates, rather than by myopic optimization of each slot.
As the task size approaches the limit of what can be successfully transferred within T 0 , the number of slots available for successful offloading decreases.In this case, our OVI scheme, similar to the offline scheme, basically maintains steady growth.In contrast, the performances of the threshold-based scheme and the adventurous scheme are worse, requiring frequent recomputations and sharply increasing energy consumption due to the increased likelihood of moving out as the task completion period is lengthened.Note that the performance of the conservative scheme in the limit case is even worse because the vehicle does not offload tasks if it cannot fetch the intermediate results, therefore decreasing the number of completed tasks during T slots, which accounts for the significant reduction in energy consumption in Fig. 7.
2) Impact of the Number of Tasks: The total energy consumption for the different numbers of tasks is shown in Fig. 8.The total energy consumption of our proposed scheme increases more slowly as the number of tasks M increases, compared with other online schemes.Specifically, the possibility of completing all tasks within a single BS decreases as the number of tasks increases, therefore significantly increasing the energy consumption imposed by frequent move-outs and recomputations for the adventurous scheme.For the conservative scheme, frequent fetches incur greater energy consumption in downlink transmission for a small number of tasks, while more recomputation overhead is avoided at a relatively small fetch cost for a large number of tasks.Therefore, the consumption of the conservative scheme is lower than that of the adventurous scheme from M = 5.The threshold-based scheme avoids blind offloading or fetching by compromising in recomputation and communication to a certain extent.However, it generates an energy loss by focusing excessively on whether to fetch and neglecting the selection of the optimal offloading slots.Furthermore, the superiority of our OVI scheme is due to the joint consideration of the immediate communication cost (the first term in (30)) and the potential recomputation cost in the real-time environment (the second term in (30)), and the policy is determined by minimizing the joint costs.
3) Impact of the Ratio of Output Size to Task Size: At last, we evaluate the performance of OVI schemes for different ratios of output size to task size.We consider a general case where the size of output results w m is smaller than the task size L m , and set the upper limit of the ratio to 50% to avoid the impact of migration timeouts on the experiment due to oversized migration results.As can be seen from Fig. 9, our proposed scheme outperforms other online schemes at all ratios, that is, our scheme is adapted to various task scenarios with different output ratios.This is because our scheme takes the fetch energy consumption into account in the cost function (24), and thus adaptively reduces the number of fetches when the fetch energy consumption is large (corresponding to a large output size).It also explains why the Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.consumption of our proposed OVI scheme gradually approaches that of the adventurous scheme as the ratio increases.

B. Performance Analysis of Proposed OVI for Different Numbers of BSs
In Fig. 10, we show the total energy consumption performance of the proposed OVI scheme for different numbers of BSs.It can be seen that the total energy consumption increases as the number of BSs increases.This is because the move-out probability of the vehicle, or the BS handover probability increases as the BSs are deployed more densely with a fixed average speed of the vehicle, thus incurring a greater energy cost of recomputations.Compared to other schemes, our proposed OVI scheme yields lower total energy consumption for different numbers of BSs.This is attributed to the fact that the potential recomputation cost is considered.When the recomputation cost increases due to more frequent handovers, our scheme adaptively increases the number of fetches to reduce the amount of recomputed tasks.However, since the recomputation cost is not considered for the adventurous scheme, the total energy consumption grows approximately exponentially with the increasing number of BSs.In contrast, the conservative scheme and the threshold-based scheme consider recomputation costs but ignore the selection of optimal offload timeslots as stated before.It is concluded that our OVI scheme performs better in the random move-out scenario, especially with dense BSs deployments.

C. Performance Analysis of Proposed OVI for Different Variation Levels of SNR
In Fig. 11, we show the total energy consumption performance of the proposed OVI scheme with respect to the variation of signal-to-noise ratio (SNR), considering the highly dynamic characteristics of the channel when the vehicle is moving.We characterize the variation level of SNR in terms of the standard deviation of shadowing from a large-scale propagation effect.Note that we keep the average speed of the vehicle constant to avoid the interference of the variation of move-out probability on the simulation.Also, the effect of path loss on the channel is essentially consistent among experiments due to the fixed mobility model.We can observe that for all the schemes, the total energy consumption is insensitive to the increase in shadowing standard deviation when ξ < 10 while it begins to increase when ξ > 10.The reason is that, when ξ is small, the channel is lightly fluctuating and can guarantee task transmission and results fetch; when ξ becomes large, the transmission rate may suffer from severe degradation, thus leading to more failures in fetching as well as more recomputed tasks.The gain in transmission rate from channel fluctuation is not sufficient to offset the negative impact of recomputations.Nevertheless, the total energy consumption of our proposed scheme increases at a lower rate than that of the online benchmark schemes, thanks to its efficient failure-aware migration mechanism to balance the fetching energy and recomputation energy.the average vehicle speed induces changes in both the move-out probability as well as the channel.We can see that the total energy consumption increases with the average speed in all the schemes, due to the growth in recomputation tasks.Compared to the other online schemes, our proposed scheme has lower energy consumption at different vehicle speeds, especially in high-speed scenarios, which implies a higher move-out probability and a faster channel change.This is because our scheme takes into account both the fetch cost and the potential recomputation cost, and adaptively adjusts the offload and fetch decisions to channel variations and move-out possibilities by minimizing the joint costs.

VI. CONCLUTION
In this article, we have studied the problem of dependencyaware computation offloading and service migration under the scenario without backhaul.We have developed a novel dependency-aware indirect migration scheme and have jointly optimized offloading and fetching decisions based on a timevarying MDP model.The general expression of time-varying transition probabilities has been derived to characterize the dynamics of intermittent offloading and fetching under the scenarios of temporally varying vehicular mobility patterns and channel qualities.To solve the MDP problem with both time-varying transitions and immediate costs, we have designed an online algorithm, called OVI, based on an online implementation of value iterations.Taking both immediate communication cost and potential recomputation cost into consideration, our proposed algorithm can optimize energy consumption performance while satisfying task dependency constraints.Simulations have shown that our proposed OVI algorithm can achieve superior energy performance compared to the oracle online schemes, especially when the vehicle mobility is high.

APPENDIX PROOF OF CONVERGENCE FOR THE OVI ALGORITHM
The proposed online value iteration (OVI) algorithm can converge to the optimal value at each slot t.We first prove the convergence of a general value iteration algorithm and then prove the convergence of the proposed OVI algorithm based on this convergence theorem.
Theorem 1 (Convergence theorem for value iteration): For the following value iteration, U k (s) converges to the unique value U * (s) when k → ∞.Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
The above proof proves the existence of the fixed point, and we next prove the uniqueness by contradiction.
Assuming U, V are both fixed points with U = V, then, and according to (38), we can conclude that According to the contradiction of ( 40) and ( 41), the hypothesis is false, that is, the fixed point V * is unique.
For the proposed OVI algorithm, K iterations of the following value iteration are performed at each slot t.
V k t (s) = min a∈A E t (s, a)+ The value iteration in ( 42) is a special case of the general value iteration in (36), where the reward function r(s, a, s ) is replaced by a negative cost function −E t (s, a).Therefore, Theorem 1 guarantees that the algorithm converges to the optimal value as K approaches infinity.In practical simulations, since p(s | s, a) in the algorithm is sparse, meaning that for most state-action pairs (s, a, s ), p is equal to 0, a small value of K can lead to convergence to the optimal value.

Fig. 2 .
Fig. 2. Example task and the corresponding intermediate results for each subtask.(a) A task topology diagram describing the dependencies among subtasks.(b) The sequential offloading process graph with intermediate results marked on the dotted line.We abbreviate i=j,k,... w i as w j,k,... for brevity.

Fig. 5 .
Fig. 5. Exemplary intermittent offloading and fetching process under the indirect migration design.
by π * t (s) = arg min a∈A E t (s, a)+ s ∈S P t (s | s, a) V t+1 (s ) .

Fig. 8 .
Fig. 8.Total energy consumption in the different number of tasks.

Fig. 9 .
Fig. 9.Total energy consumption in different ratios of output size to task size.

Fig. 10 .
Fig. 10.Total energy consumption in different numbers of BSs.

Fig. 11 .
Fig. 11.Total energy consumption in different levels of channel variation.

Fig. 12
Fig.12shows the influence of the average speed of the vehicle on the total energy consumption.In the scenario of intermittent computation and intermittent communication, the variation of

Fig. 12 .
Fig. 12.Total energy consumption at different speeds of the vehicle.

TABLE I DEFINITIONS
OF NOTATIONS