A Cost-Quality Beneficial Cell Selection Approach for Sparse Mobile Crowdsensing With Diverse Sensing Costs

The Internet of Things (IoT) and mobile techniques enable real-time sensing for urban computing systems. By recruiting only a small number of users to sense data from selected subareas (namely, cells), sparse mobile crowdsensing (MCS) emerges as an effective paradigm to reduce sensing costs for monitoring the overall status of a large-scale area. The current sparse MCS solutions reduce the sensing subareas (by selecting the most informative cells) based on the assumption that each sample has the same cost, which is not always realistic in the real world, as the cost of sensing in a subarea can be diverse due to many factors, e.g., the condition of the device, location, and routing distance. To address this issue, we proposed a new cell selection approach consisting of three steps (information modeling, cost estimation, and cost-quality beneficial cell selection) to further reduce the total costs and improve the task quality. Specifically, we discussed the properties of the optimization goals and modeled the cell selection problem as a solvable biobjective optimization problem under certain assumptions and approximations. Then, we presented two selection strategies, i.e., the Pareto optimization selection (POS) and generalized cost-benefit greedy (GCB-GREEDY) selection along with our proposed cell selection algorithm. Finally, the superiority of our cell selection approach is assessed through four real-life urban monitoring data sets (Parking, Flow, Traffic, and Humidity) and three cost maps (independent identically distributed with dynamic cost map, monotonic with dynamic cost map, and spatial-correlated cost map). Results show that our proposed selection strategies POS and GCB-GREEDY can save up to 15.2% and 15.02% sample costs and reduce the inference errors to a maximum of 16.8% (15.5%) compared to the baseline-query by committee (QBC) in a sensing cycle. The findings show important implications in sparse MCS for urban context properties.


I. INTRODUCTION
T HE RAPID development of the Internet of Things (IoT) and mobile computing technologies [1], [2] promotes the emergence of intelligent, open, and large-scale sensing mechanisms, which allow citizens to effectively collect and share real-time information and enable innovative urban computing solutions to tackle city-level challenges, such as carbon emission [3], noise [4], traffic congestion [5], and infrastructure status [6]. With the widely adopted sensor-rich smartphones, mobile crowdsensing (MCS) [7], [8] plays an increasingly important role in urban computing for addressing various urban-scale monitoring needs. To ensure highquality sensing services, MCS systems often require a large number of mobile users to satisfy the high coverage ratio (quality metric) [9]- [11], which is often expensive and unrealistic when budgets and the number of participants are limited. Since sensing maps usually have a low-rank feature, researchers proposed to use compressive sensing (CS) [12] techniques to collect data from only a few subareas and then to deduce the missing data of unsensed cells by exploiting the inherent correlation of sensing data. In this way, Wang et al. [13] proposed sparse MCS to reduce the number of required samples but still keep a predefined data quality required for the MCS organizers.
In Sparse MCS, one important issue is cell selection; the organizer needs to decide where and when to collect sensed data from mobile users. Data of different MCS systems may involve diverse spatiotemporal correlations, it is thus a nontrivial task to design proper cell selection strategies. We reviewed the following methods used by the existing sparse MCS studies. The first one is based on query by committee (QBC), which selects the next salient cell to sense by calculating the uncertainty of the missing data in unsensed cells. While QBC only considers the subarea which is the 2327-4662 c 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information. most uncertain at that moment [14], Liu et al. [15], [16] proposed a deep Q-network-based cell selection strategy, which can approximate the global optimal strategy relying on sufficient data training. Different from the above-mentioned approaches, Xie et al. [17] proposed a bipartite-graph-based sensing scheduling scheme to actively determine the sample locations, which is suitable for linear systems and requires the knowledge of the matrix rank in advance. Thus, it is hard for this method to apply in real-world nonlinear systems. More importantly, these methods only aim to maximize the informativeness, but without considering the diversity of sample costs when selecting subareas for recovering the unsensed values.
To the best of our knowledge, the state-of-the-art sparse MCS techniques assume that the sensing cost is constant spatially and temporally. However, in practical MCS activities, an activity organizer has to consider the diversity of sensing costs for several reasons: 1) sensors possessed by mobile users are inherently diverse, and measurement accuracy largely depends on the sensors. Generally, data reports with high precision should be offered with higher rewards; 2) the cost of reporting a sensed data to the organizer varies based on the network condition, distance to the nearest cell tower, cellular data plan, or other concurrent activities on the device [18]; and 3) prior work also found evidence that the final cost may also be affected by the subjective perception of participants. For instance, a user would ask for a higher reward when he is running out of battery [18]. In brief, different conditions can result in vastly different costs in crowdsensing activities. Therefore, this article aims to improve the existing MCS solutions by reducing sample costs and inference errors (i.e., the quality metric for sparse MCS systems) with explicit incorporation of cost diversity into cell selection strategies.
To realize the research target of this article, an effective cell selection strategy is crucial. Since a cell with low cost might run counter to the need of collecting more information for inferring the missing data. For instance, if we have quantified the informativeness and sample cost in all subareas (shown in Fig. 1), one naïve approach is always to select sample locations with the lowest cost, which will inevitably result in the poor recovery of missing data. In other words, the predefined data quality requirement cannot be satisfied. Another naïve approach is to simply divide the informativeness by the sample cost. But it may fail when one of the two factors dominates the other. Suppose that we have an elaborate selection strategy selecting more informative subareas on the premise of reducing or not increasing the overall cost. Since more cells are sensed with real values, the inference error is also improved. For instance, if we set the cost budget in one selection to 120 (e.g., the unit of cost is CNY), a selection containing the top-left cell and the bottom-left cell in Fig. 1 is evidently better than the selection of the bottom-right cell because more information is obtained. However, such a selection mechanism is difficult to design. First, a quantitative model is needed to model information on selected subareas for inferring the missing data. Second, a proper sample cost estimation method is required for accurate estimations of the sample cost. Finally, proper strategies to find cells that are low cost and yet collect sufficient information are the most important requirement for capturing the underlying data structure to enable accurate recovery.
To tackle the above-mentioned challenges, this article thus aims to contribute the following. 1) We select four city-level data sets of various application domains and verify the inherent low-rank feature and spatial-temporal correlation feature in these urban data sets, which is the basis of the research in this article. 2) To the best of our knowledge, we are the first work providing comprehensive modeling of the cost diversity in sparse MCS and considering dynamic cost budget in the biobjective cell selection problem. Incorporating such a cost enhances the practicability of the sparse MCS methods. A novel cell selection approach, consists of three steps: a) information modeling; b) cost estimation; and c) cost-quality beneficial selection, is proposed in this article. Significantly, three important cost factors, i.e., routing cost, measurement cost, and perception cost, are discussed in detail and proposed to formulate a cost estimation function. Finally, the cell selection process is formulated as a biobjective optimization problem with the target of maximizing the informativeness in the selected subareas for recovery and minimizing the total sample costs. To solve the optimization, we propose two selection strategies, namely, generalized cost-benefit greedy (GCB-GREEDY) selection and the Pareto optimization selection (POS). 3) We provide extensive discussions on the objective functions of our cell selection optimization problem. Note that submodularity is an attractive property encoding a natural diminishing returns condition. But submodularity and monotonicity are not always acquired in any condition. Since a subset selection problem is NP-hard, our cell selection problem is definitely NP-hard and only solvable under certain assumptions (sufficient participants in each subarea and thus routing costs can be ignored). Under the solvable condition, we formalize the cost-quality beneficial cell selection algorithm for an MCS task. 4) We discuss the potential of applying the cost-quality beneficial sparse MCS approach on urban context computing and activation. By integrating the power of crowds, the urban context can be sensed in a cost-aware and sparse way over a large-scale region. By leveraging the wisdom of crowds, the efficiency of smart city systems is optimized, like the rebalancing problem of shared bikes in modern cities.

5)
We evaluate the performance of our proposed cell selection strategies using real-world cases. After taking QBC as the information modeling method and generating three types of cost maps [independent identically distributed (i.i.d) with dynamic cost map, monotonic with dynamic cost map, and spatial-correlated cost map] based on the cost estimation function, we conduct extensive experiments on Parking, Flow, Traffic, and Humidity monitoring data sets. The results can verify the effectiveness and feasibility of our proposed approach and strategies. Taking humidity sensing as an example, our proposed strategies outperform the baselines QBC and SIMP-GREEDY by lowering up to 15.2% and 8.5% sample costs while reducing inference errors to a maximum of 10.1% (3.8%). Similar tendencies are observed in the other three sensing tasks. The remainder of this article is organized as follows. We first review the related works in Section II. Then, the research problem is formulated and the system model is explained in Section III. Next, our three-step cell selection approach is presented in Section IV. Subsequently, we analyze the potential of cost-quality-aware sparse MCS assisted urban sensing and actuation. Finally, we evaluate the performance of the proposed strategies in Section VI and conclude this article in Section VII.

II. RELATED WORKS
In this section, we review the related work from three perspectives: 1) multiobjective task assignment methods in MCS; 2) sparse MCS; and 3) techniques in subset selection.

A. Multiobjective Task Assignment Methods in MCS
Quality, coverage, cost, and etc., are the main factors affecting the performance of designed algorithms in the task allocation process of MCS. Therefore, the recent researches aimed to formulate the task allocation problem from a multiobjective perspective considering the aforementioned factors. Liu et al. [19] discussed the existing strategies to reduce the resource cost and improve Quality of Service (QoS). Specifically, Xu et al. [20] proposed a compressive crowdsensing (CCS) framework to realize reduced amounts of collected data and acceptable levels of overall accuracy at the same time. The limitation of this work is the assumption that the structure and relationships within the data and phenomenon are unchanged from what is observed in historical data. To balance between the signal quality and crowdsourcing cost, He and Shin [21] proposed an incentive mechanism based on the Bayesian CCS. The contribution of work is the link between the missing value inference and confidence estimation and stopping. Differently, Meng et al. [22] focused on the unevenly distributed user observations over the monitored entities and they designed an integrated framework to realize truth discovery from redundant and sparse data. Xia et al. [23] designed a mobile-edge computing architecture to select the minimal set of users in each time cycle with maximized user spatiotemporal coverage while keeping the predefined data requirement. To provide a unified task assignment design, UniTask [24] optimized the overall system utility by jointly considering the representative MCS performance metrics (i.e., coverage, latency, and accuracy). Focusing on the vehiclebased crowdsourcing, Zhang et al. [25] formulated the worker recruitment problem as a biobjective optimization model with respect to query reliability and sensing coverage. Different from the above-mentioned studies, we followed the research line of Sparse MCS and focused on providing comprehensive modeling of the inconstancy sensing cost. The task assignment problem is simplified as a biobjective cell selection process (i.e., informativeness and sensing cost). Also, we discussed the solvable conditions and assumptions of our model. Moreover, we proposed two heuristic cell selection strategies and evaluated their performance on four city-level real-world data sets.

B. Sparse MCS
As almost all physical conditions monitored are continuous, sensory data generally exhibit strong spatial-temporal correlation, thus the environment ground-truth matrix [26] often has a low-rank feature.
With this insight, Wang et al. [27] proposed to use the overall inference error, rather than the sensing area coverage, as the data quality metric. In such MCS tasks, CS has become the de facto choice of the inference algorithm. Then, Wang et al. [13] defined the specific MCS problem as sparse MCS and discussed the challenges as well as the opportunities from three aspects. Next, Wang et al. [28] extended the sparse MCS to dynamically select a small set of subareas for sensing in each timeslot for multitask scenarios. Later, Wang et al. [29] also added a privacy protection mechanism into sparse MCS. In the above-mentioned sparse MCS, since the entropy-based cell selection only chooses the cell which is the most uncertain at that moment and ignores whether the current selection would help the inferring in the future or not. Thus, Liu et al. [15] and Wang et al. [30] proposed the deep reinforcement learning-based cell selection method in sparse MCS. This method proves to narrow the gap with the optimal solution.
Different from the previous works, Xie et al. [17] found that the matrix completion method saves more samples than the vector-based CS methods. They proposed an active sparse MCS scheme which includes a bipartite-graph-based sensing scheduling scheme to actively determine the sampling location positions in each upcoming time slot, and a bipartitebased matrix completion algorithm to robustly and accurately recover the unsensed data in the presence of sensing and communication errors. Recently, Liu et al. [31] combined the deep reinforcement learning-based cell selection method with a practical user recruitment model to deal with the data inference improvement. In this article, we also focus on the cell selection process and aim to incorporate cost-diversity into our cell selection strategies.

C. Techniques in Subset Selection
It is a fundamental problem to select the optimal subset from a large set of variables in various learning tasks, such as feature selection, sparse regression, and dictionary learning [32]. Obviously, our research problem of selecting partial sample locations and inferring missing data in the unsensed cells can also be transformed into a subset selection problem. The subset selection problem is, however, generally NP-hard [33].
To address this problem, previous techniques can mainly be categorized into two branches: 1) greedy algorithms and 2) convex relaxation methods. Generally, greedy algorithms iteratively select or abandon one instance that makes the objective currently optimized [34]. Though the greedy nature of the generalized greedy algorithm can guarantee an efficient fixed runtime but limits its performance in the meantime. Besides, convex relaxation methods usually replace the set size constraint (i.e., the l 0 -norm) with convex constraints, then find the optimal solutions to the relaxed problem, which, however, can be distant to the true optimum. In recent years, Qian et al.
proposed an evolutionary algorithm, namely, POSS, treated the subset selection problem as a biobjective optimization problem that optimizes some specific criterion and the subset size simultaneously. This algorithm is an anytime method that can use more time to find better solutions.
Note that Huang et al. [35] was the first work leveraging the POSS algorithm to solve the bicriteria feature acquisition in low-rank active matrix approximation problems. Inspired by their work, we can formulate our cell selection process into a subset selection problem with biobjective as well (maximize the information in selected cells and minimize the sample costs in the meantime). But in our article, we proposed two heuristic strategies, not only the POS strategy. Also, the cost budget in [33] and [35] is constant, but the cost budget in a selection of our article is dynamic because we set it as the biggest cost value of unsensed cells plus one. Moreover, we also adjusted the definition on the information goals and their form.

III. SYSTEM MODEL AND PROBLEM STATEMENT
In this section, we first present the three-stage model to describe the activity in a sparse MCS platform: cell selection, quality assessment, and data inference, and then introduce the quality assessment method, i.e., leave-one-out statistical analysis (LOO-SA) and the data inference method, i.e., CS in brief. Finally, we discuss the cost-quality beneficial cell selection solution in sparse MCS using a running example. Table I shows the main concepts and notations used in this article.

A. System Model
A typical MCS sensing scenario begins with a sensing task launched by its organizer to obtain fine-grained urban context results, e.g., humidity over a large-scale target area for a long time, as shown in Fig. 2. To provide high-quality sensing services, the target sensing area is divided into m subareas according to the organizer's requirement. In the meantime, the whole sensing campaign is also split into n equaling sensing cycles. For instance, the organizer needs to update the full humidity sensing map once every hour (the sensing cycle), and in each sensing cycle, the data quality requirement is that the mean absolute error (MAE) for the whole area should be less than 1.5% (humidity). To meet the data quality requirement under the constraint of task budget B one cost in each selection, the organizer needs to carefully select subareas to make a tradeoff between the informativeness (i.e., maximize the information to reduce the inference error) and the sensing cost (i.e., reduce total costs) in a subarea. After meeting the quality requirement in a sensing cycle, the humidity values of the remaining cells are deduced based on the sensed humidity values of those selected cells. Through this crowd-powered subset sensing approach, the organizer can obtain sufficient data based on the sensing requirement and costs.

B. Data Inference
In sparse MCS, we often leverage the historical and the current sensed data to infer the data of the remaining unsensed cells. CS, as a good choice for inferring the full ground data matrix from the partially collected sensing values with convincing theoretical deviation, has shown its effectiveness in several scenarios [26], [36]. Now, we recover the full ground data matrixĜ m×n based on the low-rank property as follows: where • denotes the elementwise multiplication and each entry S ij denotes whether the cell i at cycle j is selected for sensing. Thus, S ij equals 0 or 1. Based on the singular value decomposition (SVD) and CS theory, i.e.,Ĝ = LR T , we convert the above nonconvex optimization problem as follows: This optimization changes the rank minimization problem (minimize the rank ofĜ) to minimizing the sum of L and R's Frobenius norms. λ allows a tunable tradeoff between rank minimization and accuracy fitness. To get the optimalĜ, alternating least-squares [37] procedure is leveraged to estimate L and R iteratively.
Moreover, strong spatial-temporal correlations can be discovered in the sensing data [38], [39]. Thus, adding the explicit spatiotemporal correlations into CS, the optimization function can be further formulated as where S and T are spatial and temporal constraint matrices, respectively, and λ r , λ s , and λ t control the tradeoff between different correlations. Concerning the spatial and temporal constraint matrix, interested readers are referred to this study [26].

C. Quality Assessment
In this article, LOO-SA is used to assess the inference quality. First, a leave-one-out resampling mechanism is implemented to obtain the set of (inferred, true) data pairs. Then, by comparing the inferred data with the corresponding true collected data, the Bayesian inference or Bootstrap analysis is leveraged to assess whether the current data quality can satisfy the predefined ( , p)-quality requirement.
Leave-one-out is a popular resampling method to measure the performance of many prediction and classification algorithms. Suppose that we collect sensing data from m out of all the m cells, the idea of LOO is that for each time, we leave one observation out and infer it based on the rest m −1 observations by using CS or interpolation algorithms. After running this process for all m observations, we get m predictions accompanying with the m true observations, as shown in the following: Based on m (inferred, true) data pairs, we can use the Bayesian inference or Bootstrap analysis to estimate the probability distribution of the inference error ε, e.g., MAE to help quality assessment. Actually, satisfying the ( , p)-quality can be converted to calculate the probability of ε k ≤ , i.e., P(ε k ≤ ), for the current cycle k. If P(ε k ≤ ) ≥ p can hold for every cycle k, then ( , p)-quality is expected to be satisfied as a whole. In this article, two statistical analysis methods, i.e., the Bayesian inference [40] and Bootstrap analysis [41], are leveraged for estimating P(ε k ≤ ). Different from the Bayesian inference (require the error metric is normal distribution), the advantage of Bootstrap is that no assumption on the distribution of the observations needs to be made. Detailed information about the Bayesian or Bootstrap analysis can be found in [13].

1) Assumptions:
We follow the basic assumptions in [16] and [27] in our study except for the assumption of sample cost. The unsound assumption in the previous sparse MCS is that each sample has the same cost: its goal is to simply reduce the number of samples while achieving a good recovery accuracy. Since the cost of obtaining a sample depends highly on the location, time, sensing device types, condition of the device, and many other factors of the sample, we break the assumption in previous studies of sparse MCS and further assume that the cost of obtaining a specific sample of each subarea in different cycles is diverse. To make the cell selection problem more practical, the assumption of cost diversity allows us to make a tradeoff selection considering both the sample cost and informativeness in a spatial-temporal cell.
2) Problem Formulation: Based on the previous system model, assumptions, and the brief introduction on CS and LOO-SA, we define our research problem and focus on the cell selection. The cell selection problem can be formulated as (5): given an MCS task with m cells and n cycles, the sensing budget B all cost , a sensing matrix inference algorithm R, an information estimation algorithm f 1 (S), and a sample cost estimation algorithm f 2 (S), the MCS organizer attempts to select a subset of most informative sensing cells under the task budgets during the whole process (use the minimal costs to find the subset cells maximal in information), while satisfying the ( , p)-quality This optimization problem aims to maximize the information estimation function f 1 (s) and minimizes the cost estimation function f 2 (s) simultaneously. Here, we maximize the information of selected cells by minimizing its negative f 1 (s). The overall sensing error ε k and the error metric (MAE) are defined in Table I. Note that we use a Boolean vector s ∈ {0, 1} mn to replace the cell section matrix S m×n , where the i + m × j bit s i+m×j = 1 means that the entry S ij equals 1. In this article, we will not distinguish s ∈ {0, 1} mn and its corresponding representation S m×n . As we cannot foresee the ground-truth matrix G m×n for an MCS task, it is impossible to obtain the optimal cell selection matrix S m×n in reality. To overcome the difficulties, we propose a novel cell selection method, which leverages an iterative process to select sensing cells in each cycle, with details elaborated in the following section.
3) Use Case Study: Fig. 3 shows the basic idea of our proposed cell selection process in a sensing cycle. Suppose the target area contains six cells and the sixth sensing cycle starts currently; at first, no sensing data are collected in the sixth cycle [ Fig. 3(1)]. Our proposed cost-quality cell selection method (e.g., take the POS strategy as an example) works as follows.
1) Under the given cost budget (e.g., 2 CNY) of a selection, our strategy lists all possible solution combinations (i.e., here we have seven possible solutions within the cost budget and they are single cell 1-6 and the combination of cells 1 and 6), compares the solutions by considering the tradeoff between information and sample costs, selects the first two salient cells (cell 1 and cell 6) under the budget, i.e., 2 and allocates the sensing tasks to two mobile users in cell 1 and cell 6, respectively. Mobile users perform the sensing tasks and return the sensing data to the MCS server [ Fig. 3(2)]. 2) Then, given the collected sensing data, the quality assessment module decides if the data quality satisfies the predefined ( , p)-quality requirement. If the data quality does not meet the quality requirement, we have to select more cells for sensing [choose cell 5 to sense in Fig. 3(3)]. In this way, our strategy continues to allocate tasks to new cells and collects sensing data (illustrated in step (1), and more details will be introduced in the section of cell selection until the quality of the collected sensing data satisfies the predefined requirement. 3) Given the collected sensing values, the CS module infers the values of unsensed cells.

IV. COST-QUALITY BENEFICIAL CELL SELECTION IN SPARSE MCS
In this section, we will introduce the information modeling, sample cost estimation, and cost-quality beneficial cell selection in sequence.

A. Information Modeling
In traditional active learning, if the model is less certain about the prediction on an instance, then the instance is considered to be more informative for improving the model and will be more likely to be selected for label querying [35]. Inspired by this idea, we can leverage a reward criterion or a value criterion [42] to estimate the informativeness of entry, i.e., a spatiotemporal cell in the matrix G. The challenge here is how to quantify the informativeness of an instance in a subarea for recovering the entries in other subareas. Obviously, the entropy-based (e.g., QBC) or mutual-information-based method (e.g., the Gaussian process-based mutual information) belongs to a reward criterion, which indicates what is good in an immediate sense; while the value-function-based method (the reinforcement learning-based method) considers what is good in the long run. Though the mutual-information-based method and value-function-based method can better quantify which cell is more informative, they require sufficient historical data to compute the informativeness of a cell. Also, they are unable to be applied in a fresh task when no sufficient data are acquired. Therefore, in this article, we only take the simple but general method, i.e., QBC as the information estimation method in the evaluation section.
QBC originates from such an idea that if the variance of an entry is large, it implies that the entry cannot be certainly decided by the algorithm, and thus may contain more useful information to recover the estimated full sensing matrix. The QBC framework quantifies the prediction uncertainty based on the level of disagreement among an ensemble of matrix completion algorithms. Specifically, a committee of matrix completion algorithms is applied to the partially observed data matrix to impute the missing values. The variance of prediction (among the committee members) of each missing entry is taken as a measure of uncertainty of that entry. In this article, the committee consists of several commonly used inference algorithms, including CS, STCS, K-nearest-neighbors (KNNs), and SVD.
It is assumed that the committee includes a set of L inference algorithms. In a sensing cycle j, the already selected cells with measurements in this cycle are denoted by S j (S j ∈ V). The sensor measurements in these selected locations are represented by χ S j = x S j . By using one of the inference algorithms, we haveĜ(:, j) = R l (x S j ). For an unsensed cell υ / ∈ S j , the informativeness of this cell can be formulated as where I υ,j represents the information of unsensed cell υ in time cycle j;Ḡ(υ, j) denotes the average value predicted by the committee; andĜ l (υ, j) is the predicted value of the lth matrix completion algorithm in cell υ.

B. Cost Estimation
Since the cost of obtaining a specific sample in practical MCS systems depends highly on the location, time, condition of the device, human expectation, density of participants, moving distance, and many other factors, we will consider cost diversity in the cell selection process. Different from the cost modeling in [18], in the following, we first discuss more broad types of costs occurring in MCS systems, including the routing cost, measurement cost, perception cost, and their combination. Then, we introduce a new cost function to estimate the sensing cost. Finally, we present several challenges of incorporating costs into sparse MCS, thus we need to make some compromises on the implementation of estimation on costs.
1) Cost Factors: In practice, different types of costs often occur in MCS systems, including but not limited to: 1) routing cost; 2) measurement cost; 3) perception cost.
Routing Cost: Consider such a scenario when mobile users in some cells are insufficient, so both the costs of moving from the present location to the target location and that of making measurements at subareas need to be considered. We use the cost function c(V) to denote such a cost, defined as is the cost of the shortest walk to visit each selected subareas in V at least once. Note that c R (V) is generally nonsubmodular and cannot be exactly computed in polynomial time. It will be discussed in detail in the following section.
Measurement Cost: We use c υ to denote the cost of collecting measurements in each subarea υ ∈ V. Generally, this kind of cost consists of energy consumption and data consumption on sensing devices as well as the data management cost. Devices consume energy in both measuring and reporting a sample, e.g., locate a GPS signal and report position. This cost depends on the location as well as the status of the device. The reporting cost may depend on the network, i.e., WiFi, 3G, or 4G, the signal strength, variability to the network, and the congestion level. Meanwhile, the reporting may incur cellular data cost when using cellular networks. Also, the submitted data stored in the cloud or network and the quality control of data incur a management cost.
Perception Cost: Finally, mobile users may have different perceptions of a given cost. In other words, this cost is a subjective evaluation of the provided services. For example, a user carrying on a smartphone with a full battery may not care about the energy consumption for GPS locating to be a large cost, whereas other users may be more sensitive to the same amount of energy usage when they are running out of the battery of smartphones. Such perception-based cost adjustments c b should be considered as they are important to user experience in MCS applications.
2) Cost Functions: Here, we introduce a cost function to estimate the value of sample costs when the actual cost is hard to acquire. Since it is assumed that sufficient participants exist in each subarea waiting for recruitment in the previous sparse MCS, the routing cost across different cells will thus not be included in this article. To estimate the measurement cost, we can conduct outdoor measurements of GPS energy consumption using smartphones at hundreds of subareas across the target region and record the data consumption of the 3G/4G/5G cellular network at those locations in the meantime.
Furthermore, we naturally consider the remaining battery level of a device as a type of "perception cost": the lower the remaining battery, the more valuable it is, the higher cost it should be assigned. We define c b = B 1−b as the perceptionbased cost for the remaining battery, where b is the ratio of the remaining battery and B is a constant. In particular, as b goes to 0, the cost is high and approaches B quickly. The intuition is that when b is large, mobile users are not sensitive and thus the measurement cost dominates. On the other hand, when b is small, users are sensitive and thus this factor will contribute a lot to the final cost. Therefore, we choose a multiplication combination of the measurement cost (the initial cost function) and perception cost (multiplier, forming a dynamic cost function) as the overall sensing cost function c υ · c b .
Prior study [18] proved that synthetic cost maps based on the overall cost estimation function are also feasible for performance evaluation when it is hard to conduct practical measurement and estimation. Therefore, we generate three synthetic cost maps based on our proposed cost function, and they are i.i.d with the dynamic cost map, monotonic with dynamic cost map, and spatial-correlated with dynamic cost map. Here, we take the final dynamic cost map in the traffic data set as an example, the cost distribution on different subareas over time (in a day) is exhibited in Fig. 4. In the rest of the article, we use CT1, CT2, and CT3 to refer to i.i.d with dynamic cost map, monotonic with dynamic cost map, and spatial correlated cost map, respectively. As we see in Fig. 4, the three sampling cost maps present different changing characteristics (randomness in CT1, monotonicity in CT2, and spatial correlation in CT3), with darker color indicating larger cost.

3) Challenges of Estimating Costs:
1) Difficult to estimate cost accurately. Since real costs are hard to obtain, a cost estimation function is often required. Due to the cost diversity, a simple cost estimation function is hard to estimate the value of sample cost accurately. Though in present practices, multifactor regression models are trained to estimate the current cost of the operation and the influence of prior cost observations is also considered, this design is still far from practical values. How to design an estimator to give an entirely accurate estimation of the sensing cost is not the focus of this article. Thus, we leverage the synthetic cost maps generated by the cost function c υ · c b . 2) Difficult to accomplish estimation on certain types of cost in polynomial time, e.g., routing cost. Unlike other typical additive cost constraints, such routing planning costs are themselves NP-hard to evaluate [43]. Therefore, to ensure the algorithm efficiency in cell selection, necessitating approximation in practices.

C. Discussions on Objective Functions
We first give the definition of submodularity and then discuss whether the objective functions have this property when certain conditions are satisfied. If is a finite set, a submodular function is a set of function f : 2 → R, where 2 denotes the power set of , which satisfies the following conditions 1) Discussion on f 1 : Since f 1 (s) is the informativeness of the selected cell set, it is nonnegative and nondecreasing when a new element is added. Meanwhile, we notice that f 1 : 2 s → R is a submodular function for satisfying the definition. If the informativeness of a cell is estimated by the QBC method, f 1 (s) would be the entropy of the set of random variables s. Then, f 1 (s) would be a monotone submodular function. However, if the informativeness of a cell is estimated by the mutual-information method or the valuefunction-based method, then f 1 (s) would be a nonmonotone submodular function.
2) Discussion on f 2 : If routing cost is not considered, no matter the sensing cost is constant or inconstant, f 2 (s) = C ij is called a linear function. Additionally, since C ij is nonnegative, then f 2 (s) is a monotone submodular function. If the routing cost is considered and denoted by the number of edges (edges formed by the vertices of subareas), then f 2 (s) is a nonmonotone submodular function. If the routing cost is denoted as the cost of the shortest walk to visit each selected subareas in this article, then f 2 (s) is nonsubmodular and nonmonotone.
The cell selection problem in this article is definitely NPhard (the proof can be found in [32] since it is a subset selection problem) and sometimes hard to solve when the estimation on routing cost is considered. Since we have assumed sufficient participants in each subarea (f 2 is monotone submodular) and f 1 is estimated by QBC (f 1 is monotone submodular), the cell selection problem is solvable by leveraging the following algorithms (Algorithms 1 and 2).

1) Cost-Quality Beneficial Cell Selection Strategy:
With the cost and informativeness estimation method above, in this section, the diversity of sample cost is incorporated into the cell selection process, and we propose two selection strategies, namely, GCB-GREEDY and POS, to balance the two objectives at meantime: minimize the sensing cost and maximize the informativeness in the collected cells. The detailed strategies of cell selection are formulated as follows.

1) Generalized Cost-Benefit Greedy Selection Strategy (GCB-GREEDY):
The cell selection process in the previous sparse MCS can be described as a typical subset selection problem. Generally, the subset selection problem tries to select a subset S j (salient cells) from the subarea set V with an objective function f 1 (S) (information function) and a constraint of the subset size (select one by one) in each cycle j. Therefore, the previous cell selection problem can be formalized as arg max where | · | denotes the size of a set and B size is the maximum number of selected elements (the stopping criterion is decided by LOO-SA). But in a cost-quality beneficial selection, the constraint of subset size should be transformed into the budget constraint as f 2 (S j ) ≤ B one cost . At the core of the GCB greedy algorithm is the following heuristic: in each iteration k, add to the set S j an element υ k such that: where S 0 j = ∅ and S k j = {υ 1 , . . . , υ k }. The number of cells in one selection depends on the information and cost budget. Since the routing cost is ignored due to the sufficient participants assumption in the cost function, our cell selection problem is transformed into a problem of maximizing a monotone submodular function f 1 with a monotone approximate cost constraint f 2 . The corresponding GCB greedy algorithm is shown in Algorithm 1. It iteratively selects one subarea υ to sense such that the ratio of the marginal gain on f 1 and f 2 by adding υ is maximized. 2) POS Strategy: Inspired by the solutions in [32], the subset selection problem in (7) can be reformulated as optimizing a binary vector. We introduce a binary vector s ∈ {0, 1} m to indicate the subset membership, where s i = 1 if the ith element in V is selected in a sensing cycle, and s i = 0 otherwise. So the cell selection

Input:
A monotone objective function, f 1 ; A monotone approximate cost function, f 2 ; The budget constraint, B one cost .

Output:
The solution S j ⊆ V with f 2 (S j ) ≤ B one cost . 1: Let S j = ∅ and V = V ; 2: repeat 3: υ * ← arg max υ∈V problem can be formulated as a biobjective minimization model where |s| denotes the number of 1s in s; S ij denotes the entry in cell selection matrix S m×n ; I ij represents the information of cell i in sensing cycle j; C ij is the approximate sample cost of cell i in sensing cycle j; B one cost is the cost budget in one selection, which is set as the maximal cost value of unsensed cells (this kind of dynamic cost budget has never been considered before); and f 1 is set to −∞ to avoid trivial or overcost solutions. We use the value B one cost + 1 instead of B one cost in the definition of f 1 as this gives the algorithm some look ahead for larger constraint bounds. However, every value of at least B one cost would work for our theoretical analysis. The only drawback would be a potentially larger population size that influences the runtime bounds. The biobjective optimization model performs active selection to maximize the informativeness and meanwhile to minimize the sample costs of the selected cells. We then employ a recently proposed Pareto optimization for monotonic constraints (POMCs) algorithm [33] to solve this problem. POMC is an evolutionary style algorithm, which maintains a solution archive and iteratively updates the archive by replacing some solutions with better ones. It is also known as global SEMO in the evolutionary computation literature [44], shown in Algorithm 2.

2) Cell Selection Algorithm for the MCS Task:
The above two strategies are proposed to compute the approximate optimal solution for only one selection. Considering n cycles and m cells in our problem, we summarize the pseudocode of the proposed algorithm in Algorithm 3. When a new sensing cycle starts, the MCS server needs to update the cost map at Select s from P uniformly at random; 5: Generate s by flipping each bit of s with probability 1/m; 6: if ∃z ∈ P such that z s then first. Then, the information of unsensed cells needs to be computed by QBC. Next, we set the cost budget in one selection at the maximal sample cost (or a little bit larger). Consequently, we adopt different cell selection strategies to solve the subset selection problem with cost constraints. After that, the MCS server recruits participants to collect actual sensing data in the selected cells and aggregates the collected data to judge if more cells are required to sense. If the predefined quality requirement is not satisfied, we repeat steps 6-9 until the predefined quality requirement is satisfied. The quality requirement satisfied indicates that the MCS server can stop sensing in this cycle and move to the next cycle. The MCS server repeats the above steps until ( , p)-quality in all time cycles is satisfied. Finally, we can deduce the unsensed data through CS based on sensed data.

3) Computation Complexity:
Since cell selection strategies depend on informativeness modeling (QBC is selected in this article), and thus QBC contributes much to the running time. Due to the fact that the runtime of QBC is mainly spent on using different inference algorithms to reconstruct the sensing matrix, then the complexity of QBC can be formulated as O( l T R l ) if the computation complexity of a reconstruction algorithm R l is T R l . Besides, due to the characteristics of different strategies, their runtime performances are widely divergent. The greedy nature of the GCB-greedy algorithm results in itself an efficient fixed time algorithm. While POMC is an anytime randomized iterative algorithm, it needs to spend more time than the greedy algorithm to find the best feasible solutions. More specifically, the runtime of POMC depends on the setting of parameter T.

V. ANALYSIS OF COST-QUALITY-AWARE SPARSE MCS-ASSISTED URBAN SENSING AND ACTUATION
For urban computing [45], [46], traditional practices usually depend on specialized infrastructure, e.g., surveillance

Input:
The budget constraint, B all cost ; Predefined quality requirement, ( , p)-quality; The sensing matrix reconstruction algorithm, R; The cost map, C m×n ; The error metric, e.g., MAE or CE.

Output:
The Inferred full ground data matrix,Ĝ m×n . 1: repeat 2: new sensing cycle t starts, update the cost map for the this cycle; 3: repeat 4: compute the informativeness of the un-measured subareas through Eq.(6); 5: determine the cost budget for a batch of chosen cells in a selection; 6: solve the subset selection problem considering both information and sample costs through different cell selection strategies; 7: send participants to collect sensing data in the selected cells; 8: assess task quality in this time cycle; 9: until the predefined quality requirement is satisfied; 10: until the predefined quality requirement in all time cycles is satisfied; 11: return The estimated full ground truth matrixĜ m×n .
cameras, air quality stations, which incurs a high cost for deployment and maintenance. With the advent and development of seamless connections among machines, smart things, and humans, it is an emerging trend that a governor or a service initiator leverages the power of crowds, e.g., mobile users and smart things, to monitor what is happening in a city, understand how the city is evolving, and further take actions to enable a better quality of life [47]. In this article, we offer a governor with the proposed cost-quality beneficial sparse MCS approach to sense the urban context in a more cost-beneficial way with high-quality sensed data and inferred data.

A. Benefits to Urban Context Sensing
Nowadays, real-time information in a city, for instance, the shortage of parking bothers the managers and causes severe societal problems, such as traffic congestion and environmental pollution. In previous practices, a governor would employ a dedicated staff and leverage expensive resources to monitor and report the parking occupancy situation, which incurs large operational costs. Meanwhile, note that under a large-scale target area, we usually have many subareas for a fine-grained result and need to recruit a large number of participants, which also costs a lot. Alternatively, we can leverage the costquality-aware cell selection approach proposed in this article to recruit only a few numbers of mobile users to collect real-time parking availability information in some subareas and report the collected information to the centralized server. Then, the server would exploit the CS or matrix completion techniques to recover the information in unsensed subareas. Other examples, such as passenger flows in a target area, traffic situation, and air quality, are also important issues to a governor, as well as the citizens, and can be sensed by our proposed crowdpowered way. Therefore, our proposed crowd-powered urban context sensing can fulfill the task of sensing in large urban regions with less cost and higher efficiency.

B. Benefits to Urban Context Actuation
The further intention of a governor or a service initiator is to adopt measures or impose influence on the urban context by leveraging the collected and inferred information. The management mode of a city would be changed to optimize different smart systems (e.g., smart parking and intelligent transit) and enable a better quality of life (e.g., the recommendation of a parking spot and reschedule travel plans). For instance, due to the outbreak of COVID-19, citizens are required to maintain social distance for a certain period of time. But if the collected information in a subarea about the passenger flow index exceeds the predefined threshold, the local governor will suggest citizens in other regions to not travel to this region and take strict isolation measures in this region to reduce the flow index. Other examples, such as engaging users to rebalance shared bikes, encouraging citizens and private cars to assist package delivery, and suggesting vehicles to take other routes when meeting traffic congestion are also typical actuation applications by leveraging the collected information. In this crowd-powered paradigm, the efficiency of the current smart city systems will be largely improved. It reveals the importance of the information supported by our proposed crowd-powered sensing paradigm.

VI. PERFORMANCE EVALUATION
In this section, we evaluate our proposed strategies on four sensing projects, which contain various types of sensed data, including the parking occupancy rate, passenger flow index, traffic speed, and humidity data.

A. Data Sets and the Inherent Features
In this article, we adopt four real-life sensing data sets, Birmingham-Parking [48], DataFountain competitions, 1 TaxiSpeed [49], and SensorScope [50] to evaluate the applicability of our following proposed algorithms. The data sets contain various types of sensed data in representative IoT applications, such as the parking occupancy rate, flow index, traffic speeds, and humidity. Though some of the data in these data sets are collected by sensor networks or static stations, mobile users can also sense the data by using smartphones, as shown in studies [3], [51]. The detailed statistics of the four data sets are shown in Table II and their distributions are shown in Fig. 5.  Traffic (Speed): The speed readings of taxis are collected for road segments in the TaxiSpeed project in Beijing. The project lasted for four days from September 12, 2020 to September 15, 2020. Specifically, this data set contains more than 33 000 trajectories collected by GPS on taxis. Each sensing cycle lasts for 60 min. According to [49], we consider the road segments as the cells, and a target area that has 100 road segments with valid sensed values is selected.
Humidity: The humidity readings are sensed in the SensorScope project, collected from the EPFL campus with an area about 500 m × 300 m for seven days (from July 1, 2020 to July 7, 2020). Each sensing cycle lasts for 30 min. For our experiments, we divide the target area into 100 cells with each cell size 50 m × 30 m. Since only 57 cells are deployed with valid sensors, we just utilize the sensed data at the cells with valid readings.
In these data sets, the MAE is chosen as the metric to evaluate the inference quality. Also, the data sets used in this article come from publicly available data on the Internet. After the careful check by the authors, there are no user privacy issues.
Inherent features in the urban sensing data are the prerequisite for spatial-temporal CS. To ensure the validity of our proposed models and algorithms, we need to conduct a set of experiments on these data sets to discover the strong spatial-temporal correlations. It is the basis and premise of this research. Results show that the urban sensing data matrix could have a low-rank approximation, certain temporal stability, and high spatial correlation.
1) Low-Rank Feature: Generally, there often exists an inherent correlated structure or redundancy in long-time urban sensing data. Thus, we apply SVD to examine whether the ground-truth sensing matrix has a good low-rank structure. Any ground-truth data matrix G m×n can be decomposed as where V tr is the transpose of V (an n × n unitary matrix), U is an m × m unitary matrix, and is an m × n diagonal matrix with the diagonal elements σ i (i.e., singular values) organized in the decreasing order. The rank of a matrix, denoted by r, is equal to the number of its nonzero singular values. Specifically, a low-rank matrix means that its matrix rank r min{m, n}. According to the principal components analysis, a low-rank matrix has the character that its top k singular values occupy the total or near-total variance k i=1 σ 2 i ≈ r i=1 σ 2 i . Thus, we use the fraction of the total variance captured by the top k singular values as the evaluation metric Fig. 6(a) plots the fraction of the total variance captured by the top k singular values as k varies for different urban sensing data. We can find that the top 1.75%-13.3% singular values include over 98% variance in the real data sets. The result indicates that the urban sensing data have a good low-rank approximation.
2) Temporal Stability Feature: Temporal stability indicates how the measured data changes over time. In urban sensing, some measured data, e.g., humidity and temperature, usually change slowly over consecutive time slots. But other urban context data may not have this feature. Thus, to reveal the natural phenomenon and check if this feature exists in different urban sensing data, we analyze the data sets in the time dimension between each pair of adjacent time measurements at a location.
The temporal stability feature at subarea i and time slot j is computed by the normalized difference values between adjacent time slots tsf (i, j) where i varies from 1 to m, j varies from 2 to n (n is the number of time slots of interest), and max 1≤i≤m,2≤j≤n |G(i, j)− G(i, j−1)| is the maximal difference of the urban sensing data captured in any two consecutive time slots. The cumulative distribution function (CDF) of tsf (i, j) is plotted in Fig. 6(b). The x-axis represents the normalized difference between values in two consecutive time slots, i.e., tsf (i, j). The y-axis denotes the cumulative probability. It is observed that more than 60% of tsf (i, j) values are very small (< 0.1) for the four different data sets. Even in the worst case, the traffic values between two consecutive time slots mostly (>90%) only change a little (< 0.3). These findings indicate that the real urban sensing data are temporally stable.
3) Spatial Correlation Feature: Spatial correlation indicates the correlation of the sensing data between nearby locations. Since environments are often smooth in a small area and thus environmental values are similar at nearby locations. In this article, we use the correlation coefficient to quantify this kind of correlation and dependence. Let G (i) denote the ith row of matrix G. In specific, G (i) , G (i ) ∈ R n represent the data vectors of locations i and i'. The following metric scf (i, i ) (spatial correlation) between data at locations i and i' can be formulated as follows: where i and i vary from 1 to m,Ḡ (i) = (1/n) n j=1 G(i, j), and j). To avoid the existence of negative values in scf (i, i ), the absolute value function is added in the covariance function. Fig. 6(c) plots the CDF of scf (i, i ), with the x-axis being the values of scf (i, i ) and y-axis is the cumulative probability. We find that the urban sensing data exhibit high spatial correlations in general. Fig. 7. Violin diagram of different cost maps. CT1, CT2, and CT3 denote the i.i.d with dynamic cost map, monotonic with dynamic cost map, and spatial correlated cost map, respectively; the three doted lines mark the 25% percentile, median, 75% percentile value in each violin plot, respectively.
In brief, the inherent features, i.e., the low-rank feature, temporal stability feature, and spatial correlation feature discovered in urban sensing data allow us to perform spatialtemporal CS and quality assessment actions.

B. Cost Map
In this article, we estimated three different initial cost maps (i.e., i.i.d cost map, spatial-correlated cost map, and monotonic cost map) on the target data sets, respectively. In the meantime, a dynamic, time-variant factor, i.e., the perception cost is considered in this article. In our evaluation, we use c b = B 1−b (B = 2, b ∈ [0, 1]) to denote the dynamic cost. The example of three different cost maps is given in Fig. 4. More specifically, the summary statistics over the three cost maps are shown in Fig. 7 (the unit of cost is CNY). As we can see in the violin diagram, each dot represents a sample cost and the height of the violin outline indicates the range of costs. Note that the range and std. deviation in CT1 are maximal compared to those in the other two cost maps while the mean and the minimal value of CT1 is smallest, that is to say, CT1 has more sample cost with small values. This fact can explain why our proposed algorithms will select more cells in CT1 to sense, and more details can be referred to Section VI-D.

C. Baselines
Since this article addresses the practical sensing problem (usually a nonlinear system) with less historical monitoring data, we compare our cell selection strategies with two baselines: 1) SIMP-GREEDY and 2) QBC.

SIMP-GREEDY:
Since there is typically a conflict between the informativeness and sample cost in a cell, the most straightforward strategy is to simply divide the informativeness by the sample cost. Thus, we can have the selection strategy as This strategy transforms a biobjective problem into maximizing the single objective f 1 (υ)/f 2 (υ) in each selection, which provides a simple solution for cost-quality beneficial selection, but it may fail when one of the two factors dominates the other [35]. Hence, SIMP-GREEDY is considered as a baseline.
QBC: Previous works [13], [27] have proven the feasibility and satisfying performance of QBC of cell selection in sparse MCS applications. Some "committee members" are contained in QBC to determine which salient cell to sense in the next task. More specifically, the "committee" is formed by different data inference algorithms, such as spatial-temporal CS, CS, K-nearest neighbors, and SVD. Finally, it chooses the cell where the inferred data of various algorithms has the largest variance as the next selection for sensing without considering the cost diversity. In other words, QBC tries to minimize the total costs of selecting cells by selecting the unsensed cells with the largest variance. Therefore, QBC is suitable as a baseline.

1) Errors of Inferred Value:
We first compare the average inference error, i.e., MAE brought by different cell selection strategies while changing the number of selected cells for each cycle without considering ( , p)-quality. As exhibited in Fig. 8, similar tendencies are observed over four types of sensing tasks. As the increment of the number of selected cells in each sensing cycle, the average inference errors drop rapidly. The fact implies that more information brought by the increasing selected cells promotes the accuracy of data inference. Note that the information modeling of our proposed strategies, i.e., POS and GCB-GREEDY and the baseline SIMP-GREEDY are based on QBC, and thus they share the comparable error levels theoretically. This fact is also confirmed by the experimental results though the inference error of our proposed strategies is better than that of the baselines in many circumstances. Next, we will evaluate and discuss the performances of our cell selection strategies by considering ( , p)-quality, which is more practical in real-world applications.
2) Number and Total Costs of Selected Cells: Then, we focus on analyzing the research objective-how much sample costs could our proposed algorithms save while further obtaining more informativeness to reduce the inference errors? The proposed strategies are compared to the baselines from three aspects: 1) costs; 2) selected cells; and 3) inference errors on four real-life data sets.
On the Parking occupancy rate sensing, for the predefined ( , p)-quality, we set the error bound as 0.1 and p as 0.9 or 0.95. In other words, we require the inference error smaller than 0.1 for around 90% or 95% of cycles. The average number of selected cells for each cycle is shown in Fig. 9(a), where the baseline QBC always selects the fewest cells on three different cost maps, while GCB-GREEDY and POS can select 0.5%-4.2% (on average 2.8%) and 0.6%-5.2% (on average 3.5%) more subareas than QBC, respectively. Except for the circumstance of CT1 (95%), SIMP-GREEDY also selects a bit more cells (0.21%-0.9%, on average 0.5%). Note that in CT1 (i.i.d with dynamics cost map), our proposed strategies select more cells. The phenomenon can be explained by the statistics of cost maps since CT1 has more sample cost with small values. So our proposed algorithms may choose   Fig. 3). In general, GCB-GREEDY and POS only need to select on average 6.67 (7.23) and 6.74 (7.26) out of 30 cells for each sensing cycle to ensure the inference error below 0.1 in 90% (95%) of cycles, respectively. Though more cells are selected by our proposed strategies, the total costs of our proposed strategies outperform those of the baselines. Generally, QBC costs the most while POS saves the most costs, as shown in Fig.  9(b). Specifically, GCB-GREEDY and POS spend 1.6%-9.1% (on average 4.7%) and 2.1%-11.2% (on average 5.7%) fewer costs than QBC. Meanwhile, our proposed strategies perform better than SIMP-GREEDY on cost saving in practically all circumstances. Especially in CT1, our proposed algorithms can achieve the best performance. Due to the simple greedy heuristic, SIMP-GREEDY cannot ensure a full superiority over QBC. Note that in the case of CT3 (90%), it even spends more cost than QBC. Subsequently, let us compare the inference errors. As shown in the Appendix (Table IV), since our proposed strategies cover more subareas in each sensing cycle, which can provide more information for data inference, and thus enhance data accuracy compared to the baselines. For the Flow and Traffic data sets, we observe a similar tendency in Fig. 9(c)-(f). It is noteworthy that our proposed strategies achieve better performance than the baselines since they leverage a more complex mechanism to balance the sample cost and information. Specifically, POS and GCB-GREEDY select more cells and save more costs compared to those in parking sensing tasks since the average number of selected cells in a time cycle becomes larger. Also, the inference error of our proposed strategies is obviously reduced.
From the above-analyzed results, our proposed strategies undoubtedly outperform the baselines on the performance of decreasing inference errors and sample costs. Now, we define a new indicator-cost per cell (CPC) to see which strategy performs best. The results are shown in Fig. 10. On the Parking sensing task, the CPC of POS performs the best in all cases. Similarly, on the Flow, Traffic, and Humidity data sets, POS also shows its advantage in all circumstances over other strategies. Thus, we can conclude POS is the best strategy considering the results. Finally, let us see whether all the strategies can achieve the predefined task quality requirement. As shown in the Appendix (Table V), most of the values are larger than its predefined p (all the methods adopt LOO-SA as the stopping criterion), and this result indicates that both our proposed strategies and the baselines can well satisfy the predefined quality requirement for most of the time. In the meantime, we also observe that some results are slightly less than the predefined p, for instance, 0.8992 < 0.9 and 0.9496 < 0.95, but the gap is quite small and acceptable. This is probably due to the fact that CS and the Bayesian inference in our algorithms have intrinsic probabilistic characteristics and would cause some minor errors. To ensure the accuracy of the results, each experiment sample was run five times. If time permits, more runs should be considered to avoid this probabilistic characteristic.
3) Results of Different Cost Budgets: Furthermore, to study how the change of cost budget in a selection will influence the evaluation results, we take the humidity sensing on CT1 as an example and conduct more experiments on the POS strategy since POS has exhibited its superiority over other strategies. In the previous experiments, the cost budget in a selection is set to B one cost = max(f 2 (V j )), that is to say, B one cost equals the maximal sample cost in cycle j. The reason for this setting is that cell selection strategies can consider and select any possible candidate subareas. Now, we vary the ratio of the cost budget to the maximal sample cost and the results are shown in Fig. 11. Generally, when the ratio rises, more subareas are selected and the total sample costs correspondingly increase. This may because when the ratio is less than 1, the POS strategy omits some subareas with high sample costs; while when the ratio is greater than 1, the POS strategy has a greater cost budget to select more cells. Note that though the total sample costs are reduced in the low-cost budget scenarios, the inference errors increase significantly. When the ratio is less than 0.7, the results cannot even meet the predefined quality requirement. Thus, if a governor cares more about the cost reduction, the cost budget can be set slightly below the maximum sample cost. But to ensure the inference performance, we suggest the ratio should be set greater than 0.9.

4) Results of Leave Some Percentage Out Cross Validation:
Though the number of selected cells in most cycles is much smaller than the total number of cells, there still remains the situation that in some cycles, more than a half or two-thirds of the total cells are chosen since the data in these cycles are not necessarily accurately sensed due to the failure of sensors. In matrix completion, there is always a threshold of the observation rate, beyond which the performance should be  satisfiable. Thus, in the above-mentioned situation, the number of observed entries (measured cells) is likely to go beyond the threshold, and consequently, the performance may always be satisfiable by using the LOO evaluation, because leaving only one entry out may be too few and cannot show when the method can work and cannot work. In [16] and [52], some other cross-validation approaches are leveraged as the quality assessment method, such as K-fold cross validation and leave-P-out (LPO) cross validation. Note that LOO is a particular case of K-fold and LPO. In this article, we conduct the evaluation by leaving various percentages (e.g., 10%, 20%, and 30%) of data out. Suppose that we have sensed m cells out of all the m cells, the idea of leave some percentage out (LSPO) is that for each time, we leave some percentage, e.g., 10%, of the m observations out [i.e., leave P observations out, P equals ceil(m × percentage)] and infer them based on the rest (m -P) observations. Here, we take the humidity sensing task as an example, set the error bound as 1.5% and p as 0.9, and experiment on percentages 10%, 20%, 30%, 40%, and 50%, respectively. Each experiment sample is repeated five times (the results are averaged), and the corresponding results are shown in Fig. 12 and Appendix (Table VI).
It can be concluded that compared to LOO, leveraging LSPO incurs a bit more selected cells as well as sensing costs and also improves the inference results to some extent. Specifically, with the increment of the percentage, the average number of selected cells in each cycle increases slightly. This is because leaving more cells out in the above-mentioned situation would reduce the observation rate and require more cells to be sensed. In most cases, the corresponding sensing costs also increase slightly and the inference error in unsensed cells decreases a little. The findings reveal that LSPO can better handle with the inaccurate sensing situation in some cycles compared to LOO and avoid the invalid operation when the observation rate exceeds the threshold in matrix completion. But LSPO cross validation requires to learn and validate C p m times, so as the value of m becomes too big, it would be impossible to calculate. Therefore, in terms of computational efficiency, LOO is a good choice; and considering the reduction of the inference error and the problem of coping with the observation rate over the threshold, LSPO is a better choice. 5) Running Time: Finally, since the baseline QBC has demonstrated its feasibility of running time performance in the real-life scenario, we report the computation time of our proposed strategies and SIMP-GREEDY to see whether they can also satisfy the runtime requirements. We run the experiments on a desktop computer (Intel Core i7-8559U CPU @ 2.70 GHz, 16-GB RAM, Windows 10) with Python3.7. Table III lists the running time for different stages of the whole task assignment process. As we can see in Table III, the "quality assessment" module costs the most since it needs to run the "data inference" module for many rounds to judge whether the sensing cycle can stop or continue. Though our proposed strategies use QBC as the basis to estimate the information in unsensed cells, the computation time of GCB-GREEDY is even reduced. This is because the GCB-greedy algorithm can ensure a fixed runtime and may select more than one cell in a selection. Despite POS is the most time consuming, the total runtime for allocating a new task is no more than 15.4 s [i.e., estimating the task quality once and, if it cannot meet the predefined ( , p)-quality, finds the next sensing cell]. Therefore, we believe the efficiency of our proposed methods can also satisfy most real-world applications. 6) Discussion: In this section, we will conclude the experimental phenomena and discuss some drawbacks of this article.
Our proposed cell selection method explicitly outperforms the baselines, with two strategies (the Pareto optimization and GCB-GREEDY) from three aspects: 1) less sample cost; 2) more selected cells with sensing values; and 3) less inference error since we leverage a complex mechanism to minimize the total cost and maximize the informativeness. In other words, we select more subareas on the premise of reducing or not increasing the overall cost. Since more cells are sensed with real measurements, the inference error is also improved. Whatever the sample cost type and sensing task type are, the POS strategy achieves the best performance; however, if the running time is considered, GCB-GREEDY is a comparable strategy. Experimental results demonstrate the feasibility of the proposed cost-quality beneficial cell selection method.
Compared to the results in monotonic cost map (CT2) and spatial-correlated cost map (CT3), the POS strategy and GCB-GREEDY strategy perform much better in sample cost reduction and inference error decline under i.i.d cost map (CT1). It is because CT1 owns cost values with a bigger range and std. deviation and has more sample cost with small values, thus it provides the Pareto optimization and GCB-GREEDY more chances to select more than one subarea in a selection. Thus, the average number of selected cells in CT1 of the Pareto optimization and GCB-GREEDY is greater than that in CT2 and CT3. More selected cells mean more information for recovery, and thus the inference error reduction in CT1 is better. This finding implicates that our proposed framework and cell selection strategies are able to handle various kinds of cost inconstancy, especially when the cost map has a bigger range and standard deviation.
However, there remain some drawbacks in the present work. First, we still leverage LOO-SA as the stopping criterion, in which the practical relationship between statistical results and the stop condition is simplified. Thus, when the observed entries are beyond the threshold in a cycle, the performance may always be satisfiable. So that it is unclear whether LOO-SA is working or not in this situation. Instead, LSPO-SA would be a good choice for quality assessment. Second, since we use the QBC method to estimate the uncertainty in each unsensed cells. But the direct relationship between the uncertainty and the quality of reconstruction remains to be proven. Finally, the accuracy of data acquisition influences the overall performance of Sparse MCS to some extent, which needs further discussion and analysis.
In a few cases, we observe that the naive greedy strategy even performs a bit better than the GCB-GREEDY strategy. It may because CS and the Bayesian inference in our algorithms have intrinsic probabilistic characteristics and would cause some minor errors. If time permits, more runs should be considered to avoid this probabilistic characteristic. It may also because there exist certain measurement errors in the raw data, which affect the performance of our proposed strategies in some cases.
The practicality of this work may be limited on simulationbased results without real-world applications and practical experiments. The issue of the human factor is avoided by a perfect participant assumption in this article. However, a participant may fail, deny, or be late in doing the assigned task. The probability of the failure should be different participant by participant since their personality is different. Thus, we would clarify how the proposed methods handle human factors happening in real-world applications by conducting real experiments in the future.

VII. CONCLUSION
Crowdsensing, as a typical way of urban computing, has shown its advantage in pervasive sensing and knowledge discovery. However, sample costs of high-quality data still hamper MCS from being utilized at a large scale. Thus, in this article, we incorporated cost diversity into the cell selection process. To that end, a novel three-step cell selection approach (information modeling, cost estimation, and cost-quality beneficial selection) was proposed with the target of minimizing the total sample costs and maximizing the beneficial informativeness in the selected cells (further reducing the error of inferred results). After a reasonable approximation of the cost and discussion on the properties of the optimization goals, we proposed two selection strategies, namely, GCB-GREEDY and POS, to solve the optimization model. We evaluated the proposed strategies by comparing them to two baselines, i.e., QBC and SIMP-GREEDY, on four real-life sensing data sets and three different cost maps. Results explicitly demonstrated the effectiveness and applicability of our strategies in realworld MCS systems. Our strategies can save sample costs and reduce inference errors in the meantime. Also, we analyzed the potential of applying our proposed cost-quality beneficial crowd-powered way in real-life urban-scale sensing and actuation.
In the future, we will continue to improve this work from the following aspects. We will first incorporate the sample cost diversity into the deep reinforcement learning-based cell selection method. Second, we will improve the insufficient participant scenario by considering the user's historical mobility traces where relocate participants to new task locations. Finally, we will provide a mathematical proof of the relationship between the informativeness of a cell and the reconstruction performance in the nonlinear system scenarios.

APPENDIX
See Tables IV-VI.