def landuse_agg_wrapper(): mu_data_hill, mu_data_other, ac_data, ac_data_wt = data.aggregate_landuses(node_data, edge_data, node_edge_map, data_map, distances, betas, mixed_use_hill_keys=np.array([0, 1]), landuse_encodings=landuse_encodings, qs=qs, angular=False)
def compute_landuses(self, landuse_labels: list | tuple | np.ndarray, mixed_use_keys: list | tuple = None, accessibility_keys: list | tuple = None, cl_disparity_wt_matrix: list | tuple | np.ndarray = None, qs: list | tuple | np.ndarray = None, jitter_scale: float = 0.0, angular: bool = False): """ This method wraps the underlying `numba` optimised functions for aggregating and computing various mixed-use and land-use accessibility measures. These are computed simultaneously for any required combinations of measures (and distances), which can have significant speed implications. Situations requiring only a single measure can instead make use of the simplified [`DataLayer.hill_diversity`](#datalayerhill_diversity), [`DataLayer.hill_branch_wt_diversity`](#datalayerhill_branch_wt_diversity), and [`DataLayer.compute_accessibilities`](#datalayercompute_accessibilities) methods. See the accompanying paper on `arXiv` for additional information about methods for computing mixed-use measures at the pedestrian scale. import ArXivLink from '../../src/components/ArXivLink.vue' <ArXivLink arXivLink='https://arxiv.org/abs/2106.14048'/> The data is aggregated and computed over the street network relative to the `Network Layer` nodes, with the implication that mixed-use and land-use accessibility aggregations are generated from the same locations as for centrality computations, which can therefore be correlated or otherwise compared. The outputs of the calculations are written to the corresponding node indices in the same `NetworkLayer.metrics` dictionary used for centrality methods, and will be categorised by the respective keys and parameters. For example, if `hill` and `shannon` mixed-use keys; `shops` and `factories` accessibility keys are computed on a `Network Layer` instantiated with 800m and 1600m distance thresholds, then the dictionary would assume the following structure: ```python NetworkLayer.metrics = { 'mixed_uses': { # note that hill measures have q keys 'hill': { # here, q=0 0: { 800: [...], 1600: [...] }, # here, q=1 1: { 800: [...], 1600: [...] } }, # non-hill measures do not have q keys 'shannon': { 800: [...], 1600: [...] } }, 'accessibility': { # accessibility keys are computed in both weighted and unweighted forms 'weighted': { 'shops': { 800: [...], 1600: [...] }, 'factories': { 800: [...], 1600: [...] } }, 'non_weighted': { 'shops': { 800: [...], 1600: [...] }, 'factories': { 800: [...], 1600: [...] } } } } ``` Parameters ---------- landuse_labels A set of land-use labels corresponding to the length and order of the data points. The labels should correspond to descriptors from the land-use schema, such as "retail" or "commercial". This parameter is only required if computing mixed-uses or land-use accessibilities. mixed_use_keys An optional list of strings describing which mixed-use metrics to compute, containing any combination of `key` values from the following table, by default None. See **Notes** for additional information. accessibility_keys An optional `list` or `tuple` of land-use classifications for which to calculate accessibilities. The keys should be selected from the same land-use schema used for the `landuse_labels` parameter, e.g. "retail". The calculations will be performed in both `weighted` and `non_weighted` variants, by default None. cl_disparity_wt_matrix A pairwise `NxN` disparity matrix numerically describing the degree of disparity between any pair of distinct land-uses. This parameter is only required if computing mixed-uses using `hill_pairwise_disparity` or `raos_pairwise_disparity`. The number and order of land-uses should match those implicitly generated by [`encode_categorical`](#encode_categorical), by default None. qs The values of `q` for which to compute Hill diversity. This parameter is only required if computing one of the Hill diversity mixed-use measures, by default None. jitter_scale The scale of random jitter to add to shortest path calculations, useful for situations with highly rectilinear grids. `jitter_scale` is passed to the `scale` parameter of `np.random.normal`. Default of zero. angular Whether to use a simplest-path heuristic in-lieu of a shortest-path heuristic when calculating aggregations and distances, by default False Notes ----- | key | formula | notes | |-----|:-------:|-------| | hill | $\scriptstyle\big(\sum_{i}^{S}p_{i}^q\big)^{1/(1-q)}\ q\geq0,\ q\neq1 \\ \scriptstyle lim_{q\to1}\ exp\big(-\sum_{i}^{S}\ p_{i}\ log\ p_{i}\big)$ | Hill diversity: this is the preferred form of diversity metric because it adheres to the replication principle and uses units of effective species instead of measures of information or uncertainty. The `q` parameter controls the degree of emphasis on the _richness_ of species as opposed to the _balance_ of species. Over-emphasis on balance can be misleading in an urban context, for which reason research finds support for using `q=0`: this reduces to a simple count of distinct land-uses.| | hill_branch_wt | $\scriptstyle\big[\sum_{i}^{S}d_{i}\big(\frac{p_{i}}{\bar{T}}\big)^{q} \big]^{1/(1-q)} \\ \scriptstyle\bar{T} = \sum_{i}^{S}d_{i}p_{i}$ | This is a distance-weighted variant of Hill Diversity based on the distances from the point of computation to the nearest example of a particular land-use. It therefore gives a locally representative indication of the intensity of mixed-uses. $d_{i}$ is a negative exponential function where $\beta$ controls the strength of the decay. ($\beta$ is provided by the `Network Layer`, see [`distance_from_beta`](/metrics/networks/#distance_from_beta).)| | hill_pairwise_wt | $\scriptstyle\big[ \sum_{i}^{S} \sum_{j\neq{i}}^{S} d_{ij} \big( \frac{p_{i} p_{j}}{Q} \big)^{q} \big]^{1/(1-q)} \\ \scriptstyle Q = \sum_{i}^{S} \sum_{j\neq{i}}^{S} d_{ij} p_{i} p_{j}$ | This is a pairwise-distance-weighted variant of Hill Diversity based on the respective distances between the closest examples of the pairwise distinct land-use combinations as routed through the point of computation. $d_{ij}$ represents a negative exponential function where $\beta$ controls the strength of the decay. ($\beta$ is provided by the `Network Layer`, see [`distance_from_beta`](/metrics/networks/#distance_from_beta).)| | hill_pairwise_disparity | $\scriptstyle\big[ \sum_{i}^{S} \sum_{j\neq{i}}^{S} w_{ij} \big( \frac{p_{i} p_{j}}{Q} \big)^{q} \big]^{1/(1-q)} \\ \scriptstyle Q = \sum_{i}^{S} \sum_{j\neq{i}}^{S} w_{ij} p_{i} p_{j}$ | This is a disparity-weighted variant of Hill Diversity based on the pairwise disparities between land-uses. This variant requires the use of a disparity matrix provided through the `cl_disparity_wt_matrix` parameter.| | shannon | $\scriptstyle -\sum_{i}^{S}\ p_{i}\ log\ p_{i}$ | Shannon diversity (or_information entropy_) is one of the classic diversity indices. Note that it is preferable to use Hill Diversity with `q=1`, which is effectively a transformation of Shannon diversity into units of effective species.| | gini_simpson | $\scriptstyle 1 - \sum_{i}^{S} p_{i}^2$ | Gini-Simpson is another classic diversity index. It can behave problematically because it does not adhere to the replication principle and places emphasis on the balance of species, which can be counter-productive for purposes of measuring mixed-uses. Note that where an emphasis on balance is desired, it is preferable to use Hill Diversity with `q=2`, which is effectively a transformation of Gini-Simpson diversity into units of effective species.| | raos_pairwise_disparity | $\scriptstyle \sum_{i}^{S} \sum_{j \neq{i}}^{S} d_{ij} p_{i} p_{j}$ | Rao diversity is a pairwise disparity measure and requires the use of a disparity matrix provided through the `cl_disparity_wt_matrix` parameter. It suffers from the same issues as Gini-Simpson. It is preferable to use disparity weighted Hill diversity with `q=2`.| :::tip Comment The available choices of land-use diversity measures may seem overwhelming. `hill_branch_wt` paired with `q=0` is generally the best choice for granular landuse data, or else `q=1` or `q=2` for increasingly crude landuse classifications schemas. ::: A worked example: ```python from cityseer.metrics import networks, layers from cityseer.tools import mock, graphs # prepare a mock graph G = mock.mock_graph() G = graphs.nX_simple_geoms(G) # generate the network layer N = networks.NetworkLayerFromNX(G, distances=[200, 400, 800, 1600]) # prepare a mock data dictionary data_dict = mock.mock_data_dict(G, random_seed=25) # prepare some mock land-use classifications landuses = mock.mock_categorical_data(len(data_dict), random_seed=25) # generate a data layer L = layers.DataLayerFromDict(data_dict) # assign to the network L.assign_to_network(N, max_dist=500) # compute some metrics - here we'll use the full interface, see below for simplified interfaces # FULL INTERFACE # ============== L.compute_landuses(landuse_labels=landuses, mixed_use_keys=['hill'], qs=[0, 1], accessibility_keys=['c', 'd', 'e']) # note that the above measures can optionally be run individually using simplified interfaces, e.g. # SIMPLIFIED INTERFACES # ===================== # L.hill_diversity(landuses, qs=[0]) # L.compute_accessibilities(landuses, ['a', 'b']) # let's prepare some keys for accessing the computational outputs # distance idx: any of the distances with which the NetworkLayer was initialised distance_idx = 200 # q index: any of the invoked q parameters q_idx = 0 # a node idx node_idx = 0 # the data is available at N.metrics print(N.metrics['mixed_uses']['hill'][q_idx][distance_idx][node_idx]) # prints: 4.0 print(N.metrics['accessibility']['weighted']['d'][distance_idx][node_idx]) # prints: 0.019168843947614676 print(N.metrics['accessibility']['non_weighted']['d'][distance_idx][node_idx]) # prints: 1.0 ``` Note that the data can also be unpacked to a dictionary using [`NetworkLayer.metrics_to_dict`](/metrics/networks/#networklayermetrics_to_dict), or transposed to a `networkX` graph using [`NetworkLayer.to_networkX`](/metrics/networks/#networklayerto_networkx). :::danger Caution Be cognisant that mixed-use and land-use accessibility measures are sensitive to the classification schema that has been used. Meaningful comparisons from one location to another are only possible where the same schemas have been applied. ::: """ if self.Network is None: raise ValueError( 'Assign this data layer to a network prior to computing mixed-uses or accessibilities.' ) mixed_uses_options = [ 'hill', 'hill_branch_wt', 'hill_pairwise_wt', 'hill_pairwise_disparity', 'shannon', 'gini_simpson', 'raos_pairwise_disparity' ] # remember, most checks on parameter integrity occur in underlying method # so, don't duplicate here if len(landuse_labels) != len(self._data): raise ValueError( 'The number of landuse labels should match the number of data points.' ) # get the landuse encodings landuse_classes, landuse_encodings = encode_categorical(landuse_labels) # if necessary, check the disparity matrix if cl_disparity_wt_matrix is None: cl_disparity_wt_matrix = np.full((0, 0), np.nan) elif not isinstance(cl_disparity_wt_matrix, (list, tuple, np.ndarray)) or \ cl_disparity_wt_matrix.ndim != 2 or \ cl_disparity_wt_matrix.shape[0] != cl_disparity_wt_matrix.shape[1] or \ len(cl_disparity_wt_matrix) != len(landuse_classes): raise TypeError( 'Disparity weights must be a square pairwise NxN matrix in list, tuple, or numpy.ndarray form. ' 'The number of edge-wise elements should match the number of unique class labels.' ) # warn if no qs provided if qs is None: qs = () if isinstance(qs, (int, float)): qs = (qs) if not isinstance(qs, (list, tuple, np.ndarray)): raise TypeError( 'Please provide a float, list, tuple, or numpy.ndarray of q values.' ) # extrapolate the requested mixed use measures mu_hill_keys = [] mu_other_keys = [] if mixed_use_keys is not None: for mu in mixed_use_keys: if mu not in mixed_uses_options: raise ValueError( f'Invalid mixed-use option: {mu}. Must be one of {", ".join(mixed_uses_options)}.' ) idx = mixed_uses_options.index(mu) if idx < 4: mu_hill_keys.append(idx) else: mu_other_keys.append(idx - 4) if not checks.quiet_mode: logger.info( f'Computing mixed-use measures: {", ".join(mixed_use_keys)}' ) # figure out the corresponding indices for the landuse classes that are present in the dataset # these indices are passed as keys which will be matched against the integer landuse encodings acc_keys = [] if accessibility_keys is not None: for ac_label in accessibility_keys: if ac_label not in landuse_classes: logger.warning( f'No instances of accessibility label: {ac_label} present in the data.' ) else: acc_keys.append(landuse_classes.index(ac_label)) if not checks.quiet_mode: logger.info( f'Computing land-use accessibility for: {", ".join(accessibility_keys)}' ) if not checks.quiet_mode: progress_proxy = ProgressBar(total=len(self.Network._node_data)) else: progress_proxy = None # call the underlying method mixed_use_hill_data, mixed_use_other_data, accessibility_data, accessibility_data_wt = \ data.aggregate_landuses(self.Network._node_data, self.Network._edge_data, self.Network._node_edge_map, self._data, distances=np.array(self.Network.distances), betas=np.array(self.Network.betas), landuse_encodings=np.array(landuse_encodings), qs=np.array(qs), mixed_use_hill_keys=np.array(mu_hill_keys), mixed_use_other_keys=np.array(mu_other_keys), accessibility_keys=np.array(acc_keys), cl_disparity_wt_matrix=np.array(cl_disparity_wt_matrix), jitter_scale=jitter_scale, angular=angular, progress_proxy=progress_proxy) if progress_proxy is not None: progress_proxy.close() # write the results to the Network's metrics dict # keys will check for pre-existing, whereas qs and distance keys will overwrite # unpack mixed use hill for mu_h_idx, mu_h_key in enumerate(mu_hill_keys): mu_h_label = mixed_uses_options[mu_h_key] if mu_h_label not in self.Network.metrics['mixed_uses']: self.Network.metrics['mixed_uses'][mu_h_label] = {} for q_idx, q_key in enumerate(qs): self.Network.metrics['mixed_uses'][mu_h_label][q_key] = {} for d_idx, d_key in enumerate(self.Network.distances): self.Network.metrics['mixed_uses'][mu_h_label][q_key][d_key] = \ mixed_use_hill_data[mu_h_idx][q_idx][d_idx] # unpack mixed use other for mu_o_idx, mu_o_key in enumerate(mu_other_keys): mu_o_label = mixed_uses_options[mu_o_key + 4] if mu_o_label not in self.Network.metrics['mixed_uses']: self.Network.metrics['mixed_uses'][mu_o_label] = {} # no qs for d_idx, d_key in enumerate(self.Network.distances): self.Network.metrics['mixed_uses'][mu_o_label][ d_key] = mixed_use_other_data[mu_o_idx][d_idx] # unpack accessibility data for ac_idx, ac_code in enumerate(acc_keys): ac_label = landuse_classes[ac_code] # ac_code is index of ac_label for k, ac_data in zip(['non_weighted', 'weighted'], [accessibility_data, accessibility_data_wt]): if ac_label not in self.Network.metrics['accessibility'][k]: self.Network.metrics['accessibility'][k][ac_label] = {} for d_idx, d_key in enumerate(self.Network.distances): self.Network.metrics['accessibility'][k][ac_label][ d_key] = ac_data[ac_idx][d_idx]
def test_aggregate_landuses_signatures(primal_graph): # generate node and edge maps node_uids, node_data, edge_data, node_edge_map = graphs.graph_maps_from_nX(primal_graph) # setup data data_dict = mock.mock_data_dict(primal_graph, random_seed=13) data_uids, data_map = layers.data_map_from_dict(data_dict) data_map = data.assign_to_network(data_map, node_data, edge_data, node_edge_map, 500) # set parameters betas = np.array([0.02, 0.01, 0.005, 0.0025]) distances = networks.distance_from_beta(betas) qs = np.array([0, 1, 2]) mock_categorical = mock.mock_categorical_data(len(data_map)) landuse_classes, landuse_encodings = layers.encode_categorical(mock_categorical) # check that empty land_use encodings are caught with pytest.raises(ValueError): data.aggregate_landuses(node_data, edge_data, node_edge_map, data_map, distances, betas, mixed_use_hill_keys=np.array([0])) # check that unequal land_use encodings vs data map lengths are caught with pytest.raises(ValueError): data.aggregate_landuses(node_data, edge_data, node_edge_map, data_map, distances, betas, landuse_encodings=landuse_encodings[:-1], mixed_use_other_keys=np.array([0])) # check that no provided metrics flags with pytest.raises(ValueError): data.aggregate_landuses(node_data, edge_data, node_edge_map, data_map, distances, betas, landuse_encodings=landuse_encodings) # check that missing qs flags with pytest.raises(ValueError): data.aggregate_landuses(node_data, edge_data, node_edge_map, data_map, distances, betas, mixed_use_hill_keys=np.array([0]), landuse_encodings=landuse_encodings) # check that problematic mixed use and accessibility keys are caught for mu_h_key, mu_o_key, ac_key in [ # negatives ([-1], [1], [1]), ([1], [-1], [1]), ([1], [1], [-1]), # out of range ([4], [1], [1]), ([1], [3], [1]), ([1], [1], [max(landuse_encodings) + 1]), # duplicates ([1, 1], [1], [1]), ([1], [1, 1], [1]), ([1], [1], [1, 1])]: with pytest.raises(ValueError): data.aggregate_landuses(node_data, edge_data, node_edge_map, data_map, distances, betas, landuse_encodings, qs=qs, mixed_use_hill_keys=np.array(mu_h_key), mixed_use_other_keys=np.array(mu_o_key), accessibility_keys=np.array(ac_key)) for h_key, o_key in (([3], []), ([], [2])): # check that missing matrix is caught for disparity weighted indices with pytest.raises(ValueError): data.aggregate_landuses(node_data, edge_data, node_edge_map, data_map, distances, betas, landuse_encodings=landuse_encodings, qs=qs, mixed_use_hill_keys=np.array(h_key), mixed_use_other_keys=np.array(o_key)) # check that non-square disparity matrix is caught mock_matrix = np.full((len(landuse_classes), len(landuse_classes)), 1) with pytest.raises(ValueError): data.aggregate_landuses(node_data, edge_data, node_edge_map, data_map, distances, betas, landuse_encodings=landuse_encodings, qs=qs, mixed_use_hill_keys=np.array(h_key), mixed_use_other_keys=np.array(o_key), cl_disparity_wt_matrix=mock_matrix[:-1])
def test_aggregate_landuses_categorical_components(primal_graph): # generate node and edge maps node_uids, node_data, edge_data, node_edge_map, = graphs.graph_maps_from_nX(primal_graph) # setup data data_dict = mock.mock_data_dict(primal_graph, random_seed=13) data_uids, data_map = layers.data_map_from_dict(data_dict) data_map = data.assign_to_network(data_map, node_data, edge_data, node_edge_map, 500) # set parameters betas = np.array([0.02, 0.01, 0.005, 0.0025]) distances = networks.distance_from_beta(betas) qs = np.array([0, 1, 2]) mock_categorical = mock.mock_categorical_data(len(data_map)) landuse_classes, landuse_encodings = layers.encode_categorical(mock_categorical) mock_matrix = np.full((len(landuse_classes), len(landuse_classes)), 1) # set the keys - add shuffling to be sure various orders work hill_keys = np.arange(4) np.random.shuffle(hill_keys) non_hill_keys = np.arange(3) np.random.shuffle(non_hill_keys) ac_keys = np.array([1, 2, 5]) np.random.shuffle(ac_keys) # generate mu_data_hill, mu_data_other, ac_data, ac_data_wt = data.aggregate_landuses(node_data, edge_data, node_edge_map, data_map, distances, betas, landuse_encodings=landuse_encodings, qs=qs, mixed_use_hill_keys=hill_keys, mixed_use_other_keys=non_hill_keys, accessibility_keys=ac_keys, cl_disparity_wt_matrix=mock_matrix, angular=False) # hill hill = mu_data_hill[np.where(hill_keys == 0)][0] hill_branch_wt = mu_data_hill[np.where(hill_keys == 1)][0] hill_pw_wt = mu_data_hill[np.where(hill_keys == 2)][0] hill_disp_wt = mu_data_hill[np.where(hill_keys == 3)][0] # non hill shannon = mu_data_other[np.where(non_hill_keys == 0)][0] gini = mu_data_other[np.where(non_hill_keys == 1)][0] raos = mu_data_other[np.where(non_hill_keys == 2)][0] # access non-weighted ac_1_nw = ac_data[np.where(ac_keys == 1)][0] ac_2_nw = ac_data[np.where(ac_keys == 2)][0] ac_5_nw = ac_data[np.where(ac_keys == 5)][0] # access weighted ac_1_w = ac_data_wt[np.where(ac_keys == 1)][0] ac_2_w = ac_data_wt[np.where(ac_keys == 2)][0] ac_5_w = ac_data_wt[np.where(ac_keys == 5)][0] # test manual metrics against all nodes mu_max_unique = len(landuse_classes) # test against various distances for d_idx in range(len(distances)): dist_cutoff = distances[d_idx] beta = betas[d_idx] for src_idx in range(len(primal_graph)): reachable_data, reachable_data_dist, tree_preds = data.aggregate_to_src_idx(src_idx, node_data, edge_data, node_edge_map, data_map, dist_cutoff) # counts of each class type (array length per max unique classes - not just those within max distance) cl_counts = np.full(mu_max_unique, 0) # nearest of each class type (likewise) cl_nearest = np.full(mu_max_unique, np.inf) # aggregate a_1_nw = 0 a_2_nw = 0 a_5_nw = 0 a_1_w = 0 a_2_w = 0 a_5_w = 0 # iterate reachable for data_idx, (reachable, data_dist) in enumerate(zip(reachable_data, reachable_data_dist)): if not reachable: continue cl = landuse_encodings[data_idx] # double check distance is within threshold assert data_dist <= dist_cutoff # update the class counts cl_counts[cl] += 1 # if distance is nearer, update the nearest distance array too if data_dist < cl_nearest[cl]: cl_nearest[cl] = data_dist # aggregate accessibility codes if cl == 1: a_1_nw += 1 a_1_w += np.exp(-beta * data_dist) elif cl == 2: a_2_nw += 1 a_2_w += np.exp(-beta * data_dist) elif cl == 5: a_5_nw += 1 a_5_w += np.exp(-beta * data_dist) # assertions assert ac_1_nw[d_idx, src_idx] == a_1_nw assert ac_2_nw[d_idx, src_idx] == a_2_nw assert ac_5_nw[d_idx, src_idx] == a_5_nw assert ac_1_w[d_idx, src_idx] == a_1_w assert ac_2_w[d_idx, src_idx] == a_2_w assert ac_5_w[d_idx, src_idx] == a_5_w assert hill[0, d_idx, src_idx] == diversity.hill_diversity(cl_counts, 0) assert hill[1, d_idx, src_idx] == diversity.hill_diversity(cl_counts, 1) assert hill[2, d_idx, src_idx] == diversity.hill_diversity(cl_counts, 2) assert hill_branch_wt[0, d_idx, src_idx] == \ diversity.hill_diversity_branch_distance_wt(cl_counts, cl_nearest, 0, beta) assert hill_branch_wt[1, d_idx, src_idx] == \ diversity.hill_diversity_branch_distance_wt(cl_counts, cl_nearest, 1, beta) assert hill_branch_wt[2, d_idx, src_idx] == \ diversity.hill_diversity_branch_distance_wt(cl_counts, cl_nearest, 2, beta) assert hill_pw_wt[0, d_idx, src_idx] == \ diversity.hill_diversity_pairwise_distance_wt(cl_counts, cl_nearest, 0, beta) assert hill_pw_wt[1, d_idx, src_idx] == \ diversity.hill_diversity_pairwise_distance_wt(cl_counts, cl_nearest, 1, beta) assert hill_pw_wt[2, d_idx, src_idx] == \ diversity.hill_diversity_pairwise_distance_wt(cl_counts, cl_nearest, 2, beta) assert hill_disp_wt[0, d_idx, src_idx] == \ diversity.hill_diversity_pairwise_matrix_wt(cl_counts, mock_matrix, 0) assert hill_disp_wt[1, d_idx, src_idx] == \ diversity.hill_diversity_pairwise_matrix_wt(cl_counts, mock_matrix, 1) assert hill_disp_wt[2, d_idx, src_idx] == \ diversity.hill_diversity_pairwise_matrix_wt(cl_counts, mock_matrix, 2) assert shannon[d_idx, src_idx] == diversity.shannon_diversity(cl_counts) assert gini[d_idx, src_idx] == diversity.gini_simpson_diversity(cl_counts) assert raos[d_idx, src_idx] == diversity.raos_quadratic_diversity(cl_counts, mock_matrix) # check that angular is passed-through # actual angular tests happen in test_shortest_path_tree() # here the emphasis is simply on checking that the angular instruction gets chained through # setup dual data G_dual = graphs.nX_to_dual(primal_graph) node_labels_dual, node_data_dual, edge_data_dual, node_edge_map_dual = graphs.graph_maps_from_nX(G_dual) data_dict_dual = mock.mock_data_dict(G_dual, random_seed=13) data_uids_dual, data_map_dual = layers.data_map_from_dict(data_dict_dual) data_map_dual = data.assign_to_network(data_map_dual, node_data_dual, edge_data_dual, node_edge_map_dual, 500) mock_categorical = mock.mock_categorical_data(len(data_map_dual)) landuse_classes_dual, landuse_encodings_dual = layers.encode_categorical(mock_categorical) mock_matrix = np.full((len(landuse_classes_dual), len(landuse_classes_dual)), 1) mu_hill_dual, mu_other_dual, ac_dual, ac_wt_dual = data.aggregate_landuses(node_data_dual, edge_data_dual, node_edge_map_dual, data_map_dual, distances, betas, landuse_encodings_dual, qs=qs, mixed_use_hill_keys=hill_keys, mixed_use_other_keys=non_hill_keys, accessibility_keys=ac_keys, cl_disparity_wt_matrix=mock_matrix, angular=True) mu_hill_dual_sidestep, mu_other_dual_sidestep, ac_dual_sidestep, ac_wt_dual_sidestep = \ data.aggregate_landuses(node_data_dual, edge_data_dual, node_edge_map_dual, data_map_dual, distances, betas, landuse_encodings_dual, qs=qs, mixed_use_hill_keys=hill_keys, mixed_use_other_keys=non_hill_keys, accessibility_keys=ac_keys, cl_disparity_wt_matrix=mock_matrix, angular=False) assert not np.allclose(mu_hill_dual, mu_hill_dual_sidestep, atol=0.001, rtol=0) assert not np.allclose(mu_other_dual, mu_other_dual_sidestep, atol=0.001, rtol=0) assert not np.allclose(ac_dual, ac_dual_sidestep, atol=0.001, rtol=0) assert not np.allclose(ac_wt_dual, ac_wt_dual_sidestep, atol=0.001, rtol=0)
def test_compute_landuses(primal_graph): betas = np.array([0.01, 0.005]) distances = networks.distance_from_beta(betas) # network layer N = networks.NetworkLayerFromNX(primal_graph, distances=distances) node_map = N._node_data edge_map = N._edge_data node_edge_map = N._node_edge_map # data layer data_dict = mock.mock_data_dict(primal_graph) qs = np.array([0, 1, 2]) D = layers.DataLayerFromDict(data_dict) # check single metrics independently against underlying for some use-cases, e.g. hill, non-hill, accessibility... D.assign_to_network(N, max_dist=500) # generate some mock landuse data landuse_labels = mock.mock_categorical_data(len(data_dict)) landuse_classes, landuse_encodings = layers.encode_categorical( landuse_labels) # compute hill mixed uses D.compute_landuses(landuse_labels, mixed_use_keys=['hill_branch_wt'], qs=qs) # test against underlying method data_map = D._data mu_data_hill, mu_data_other, ac_data, ac_data_wt = data.aggregate_landuses( node_map, edge_map, node_edge_map, data_map, distances, betas, landuse_encodings, qs=qs, mixed_use_hill_keys=np.array([1])) for q_idx, q_key in enumerate(qs): for d_idx, d_key in enumerate(distances): assert np.allclose( N.metrics['mixed_uses']['hill_branch_wt'][q_key][d_key], mu_data_hill[0][q_idx][d_idx], atol=0.001, rtol=0) # gini simpson D.compute_landuses(landuse_labels, mixed_use_keys=['gini_simpson']) # test against underlying method data_map = D._data mu_data_hill, mu_data_other, ac_data, ac_data_wt = data.aggregate_landuses( node_map, edge_map, node_edge_map, data_map, distances, betas, landuse_encodings, mixed_use_other_keys=np.array([1])) for d_idx, d_key in enumerate(distances): assert np.allclose(N.metrics['mixed_uses']['gini_simpson'][d_key], mu_data_other[0][d_idx], atol=0.001, rtol=0) # accessibilities D.compute_landuses(landuse_labels, accessibility_keys=['c']) # test against underlying method data_map = D._data mu_data_hill, mu_data_other, ac_data, ac_data_wt = data.aggregate_landuses( node_map, edge_map, node_edge_map, data_map, distances, betas, landuse_encodings, accessibility_keys=np.array([landuse_classes.index('c')])) for d_idx, d_key in enumerate(distances): assert np.allclose( N.metrics['accessibility']['non_weighted']['c'][d_key], ac_data[0][d_idx], atol=0.001, rtol=0) assert np.allclose(N.metrics['accessibility']['weighted']['c'][d_key], ac_data_wt[0][d_idx], atol=0.001, rtol=0) # also check the number of returned types for a few assortments of metrics mixed_uses_hill_types = np.array([ 'hill', 'hill_branch_wt', 'hill_pairwise_wt', 'hill_pairwise_disparity' ]) mixed_use_other_types = np.array( ['shannon', 'gini_simpson', 'raos_pairwise_disparity']) ac_codes = np.array(landuse_classes) # mixed uses hill mu_hill_random = np.arange(len(mixed_uses_hill_types)) np.random.shuffle(mu_hill_random) # mixed uses other mu_other_random = np.arange(len(mixed_use_other_types)) np.random.shuffle(mu_other_random) # accessibility ac_random = np.arange(len(landuse_classes)) np.random.shuffle(ac_random) # mock disparity matrix mock_disparity_wt_matrix = np.full( (len(landuse_classes), len(landuse_classes)), 1) # not necessary to do all labels, first few should do for mu_h_min in range(3): mu_h_keys = np.array(mu_hill_random[mu_h_min:]) for mu_o_min in range(3): mu_o_keys = np.array(mu_other_random[mu_o_min:]) for ac_min in range(3): ac_keys = np.array(ac_random[ac_min:]) # in the final case, set accessibility to a single code otherwise an error would be raised if len(mu_h_keys) == 0 and len(mu_o_keys) == 0 and len( ac_keys) == 0: ac_keys = np.array([0]) # randomise order of keys and metrics mu_h_metrics = mixed_uses_hill_types[mu_h_keys] mu_o_metrics = mixed_use_other_types[mu_o_keys] ac_metrics = ac_codes[ac_keys] # prepare network and compute N_temp = networks.NetworkLayerFromNX(primal_graph, distances=distances) D_temp = layers.DataLayerFromDict(data_dict) D_temp.assign_to_network(N_temp, max_dist=500) D_temp.compute_landuses( landuse_labels, mixed_use_keys=list(mu_h_metrics) + list(mu_o_metrics), accessibility_keys=ac_metrics, cl_disparity_wt_matrix=mock_disparity_wt_matrix, qs=qs) # test against underlying method mu_data_hill, mu_data_other, ac_data, ac_data_wt = \ data.aggregate_landuses(node_map, edge_map, node_edge_map, data_map, distances, betas, landuse_encodings, qs=qs, mixed_use_hill_keys=mu_h_keys, mixed_use_other_keys=mu_o_keys, accessibility_keys=ac_keys, cl_disparity_wt_matrix=mock_disparity_wt_matrix) for mu_h_idx, mu_h_met in enumerate(mu_h_metrics): for q_idx, q_key in enumerate(qs): for d_idx, d_key in enumerate(distances): assert np.allclose( N_temp.metrics['mixed_uses'][mu_h_met][q_key] [d_key], mu_data_hill[mu_h_idx][q_idx][d_idx], atol=0.001, rtol=0) for mu_o_idx, mu_o_met in enumerate(mu_o_metrics): for d_idx, d_key in enumerate(distances): assert np.allclose( N_temp.metrics['mixed_uses'][mu_o_met][d_key], mu_data_other[mu_o_idx][d_idx], atol=0.001, rtol=0) for ac_idx, ac_met in enumerate(ac_metrics): for d_idx, d_key in enumerate(distances): assert np.allclose(N_temp.metrics['accessibility'] ['non_weighted'][ac_met][d_key], ac_data[ac_idx][d_idx], atol=0.001, rtol=0) assert np.allclose(N_temp.metrics['accessibility'] ['weighted'][ac_met][d_key], ac_data_wt[ac_idx][d_idx], atol=0.001, rtol=0) # most integrity checks happen in underlying method, though check here for mismatching labels length and typos with pytest.raises(ValueError): D.compute_landuses(landuse_labels[-1], mixed_use_keys=['shannon']) with pytest.raises(ValueError): D.compute_landuses(landuse_labels, mixed_use_keys=['spelling_typo']) # don't check accessibility_labels for typos - because only warning is triggered (not all labels will be in all data) # check that unassigned data layer flags with pytest.raises(ValueError): D_new = layers.DataLayerFromDict(data_dict) D_new.compute_landuses(landuse_labels, mixed_use_keys=['shannon'])