Space2Vec-sphereM

SphereMixScaleSpatialRelationLocationEncoder

Overview

The SphereMixScaleSpatialRelationLocationEncoder is engineered to encode spatial relationships between locations using advanced position encoding techniques. It integrates the SphereMixScaleSpatialRelationPositionEncoder for initial encoding and processes the results through a multi-layer feed-forward neural network to produce high-dimensional spatial embeddings.

Features

  • Position Encoding self.position_encoder: Utilizes the SphereMixScaleSpatialRelationPositionEncoder to encode spatial differences (deltaX, deltaY) using geometrically scaled sinusoidal functions.

  • Feed-Forward Neural Network self.ffn: Transforms position-encoded data through several neural network layers to produce high-dimensional spatial embeddings.

Configuration Parameters

  • spa_embed_dim: The dimensionality of the output spatial embeddings.

  • coord_dim: The dimensionality of the coordinate space, typically 2D.

  • device: Specifies the computation device, e.g., ‘cuda’.

  • frequency_num: Number of frequency components used in positional encoding.

  • max_radius: The largest spatial context radius the model can handle.

  • min_radius: The minimum radius, ensuring detailed capture at smaller scales.

  • freq_init: Initialization method for frequency calculation, set to ‘geometric’.

  • ffn_act: Activation function used in the MLP layers.

  • ffn_num_hidden_layers: Number of layers in the feed-forward network.

  • ffn_dropout_rate: Dropout rate for regularization within the MLP.

  • ffn_hidden_dim: Dimension of each hidden layer within the MLP.

  • ffn_use_layernormalize: Boolean to enable normalization within the MLP.

  • ffn_skip_connection: Enables skip connections within the MLP, potentially enhancing learning.

  • ffn_context_str: Context string for debugging and detailed logging within the network.

Methods

forward(coords)

Processes input coordinates through the location encoder to generate final spatial embeddings.

  • Parameters:
    • coords (List or np.ndarray): Coordinates to process, formatted as (batch_size, num_context_pt, coord_dim).

  • Returns:
    • sprenc (Tensor): Spatial relation embeddings with a shape of (batch_size, num_context_pt, spa_embed_dim).

SphereMixScaleSpatialRelationPositionEncoder

Overview

Transforms spatial coordinates into high-dimensional encoded formats using sinusoidal functions scaled across multiple frequencies, enhancing the model’s capability to discern spatial nuances.

Assumptions for Grid-Structured Data

Spatial Regularity

Grid data often comes in regular, evenly spaced intervals, such as pixels in images or cells in raster GIS data.

Two-Dimensional Structure

Most grid data is two-dimensional, requiring simultaneous encoding of both dimensions to capture spatial relationships effectively.

Formula Development

  • Base Sinusoidal Encoding

For each coordinate component \(x\) and \(y\), apply sinusoidal functions across multiple scales:

\[E(x, y) = \bigoplus_{i=0}^{L-1} \left[ \sin(\omega_i x), \cos(\omega_i x), \sin(\omega_i y), \cos(\omega_i y) \right]\]

Where:

  • \(\bigoplus\) denotes vector concatenation.

  • \(L\) is the number of different frequencies used.

  • \(\omega_i\) are the scaled frequencies.

  • Frequency Scaling

Given the grid structure, frequency scaling might be adapted based on typical distances or resolutions encountered in grid data:

\[\omega_i = \pi \cdot \left(\frac{2^i}{\text{cell size}}\right)\]

This scaling method aligns the frequency increments with the spatial resolution of grid cells, allowing the encoder to capture variations within and between cells.

  • Enhanced Spatial Encoding

To account for the two-dimensional nature of grid data and potentially the interactions between grid cells, the encoding can be expanded to include mixed terms that combine \(x\) and \(y\) coordinates:

\[E_{\text{enhanced}}(x, y) = E(x, y) \oplus \left[ \sin(\omega_i x) \cdot \cos(\omega_i y), \cos(\omega_i x) \cdot \sin(\omega_i y) \right]\]

These mixed terms help to model cross-dimensional spatial interactions, which are critical in grid-like structures where horizontal and vertical relationships might influence the spatial analysis.

  • Output Dimensionality

The output dimensionality, considering the enhanced encoding, becomes:

\[\text{Output Dim} = 4L + 2L = 6L\]

Where \(4L\) comes from the original sinusoidal terms for \(x\) and \(y\), and \(2L\) from the mixed terms added for cross-dimensional interactions.

Features

  • Geometric Frequency Scaling: Employs a geometric progression of frequencies for sinusoidal encoding, capturing a broad range of spatial details.

  • Configurable Parameters: Supports adjustments in encoding dimensions, frequency range, and computational resources.

Configuration Parameters

  • coord_dim: The dimensionality of the space being encoded.

  • frequency_num: The number of frequencies used for encoding.

  • device: Specifies the computational device.

Methods

cal_elementwise_angle(coord, cur_freq)
  • Parameters:
    • coord: The deltaX or deltaY.

    • cur_freq: The frequency index.

  • Returns:
    • The calculated angle for the sinusoidal transformation.

cal_coord_embed(coords_tuple)

Converts a batch of coordinates into sinusoidally-encoded vectors.

  • Parameters:
    • coords_tuple: Tuple of spatial differences.

  • Returns:
    • High-dimensional vector representing the encoded spatial relationships.

Usage Example

encoder = SphereMixScaleSpatialRelationLocationEncoder(
spa_embed_dim=64,
coord_dim=2,
device="cuda",
frequency_num=16,
max_radius=10000,
min_radius=10,
freq_init="geometric",
ffn_act="relu",
ffn_num_hidden_layers=1,
ffn_dropout_rate=0.5,
ffn_hidden_dim=256,
ffn_use_layernormalize=True,
ffn_skip_connection=True,
ffn_context_str="SphereMixScaleSpatialRelationEncoder"
)

coords = np.array([[34.0522, -118.2437], [40.7128, -74.0060]])  # Example coordinate data
embeddings = encoder.forward(coords)