The hippocampus of the mammalian brain supports spatial navigation by building cognitive maps of the environments in which the animal explores. Currently, there is little neurocomputational work investigating the encoding and decoding mechanisms of hippocampal neural representations in large-scale environments. We propose a biologically-inspired hierarchical neural network architecture to learn the transformation of egocentric sensorimotor inputs into allocentric spatial representation for navigation. The hierarchical network is composed of two parallel subnetworks mimicking the lateral entorhinal cortex (LEC) and medial entorhinal cortex (MEC), and one convergent subnetwork mimicking the hippocampus. LEC relays time-related visual information and MEC supplies space-related information in the form of multi-resolution grid codes as resulted from integrating movement information. The convergent subnetwork integrates all information from the parallel subnetworks and predicts the position of the agent in the environment. Synaptic weights of the vision-to-place and grid-to-place connections are learned based on the stochastic gradient descent algorithm. Simulations in a large virtual maze demonstrate that hippocampal place units in the model form multiple and irregularly-spaced place fields, similar to those observed in neurobiological experiments. The model is able to accurately decode the positions of the agent from the learned spatial representations. Moreover, the model is capable of adaptation to degraded visual inputs, and therefore is robust against perturbations. When the motion inputs are deprived, the model meets with localization difficulty, suffering from less accuracy in position predictions.
This work is published on Neurocomputing,453,579-589.