A. Arleo and W. Gerstner, Spatial cognition and neuro-mimetic navigation: a model of hippocampal place cell activity, Biological cybernetics, vol.83, issue.3, pp.287-299, 2000.

L. Aubin, M. Khamassi, and B. Girard, Prioritized sweeping neural DynaQ with multiple predecessors, and hippocampal replays, Conference on Biomimetic and Biohybrid Systems, pp.16-27, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01709275

A. G. Barto, Adaptive critics and the basal ganglia, Models of Information Processing in the Basal Ganglia, pp.215-232, 1995.

A. G. Barto, S. J. Bradtke, and S. P. Singh, Learning to act using real-time dynamic programming, Artificial intelligence, vol.72, issue.1-2, pp.81-138, 1995.

F. P. Battaglia, A. Peyrache, M. Khamassi, and S. I. Wiener, Spatial decisions and neuronal activity in hippocampal projection zones in prefrontal cortex and striatum, Hippocampal Place Fields: Relevance to Learning and Memory pp, pp.289-311, 2008.

K. Benchenane, A. Peyrache, M. Khamassi, P. L. Tierney, Y. Gioanni et al., Coherent theta oscillations and reorganization of spike timing in the hippocampal-prefrontal network upon learning, Neuron, vol.66, issue.6, pp.921-936, 2010.
URL : https://hal.archives-ouvertes.fr/hal-00554482

U. S. Bhalla, Dendrites, deep learning, and sequences in the hippocampus, Hippocampus, vol.29, issue.3, pp.239-251, 2019.

G. Buzsáki, Two-stage model of memory trace formation: A role for "noisy" brain states, Neuroscience, vol.31, issue.3, pp.551-570, 1989.

K. Caluwaerts, M. Staffa, N. 'guyen, S. Grand, C. Dollé et al., A biologically inspired meta-control navigation system for the psikharpax rat robot, Bioinspiration & biomimetics, vol.7, issue.2, p.25009, 2012.
URL : https://hal.archives-ouvertes.fr/hal-01000945

R. Cazé, M. Khamassi, L. Aubin, and B. Girard, Hippocampal replays under the scrutiny of reinforcement learning models, Journal of neurophysiology, vol.120, issue.6, pp.2877-2896, 2018.

P. Cisek, G. A. Puskas, and S. El-murr, Decisions in changing conditions: the urgency-gating model, Journal of Neuroscience, vol.29, issue.37, pp.11560-11571, 2009.

V. Cutsuridis and M. Hasselmo, Spatial memory sequence encoding and replay during modeled theta and ripple oscillations, Cognitive Computation, vol.3, issue.4, pp.554-574, 2011.

N. D. Daw, Y. Niv, and P. Dayan, Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control, Nature neuroscience, vol.8, issue.12, p.1704, 2005.

K. Diba and G. Buzsáki, Forward and reverse hippocampal place-cell sequences during ripples, Nature neuroscience, vol.10, issue.10, p.1241, 2007.

L. Dollé, M. Khamassi, B. Girard, A. Guillot, and R. Chavarriaga, Analyzing interactions between navigation strategies using a computational model of action selection, International Conference on Spatial Cognition, pp.71-86, 2008.

L. Dollé, D. Sheynikhovich, B. Girard, R. Chavarriaga, and A. Guillot, Path planning versus cue responding: a bio-inspired model of switching between navigation strategies, Biological cybernetics, vol.103, issue.4, pp.299-317, 2010.

L. Dollé, R. Chavarriaga, A. Guillot, and M. Khamassi, Interactions of spatial strategies producing generalization gradient and blocking: A computational approach, PLoS computational biology, vol.14, issue.4, p.1006092, 2018.

D. Foster, R. Morris, and P. Dayan, A model of hippocampally dependent navigation, using the temporal difference learning rule, Hippocampus, vol.10, issue.1, pp.1-16, 2000.

D. J. Foster, Replay comes of age, Annual review of neuroscience, vol.40, pp.581-602, 2017.

D. J. Foster and W. Ma, Reverse replay of behavioural sequences in hippocampal place cells during the awake state, Nature, vol.440, issue.7084, pp.680-683, 2006.

M. J. Frank and E. D. Claus, Anatomy of a decision: striato-orbitofrontal interactions in reinforcement learning, decision making, and reversal, Psychological review, vol.113, issue.2, p.300, 2006.

P. W. Frankland and B. Bontempi, The organization of recent and remote memories, Nature reviews Neuroscience, vol.6, issue.2, pp.119-130, 2005.

G. Girardeau, K. Benchenane, S. I. Wiener, G. Buzsáki, and M. B. Zugaro, Selective suppression of hippocampal ripples impairs spatial memory, Nature neuroscience, vol.12, issue.10, pp.1222-1223, 2009.
URL : https://hal.archives-ouvertes.fr/hal-00599372

A. Guazzelli, M. Bota, F. J. Corbacho, and M. A. Arbib, Affordances. motivations, and the world graph theory, Adaptive Behavior, vol.6, issue.3-4, pp.435-471, 1998.

A. S. Gupta, M. Van-der-meer, D. S. Touretzky, and A. D. Redish, Hippocampal Replay Is Not a Simple Function of Experience, Neuron, vol.65, issue.5, pp.695-705, 2010.

S. P. Jadhav, C. Kemere, P. W. German, and L. M. Frank, Awake hippocampal sharpwave ripples support spatial memory, Science, vol.336, issue.6087, pp.1454-1458, 2012.

S. Jahnke, M. Timme, and R. M. Memmesheimer, A unified dynamic model for learning, replay, and sharp-wave/ripples, Journal of Neuroscience, vol.35, issue.49, pp.16236-16258, 2015.

A. Johnson and A. D. Redish, Hippocampal replay contributes to within session learning in a temporal difference reinforcement learning model, Neural Networks, vol.18, issue.9, pp.1163-1171, 2005.

A. Johnson and A. D. Redish, Neural ensembles in CA3 transiently encode paths forward of the animal at a decision point, Journal of Neuroscience, vol.27, issue.45, pp.12176-12189, 2007.

A. Johnson, M. A. Van-der-meer, and A. D. Redish, Integrating hippocampus and striatum in decision-making, Current opinion in neurobiology, vol.17, issue.6, pp.692-697, 2007.

J. L. Jones, G. R. Esber, M. A. Mcdannald, A. J. Gruber, A. Hernandez et al., Orbitofrontal cortex supports behavior and learning using inferred but not cached values, Science, vol.338, issue.6109, pp.953-956, 2012.

M. P. Karlsson and L. M. Frank, Awake replay of remote experiences in the hippocampus, Nature neuroscience, vol.12, issue.7, p.913, 2009.

M. Khamassi and M. D. Humphries, Integrating cortico-limbic-basal ganglia architectures for learning model-based and model-free navigation strategies, Frontiers in Behavioral Neuroscience, vol.6, p.79, 2012.
URL : https://hal.archives-ouvertes.fr/hal-01219958

M. Khamassi, R. Quilodran, P. Enel, P. Dominey, and E. Procyk, Behavioral regulation and the modulation of information coding in the lateral prefrontal and cingulate cortex, Cerebral Cortex, vol.25, issue.9, pp.3197-3218, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01219972

M. C. Klein-flügge, H. C. Barron, K. H. Brodersen, R. J. Dolan, and T. Behrens, Segregated encoding of reward-identity and stimulus-reward associations in human orbitofrontal cortex, Journal of Neuroscience, vol.33, issue.7, pp.3202-3211, 2013.

C. S. Lansink, P. M. Goltstein, J. V. Lankelma, B. L. Mcnaughton, and C. Pennartz, Hippocampus leads ventral striatum in replay of place-reward information, PLoS Biology, vol.7, issue.8, 2009.

G. De-lavilléon, M. M. Lacroix, L. Rondi-reig, and K. Benchenane, Explicit memory creation during sleep demonstrates a causal role of place cells in navigation, Nature neuroscience, vol.18, issue.4, pp.493-495, 2015.

A. K. Lee and M. A. Wilson, Memory of sequential experience in the hippocampus during slow wave sleep, Neuron, vol.36, issue.6, pp.1183-1194, 2002.

W. B. Levy, A sequence predicting ca3 is a flexible associator that learns and uses context to solve hippocampal-like tasks, Hippocampus, vol.6, issue.6, pp.579-590, 1996.

L. J. Lin, Self-improving reactive agents based on reinforcement learning, planning and teaching, Machine learning, vol.8, issue.3/4, pp.69-97, 1992.

N. Maingret, G. Girardeau, R. Todorova, M. Goutierre, and M. Zugaro, Hippocampo-cortical coupling mediates memory consolidation during sleep, Nature neuroscience, vol.19, issue.7, pp.959-964, 2016.
URL : https://hal.archives-ouvertes.fr/hal-02365552

M. G. Mattar and N. D. Daw, Prioritized memory access explains planning and hippocampal replay, Nature Neuroscience, vol.21, issue.11, p.1609, 2018.

M. Van-der-meer, Z. Kurth-nelson, and A. D. Redish, Information processing in decision-making systems, The Neuroscientist, vol.18, issue.4, pp.342-359, 2012.

E. K. Miller and J. D. Cohen, An integrative theory of prefrontal cortex function, Annual review of neuroscience, vol.24, issue.1, pp.167-202, 2001.

A. W. Moore and C. G. Atkeson, Prioritized sweeping: Reinforcement learning with less data and less time, Machine learning, vol.13, issue.1, pp.103-130, 1993.

J. O'keefe and J. Dostrovsky, The hippocampus as a spatial map: Preliminary evidence from unit activity in the freely-moving rat, Brain research, vol.34, issue.1, pp.171-175, 1971.

H. F. Olafsdóttir, C. Barry, A. B. Saleem, D. Hassabis, and H. J. Spiers, Hippocampal place cells construct reward related sequences through unexplored space, vol.4, p.6063, 2015.

H. F. Olafsdóttir, D. Bush, and C. Barry, The role of hippocampal replay in memory and planning, Current Biology, vol.28, issue.1, pp.37-50, 2018.

S. Palminteri, G. Lefebvre, E. J. Kilford, and S. J. Blakemore, Confirmation bias in human reinforcement learning: Evidence from counterfactual feedback processing, PLoS computational biology, vol.13, issue.8, p.1005684, 2017.

A. E. Papale, M. C. Zielinski, L. M. Frank, S. P. Jadhav, and A. D. Redish, Interplay between Hippocampal Sharp-Wave-Ripple Events and Vicarious Trial and Error Behaviors in Decision Making, Neuron, vol.92, issue.5, pp.1-8, 2016.

S. A. Park, D. S. Miller, H. Nili, C. Ranganath, and E. D. Boorman, Map making: Constructing, combining, and navigating abstract cognitive maps, p.810051, 2019.

A. Pasupathy and E. K. Miller, Different time courses of learning-related activity in the prefrontal cortex and striatum, Nature, vol.433, issue.7028, p.873, 2005.

J. Peng and R. J. Williams, Efficient learning and planning within the Dyna framework, Adaptive Behavior, vol.1, issue.4, pp.437-454, 1993.

A. Peyrache, M. Khamassi, K. Benchenane, S. I. Wiener, and F. P. Battaglia, Replay of rule-learning related neural patterns in the prefrontal cortex during sleep, Nature Neuroscience, vol.12, issue.7, pp.919-926, 2009.
URL : https://hal.archives-ouvertes.fr/hal-00551868

G. Pezzulo, F. Rigoli, and F. Chersi, The mixed instrumental controller: using value of information to combine habitual choice and mental simulation, Frontiers in psychology, vol.4, 2013.

G. Pezzulo, M. Van-der-meer, C. S. Lansink, and C. Pennartz, Internally generated sequences in learning and executing goal-directed behavior, Trends in Cognitive Sciences, vol.18, issue.12, pp.647-657, 2014.

G. Pezzulo, C. Kemere, and M. A. Van-der-meer, Internally generated hippocampal sequences as a vantage point to probe future-oriented cognition, Annals of the New York Academy of Sciences, vol.1396, issue.1, pp.144-165, 2017.

B. E. Pfeiffer and D. J. Foster, Hippocampal place-cell sequences depict future paths to remembered goals, Nature, vol.497, issue.7447, p.74, 2013.

I. Pohl, Bi-directional search, Machine intelligence, vol.6, p.10, 1971.

A. D. Redish, Vicarious trial and error, Nature Reviews Neuroscience, vol.17, issue.3, pp.147-159, 2016.

E. Renaudo, B. Girard, R. Chatila, and M. Khamassi, Design of a control architecture for habit learning in robots, Conference on Biomimetic and Biohybrid Systems, pp.249-260, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01312443

C. Rennó-costa, A. Da-silva, W. Blanco, and S. Ribeiro, Computational models of memory consolidation and long-term synaptic plasticity during sleep, Neurobiology of learning and memory, vol.160, pp.32-47, 2019.

D. K. Roumis and L. M. Frank, Hippocampal sharp-wave ripples in waking and sleeping states, Current opinion in neurobiology, vol.35, pp.6-12, 2015.

V. Saravanan, D. Arabali, A. Jochems, A. X. Cui, L. Gootjes-dreesbach et al., Transition between encoding and consolidation/replay dynamics via cholinergic modulation of can current: a modeling study, Hippocampus, vol.25, issue.9, pp.1052-1070, 2015.

W. Schultz, P. Dayan, and P. R. Montague, A neural substrate of prediction and reward, Science, vol.275, pp.1593-1599, 1997.

K. L. Stachenfeld, M. M. Botvinick, and S. J. Gershman, The hippocampus as a predictive map, Nature neuroscience, vol.20, issue.11, p.1643, 2017.

R. S. Sutton, Integrated architectures for learning, planning, and reacting based on approximating dynamic programming, Proceedings of the seventh international conference on machine learning, pp.216-224, 1990.

R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction. Cambridge, 1998.

G. Viejo, M. Khamassi, A. Brovelli, and B. Girard, Modeling choice and reaction time during arbitrary visuomotor learning through the coordination of adaptive working memory and reinforcement learning, Frontiers in behavioral neuroscience 9, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01215419

A. M. Wikenheiser and G. Schoenbaum, Over the river, through the woods: cognitive maps in the hippocampus and orbitofrontal cortex, Nature Reviews Neuroscience, vol.17, issue.8, pp.513-523, 2016.

M. A. Wilson and B. L. Mcnaughton, Reactivation of hippocampal ensemble memories during sleep, Science, vol.265, issue.5172, pp.676-679, 1994.

J. Zhou, M. Montesinos-cartagena, A. M. Wikenheiser, M. P. Gardner, Y. Niv et al., Complementary task structure representations in hippocampus and orbitofrontal cortex during an odor sequence task, Current Biology, vol.29, issue.20, pp.3402-3409, 2019.