, 2005 and Hayden et al., 2009). The internal models about the animal’s environment necessary for this mental simulation can be acquired without reinforcement (Tolman, 1948 and Fiser and Aslin, 2001). In particular, the ability to incorporate simultaneously actual and hypothetical outcomes expected from chosen and unchosen actions can facilitate the process of finding optimal strategies during social interactions (Camerer, 2003, Gallagher and Frith, 2003, Lee, 2008 and Behrens et al., 2009), since observed behaviors of other decision makers
can provide the information about the hypothetical outcomes from multiple actions. However, learning from both real and hypothetical outcomes
is not trivial, because these two different types of information need to be linked to different actions correctly. For example, attributing the hypothetical outcomes from unchosen actions incorrectly see more DAPT to the chosen action would interfere with adaptive behaviors (Walton et al., 2010). Although previous studies have identified neural signals related to hypothetical outcomes in multiple brain areas (Camille et al., 2004, Coricelli et al., 2005, Lohrenz et al., 2007, Chandrasekhar et al., 2008, Fujiwara et al., 2009 and Hayden et al., 2009), they have not revealed signals encoding hypothetical outcomes associated with specific actions. Therefore, the neural substrates necessary for learning from hypothetical outcomes remain unknown. In already the present study, we tested whether the information about the actual and hypothetical outcomes
from chosen and unchosen actions is properly integrated in the primate prefrontal cortex. In particular, the dorsolateral prefrontal cortex (DLPFC) is integral to binding the sensory inputs in multiple modalities appropriately (Prabhakaran et al., 2000), including the contextual information essential for episodic memory (Baddeley, 2000 and Mitchell and Johnson, 2009). DLPFC has also been implicated in processing hypothetical outcomes (Coricelli et al., 2005 and Fujiwara et al., 2009) and in model-based reinforcement learning (Gläscher et al., 2010). Moreover, DLPFC neurons often change their activity according to the outcomes expected or obtained from specific actions (Watanabe, 1996, Leon and Shadlen, 1999, Matsumoto et al., 2003, Barraclough et al., 2004 and Seo and Lee, 2009). Therefore, we hypothesized that individual neurons in the DLPFC might encode both actual and hypothetical outcomes resulting from the same actions and provide the substrate for learning the values of both chosen and unchosen actions. The orbitofrontal cortex (OFC) might be also crucial for behavioral adjustment guided by hypothetical outcome (Camille et al., 2004 and Coricelli et al., 2005).