A social path to human-like artificial intelligence

Krizhevsky, A., Sutskever, I. & Hinton, G. E. Imagenet classification with deep convolutional neural networks. Adv. NeurIPS 25, 1097–1105 (2012).Deng, J. et al. Imagenet: a large-scale hierarchical picture database. IEEE Conf. Comput. Vis. Pattern Recog. 248–255 (2009).Kaplan, J. et al. Scaling legal guidelines for neural language fashions. Preprint at https://arXiv.org/abs/2001.08361 (2020).Bommasani, R. et al. On the alternatives and dangers of basis fashions. Preprint at https://arXiv.org/abs/2108.07258 (2021).Hoffmann, J. et al. Training compute-optimal giant language fashions. Preprint at https://arXiv.org/abs/2203.15556 (2022).Fei-Fei, L. & Krishna, R. Searching for laptop imaginative and prescient north stars. Daedalus 151, 85–99 (2022).Article 

Google Scholar 
Alayrac, J.-B. et al. Flamingo: a visible language mannequin for few-shot studying. Adv. NeurIPS 35, 23716–23736 (2022).
Google Scholar 
Young, T. Experiments and calculations relative to bodily optics (The 1803 Bakerian lecture). Phil. Trans. R. Soc. 94, 1–16 (1804).Sutton, R. S. & Barto, A. G. Reinforcement Learning: An Introduction (MIT Press, 2018).Schaul, T., Borsa, D., Modayil, J. & Pascanu, R. Ray interference: a supply of plateaus in deep reinforcement studying. Preprint at https://arXiv.org/abs/1904.11455 (2019).Ortega, P. A. et al. Shaking the foundations: delusions in sequence fashions for interplay and management. Preprint at https://arXiv.org/abs/2110.10819 (2021).Huang, J. et al. Large language fashions can self-improve. Preprint at https://arXiv.org/abs/2210.11610 (2022).Shumailov, I. et al. The curse of recursion: coaching on generated knowledge makes fashions neglect. Preprint at https://arXiv.org/abs/2305.17493 (2023).Wang, R., Lehman, J., Clune, J. & Stanley, Ok. O. Paired open-ended trailblazer (POET): endlessly producing more and more complicated and various studying environments and their options. Preprint at https://arXiv.org/abs/1901.01753 (2019).Portelas, R., Colas, C., Weng, L., Hofmann, Ok. & Oudeyer, P.-Y. Automatic curriculum studying for deep RL: a brief survey. Proc. twenty ninth International Joint Conference on Artificial Intelligence Survey Track (2020).Linke, C., Ady, N. M., White, M., Degris, T. & White, A. Adapting conduct through intrinsic reward: a survey and empirical research. J Artif. Intell. Res. 69, 1287–1332 (2020).Article 
MathSciWeb 

Google Scholar 
Oudeyer, P.-Y. & Kaplan, F. What is intrinsic motivation? A typology of computational approaches. Front. Neurorobot. 1, 6 (2007).Article 

Google Scholar 
Pathak, D., Agrawal, P., Efros, A. A. & Darrell, T. Curiosity-driven exploration by self-supervised prediction. Proc. thirty fourth International Conference on Machine Learning 70, 2778–2787 (PMLR, 2017).Colas, C., Karch, T., Sigaud, O. & Oudeyer, P.-Y. Autotelic brokers with intrinsically motivated goal-conditioned reinforcement studying: A brief survey. J. Artif. Intell. Res. 74, 1159–1199 (2022).Article 
MathSciWeb 
MATH 

Google Scholar 
Ladosz, P., Weng, L., Kim, M. & Oh, H. Exploration in deep reinforcement studying: a survey. Inf. Fusion 85, 1–22 (2022).Jiang, M., Rocktäschel, T. & Grefenstette, E. General intelligence requires rethinking exploration. R. Soc. Open Sci. 10, 230539 (2023).Article 

Google Scholar 
Kearns, M. & Singh, S. Near-optimal reinforcement studying in polynomial time. Mach. Learn. 49, 209–232 (2002).Article 
MATH 

Google Scholar 
Osband, I., Van Roy, B., Russo, D. J. & Wen, Z. Deep exploration through randomized worth features. J. Mach. Learn. Res. 20, 1–62 (2019).MathSciWeb 
MATH 

Google Scholar 
Leibo, J. Z., Hughes, E., Lanctot, M. & Graepel, T. Autocurricula and the emergence of innovation from social interplay: a manifesto for multi-agent intelligence analysis. Preprint at https://arXiv.org/abs/1903.00742 (2019).Sukhbaatar, S. et al. Intrinsic motivation and automated curricula through uneven self-play. sixth International Conference on Learning Representations 6 (2018).Leibo, J. Z. et al. Malthusian reinforcement studying. Proc. 18th International Conference on Autonomous Agents and MultiAgent Systems 1099–1107 (2019).Baker, B. et al. Emergent device use from multi-agent autocurricula. eighth International Conference on Learning Representations 8 (2020).Balduzzi, D. et al. Open-ended studying in symmetric zero-sum video games. Proc. thirty sixth International Conference on Machine Learning 97, 434–443 (PMLR, 2019).Plappert, M. et al. Asymmetric self-play for automated purpose discovery in robotic manipulation. Preprint at https://arXiv.org/abs/2101.04882 (2021).Goodfellow, I. et al. Generative adversarial nets. Adv. NeurIPS 27, 2672–2680 (2014).Herrmann, E., Call, J., Hernández-Lloreda, M. V., Hare, B. & Tomasello, M. Humans have advanced specialised expertise of social cognition: the cultural intelligence speculation. Science 317, 1360–1366 (2007).Article 

Google Scholar 
Boyd, R., Richerson, P. J. & Henrich, J. The cultural area of interest: why social studying is important for human adaptation. Proc. Natl Acad. Sci. USA 108, 10918–10925 (2011).Article 

Google Scholar 
Whiten, A. Cultural evolution in animals. Annu. Rev. Ecol. Evol. Syst. 50, 27–48 (2019).Article 

Google Scholar 
Dunbar, R. I. M. The social mind speculation. Evol. Anthropol. 6, 178–190 (1998).Article 

Google Scholar 
Byrne, R. W. Machiavellian intelligence retrospective. J. Comp. Psychol. 132, 432 (2018).Article 

Google Scholar 
Szathmáry, E. & Maynard Smith, J. The main evolutionary transitions. Nature 374, 227–232 (1995).Article 

Google Scholar 
Jablonka, E. & Lamb, M. J. Evolution in Four Dimensions: Genetic, Epigenetic, Behavioral, and Symbolic Variation within the History of Life (MIT Press, 2014).Heyes, C. Cognitive Gadgets: The Cultural Evolution of Thinking (Harvard Univ. Press, 2018).Ng, W.-L. & Bassler, B. L. Bacterial quorum-sensing community architectures. Ann. Rev. Genet. 43, 197 (2009).Article 

Google Scholar 
Verheggen, F. J., Haubruge, E. & Mescher, M. C. Alarm pheromones—chemical signaling in response to hazard. Vit. Horm. 83, 215–239 (2010).Article 

Google Scholar 
Nagy, M. et al. Synergistic advantages of group search in rats. Curr. Biol. 30, 4733–4738 (2020).Article 

Google Scholar 
Schluter, D. The Ecology of Adaptive Radiation (Oxford Univ. Press, 2000).Bansal, T., Pachocki, J., Sidor, S., Sutskever, I. & Mordatch, I. Emergent complexity through multi-agent competitors. sixth International Conference on Learning Representations 6 (2018).Reynolds, C. W. Flocks, herds and faculties: a distributed behavioral mannequin. Computer Graphics 21, 25–34 (1987).Lerer, A. & Peysakhovich, A. Maintaining cooperation in complicated social dilemmas utilizing deep reinforcement studying. Preprint at https://arXiv.org/abs/1707.01068 (2017).Leibo, J. Z., Zambaldi, V., Lanctot, M., Marecki, J. & Graepel, T. Multi-agent reinforcement studying in sequential social dilemmas. Proc. sixteenth International Conference on Autonomous Agents and MultiAgent Systems 464–473 (2017).McKee, Ok. R., Leibo, J. Z., Beattie, C. & Everett, R. Quantifying the consequences of setting and inhabitants range in multi-agent reinforcement studying. Auton. Agents Multi-Agent Syst. 36, 21 (2022).Strouse, D., McKee, Ok., Botvinick, M., Hughes, E. & Everett, R. Collaborating with people with out human knowledge. Adv. NeurIPS 34, 14502–14515 (2021).
Google Scholar 
Lazaridou, A., Peysakhovich, A. & Baroni, M. Multi-agent cooperation and the emergence of (pure) language. fifth International Conference on Learning Representations 5 (2017).Czarnecki, W. M. et al. Real world video games seem like spinning tops. Adv. NeurIPS 33, 17443–17454 (2020).
Google Scholar 
McGill, B. J. & Brown, J. S. Evolutionary recreation principle and adaptive dynamics of steady traits. Annu. Rev. Ecol. Evol. Syst. 38, 403–435 (2007).Article 

Google Scholar 
Sareni, B. & Krahenbuhl, L. Fitness sharing and niching strategies revisited. IEEE Trans. Evol. Comp. 2, 97–106 (1998).Article 

Google Scholar 
Lehman, J. et al. The shocking creativity of digital evolution: a group of anecdotes from the evolutionary computation and artificial life analysis communities. Artif. Life 26, 274–306 (2020).Article 

Google Scholar 
Van Valen, L. A new evolutionary legislation. Evol. Theory 1, 1–30 (1973).
Google Scholar 
Dawkins, R. & Krebs, J. R. Arms races between and inside species. Proc. R. Soc. B 205, 489–511 (1979).
Google Scholar 
Sims, Ok. Evolving 3D morphology and conduct by competitors. Artif. Life 1, 353–372 (1994).Article 

Google Scholar 
Nolfi, S. & Floreano, D. Coevolving predator and prey robots: do ‘arms races’ come up in artificial evolution? Artif. Life 4, 311–335 (1998).Article 

Google Scholar 
Silver, D. et al. Mastering the sport of Go with deep neural networks and tree search. Nature 529, 484–489 (2016).Article 

Google Scholar 
Stooke, A. et al. Open-ended studying leads to typically succesful brokers. Preprint at https://arXiv.org/abs/2107.12808 (2021).Johanson, M. B., Hughes, E., Timbers, F. & Leibo, J. Z. Emergent bartering behaviour in multi-agent reinforcement studying. Preprint at https://arXiv.org/abs/2205.06760 (2022).Clune, J. AI-GAs: AI-generating algorithms, an alternate paradigm for producing normal artificial intelligence. Preprint at https://arXiv.org/abs/1905.10985 (2019).Nisioti, E. & Moulin-Frier, C. Grounding artificial intelligence within the origins of human conduct. Preprint at https://arXiv.org/abs/2012.08564 (2020).Aubret, A., Matignon, L. & Hassas, S. A survey on intrinsic motivation in reinforcement studying. Preprint at https://arXiv.org/abs/1908.06976 (2019).Tesauro, G. TD-Gammon, a self-teaching backgammon program, achieves master-level play. Neural Computation 6, 267–285 (1994).Jaderberg, M. et al. Human-level efficiency in 3D multiplayer video games with population-based reinforcement studying. Science 364, 859–865 (2019).Article 
MathSciWeb 

Google Scholar 
Bakhtin, A. et al. Human-level play within the recreation of Diplomacy by combining language fashions with strategic reasoning. Science 378, 1067–1074 (2022).Article 
MathSciWeb 

Google Scholar 
Byrne, R. & Whiten, A. Machiavellian Intelligence (Oxford Univ. Press, 1994).Lanctot, M. et al. A unified game-theoretic strategy to multiagent reinforcement studying. Adv. NeurIPS 30, 4190–4203 (2017).Vinyals, O. et al. Grandmaster degree in StarCraft II utilizing multi-agent reinforcement studying. Nature 575, 350–354 (2019).Article 

Google Scholar 
Rendell, L. et al. Why copy others? Insights from the social studying methods event. Science 328, 208–213 (2010).Article 
MathSciWeb 
MATH 

Google Scholar 
Fang, C., Lee, J. & Schilling, M. A. Balancing exploration and exploitation by way of structural design: the isolation of subgroups and organizational studying. Org. Sci. 21, 625–642 (2010).Article 

Google Scholar 
Lazer, D. & Friedman, A. The community construction of exploration and exploitation. Admin. Sci. Quart. 52, 667–694 (2007).Article 

Google Scholar 
Mason, W. A., Jones, A. & Goldstone, R. L. Propagation of improvements in networked teams. J. Exp. Psychol. Gen. 137, 422 (2008).Article 

Google Scholar 
Vlasceanu, M., Morais, M. J. & Coman, A. Network construction impacts the synchronization of collective beliefs. J. Cogn. Cult. 21, 431–448 (2021).Article 

Google Scholar 
Coman, A., Momennejad, I., Drach, R. D. & Geana, A. Mnemonic convergence in social networks: the emergent properties of cognition at a collective degree. Proc. Natl Acad. Sci. USA 113, 8171–8176 (2016).Article 

Google Scholar 
Centola, D. The community science of collective intelligence. Trends Cogn. Sci. 26, 923–941 (2022).Bernstein, E., Shore, J. & Lazer, D. How intermittent breaks in interplay enhance collective intelligence. Proc. Natl Acad. Sci. USA 115, 8734–8739 (2018).Article 

Google Scholar 
McKee, Ok. R. et al. Scaffolding cooperation in human teams with deep reinforcement studying. Nat. Hum. Behav. 7, 1787–1796 (2023).Osa, T. et al. An algorithmic perspective on imitation studying. Found. Trends Robot. 7, 1–179 (2018).Article 

Google Scholar 
Torabi, F., Warnell, G. & Stone, P. Behavioral cloning from remark. Proc. twenty seventh International Joint Conference on Artificial Intelligence 4950–4957 (2018).Ho, J. & Ermon, S. Generative adversarial imitation studying. Adv. NeurIPS 29, (2016).Liu, S. et al. From motor management to staff play in simulated humanoid soccer. Preprint at https://arXiv.org/abs/2105.12196 (2021).Borsa, D. et al. Observational studying by reinforcement studying. Proc. 18th International Conference on Autonomous Agents and MultiAgent Systems 1117–1124 (2019).Ndousse, Ok. Ok., Eck, D., Levine, S. & Jaques, N. Emergent social studying through multi-agent reinforcement studying. Proc. thirty eighth International Conference on Machine Learning 139, 7991–8004 (PMLR, 2021).Nisioti, E., Mahaut, M., Oudeyer, P.-Y., Momennejad, I. & Moulin-Frier, C. Social community construction shapes innovation: experience-sharing in RL with SAPIENS. Preprint at https://arXiv.org/abs/2206.05060 (2022).Jablonka, E. & Lamb, M. J. The evolution of knowledge within the main transitions. J. Theor. Biol. 239, 236–246 (2006).Article 
MathSciWeb 

Google Scholar 
Henrich, J. The Secret of Our Success: How Culture is Driving Human Evolution, Domesticating Our Species, and Making Us Smarter (Princeton Univ. Press, 2016).Bowling, S., Lawlor, Ok. & Rodríguez, T. A. Cell competitors: the winners and losers of health choice. Development 146, dev167486 (2019).Article 

Google Scholar 
Raff, M. C. Social controls on cell survival and cell dying. Nature 356, 397–400 (1992).Article 

Google Scholar 
Ferrante, E., Turgut, A. E., Duéñez-Guzmán, E., Dorigo, M. & Wenseleers, T. Evolution of self-organized process specialization in robotic swarms. PLoS Comp. Biol. 11, e1004273 (2015).Article 

Google Scholar 
Peysakhovich, A. & Lerer, A. Prosocial studying brokers resolve generalized stag hunts higher than egocentric ones. Proc. seventeenth International Conference on Autonomous Agents and MultiAgent Systems 2043–2044 (2018).Brambilla, M., Ferrante, E., Birattari, M. & Dorigo, M. Swarm robotics: a overview from the swarm engineering perspective. Swarm Intell. 7, 1–41 (2013).Article 

Google Scholar 
Oroojlooy, A. & Hajinezhad, D. A overview of cooperative multi-agent deep reinforcement studying. Appl. Intell. 53, 13677–13722 (2023).Schranz, M., Umlauft, M., Sende, M. & Elmenreich, W. Swarm robotic behaviors and present functions. Front. Robot. AI 7, 36 (2020).Article 

Google Scholar 
Leibo, J. Z. et al. Scalable analysis of multi-agent reinforcement studying with Melting Pot. Proc. thirty eighth International Conference on Machine Learning 139, 6187–6199 (PMLR, 2021).Sunehag, P., Vezhnevets, A. S., Duéñez-Guzmán, E., Mordach, I. & Leibo, J. Z. Diversity by way of exclusion (DTE): area of interest identification for reinforcement studying by way of value-decomposition. Proc. 2023 International Conference on Autonomous Agents and Multiagent Systems 2827–2829 (2023).Wang, J. X. et al. Evolving intrinsic motivations for altruistic conduct. Proc. 18th International Conference on Autonomous Agents and MultiAgent Systems 683–692 (2019).Gemp, I. et al. D3C: lowering the value of anarchy in multi-agent studying. Proc. twenty first International Conference on Autonomous Agents and Multiagent Systems 498–506 (2022).Zheng, S., Trott, A., Srinivasa, S., Parkes, D. C. & Socher, R. The AI economist: taxation coverage design through two-level deep multiagent reinforcement studying. Sci. Adv. 8, eabk2607 (2022).Article 

Google Scholar 
Koster, R. et al. Human-centered mechanism design with democratic AI. Nat. Hum. Behav. 6, 1398–1407 (2022).Article 

Google Scholar 
Dean, L. G., Kendal, R. L., Schapiro, S. J., Thierry, B. & Laland, Ok. N. Identification of the social and cognitive processes underlying human cumulative tradition. Science 335, 1114–1118 (2012).Article 

Google Scholar 
Muthukrishna, M. & Henrich, J. Innovation within the collective mind. Phil. Trans. R. Soc. B 371, 20150192 (2016).Article 

Google Scholar 
Dunbar, R. I. & Shultz, S. Why are there so many explanations for primate mind evolution? Phil. Trans. R. Soc. B 372, 20160244 (2017).Article 

Google Scholar 
Kirby, S., Tamariz, M., Cornish, H. & Smith, Ok. Compression and communication within the cultural evolution of linguistic construction. Cognition 141, 87–102 (2015).Article 

Google Scholar 
Ostrom, E. Understanding Institutional Diversity (Princeton Univ. Press, 2005).Havrylov, S. & Titov, I. Emergence of language with multi-agent video games: Learning to talk with sequences of symbols. Adv. NeurIPS 30, (2017).Mordatch, I. & Abbeel, P. Emergence of grounded compositional language in multi-agent populations. Proc. AAAI Conf. Artif. Intell. 32, https://doi.org/10.1609/aaai.v32i1.11492 (2018).Brown, T. et al. Language fashions are few-shot learners. Adv. NeurIPS 33, 1877–1901 (2020).
Google Scholar 
Chowdhery, A. et al. PaLM: scaling language modeling with pathways. Preprint at https://arXiv.org/abs/2204.02311 (2022).Chan, S. C. et al. Data distributional properties drive emergent few-shot studying in transformers. Adv. NeurIPS 35, 18878–18891 (2022).
Google Scholar 
Wei, J. et al. Chain of thought prompting elicits reasoning in giant language fashions. Adv. NeurIPS 35, 24824–24837 (2022).
Google Scholar 
Bisk, Y. et al. Experience grounds language. In Proc. 2020 Conference on Empirical Methods in Natural Language Processing 8718–8735 (2020).Ullman, T. Large language fashions fail on trivial alterations to theory-of-mind duties. Preprint at https://arXiv.org/abs/2302.08399 (2023).Liu, R. et al. Mind’s eye: Grounded language mannequin reasoning by way of simulation. eleventh International Conference on Learning Representations 11 (2023).Glaese, A. et al. Improving alignment of dialogue brokers through focused human judgements. Preprint at https://arXiv.org/abs/2209.14375 (2022).Colas, C., Karch, T., Moulin-Frier, C. & Oudeyer, P.-Y. Language and tradition internalization for human-like autotelic AI. Nat. Mach. Intell. 4, 1068–1076 (2022).Article 

Google Scholar 
Villalobos, P. et al. Will we run out of information? An evaluation of the bounds of scaling datasets in machine studying. Preprint at https://arXiv.org/abs/2211.04325 (2022).Gazda, S. Ok. Driver-barrier feeding conduct in bottlenose dolphins (Tursiops truncatus): new insights from a longitudinal research. Mar. Mammal Sci. 32, 1152–1160 (2016).Article 

Google Scholar 
Bales, Ok. L. et al. What is a pair bond? Horm. Behav. 136, 105062 (2021).Article 

Google Scholar 
Lukas, D. & Clutton-Brock, T. Social complexity and kinship in animal societies. Ecol. Lett. 21, 1129–1134 (2018).Article 

Google Scholar 
Feldman, R. The adaptive human parental mind: implications for youngsters’s social growth. Trends Neurosci. 38, 387–399 (2015).Article 

Google Scholar 
Tarr, B., Launay, J., Cohen, E. & Dunbar, R. Synchrony and exertion throughout dance independently increase ache threshold and encourage social bonding. Biol. Lett. 11, 20150767 (2015).Article 

Google Scholar 
Lieberwirth, C. & Wang, Z. Social bonding: regulation by neuropeptides. Front. Neurosci. 8, 171 (2014).Article 

Google Scholar 
Ågren, J. A., Davies, N. G. & Foster, Ok. R. Enforcement is central to the evolution of cooperation. Nat. Ecol. Evol. 3, 1018–1029 (2019).Article 

Google Scholar 
Wilkins, A. S., Wrangham, R. W. & Fitch, W. T. The ‘domestication syndrome’ in mammals: a unified clarification based mostly on neural crest cell conduct and genetics. Genetics 197, 795–808 (2014).Article 

Google Scholar 

https://www.nature.com/articles/s42256-023-00754-x

Recommended For You