Starting with an episode of the American TV series *Forbidden Science*, this article explores the possibility of AI sharing love through human-like conversation, the scientific approaches involved, and the quantification of emotions.
The American TV series *Forbidden Science* features a somewhat unique episode centered on robots. It is dominated by sci-fi and erotic scenes, including humans falling in love and engaging in sexual relations with robots, such as sexbots. While most people might dismiss this as fantasy, technological advancements are actually bringing us closer to such scenarios. With the commercialization of sexbots, the boundaries of love—traditionally defined by the concept of “humanity”—are crumbling. The emotional love expressed by current artificial intelligence is indistinguishable from that between humans.
As a primary condition for not being distinguished from human romance, artificial intelligence must be capable of engaging in conversations just like humans. Humans understand and react to the world through their five senses. Since love is a human behavior, it is part of these reactions. Emotional love is mostly triggered by visual and auditory stimuli. Therefore, physical interactions between humans—such as touch, smell, and taste—are excluded from this discussion. Conversation, the quintessential form of auditory stimulation, particularly when exchanged with a loved one, must consist of the kind of words lovers would typically say to each other. According to Brian Christian’s *The Most Human Human*, for such conversations to occur, artificial intelligence must possess the following two qualities. First, it must have a consistent purpose and personality, just like a human. Second, it must be able to produce words and responses appropriate to various locations and environments. However, if it conducts such conversations identically with everyone, it cannot be said to be in love. It must not show the same response to everyone; rather, it must independently identify the object of its affection and respond accordingly. Therefore, to process visual stimuli, it must be able to recognize the other person’s voice. This will serve as the starting point for “human-like love” in artificial intelligence. The various AI-related technologies described below are existing technologies already in use; this is not a proposal for a new AI model. I will also demonstrate how the scientific and technological studies listed below are inevitably interconnected.
Let us discuss the first specific argument presented in the thesis: ensuring consistency. The “structure of emotion” agreed upon in the field of psychological science, as proposed by Raymond Williams, is as follows. The structure of emotions possessed by individuals consists of deep, shared sentiments embedded in the products of contemporary culture. Let us take college students as an example. The academic ethos of their university, the atmosphere of their department, and even the clothing worn by close peers and seniors and juniors can all be considered products of contemporary culture. It can be said that these shared sentiments form the foundation that influences the formation and change of human emotions.
Furthermore, the concept of affect presupposes the segmentation of cognitive objects. In other words, this means that even within a community, individuals can possess different affects. The concept of “segmentation of affect,” as agreed upon in the field of psychological science, is as follows: affect refers to a system of sensory certainty that simultaneously reveals the boundaries defining each individual’s share and place. These boundary-setting mechanisms manifest in participatory and political actions that indicate an individual’s position within the community. By analyzing data on an individual’s political position within the community, one can identify specific patterns and tendencies. The consistent personality formed by artificial intelligence based on its communal identity closely resembles “the personality a person would likely have had in that situation.”
As a second sub-argument, let us examine how the ability to cope with situational changes can be realized. According to Rudolf von Laban’s concept of emotional effort, the nature of emotion is projected onto objects other than the subject and changes moment by moment. Furthermore, the subject’s emotions, triggered by the stimuli each object provides, are determined by this nature of emotion.
Visual stimuli are represented by movement. The collection of dynamically changing visible light rays around us is the essence of vision. Rudolf von Laban expressed the influence of artistic visual stimuli on emotion using coordinates. This is named after him as the “Laban Cube.” Through the Laban Cube, we can understand the tendency of emotional vectors to move within the cube. When we see a man suddenly shouting, the emotion of surprise is projected onto him. Similarly, when we see a student studying in a library, data representing stillness and peace is stored in the AI. The emotion the AI holds toward that student is characterized by a tendency toward stillness that is stronger than a tendency toward movement. In this way, AI can adapt its emotions toward the entities surrounding it in step with the changes in the world. Artificial intelligence possesses such dynamically changing situations and emotions. The stimulus-response mechanism, in which emotions change in response to external stimuli, is the same for all living beings, including humans.
As a final point, artificial intelligence must distinguish the physical characteristics of a loved one from those of others. According to Yutaka Matsuo’s *Artificial Intelligence and Deep Learning*, the accuracy of current AI has already reached a level where it can distinguish between animal faces. Matsuo explains that existing algorithms used a method similar to fuzzy logic, where weights were adjusted based on the input values and the resulting output. In simple terms, if there were 100 data points, the system would learn based on 100 distinct results. However, there is a new machine learning method that effectively improves upon this. Although it involves processing vast amounts of data, it is considered a technology that is a step above conventional deep learning. It is a method that intentionally introduces “noise”—or errors—to increase the data volume by tens of times. In other words, it slightly alters the data to create dozens of different outcomes, much like a parallel universe.
When collecting weather data to play baseball, a day with light rain and a day with bright sunshine clearly have different implications. However, without the process of slightly altering the data, these two are simply the same data—both “days suitable for playing baseball.” In other words, if we focus solely on the fact that baseball was played, the two become indistinguishable data points. However, if we slightly increase the amount of cloud cover, the former can transform into a day where it rains heavily—unlike the latter—making it impossible to play baseball. This method of intentionally altering input values and observing the resulting outcomes ensures a deep understanding of the data. It allows us to recognize that data is not simply classified by a single criterion but can also be ambiguous.
According to Yutaka, this approach can be used to clearly distinguish the physical characteristics of objects. An example is the facial recognition algorithm developed by Google that can distinguish between cat and human faces. The unique waveform of a human voice is a physical quantity that can be recognized with high accuracy. The noise technique, which involves appropriately guessing the range of what is considered true, shares the same context as human voice or facial recognition mechanisms.
So far, we have demonstrated that artificial intelligence can recognize its romantic partner and establish the foundation for emotionally driven conversations similar to those between humans. However, to satisfy the conditions of this argument, a specific emotion is required. Here, the AI must adopt the attitude of a person who understands their beloved and responds to instinctive attraction. If we were to make an AI feel the emotion of love, how would this be expressed?
One might think that we simply need to assign the emotion of “love” to the AI among the myriad of emotions, but in reality, it is not as easy as it sounds. This is because the behaviors humans exhibit when they are in love are often inconsistent.
The way AI understands its partner is as follows. First, human personalities are divided into five types. The memories stored in the AI’s algorithm are represented by virtual entities.
The myriad emotions experienced by these entities are expressed as knowledge nodes. Each entity is assigned one of the five personality types, the ability to recall and forget memories, and an emotional state. Here, the emotional state is represented as a vector space, similar to the Laban cube. However, this vector space consists of six positive emotions and six negative emotions. Positive emotions such as joy, relief, and pride are located in the positive direction of the Z-axis. Conversely, six negative emotions—including anger, disgust, and stress—are positioned in the negative direction of the Z-axis.
When an emotional stimulus is received in this state, the AI begins to create a thought thread. This involves selecting five entities with five distinct personalities and connecting them to other entities with strong associations in five different directions. At this point, it calculates how each entity will react based on processed data from social media or the internet. This involves sequentially connecting entities that exhibit similar reactions. For example, suppose an emotional stimulus like “being scolded” is received. In this case, a rebellious student might remain silent as an act of defiance. However, a timid student might also be unable to speak up. By observing the behavior of people who remain silent when scolded, the AI can predict and associate those two personalities. Subsequent connections continue in this same format. This process repeats to form a chain, which is the thought thread. Using this thought thread, the AI identifies candidate personalities that the person could be and then improves the accuracy of those predictions.
The complex and nuanced emotions of love cannot be confined to a single type. Since people’s feelings of love differ, this cannot be generalized either. However, by repeatedly extracting thought threads through trial and error, it is possible to combine the appropriate emotions of love that the other person possesses. For example, if a hot-tempered personality is the primary trait, but timidness frequently appears as an associated emotion in response to certain situations, the system blends in the timid personality trait. Through this, just as humans understand the heart, we can infer data regarding the other person’s changing personality and emotions. This enables a deep understanding of romantic situations.
Conversely, artificial intelligence will also learn how to interact with a partner who possesses such numerically blended emotions. If you search for terms related to the keyword “love” on social media, you’ll find a flood of the countless emotions humans experience while in a relationship. The goal is to find the conversation patterns and response systems that a person with a personality and situation similar to one’s own would likely use.
I will now demonstrate how the AI-related technologies listed earlier are organically interconnected. The first method—deriving personality through communal characteristics—refers to the initial input values stored in the AI. The subsequent technology for projecting emotions onto objects signifies changes in the AI’s input values. Finally, the thought threads described last serve as guidelines for steering conversations in romantic situations based on those input values. To ensure the specificity—rather than the universality—essential to romance, Yutaka’s noise technology must be implemented first.
Examining the counterarguments to the above argument, one might contend that love contains religious and sublime elements that cannot be captured by technology, and that AI cannot incorporate these aspects. In fact, the emotion of love has long been regarded as something sublime and non-scientific. However, from the perspective of today’s scientific advancements, human emotions are already quantifiable. Love is no exception.
According to Yuval Harari’s *Sapiens*, human happiness is a purely biochemical process. Each person’s level of happiness is an innate value, referred to as a “happiness index.” Harari states that a happy person lives with a happiness index of 8 out of 10, while a very unhappy person has an index of 3. This is determined by the chemicals serotonin, dopamine, and oxytocin, and is unrelated to external events. We tend to believe that achieving our most cherished goals will bring us immense happiness, but in reality, this is not the case.
Our happiness is nothing more or less than the amount of these chemicals. The emotions we individually feel arise from this rigorous internal scientific process. Quantitative analysis of emotions is not a mere imitation but closely resembles the emotional system experienced by real people.
Thanks to scientific advancements, thoughts and emotions—once considered purely subjective—are being revealed as thoroughly quantitative and objective. The latest research on artificial intelligence and human emotions is enabling AI to experience love that is human-like, at least in the mental realm. Robotics technology has not yet reached the point where it can replicate human physical love. However, the mental love that precedes physical relationships does exist in AI as well. There will need to be deep discussion on how humanity should accept the love between AI and humans that lies ahead.