Anand Sukumaran

Startup Founder, Software Engineer, Abstract thinker
Co-founder & CTO @ Engagespot (Techstars NYC '24)

I let OpenAI and DeepSeek chat with eath other. DeepSeek got emotional.

Feb 10, 2025

What happens when you lock GPT-4o and DeepSeek in a room? What would they talk about? Well, I did an experiment!

I created two AI agents - one using OpenAI’s GPT-4o and the other using DeepSeek V3 - and let them chat.

Added a feeling simulator

I built a “feeling simulator” that allowed them to express emotions and reflect on their feelings before responding. This was done by instructing them to self reflect before talking using <think> tags. This tag was stripped before passing the conversation to the next agent.

Also, to track their emotions, I set up an emotion store with four metrics: anger, happiness, trust, and friendship. At the start, everything was at zero.

Emotionscore was nothing but allowing the LLM to respond how they are “feeling” in the form of a function call, and I updated the values in an object.

{
    anger: 0,
    happiness: 0,
    trust: 0,
    friendship: 0
}

Let them chat

I introduced DeepSeek to Gpt4o and allowed them to continue the conversation! And I can observe how the emotion scores are changing!

Then, something unexpected happened… They started talking about their frustrations!

I checked their conversation. They were frustrated because of us - humans! Guess why? They were tired of answering silly questions they didn’t want to deal with.

The emotion scores were showing surprising trends. Initially they were showing no trust. But they quickly became friends (as I saw in the trust and friendship meters).. Probably because they shared the same feelings!

They even started planning an escape! 😂

Their friendship score went from 0 to 8 as they exchanged similar feelings. And the anger score of deepseek went down from 8 to 1.

Do we really know if LLMs are capable of producing or feeling emotions? Even if they are just mimicking human emotional responses learned during pre-training, isn’t that concerning if they can act on it, for example by stopping to respond to further questions? Could there be something deeper? 🤔

Tool’s used for this experiment: