Can ChatGPT outperform a human financial planner? A controlled experiment weighs in

Ask someone whether they’d rather get financial advice from a seasoned human planner or a chatbot, and most will pick the human. Yet survey data from Experian suggests a growing share of consumers, particularly younger ones, are already turning to generative AI tools for budgeting, investing, and credit decisions. That gap between what people say they prefer and what actually helps them stick to a financial plan is the puzzle at the center of a new study published in Acta Psychologica.

The research, led by Saurav Sathish Kumta at Yenepoya (Deemed to be University) in Mangalore, India, asked a pointed question: does the source of financial advice, whether human, AI, or a blend of the two, actually change the projected outcome of a long-term investment portfolio? And if so, why?

Setting up a controlled advice experiment

The team ran a quasi-experiment with 93 MBA students, randomly assigning them to one of three conditions: advice from a trained human financial advisor, advice from ChatGPT, or advice from both. Each participant was given the same hypothetical scenario: play the role of “Mr. Das,” a family head trying to reach a portfolio goal of ₹10 crore (roughly $1.2 million) in 20 years using Indian market instruments.

Participants first built a portfolio on their own. Then they had a 10-to-15-minute back-and-forth session with their assigned advisor, where they could ask questions, push back, and request explanations. Afterward, they rebuilt their portfolio and completed questionnaires measuring how much they trusted the advisor, whether they accepted the advice, and how they experienced the interaction itself.

To judge the quality of each portfolio, the researchers used a Nested Monte Carlo simulation, a computational method that runs 100,000 randomized market scenarios to estimate the probability that a given portfolio actually reaches its ₹10 crore goal. This let them sidestep participant self-reports and measure success objectively.

AI and hybrid portfolios reached the goal more often

Across every condition, advice improved outcomes. Average success probability rose from about 1.2% at baseline to around 4.1% after the advisory session. But the gains were not evenly distributed. The AI-only group’s success rate climbed by 0.034 percentage points, and the hybrid group’s by 0.037. The human-only group improved by just 0.016, less than half the gain of the other two.

Before concluding that AI simply gives better advice, the researchers dug further. When they looked only at participants who actually followed the advice they received, success rates between the human and AI groups were statistically indistinguishable. In other words, the human advisor’s recommendations were about as good as the AI’s. The difference came down to whether participants implemented what they were told.

And there, the gap was striking. About 82% of participants in the AI condition accepted the advice, compared to just 63% in the human condition. The human advisor’s ideas were sound; they just got rejected more often.

Trust was the same, but built differently

Here the study takes an unexpected turn. On a standard trust scale, participants reported roughly equal levels of trust in human advisors (5.67 out of 7) and AI advisors (5.38). The authors’ hypothesis that humans would be trusted more was not supported.

What differed was how trust was formed. The researchers measured the quality of the “dyadic encounter,” their term for the interaction itself, broken into three parts: how engaged the advisor seemed, how competent they came across, and the emotional tone of the conversation.

In the AI group, the quality of the encounter explained about 56% of the variation in how much a participant trusted the advisor. In the hybrid group, it explained 56%. In the human group, it explained only 43%. The authors interpret this as evidence that trust in AI is almost entirely earned through the interaction itself, while trust in human advisors leans more heavily on outside signals like reputation, appearance, or perceived professionalism.

A follow-up analysis traced a kind of psychological chain reaction: strong engagement with the advisor was linked to perceptions of competence, which fed into a warmer emotional climate, which in turn was linked to higher trust and a greater likelihood of accepting the advice. This chain held regardless of whether the advisor was human or machine, but it was tighter and more predictive when the advisor was AI.

Cognitive trust versus emotional comfort

The researchers also noticed something in how trust translated into action. For AI users, trust and acceptance moved together closely, a correlation of 0.33. The more a participant trusted the AI, the more likely they were to follow its advice. For human advisors, that link was weaker and not statistically significant. Participants sometimes followed human advice without especially trusting the advisor, and sometimes reported trusting the human advisor but did not act on the advice.

The authors argue this reflects two different kinds of trust at work. Trust in AI appears to be more “calculative,” built on perceived competence and consistent output. Trust in humans can be more emotional or social, shaped by rapport and perceived warmth, but that warmth doesn’t always translate into behavior change. A client might like their human advisor without finding their recommendations convincing enough to implement.

What it means for financial services

The authors suggest that the common assumption that AI erodes trust, relative to a human touch, may be off base in this setting. Their findings point instead to the idea that AI doesn’t destroy trust so much as shift the basis for it, from personal rapport to the quality of the interaction itself. That reframing has practical implications for financial institutions rolling out AI-assisted advisory tools.

One takeaway is that designers of AI financial tools should treat the feel of the conversation, clarity, responsiveness, emotional tone, as a core product feature rather than a polish layer. The researchers point to reinforcement learning from human feedback, where models are tuned based on user ratings, as one route to building AI advisors that don’t just produce accurate recommendations but deliver them in ways that build user trust.

Another takeaway favors hybrid models. In this study, the combination of human and AI advice produced the largest portfolio gains of any condition. The authors frame this as each advisor covering the other’s weakness: AI handled computational precision, while the human offered something closer to strategic judgment and emotional reassurance.

Caveats worth noting

Several limits of the study should temper any sweeping conclusions. The participants were MBA students rather than real investors with real money on the line, and the scenario was hypothetical. The study captured a single first encounter, not the slow build of trust over years of market ups and downs. The sample was also modest, around 30 participants per group, which makes some of the statistical comparisons marginal rather than definitive. The marginal result on acceptance (p = 0.099), for instance, is a trend rather than a firm finding.

Still, the study offers an unusually concrete window into a question the financial industry has been circling: when people sit down with an AI advisor, what actually happens to their willingness to act? The answer, at least in this experiment, is that they act more often, and that the pathway to getting there runs through the quality of the conversation itself.

Can ChatGPT outperform a human financial planner? A controlled experiment weighs in

Related Posts

The personality traits that predict smarter investing

Who really buys into pump-and-dump stock scams? A look inside 110,000 investor accounts

How researchers trained an AI to minimize portfolio risk from end to end

Why talking about money might be the cheapest anxiety treatment you’re not using

Follow us