ChatGPT not able to perform simple math problem? :)

mrclttnz · April 28, 2023, 2:24pm

I have tried to create a new task for ChatGPT, starting from the simple math problem in the Guidelines section of the course. Basically I’ve just asked ChatGPT to solve the problem with this prompt

prompt = f"""
Your task is to solve a math problem and provide the solution.


Use the following format:

The problem
===
summary of the problem here
===

The solution
===
solution here. Make sure the math is correct
===


Question:
===
I'm building a solar power installation and I need help \
working out the financials. 
- Land costs $100 / square foot
- I can buy solar panels for $250 / square foot
- I negotiated a contract for maintenance that will cost \
me a flat $100k per year, and an additional $10 / square \
foot
What is the total cost for the first year of operations \
as a function of the number of square feet.
===
"""
response = get_completion(prompt)
print(response)

While the summarisation of the problem looks great to me, the solution seems wrong. Any suggestions on how to avoid this type of errors from ChatGPT?

The response

The problem
===
What is the total cost for the first year of operations as a function of the number of square feet, given that land costs $100/sq ft, solar panels cost $250/sq ft, and maintenance costs $100k + $10/sq ft?

The solution
===
Let x be the number of square feet. 

The total cost for the first year of operations is the sum of the cost of land, solar panels, and maintenance: 

Total cost = (land cost per sq ft) x + (solar panel cost per sq ft) x + (maintenance flat fee) + (maintenance cost per sq ft) x

Total cost = ($100/sq ft) x + ($250/sq ft) x + ($100,000) + ($10/sq ft) x

Total cost = ($350/sq ft) x + ($100,000)

Therefore, the total cost for the first year of operations as a function of the number of square feet is:

Total cost = ($350/sq ft) x + ($100,000)

Note that a similar wrong answer happens if I use the three backticks to in the prompt to signal the different sections instead of the three equal signs. For some reason the answer is correct if I use instead three dashes.
Is ChatGPT so sensitive to such small details? How can it be used in the real world if it is? Or have I just been unlucky to find this example?

ertgbnm · April 28, 2023, 2:54pm

Yes, GPT-3.5 is super brittle when it comes to math. If you provide the right step by step approach prompt, it can get it pretty often.

GPT-4 fixes 95% of problem cases. The remaining 5% are fixed via plugins.

A general rule of thumb is to only use GPT for processing math and not for evaluating math. Use external tools like calculators for evaluation. That’s what humans do anyways.

mrclttnz · April 28, 2023, 3:22pm

Thanks @ertgbnm for your prompt (no pun intended) reply.

Can you suggest a step by step approach we can use in this case, something that is generic and is not already suggesting the solution to the AI (otherwise it’s too easy )?

TMosh · April 28, 2023, 6:26pm

GPT isn’t a calculator. It’s just a text predictor. Be careful to not expect too much from it.

takashisendo · April 29, 2023, 3:54am

GPT first learns the characteristics of languages by training by estimating next words, by using training document. Then the trained model uses for answering question. It likes a man read all the document in the earth, and guess the words and sentence for answer. Math has a different logic by us understanding theory and apply to a problem. GPT does not understand the theory. Math, even though it is simple, is hard for GPT unless same variables are abundant in document GPT uses for training.

Topic		Replies	Views
Wrong answers by chatgpt ChatGPT Prompt Engineering for Developers	4	393	July 26, 2023
Solar panel installation output from Prompt Engineering course ChatGPT Prompt Engineering for Developers	1	134	May 20, 2023
L6-Agents - Quite many errors when solving maths problems LangChain for LLM Application Development	3	131	June 9, 2023
Better to use periods in a prompt or not? ChatGPT Prompt Engineering for Developers	3	149	August 8, 2023
CHAT GPT "reasoning" capability AI Discussions ai-discussions	2	67	February 13, 2025

ChatGPT not able to perform simple math problem? :)

Related topics