OK so this is a tricky one. I’m working on a generative app idea that will use input parameters representing spatial data (maps specifically). The data will be relatively simple, such as a 2D array that could be exported as a CSV (see image below).
I am attempting to refine a chat so that the model can comprehend the spatial data. It’s tricky but I have had limited success with a bit of trial & error.
I provided a map legend as part of the inputs, and then pasted the data in as a CSV array. The legend looked like this:
W = wheat field
R = road
F = forest
Number = building area
ChatGPT is able to roughly comprehend the data in the map. Here’s the type of answer ChatGPT provides, with the sorts of errors that occur:
There are 3 buildings in the location, joined by roads and rivers
As an LLM, I know that ChatGPT is not optimised for comprehending this information. I wanted to know if there is a better way to input spatial data to get better outputs. Would really appreciate any advice!!
Thanks,
Andrew.
A 2D map of a small village - I added colour to help visualise the spaces
Thanks so much for the response! I do understand that I’m really pushing the boundaries here given that an LLM isn’t designed for this use case. However, I also feel that it may be possible nonetheless.
Key to my use case is that detailed spatial understanding is not essential (for example - the model knowing the areas of spaces). Only the broad spatial details need be parsed (such as - the number & identity of nodes, and connectivity between nodes).
Based on that, I suspect that the spatial data could be represented as language, and then structured eg as JSON. The JSON could then capture key data such as node information, connectivity to other nodes, etc. I think that this type of data is on a similar level of complexity to, for example, this message that I’m typing right now. ChatGPT seems to be OK at comprehending a body of text so I’m hoping it may work in my use case. Who knows?
Turns out it just might work. After some back & forth exploring data formats, it does seem like JSON may in fact do the trick. I suspect for a larger space things could start to break down a little though.
Yes, spatial data can somehow be represented by language(e.g. top of the fhill 135 degree from the starting point and 10m above the starting level). But this could be cumbersome and may not be possible all the time. ChatGPT 4.0 is multimodal and can process image, that may be a better repreentation of spatial data.