I was wondering and please feel free to answer intuitively.
What do you mean by consumed?
Trained, ingested, consumed.
It’s not entirely clear exactly what you’re asking.
My take on the issue being raised here is that as LLMs proliferate, the percentage of content that we see on the web that is generated by LLMs will inevitably rise. But LLMs are primarily trained on material scraped from the web, right? So what are the implications of LLMs being trained more and more on the output of other LLMS, as opposed to actual human generated content? Seems like the case of a cat chasing its own tail, meaning it doesn’t end well.
Skynet​. Not much can be expected to happen, the parameters/weights already give that output. So you are giving the llm the output which gives out as label/prediction to train it further. Its already producing that.
Could change maybe the weights a little bit but not many changes in its behaviour I would think.
The more it eats its own tail, the larger its appetite becomes, the more it eats its own tail…singularity or sentience, whichever comes first
The problem is a “dog” can try but can never reach its own tail, because from one side he tries to reach but from the other side he is pulling it way at the same time​