Hey all,
So this constitutes an idea I’ve kind of held ‘close to my heart’ for awhile now, yet without help realized I have little chance of scaling without assistance.
The thinking process goes that though many now use ‘agents’ to scale competition in LLMs there might be a more reasonable (or sensible) way, if only we had the right database (and this is not so crazy-- much of LLMs post-training consists of a large volume of RLHF, but maybe we are only chasing the tail end of the horse that has left the gate).
As first, a long time lover of literature, and first a student of Philosophy, I realize many in this field, are, well, perhaps not. Language belies a certain sensitivity to it.
In any case, I’ve become amazed at how much a merely ‘required’ Philosophy course (in symbolic logic) has altered my life. I recall, as a student, thinking how am I ever going to use this ? I was suggested at the time, well, it could be useful if I ever decided to become a lawyer… (And I had no desire to become a lawyer).
But, later on when I started to study/learn FPGA’s-- Ah ha ! There they are, your ‘truth tables’, your Boolean representation of a function, and I thought ‘lo’ho’, I have a leg up on this and it made me feel good.
I also did try to reach out to my old professor on this very topic in present times. He got back to me once which was very nice, but I have not heard back since.
His entire textbook is available free online though, if in an antiquated format:
https://courses.umass.edu/phil110-gmh/MAIN/IHome-5.htm
He is an excellent teacher though so I want no negative feedback towards him.
My thought was, seeing as sentences can be defined ‘situationally’ this is not the same as ‘symbolically’, and for what I will call ‘active attention’ this is what we would require.
A bit less of the ‘word salad’.
But composing such a database would take a (really) dedicated team and some time.
Yes, we’ve found you can scour the back ends of the internet and comprise something useful, and I confess I still have a bit of a ‘bias’-- Even your ‘best model’ is still always, always, always dependent on the data you put into it [there is, simply, never, no way around that], we ought to think more carefully about curation beyond FineWeb, etc.
Though, even considering present models I think you could build the symbolic logic into the positional encodings; Or at the start, not the end… If only you had the dataset.
And, I guess, I mean, I suppose I am still not really convinced this is the most efficient way to implement ‘all things’ as we are acting with a database methodology in Q,V,K… But at least it might help.
I did mention an earlier version of this thought on Discord’s @akakak1337 forum, but it got no traction, so perhaps I am wrong.
Felt though it was finally time I take it off my chest.