I was recently doing some development and using LM Studio to run my local models. I had an issue that was ‘strange’ with one model, in that it was not any better, or worse than my standard goto Llama 3.2 instruct, it was just so very different in the output it was giving, how it was interpreting prompts in a way that was so different to any other model that I had tried.
I looked at the prompt structure in LM Studio and to my surprise there were many Chinese characters there! so I checked the huggingface model card, also predominately in Chinese, and yes I have confirmed it was Chinese.
The above led me to investigate LLM’s China, etc.
I was surprised to learn that close to 50% of open source (free) LLM’s originate from Chinese sources see here. I don’t have 100% confidence in what I read on Reddit! so I investigated further to confirm that what was being shown was probably true.
My mindset, like most is that I don’t generally trust such as emails from places like China, Russia, etc. Rightly, or wrongly I became suspicious. My grandmother would often tell me, ‘nothing in life is free, and that if someone is offering you something for free, then they will have a reason for doing so’.
I have now drifted a long way from my original reason for checking out this model, but I became intrigued by this situation.
Think of the basics of how an LLM functions: Calculating the probability of what the next token (vector component) will be. If we expand on this and for simplicity rephrase token to word. Then we can say that the probability of next word leads to sentence, etc.
What if, while building these free open source LLM’s, suble changes are made to steer the outputs towards a certain dirtection? You get my drift?
If this is so, then I’m sure that all nations, not just China could see potential in manipulating opinion, learning and anything else that may be to their benefit. **REM the USA in second in the list of top provider nations of free open source LLM’s. Living in the UK, I’m confident that ‘if’ such is possible then even my own goverment would see potential, this is not a China thing.
Given how much of our lives are becoming dependant on AI/LLM’s is the above something that should be considered for all LLM’s?
While there may be little to no evidence at present, we must ask to what extent we are becoming reliant on LLM’s. Where a huge market share is being soaked with free open source LLM’s, that in time such a reliance on this market could be manipulated?
Am I being paranoid, or is this something thats worth concidering?