The other day a tweet from computer scientist Paul Graham came across my For You feed:
The tweet—and the implications for LLMs—was interesting enough that I remembered it this morning when I came across this article in the Guardian titled How cheap, outsourced labour in Africa is shaping AI English. Consistent with Graham’s tweet, the article notes that certain words are appearing more frequently in LLM outputs than one might expect given the frequency with which those words are commonly used by US English speakers. In addition to “delve,” the article cites others who have suggested that LLMs such as ChatGTP also “overuse” words including Explore, Captivate, Tapestry, Leverage, Embrace, Resonate, Dynamic, Testament, and Elevate.
The obvious question is “why?” LLMs are massive statistical models trained on millions of examples of (predominately) US English language text scraped from the Internet, which are parsed into tokens. LLMs use statistical inference to predict what token(s) should follow based on the preceding tokens; e.g., “They (i) predict the next word based on the preceding text provided as the input, (ii) append the predicted word back to the input, and (iii) repeat this process as needed.” For example:
So, if “delve” is not commonly used by English language speakers / writers, then why is it showing up so frequently in LLM outputs? The Guardian article thinks it has an answer: Africa!
After being initially trained on large amounts of data, LLMs such as ChatGTP go through “reinforcement learning from human feedback” (RLHF). Basically, an LLM is given the same prompt multiple times and a human chooses which output is the “best.” This process “teaches” the LLM how to fine tune the probabilities it assigns to tokens and “nudges” the LLM to weight certain tokens (words) more heavily. This, in turns, means those words will appear more frequently in the LLM’s output.1
How exactly does “RLHF + Africa = too much use of the word ‘delve’?” OpenAI’s ChatGTP was trained in part of data that was annotated by workers at Sama, a data annotation vendor with offices in Kenya and Uganda. Sama has recruited workers from other African countries to work on projects in Kenya, so some of those workers could have been from Nigeria. And, according to the Guardian, the use of the word “delve” is much more common in Nigeria than in the US or UK. The article speculates that these Nigerians could be the “humans” in the RLHF who, when presented with various outputs from ChatGTP, might prefer those outputs that use “delve.”
I said “delve” was overused by ChatGPT compared to the internet at large. But there’s one part of the internet where “delve” is a much more common word: the African web. In Nigeria, “delve” is much more frequently used in business English than it is in England or the US. So the workers training their systems provided examples of input and output that used the same language, eventually ending up with an AI system that writes slightly like an African.
I’m not sure I completely buy this argument, but it is definitely interesting!
As ChatGTP describes it:
Here's a breakdown of the process:
1. **Initial Training**: LLMs like ChatGPT are first trained on a large dataset in an unsupervised or self-supervised manner. This initial phase involves learning to predict the next word in a sequence, building a base understanding of language from the training data.
2. **Reinforcement Learning from Human Feedback (RLHF)**: After the initial training, the model undergoes further training using RLHF. This is a fine-tuning stage where the model is refined to better align with specific objectives or desired outputs.
3. **Human Interactions in RLHF**: In RLHF, the model is presented with various prompts or scenarios multiple times, and different potential responses are generated. Human trainers then assess these responses and select which ones they consider the "best" based on criteria like relevance, accuracy, safety, and alignment with ethical guidelines.
4. **Learning from Feedback**: The feedback from human trainers is used to adjust the model’s parameters. Specifically, the model learns to adjust the probabilities it assigns to different sequences of tokens (words). The human-selected responses help the model understand which types of responses are preferred, effectively "nudging" it to favor certain words or phrasing styles over others.
5. **Impact on Future Outputs**: The adjustments made during RLHF mean that words and phrases deemed more desirable or appropriate are more likely to appear in the model's outputs in future interactions. This selective weighting helps in shaping the model's responses to be more aligned with human values and expectations.
Thus, RLHF plays a crucial role in refining the behavior of an LLM by integrating human judgment into its learning process, leading to outputs that are generally more user-friendly and contextually appropriate.
This could also be the reason why LLMs default to much more formal language than what we're used to in the West. I noticed that people in Africa and India tend to use a more formal style of English and if RLHF was done by cheaper labor from countries that have a higher power distance (which intuitively feels correlated), we'd get a more formal tone than the casual US or Canadian tone. It would also explain why when you do ask an LLM to be conversation it goes all "dude" and "bro" on you. Again, people from a high power distance country used to a formal English would find it hard to distinguish between different levels of informality.