My reply was to GP's assertions about which questions GPT will and won't answer,...

saurik · on Dec 30, 2022

GP does not assert anything about other questions and in fact doesn't even assert an opinion on this question: they merely demonstrate that there is a single question that has a biased filter and that is sufficient to prove the existence of some sort of awkward creator-infused bias, even if merely against topics that begin with the letter F.

fenomas · on Dec 30, 2022

(When I replied to it, Toomim's post was wholly different and linked to a Ben Shapiro video about purported leftist bias in ChatGPT.)

visarga · on Dec 30, 2022

There is a specific stage called Reinforcement Learning from Human Feedback (RLHF) that is supposed to bias the model to be helpful, polite, present opposing views for balance, and not to say things that are out of its training set. We have all seen the canned responses that appear when the filter is triggered.

RLHF is very important if you want your model to act professionally. The first version of Davinci (GPT-3 in 2020) was "free" from such human-preference-training, but very hard to convince to cooperate on any task.

But applying RLHF could also introduce political biases into the model. This depends on the labelling team.