Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

My reply was to GP's assertions about which questions GPT will and won't answer, not to TFA's claims about political compasses.

> Ben Shapiro

...really?



GP does not assert anything about other questions and in fact doesn't even assert an opinion on this question: they merely demonstrate that there is a single question that has a biased filter and that is sufficient to prove the existence of some sort of awkward creator-infused bias, even if merely against topics that begin with the letter F.


(When I replied to it, Toomim's post was wholly different and linked to a Ben Shapiro video about purported leftist bias in ChatGPT.)


There is a specific stage called Reinforcement Learning from Human Feedback (RLHF) that is supposed to bias the model to be helpful, polite, present opposing views for balance, and not to say things that are out of its training set. We have all seen the canned responses that appear when the filter is triggered.

RLHF is very important if you want your model to act professionally. The first version of Davinci (GPT-3 in 2020) was "free" from such human-preference-training, but very hard to convince to cooperate on any task.

But applying RLHF could also introduce political biases into the model. This depends on the labelling team.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: