RBF Larder: These Prisoners Are Training AI

D&I

Issue #362

17 Sep 2023

Reinforcement Learning by Human Feedback (RLHF) is important component of pre-training AI, particularly in the reduction of hallucinations by GPT’s. It mainly consists of humans clicking Y/N on outputs - a mind numbingly repetitive job which, in the case of Finland, is occasionally performed by prison labour. An interesting ethical conundrum - at what point does providing legal routes to economy become the exploitation of slave labour?