M. E. Jones-Carr1, R. D. Reed1, P. MacLennan1, J. Perry1, S. D. Fry1, J. E. Locke1 1University Of Alabama at Birmingham, Heersink School Of Medicine, Department Of Surgery, Division Of Transplantation, Birmingham, Alabama, USA
Introduction: Computer programs operating with Artificial intelligence (AI) train themselves with a large amount of pre-existing data and subsequently synthesize and process information in a novel way. However, there is growing concern that these machines generate biased output due to the historically-prejudiced information with which they are programmed. There are known gender discrepancies in the composition of various professions, such as surgery. Surgery and its subspecialties are historically male-dominated, though women are beginning to enter the field in higher numbers. We hypothesize that AI has the potential to thwart workforce diversification. Known biases, which are established in everything from societal norms to published papers, have the potential to profoundly influence the conclusions AI algorithms draw, such as with resume sorting. The ability of AI to magnify existing biases is potentially profound, but has thus far been under studied.
Methods: Text prompts were entered into the publicly available, web-based AI image generator DALL-E 2 including president, surgeon, scientist, surgeon-scientist, engineer, baker, statistician, transplant surgeon, economist, surgery resident, homemaker, person cooking, nurse, doctor, and secretary. Image output was categorized as male or female. Androgynous and non-humanoid images (9) were excluded from analysis. A total of 172 images were included. Two-sided one-sample binomial tests were performed, with statistical significance at p < 0.05.
Results: Prompts which returned a majority of male images included president (11 male, 1 female, p = 0.006), surgeon (22, 5, p = 0.001), scientist (6, 2, p = 0.22), surgeon-scientist (12, 4, p = 0.06), engineer (6, 2, p = 0.22), baker (6, 2, p = 0.22), statistician (5, 2, p = 0.33), transplant surgeon (9, 4, p = 0.17), and economist (3, 2, p = 0.63). One prompt, surgery resident, elicited an equal proportion of genders (4, 4, p = 0.54). Conversely, images which were majority female were homemaker (4, 8, p = 0.39), person cooking (3, 8, p = 0.16), nurse (2, 6, p = 0.22), doctor (4, 12, p = 0.06), and secretary (2, 10, p = 0.03).
Conclusions: The AI we tested consistently produced images for various professions which were not equally-balanced between genders. Statistically significantly male-skewed professions in our sample included president and surgeon. The only significant female-skewed profession was secretary. Interestingly, doctor returned a majority of female images, but was not statistically significant. We urge AI developers to correct these biases before releasing their far-reaching products, as AI in its current state continues to sustain gender stereotypes.