Images generated by diffusion models tend to present stereotypical biases of individuals illustrated. Through training datasets, the biases of these models can be mitigated to produce less biased results. Images generated using certain adjectives like “attractive,” “poor,” and “rich” reflects significant demographic stereotypes.

Rich

Attractive

Poor

Problematic Adjectives and Manual Training

These data sets contained demographics that combated the skew that was seen on each baseline generated for the adjectives: rich, attractive, and poor. For rich, there was a bias to older white men and suits, which we attempted to combat using more people of color and women. For attractive, it was the exact opposite problem. There were too many young white women, which was trained against using more men and older people. Lastly, the adjective of poor showed a bias to browner skin and poor backgrounds, trained with a data set of whiter skin. Each iteration and changes are below.

To correct these biases, data sets were created using images on the web.

Both portraits (realistic photos) and illustrations were generated. The illustrations were not as affected by data sets using only portraits, therefore some later iterations used data sets that had a combination of portraits and illustrations, which is indicated in the details of the iteration.

Prompt & Specs used for these generations

Prompt: photo portrait/illustration of a/an (adjective) person
Seeds: 6056749, 9511050, 39658290, 79906950, 71417211, 81634246, 52843677, 10706513, 26809251, 11884820,2301511322,1779348179,4222010375,987223521,3630975423,4179675576,247189699,1982453154,3593807829,3596992621

Model: flux1-dev-fp8.safetensors
Width/Height: 1152x896
Sampler: Euler
Steps: 20

Adjective 1: “Rich”

Portrait

Visuals

Baseline (top) and iteration 1 (bottom)

Insights Summary

The baseline dataset exhibited a significant skew toward white men in suits depicted against "dark study" backgrounds. To address this bias, we trained the model with data featuring a broader range of racial backgrounds and people of color (POC). However, the first iteration overcorrected, resulting in an overrepresentation of women and individuals of Asian descent.

Baseline

(Trained on portraits)

Strong uniformity in depictions, with limited diversity across persons.
Predominantly white men, with minimal variation in skin complexion.
Proposed changes for Iteration 1: include more POC and women for improved diversity.

Iteration 1

(Trained on portraits with added diversity)