Illinois Data Bank

Data for: Auditing Race and Gender Discrimination in Online Housing Markets

This dataset contains the results of a three month audit of housing advertisements. It accompanies the 2020 ICWSM paper "Auditing Race and Gender Discrimination in Online Housing Markets". It covers data collected between Dec 7, 2018 and March 19, 2019.

There are two json files in the dataset: The first contains a list of json objects representing advertisements separated by newlines. Each object includes the date and time it was collected, the image and title (if collected) of the ad, the page on which it was displayed, and the training treatment it received. The second file is a list of json objects representing a visit to a housing lister separated by newlines. Each object contains the url, training treatment applied, the location searched, and the metadata of the top sites scraped. This metadata includes location, price, and number of rooms.

The dataset also includes the raw images of ads collected in order to code them by interest and targeting. These were captured by selenium and named using a perceptive hash to de-duplicate images.

Social Sciences
algorithmic audit; advertisement audit;
CC BY
Karrie Karahalios
1057 times
Version DOI Comment Publication Date
1 10.13012/B2IDB-1408573_V1 2020-02-12

116 MB File

Contact the Research Data Service for help interpreting this log.

Research Data Service Illinois Data Bank
Access and Use Policies Web Privacy Notice Contact Us