features_and_friends.csv ------------------------ This file contains 33 image features for 19,217 MySpace profile pictures. Also included is the number of friends for each user in the sample. The columns are (roughly): n - Number of brightness levels pn - A measure of the distribution of these levels, a higher number means there are more regions with the same brightness h,s,v - Average HSV color for the entire image ch,cs,cv - Average HSV for the central portion of the image r,g,b - Average RGB levels, as % of median luminosity sr,sg,sba - Variance in RGB h1..h4 - The quartiles for luminosity histogram syx,syy - Measures of symmetry in X and Y axes csy - Measure of symmetry in the central portion of the image csyx - Location of vertical axis of greatest symmetry in image center smx,smy - Measures of 1-pixel smoothness in X and Y directions bytes - Size of the compressed file (in original file format) is_gif - Was the original file a GIF is_jpg - Was the original file a JPG x,y - Dimensions of the image ar - Aspect ratio = x/y px - Number of pixels = x*y ars - Landscape or Portrait friends - Number of friends myspace_faces.tar ----------------- This file contains 64x64 pixel profile pictures taken from 250,000 MySpace users. The data was collected in early May 2009. The number of files is slightly less than the total number of users in the sample due to corrupt images being downloaded. The files are stored in 250 directories (000 to 249), each containing 1000 images. The images are renamed to have sequential numbers, with original file names scrubbed to remove anything which may directly identify any MySpace user account. A small percentage of the images include embedded comments within metadata, but nothing should be personally identifiable. If so, woops. Note that a significant number of the pictures are animated, and this animation has been retained in the process of converting to 64x64 GIFs. The aspect ratio of the original image has been lost in this process. Sorry. Apart from resizing no other image processing has taken place. The file format conversion (from GIF and JPG) and re-sizing was performed by ImageMagick. I can't share how the users were chosen to be included in this sample. And by not telling you, it is safe to consider the selection as random browsing through the friends lists of randomly chosen users. This file is provided as a resource to researchers and the original images remain the property of whoever currently owns them. All images were accessed via publically available MySpace image servers and as far as I see it, that makes them useable by the general public. ~ ~ ~ :wq "README" 30 lines, 1533 characters written