|
OK, Last Names. I probably should have started a new thread for this one, as this file can be used for any era.
Again, the data comes from the old names.mongabay.com website. Unfortunately, as part of their reorganization, it appears certain tables are no longer on the site.
The data I pulled has just shy of 50,000 names, and provides statistics to indicate whether the name is used by people who primarily consider themselves white, black, asian/pacific, north american native, hispanic, or a combination of the above. This allows easy extrapolation of the data into subsets by ethnicity. Because my league starts so far back in history, I've not yet created subfiles for anything but caucasian, but can do so if anyone wants them before they become necessary for my universe.
Unlike the first names data, this data is sourced solely from the 2000 U.S. census, but separating the data by ethnicity does enable reasonable replication of a historic nameset. For caucasian names, certain sub-ethnicities, such as Italian, may need to be filtered further to promote greater realism in the era. I have not done this.
To ensure as few non-caucasian names are in my inaugural file, I only utilized names where at least 30% of the respondents associated themselves as non-hispanic white. This is not a guarantee and further data cleanup will be necessary to further clean the datafile -- a problem unlikely to exist when the sub sets are created for african-america, asian, and hispanic.
In any event, this filtering still yields over 43,000 caucasians, which have been incorporated into ethnicity id 0 in the attached file.
The african-american names, under ethnicity 39, includes about 3300 names, and the hispanic names, under a NEW ethnicity 41 (modification of the world_default.xml file required) contains about 3600 names. Given the size of the file, only these three namesets are included in the attachment, which still had to be compressed to be able to be uploaded to this board.
As with the first name file, I'm happy to share the Excel file I used to build and adjust this nameset, and will update this file from time-to-time as I tweak it.
Comments and suggestions welcome.
Last edited by cbbl; 05-23-2021 at 08:26 PM.
|