Ipums Home
Contact InformationGeneral Information about the IPUMSFrequently Asked QuestionsBibliography, citing the IPUMS, and terms of useRevision history and other updatesIPUMS HomepageFull text search and site mapDocumentation on using the IPUMSVariables, codes, and frequenciesDownload data and create custom data set
Data
Data Extract SystemNew User RegistrationDownload Extracts

Documentation
What is the IPUMS?User's GuideIPUMS VariablesDownload DocumentationRevision History

Research
Citation and UseBibliographyA Note to Genealogists

Site Tools
Site MapSearch This SiteFrequently Asked QuestionsLinks to Related Sites

Contact Us
FeedbackIPUMS StaffMailing AddressBulletin Board

Revisions Made to IPUMS-98

This page contains a history of all significant data fixes, documentation and extract system enhancements, updates, and other changes made to IPUMS-98 since the data were first available in January 1998.  See Changes from IPUMS-95 to IPUMS-98 for an abbreviated list of critical changes to the IPUMS data and documentation before 1998.  Revisions are categorized as primarily Data, Documentation, or Extract System changes, and all sections are organized in chronological order, with the most recent revisions listed first. 

Contents of this page:

DATA
July 1, 1999.  Released new versions of 1850, 1860, 1870, 1880, 1900, and 1910 samples, containing the following enhancements and corrections: 
  • New geographic variables (METDIST, MDSTATUS, MCIVDIV, INCPLACE, INCORP, URBAREA) were added to 1850, 1880, and 1910 samples.  
  • Minor fixes to OCC1950, IND1950, CITIZEN, LIT, COUNTY, SEA, GQTYPE, GQFUNDS, NATIVITY, VOTE, MARRINYR, NAMEFRST, and NAMELAST.
  • Missing age allocation procedures fixed to allow age 0 to be allocated.  Improved rules for spouse imputation (IMPSP).  
  • Added cases from Bradley county, TN to 1850 that had been inadvertently dropped from the 1850 sample.  PERWT adjusted slightly. 
January 22, 1999. Major error in the November 25 version of 1860 and 1870 samples corrected.  The 1860/70 samples had an error in SURSIM, which in turn created errors in all the family interrelationship variables (IMPMOM, IMPPOP, IMPSP) and in the variables constructed from them (NCHILD, NCHLT, FAMSIZE, ELDCH, and so on).  The error could also have implications for missing data allocation; we recommend tossing out any previous versions of 1860 and 1870. 

November 25, 1998 -- PERWT, NUMHHTAK, and GQFUNDS fixed on the 1860 and 1870 sample. 

November 6, 1998 -- Revised preliminary samples of the 1860 and 1870 census released.  Two versions of both the 1860 and 1870 PUMS are now available: (1) a flat 1-in-200 sample of all dwellings, and (2) a black oversample containing a 1-in-100 sample of dwellings containing one or more blacks and a 1-in-200 sample of all other dwellings. 
The sample weights in both the flat and black oversamples of the preliminary 1860 and 1870 PUMS have been adjusted to be representative of the total population. 

August 20, 1998 -- Revised IPUMS-98 database released. 

August 20, 1998 -- AGE Allocations 1850-1920. There was an error in the missing data allocation procedure for AGE affecting all pre-1940 samples. Since age is used as a predictor in many other allocations, constructed variables, and universe checks, the frequencies for many variables in the earlier samples have changed slightly from the original iteration of IPUMS-98. 

August 20, 1998 -- Split YRSINUSA into two separate variables--YRSUSA1 and YRSUSA2-- to enhance compatibility over time. YRSUSA1 (columns 145-146 in the raw data files) contains the unrecoded continuous measure of years in the U.S. from the 1900-1920 samples. YRSUSA2 recodes 1900-1920 and 1970-1990 into five intervals compatible among all sample intervals. Users desiring greater detail on the original 1970-1990 intervals can refer to YRIMMIG, which retains all of the original detail recorded in the variable discussion. Documentation change: the universe for 1980 should have excluded foreign-born persons who were citizens at birth. 

August 20, 1998 -- OCCSCORE, SEI. In 1850-1870, laborers who were changed via logical edit to farm laborers (i.e., they lived on a farm), continued to receive the OCCSCORE and SEI for laborers. They will now receive the score for farm laborers. The original 1900 sample incorrectly classified many domestics as "service workers, nec" in their original 1950 occupation classification. The IPUMS fixed the occupational code, but neglected to assign the appropriate SEI and OCCSCORES for the new occupation. This has been rectified. 

August 20, 1998 -- RACE. In 1990, persons who indicated hispanic origin were recoded out of "other race, nec" in the race variable into the category "Spanish write-in." Persons of Mexican origin were mistakenly excluded from this recode. This is now fixed. 

August 20, 1998 -- PERWT and HHWT in 1990. Previously, the IPUMS adjusted the 1990 weights so that the total weighted sample would yield the same population count as the published census returns. We removed this programming, since users could not reverse this change is they desired to, and because there seemed no reason to assert the accuracy of the 1990 count at this level of detail. 

August 20, 1998 -- CITYPOP, SIZEPL. In 1980, households in New York City received the code for "not identifiable" (codes 00000, 00) in the city population variables. New York can be identified, and we have changed the population codes accordingly. 

August 20, 1998 -- ANCESTR1 and ANCESTR2. An error in the 1990 PUMS documentation slipped into the IPUMS. Anyone with a code of 0324 (West German) should have been coded 0460 (Greek). This is now fixed. 

August 20, 1998 -- MBPL, FBPL. In 1970, recoded "U.S. possesions, n.s." to match the documentation (code 12091); it was incorrectly coded 13000 in the data. 

August 20, 1998 -- YRIMMIG documentation change: the universe for 1980 excludes foreign-born persons who were citizens at birth. Changed 969 code to 970; it refers to 1965-1970, not 1965-1969. Added 914, which refers to the period before 1915 in the 1970 sample. We also changed the data, recoding 969 to 970. 

August 20, 1998 -- EDUCREC and HIGRADE. In 1980, N/A (under age 3) and "no schooling" were combined. We have separated them. 

August 20, 1998 -- BPL. In 1850, some persons with a birthplace of Iowa should have been coded as being born in Indiana (a confusion over the interpretation of the abbreviation "IA"). We have added programming to separate these codes. 

August 20, 1998 -- CLASSWKR. Removed new workers (persons looking for work but who have never obtained their first job) from the universe for 1940 and 1950 in order to increase compatibility. In 1990, reassigned unemployed persons who last worked over five years ago to the N/A category. In all years, the relevant information is preserved in other variables (EMPSTAT and YRSLASTWK). 

August 20, 1998 -- IND1950. The original 1940 contained an undocumented industry category. We determined that this is the category for "miscellaneous machinery" (code 358) The IPUMS had coded this category to "office and store machines" (code 357); we have recoded it to 358. In addition, the IND (contemporary industry classification) appendix for 1940 did not document this category. It has been added to the documentation. 

May 20, 1998 -- OCC, OCC1950, FARM. Fixed a significant error in occupation coding in the 1860 sample (which also affected 1870, though to a much lesser degree). The missing data allocation procedure changed most persons with a blank response (no occupation) to having an occupation. This greatly overstated female occupational responses in 1860, particularly for married women. Since FARM status is inferred from occupation, and many of the allocated cases were farmers, the 1860 and 1870 samples overstated the number of farms. Both the 1860 and 1870 samples have been reconstructed to rectify this problem. 

Early March, 1998 -- Changed weights in "small" and "tiny" samples to be representative of total population. 

Early March, 1998 -- Created a new Flat 1990 sample. 

February 17, 1998 -- Changed the weights in the 1860 and 1870 files to account for oversample of blacks. 

January, 1998 -- IPUMS-98  is available.  For prior revisions, see Changes from IPUMS-95 to IPUMS-98
 

DOCUMENTATION

August 6, 1999 -- New version of the downloadable hypertext documentation available. This version replicates nearly all of the pages of the web site (current as of August 6).

March 23, 1999 -- New version of the downloadable hypertext documentation available. 

September 16, 1998 -- Created a new version of the downloadable hypertext documentation.  In addition to many many new hypertext links in Volume1: User's Guide, this version includes previously unavailable sections of Volume 2:  User's Guide Supplement and Volume 3: Counting the Past

August 20, 1998 -- Split YRSINUSA into two separate variables--YRSUSA1 and YRSUSA2-- to enhance compatibility over time (see Data section).  Documentation change: the universe for 1980 should have excluded foreign-born persons who were citizens at birth. 

August 20, 1998 -- YRIMMIG documentation change: the universe for 1980 excludes foreign-born persons who were citizens at birth. Changed 969 code to 970; it refers to 1965-1970, not 1965-1969. Added 914, which refers to the period before 1915 in the 1970 sample. We also changed the data, recoding 969 to 970. 

August 20, 1998 -- MIGTYPE5 and PWTYPE were incorrectly labeled as being available in the 1990 5% sample (only a documentation issue). 

March 25, 1998 -- Changed the METRO availability box in the variable description. The variable METRO is available only in the 1970 "State" samples, not the "Metro" (county group) samples, as indicated in the variable availability box on page 1.11.13. 

March 19, 1998 -- Created first version of downloadable hypertext documentation system. 
 

EXTRACT SYSTEM
March 24, 1998 -- Made a significant, if somewhat subtle, change to the way the extraction system works. Altered the extraction system to zero out any variables that were "stacked" in the same column location as a requested variable. Previously, if you selected a variable that was not available in every sample chosen for extraction, the system would include whatever other variable was located in those columns in the raw IPUMS data files. For example, if you selected 1880 along with more modern samples and requested the variable Migration Status, 5 Years, the system would include the alphabetic data from the 1880 variable Last Name in those same extract columns. This caused considerable confusion among users.