Census 2000 TIGER/Line(R) Files Technical Documentation Chapter 5: Data Quality This section provides detailed information on the lineage, positional accuracy, attribute accuracy, logical consistency, and completeness of the TIGER/Line(R) files. Data users can use this information to help evaluate the adequacy and applicability of this geographic file for a particular use. Lineage Geometric Properties Source codes that specify the original digital source of complete chains in the TIGER/Line files are listed in the Sources section of this chapter. These codes cover the source categories in the Census TIGER(R) data base: initial source, pre-1990 computer operations, office operations, enumerator operations, local official updates, post-1990 census updates, and pre-2000 computer operations. The initial sources used to create the Census TIGER data base were the USGS 1:100,000-scale Digital Line Graph (DLG), USGS 1:24,000-scale quadrangles, the U.S. Census Bureau's 1980 geographic base files (GBF/DIME-Files), and a variety of miscellaneous maps for selected areas outside the contiguous 48 states. The DLG coverage is extensive, albeit of variable currency, and comprises most of the rural, small city, and suburban area of the TIGER/Line files. GBF/DIME-File coverage areas were updated through 1987 with the manual translation of features from the most recent aerial photography available to the U.S. Census Bureau. In order to maintain a current geographic data base from which to extract the TIGER/Line files, the U.S. Census Bureau uses various internal and external procedures to update the Census TIGER data base. While it has made a reasonable and systematic attempt to gather the most recent information available about the features this file portrays, the U.S. Census Bureau cautions users that the files are no more complete than the source documents used in their compilation, the vintage of those source documents, and the translation of the information on those source documents. The U.S. Census Bureau has added, to the Census TIGER data base, the enumerator updates compiled during the Census 2000 census operations. The updates came from map annotations made by enumerators as they attempted to locate living quarters by traversing every street feature in their assignment area. The U.S. Census Bureau digitized the enumerator updates directly into the Census TIGER data base without geodetic controls or the use of aerial photography to confirm the features' locational accuracy. The U.S. Census Bureau also made other corrections and updates to the Census TIGER data base supplied by local participants in various U.S. Census Bureau programs. Local updates originated from map reviews by local government officials or their liaisons and local participants in U.S. Census Bureau programs. Maps were sent participants for use in various census programs, and some maps were returned with update annotations and corrections. The U.S. Census Bureau generally added the updates to the Census TIGER data base without extensive checks. Changes made by local officials do not have geodetic control. Projection and Datum The TIGER/Line data are not in a mapping projection even though most of the features were scanned directly from source maps (usually the U.S. Geological Survey (USGS) 1:100,000-scale topographic quads) that were in a projection. For the lower 48 states, most information in the TIGER data base outside the urban centers was derived from the USGS 1:100,000-scale digital line graphs, which were vectorized from the digital scanning of the original artwork. The original artwork was in Universal Transverse Mercator (UTM) projection. After the map sheets were scanned, the coordinates were transformed from UTM into projectionless geographic coordinates of latitude and longitude. For most urban centers, the information in TIGER was derived from the GBF/DIME files produced for the 1980 Census. The coordinates in the GBF/DIME files were based on the Census Bureau's Metropolitan Map Series (MMS) map sheets, originally developed for the 1970 Census, and subsequently updated by local planning agencies as well as the U.S. Census Bureau. The MMS map sheets developed after the 1970 Census were based on USGS topographic 7.5 minute topographic quadrangles, enlarged to 1:19,200 and rescribed. There were a variety of other sources used in creating the Census TIGER data base. The features from those sources also were stored as latitude and longitude coordinates. Subsequent updates to the Census TIGER also came from a variety of sources, including paper maps annotated in the field and subsequently digitized without rigorous adherence to a projection or coordinate system. The information in TIGER for Puerto Rico originally was derived by digitizing the USGS 1:20,000-scale topographic quadrangles. The information in TIGER for Hawaii was based on the GBF/DIME files and available USGS maps for the state. The information in TIGER for Alaska and the Island Areas originally was developed by digitizing USGS 1:24,000 and 1:63,360 topographic quadrangles and other available sources, including some developed for use in World War II. In the 1995 and later TIGER/Line files, NAD83 is the coordinate datum used for the 48 contiguous states, the District of Columbia, Alaska, Puerto Rico, and the Virgin Islands of the United States. Regional datums are used for Hawaii and the Pacific Island Areas. NAD27 was the coordinate datum used for the 1994 and earlier versions of the TIGER/Line files except in Hawaii and the Pacific Island Areas where regional datums were used. Because the datum used was not relevant to the U.S. Census Bureau's purposes for creating maps, the documentation did not record the specific datum of our source material for Hawaii and the Pacific Island Areas. Sources In the TIGER/Line files, there is a 1-alphanumeric character source code for complete chain and landmark features. Source codes identify the original (or final, if historical) operation that created the geographic object and its geometric properties. The U.S. Census Bureau has revised the source codes appearing in the Census 2000 TIGER/Line files to better describe for data users when a feature was introduced into the Census TIGER data base. Source Codes Value Description blank Not Documented Elsewhere A Updated 1980 GBF/DIME-File B USGS 1:100,000-Scale DLG-3 File C Other USGS Map J Pre-1990 Census Updates K Post-1990 Census Updates (1990-1994) L Pre-Census 2000 Local Official Updates (1995-Census 2000) M Pre-Census 2000 field Operations (1995-Census 2000) N Pre-Census 2000 Office Update Operations (1995-Census 2000) O Post-Census 2000 (2000-2002) Source Code Record Locations Record Type Field Name Description 1 SOURCE Linear Segment Source Code 7 SOURCE Source or First Source Code to Update 9 SOURCE Source or First Source Code to Update H HIST History or Last Source Code to Update H SOURCE Source or First Source Code to Update Address Ranges and ZIP Codes The TIGER/Line files contain potential address ranges and ZIP Codes for most areas of the United States where house number-street name style address ranges exist. Residential addresses from the 1990 decennial census master list of addresses, the Address Control File (ACF), were converted to address ranges and matched into TIGER using an address range creation formula for all counties. The original TIGER address ranges were matched, then merged with the ACF-derived address ranges, producing a single set of integrated address ranges in the TIGER data base. Subsequently, during the 1990 ACF Match/Merge operation, the ranges were integrated and many address range conflicts were resolved. Further address range edits eliminated or isolated additional overlaps. For Census 2000, the U.S. Census Bureau compared the address information in the Master Address File (MAF) to the existing address ranges in Census TIGER expanding, creating, or modifying the TIGER address ranges where necessary. Updated address information also was obtained from the U.S. Postal Service (USPS), Census 2000 field operations, and Census 2000 local participant programs and inserted into Census TIGER. ZIP Codes were originally derived from two sources: those already existing in the Census TIGER data base and those derived from the 1990 ACF. Address ranges created from the ACF may have non-city delivery ZIP Codes. This situation typically occurs in smaller places where structure numbers exist and appear in the ACF, but are not used in mail delivery. The U.S. Census Bureau updated and corrected ZIP Codes in the early 1990's by matching the Census TIGER data base with an updated USPS ZIP+4 file for the 50 states and the District of Columbia. The 5-digit ZIP Code and street name were used as keys to match address ranges from the TIGER data base to corresponding address ranges in the ZIP+4 file. Where a match occurred, the ZIP Add-On (Plus 4) code was added to the TIGER address range record. Clerical updates improved five-digit ZIP Code coverage, and eliminated the illegal five-digit ZIP Codes and three-digit ZIP Codes. Additional matching between the ZIP+4 file and the Census TIGER data base occurs during the normal course of operations to maintain the address range and five-digit ZIP Codes in Census TIGER. Census Feature Class Codes All generic CFCCs (A10, A20, A30, and A40) were changed to more descriptive CFCCs. For example, an A40 (local, neighborhood, and rural road, major category used alone when the minor category could not be determined) was changed to the more descriptive CFCC of A41 (unseparated local, neighborhood, and rural road). The census feature classifications of roads were redefined to agree more closely with customary use and to be more useful to transportation planners. Thus, all road classifications were reduced to a local or neighborhood road unless the road had a highway route number. The classification was then based on the highway route number. Feature Identifiers Highway Route Numbers The U.S. Census Bureau updated the feature identifiers (FIDs) and census feature class codes (CFCCs) for all interstates, limited access roads, US highways, and state highways in all counties in the United States. The FIDs of highways were entered in the Census TIGER data base using the following rules: If an interstate also was known by a local name, the interstate route number was entered as the primary name of the interstate and the local name was entered as the alternate name. If the US highways and state highways were known by a route number as well as by a local name, the local name was entered as the primary name, and the highway route number was entered as the alternate name. Military Installation Names The U.S. Census Bureau standardized most military installation names to match Department of Defense information. National Park Service Area Names The U.S. Census Bureau used information to standardize the names of all areas within the jurisdiction of the National Park Service, most importantly, the complete set of National Parks and National Monuments. Positional Accuracy The U.S. Census Bureau's mission to count and profile the Nation's people and institutions does not require very high levels of positional accuracy in its geographic products. Its files and maps are designed to show only the relative positions of elements. Coordinates in the TIGER/Line files are in decimal degrees and have six implied decimal places. The positional accuracy of these coordinates is not as great as the six decimal places suggest. The positional accuracy varies with the source materials used, but at best meets the established National Map Accuracy standards (approximately +/- 167 feet) where 1:100,000- scale maps from the USGS are the source. The U.S. Census Bureau cannot specify the accuracy of feature updates added by its field staff or of features derived from the GBF/DIME-Files or other map or digital sources. Thus, the level of positional accuracy in the TIGER/Line files is not suitable for high-precision measurement applications such as engineering problems, property transfers, or other uses that might require highly accurate measurements of the earth's surface. Despite the fact that TIGER/Line data positional accuracy is not as high as the coordinate values imply, the six-decimal place precision is useful when producing maps. This precision allows you to place features that are next to each other on the ground in the correct position, relative to each other, on the map without overlap. Attribute Accuracy Topological Properties The attribute accuracy of the TIGER/Line files is as precise as the source used during the creation or update of the Census TIGER data base. Accuracy statements on the Census TIGER data base are based on deductive estimates; no specific field tests for attribute accuracy have been conducted on the files. However, updates or corrections resulting from normal U.S. Census Bureau field operations are entered into the Census TIGER data base. In addition, quality checks are conducted to verify clerical transcription of data from source materials. Based on past experience, attribute codes match the source materials with less than a two-percent error. The feature network of complete chains (as represented by Record Types 1 and 2) is complete for census purposes. Data users should be aware that on occasion they may not be able to trace a specific feature by name or by census feature class code (CFCC) as a continuous line throughout the TIGER/Line files without making additional edits. For example, State Highway 32 may cross the entire county. The TIGER/Line files will contain complete chains in the file at the location of State Highway 32, but the complete chains may individually have one of a collection of local names such as S Elm Street, or Smallville Highway, with or without State Highway 32 as an alternate. The most frequent CFCC for a state highway is A21, but the complete chains at the location of State Highway 32 may have a variety of class codes such as A01, A41, or A21. Recent edits have reduced this problem, but not eliminated it. Boundaries and Geographic Entity Codes The U.S. Census Bureau collects and tabulates information for both legal and statistical entities. Record Type 1 mainly identifies the boundaries and codes for the legal entities reported to the U.S. Census Bureau to be legally in effect as of the Census 2000 Boundary and Annexation Survey. Record Types 3 and A generally contain the 1990 census tabulation geographic boundaries and codes for those entities. Most legal boundaries are based on the annotations made by local officials in response to the U.S. Census Bureau's Boundary and Annexation Surveys. The boundary information in the TIGER/Line files are for statistical data collection and tabulation purposes only; their depiction and designation for statistical purposes does not constitute a determination of jurisdictional authority or rights of ownership or entitlement. Local data users generally define and delineate statistical entities following U.S. Census Bureau guidelines. However, there are several exceptions: The U.S. Census Bureau defines Urbanized Areas (UAs) based strictly on technical considerations. The U.S. Census Bureau defines ZIP Code Tabulation Areas (ZCTAsT) through an automated process utilizing addresses in the TIGER data base and the Master Address File (MAF). State Departments of Education delineate school districts. The designated liaison for the Redistricting Data Program supplies Voting Districts (VTDs) and State Legislative Districts (SLDs). Metropolitan Planning Organizations or State Departments of Transportation define Traffic Analysis Zones (TAZs). The USGS maintains the file that is published as FIPS 55. The U.S. Census Bureau uses the file for coding American Indian/Alaska Native Areas, county subdivisions, consolidated cities, places, and sub-MCDs. Cooperatively in preparation for Census 2000, the U.S. Census Bureau and the USGS edited the FIPS 55 file to ensure alphabetical sorting and data consistency. As a result, changes were made to the FIPS 55 codes and related class codes. These changes, plus codes for new Census 2000 entities, appear in Record Type C. Other attribute data in the TIGER/Line files were gathered from many sources. The U.S. Census Bureau's staff linked the attribute information to the spatial framework of features. Most procedures for gathering the needed attributes were clerical. The quality of these attributes was ensured by various tests conducted before, during, and after the time that the attribute information was entered into the Census TIGER data base. Tests included source material selection and evaluation checks, quality control checks on staff work, independent reviews by local and tribal leaders of maps produced from the Census TIGER data base, and staff reviews of computer- performed operations. Address Ranges and ZIP Codes The conversion from the GBF/DIME-Files to the TIGER format involved neither verification of previously existing address ranges nor any significant updates or corrections. Prior to the release of the 1992 TIGER/Line files, the address ranges for an area were generally the same as those in the corresponding 1980 GBF/DIME-File. The 1992 TIGER/Line files included ACF address ranges for existing and new features identified during census operations. Address ranges and ZIP Codes were verified and coverage extended for Census 2000 through the use of the Master Address File (MAF). The MAF is closely linked to the Census TIGER data base. Local address lists and addresses from the U.S. Postal Service supplement the MAF. Through an automated matching process, addresses in the MAF were compared to existing address ranges in the Census TIGER data base creating or modifying the TIGER address ranges where necessary. Feature Identifiers A national consistency review of all feature names in the Census TIGER data base was performed by running a revised name standardizer on all feature identifiers. An additional benefit was the removal of nonstandard characters and punctuation from the names. To improve accuracy, road names in the Census TIGER data base were compared with street names in the ZIP+4 file from the US Postal Service. Errors in feature directionals or feature types were corrected in the Census TIGER data base. Logical Consistency Node-line-area relationships satisfy topological requirements. These requirements include the following: Complete chains must begin and end at nodes. Complete chains must connect to each other at nodes. Complete chains do not extend through nodes. Left and right polygons are defined for each complete chain element and are consistent for complete chains connecting at nodes. Complete chains representing the limits of a file are free from gaps. The U.S. Census Bureau performed automated tests to ensure logical consistency and limits of file. Some polygons in the TIGER/Line files are so small that the polygon internal point has been manually placed on a node that defines the polygon perimeter. The U.S. Census Bureau uses its internally developed Geographic Update System to enhance and modify spatial and attribute data in the Census TIGER data base. The Census TIGER data base has two generations of currency in geographic areas. These are the 1990 census areas and the Census 2000 areas. The boundaries of geographic areas are affected by the location, type, and number of areas. To prepare for Census 2000, those features used only as boundaries in the 1980 census were deleted. The deletions lowered the overall count of complete chains and polygons. Standard geographic codes, such as FIPS codes for states, counties, municipalities, and places, are used when encoding spatial entities. The U.S. Census Bureau performed spatial data tests for logical consistency of the codes during the compilation of the original Census TIGER data base files. Most of the codes themselves were provided to the U.S. Census Bureau by the U.S. Geological Survey (USGS), the agency responsible for maintaining FIPS 55. Completeness The GBF/DIME-Files and the USGS's DLG were the two main sources of spatial attribute data. Data for a given category contain attribute codes that reflect the information portrayed on the original source. The TIGER/Line files also use the U.S. Census Bureau's internal coding scheme which in some cases parallels the FIPS codes. The feature network of complete chains is complete for census purposes. For the 1990 census and Census 2000, census enumerators identified new and previously unreported street features for the entire Nation during a series of decennial census operations. In some areas, local officials reviewed the census maps and identified new features and feature changes. The TIGER/Line files contain limited point and area landmark data. The enumerator updates for decennial censuses do not stress landmark features. Computer file matching and automated updates from the Economic and Agriculture censuses added landmarks and key geographic locations (KGLs).