Census 2000 TIGER/Line(R) Files Technical Documentation Chapter 1: Overview and Geographic Concepts Overview What Is TIGER(R)? The U.S. Census Bureau's Census TIGER System automates the mapping and related geographic activities required to support the decennial census and sample survey programs of the U.S. Census Bureau starting with the 1990 decennial census. The Census TIGER System provides support for the following: Creation and maintenance of a digital geographic data base that includes complete coverage of the United States, Puerto Rico, the Virgin Islands of the United States, and the Pacific Island Areas Production of maps from the Census TIGER data base for all U.S. Census Bureau enumeration and publication programs Ability to assign individual addresses to geographic entities and census blocks based on polygons formed by features such as roads and streams The design of the Census TIGER data base adapts the theories of topology, graph theory, and associated fields of mathematics to provide a disciplined, mathematical description for the geographic structure of the United States and its territories. The topological structure of the Census TIGER data base defines the location and relationship of streets, rivers, railroads, and other features to each other and to the numerous geographic entities for which the U.S. Census Bureau tabulates data from its censuses and sample surveys. It is designed to ensure that there is no duplication of features or areas. The building of the Census TIGER data base involved a variety of encoding techniques such as automated map scanning, manual map digitizing, standard data keying, and sophisticated computer file matching. The goal was to provide automated access to, and retrieval of, relevant geographic information about the United States and its territories. TIGER Data Base Extracts In order for others to use the information in the Census TIGER data base in a geographic information system (GIS) or for other geographic applications, the U.S. Census Bureau releases periodic extracts of the data base, including the TIGER/Line(R) files, to the public. Various versions of the TIGER/Line files have been released; previous versions include the 1990, 1992, 1994, 1995, 1997, 1998, 1999, and Redistricting Census 2000 TIGER/Line files. The 1992 TIGER/Line files were produced to satisfy a requirement of the U.S. Department of Education and incorporated all of the updates and revisions since the production of the 1990 TIGER/Line files. The 1994 TIGER/Line files were produced to support the programs of the U.S. Department of Transportation, Bureau of Transportation Statistics. The 1995 TIGER/Line files were originally produced to support Phase I of the Census 2000 Redistricting Data Program. The 1997 TIGER/Line files were originally produced to support the Phase I Verification of the Census 2000 Redistricting Data Program and the Census 2000 Participant Statistical Areas Program Delineation. The 1998 TIGER/Line files were originally produced to support the Census 2000 Redistricting Data Program, Phase 2, the Voting District Project (VTDP) and the Census 2000 Traffic Analysis Zone (TAZ) Program. The original purpose of the 1999 TIGER/Line files was to support the Phase 2 Verification of the Census 2000 Redistricting Data Program and the verification of the Census 2000 Participant Statistical Areas and Census 2000 Traffic Analysis Zone (TAZ) Programs. The Redistricting Census 2000 version of the TIGER/Line files was the official version of the TIGER/Line files delivered to the official recipients under Public Law 94-171 and to redistricting officials in the District of Columbia and the Commonwealth of Puerto Rico. Relationship of TIGER/Line to Census 2000 Statistical Data What makes the TIGER extract products particularly valuable in the GIS environment and to the data user community is the direct linkage between the Census 2000 decennial census data products and the Census TIGER data base extracts. The digital description in the TIGER data base of the Nation's legal and statistical entities includes Federal Information Processing Standards (FIPS) codes and, for selected geographic entities, U.S. Census Bureau codes so entities can be easily matched with the Census 2000 census data. Census 2000 TIGER/Line Files The Census 2000 TIGER/Line files include files for all counties and statistically equivalent entities in the United States as well as files for Puerto Rico and the Island Areas. The Census 2000 TIGER/Line files consist of line segments that represent physical features, and legal and statistical boundaries. The files consist of 17 separate record types, including the basic data record, the shape coordinate points (feature shape records), and geographic entity codes that can be used with appropriate software to prepare maps. Related Files Summary Files (SFs) provide Census 2000 statistical data for a wide range of subject headings and geographic entities compatible with the TIGER/Line files. These files are available on the Internet and CD-ROM. Census 2000 Redistricting Data Summary Files provide selected Census 2000 population data for small area geography (state, county, county subdivision, place, census tract, block group, and block) and are compatible with the TIGER/Line files. These files are available on the Internet and CD-ROM. County-Based Files The geographic coverage for a TIGER/Line file is a county or statistically equivalent entity. See Appendix A for a list of state and county codes and Chapter 4 for a description of counties and statistically equivalent entities. The county files have a coverage area based on the legal boundaries obtained in response to the U.S. Census Bureau's Census 2000 Boundary and Annexation Survey (BAS). Even though the Census TIGER data base represents a seamless national file with no overlaps or gaps between parts, the county-based TIGER/Line files are designed to stand alone as an independent data set. The files can be combined to cover the whole Nation and its territories (see the Single-Side Flags and County Boundaries section in Chapter 3). The Data Content of the TIGER/Line Files The TIGER/Line files contain data describing three major types of features: Line features 1) Roads 2) Railroads 3) Hydrography 4) Miscellaneous transportation features and selected power lines and pipe lines 5) Boundaries Landmark features 1) Point landmarks such as schools and churches 2) Area landmarks such as parks and cemeteries 3) Key geographic locations (KGLs) such as shopping centers and factories Polygon features 1) Geographic entity codes for areas used to tabulate the Census 2000 statistical data 2) Locations of area landmarks 3) Locations of KGLs The line feature and polygon information form the majority of data in the TIGER/Line files. Some of the data describing the lines include coordinates, feature identifiers (names), feature classification codes, address ranges, and geographic entity codes. Chapter 3 details these data items; Chapter 4 defines the geographic entities and codes. The TIGER/Line files contain point and area labels that describe landmark features. These features provide locational references for field staff and map users. Area landmarks consist of a feature name or label and feature type assigned to a polygon or group of polygons. Landmarks may overlap or refer to the same set of polygons. See Chapter 3 for more information on landmark data. Topology and Spatial Objects in the TIGER/Line Files Spatial Objects in the TIGER/Line Files The Census TIGER data base uses a collection of spatial objects, points, lines, and polygons, to model or describe real-world geography. The U.S. Census Bureau uses these spatial objects to represent features such as streets, and assigns attributes to these features to identify and describe specific features such as the 500 block of Market Street in Philadelphia, Pennsylvania. The TIGER/Line files contain information about the spatial objects distributed over a series of record types. Users of the TIGER/Line files may need to link information from several record types to find all the attributes of interest that belong to one spatial object. The final section of this chapter includes a description of the record types. Topology Topology explains how points, lines, and areas relate to each other and is used as the foundation for organizing spatial objects in the Census TIGER data base. The Census TIGER data base uses points, lines, and areas to provide a disciplined, mathematical description of the features of the earth's surface. Spatial objects in the Census TIGER data base are interrelated. A sequence of points define line segments, and line segments connect to define polygons. Topology provides a basic language for describing geographic features. The Census TIGER data base relates information to points or 0-cells, lines or 1- cells, and polygons or 2-cells. The number preceding the cell identifies the dimensionality of the object; for instance, a line segment has a single dimension, length. Each of these objects builds on the others to form higher- level objects. The 0-cells form the end points of 1-cells. The 1-cells connect at 0-cells and form closed figures that partition space into polygons or 2-cells. Terminology The terms point, line segment, and polygon are familiar, but general terms that may have different meanings to data users working with a variety of different applications and data sets. The TIGER/Line file documentation uses the terminology from the Spatial Data Transfer Standard (SDTS). Since the first release of the TIGER/Line files, the U.S. Geological Survey (USGS) has coordinated the development and release of the SDTS, now part of the Federal Information Processing Standards (FIPS). The SDTS specifies a series of terms and definitions for spatial objects. Why use the SDTS terminology? Even though the TIGER/Line files do not follow the SDTS format, the TIGER/Line documentation will use these terms and definitions in order to promote a common language for describing geographic data and to facilitate the transition to the SDTS. The spatial objects in TIGER/Line belong to the "Geometry and Topology" (GT) class of objects in SDTS. The definitions are from FIPS Publication 173, Spatial Data Transfer Standard (SDTS) (August 28, 1992) Section 2-2, "Classification and Intended Use of Objects," pp. 11-20. Node "A zero-dimensional object that is a topological junction of two or more links or chains, or an end point of a link or chain," is a node. Entity Point "A point used for identifying the location of point features (or areal features collapsed to a point), such as towers, buoys, buildings, places, etc." Complete Chain "A chain [a sequence of non-intersecting line segments] that explicitly references left and right polygons and start and end nodes." The shape points combine with the nodes to form the segments that make a complete chain. Network Chain "A chain that explicitly references start and end nodes and not left and right polygons." GT-Polygon "An area that is an atomic two-dimensional component of a two-dimensional manifold, [which is defined as] one and only one planar graph and its two-dimensional objects." GT-polygons are elementary polygons that are mutually exclusive and completely exhaust the surface. Spatial Objects The spatial objects in the TIGER/Line files embody both geometry (coordinate location and shape) and topology (the relationship between points, line objects, and polygons) and therefore belong to the geometry and topology (GT) class of objects in the SDTS. In the SDTS, nodes represent point objects (0-cells) that identify the start and end position of lines or 1- dimensional objects (1-cells) called chains. The chains in the TIGER/Line files are complete chains because they form polygon boundaries and intersect other chains only at nodes. Topological chains that do not contain polygon information are network chains. Data users may choose not to use the polygon or geographic entity codes and consider the TIGER/Line files a source of network chain data. Two or more complete chains can form a central road with a start and end node defining each complete chain. Complete chains may consist of one or more line segments that describe the shape and position of the complete chain. Complete chains that meet at an intersection share the same node. Shape points define the line segments and are not part of the topology of the TIGER/Line files. Shape points and the resulting line segments are attributes of the complete chains. When complete chains link node to node and form a closed figure (a 2- cell), a GT-polygon results. GT-polygons are elementary units; they are not subdivided into smaller polygons. The polygons completely encompass the area they represent and there is no gap or overlap between adjacent polygons. The geographic entities and area landmarks in the TIGER/Line files are associated with one, or a set of GT-polygons. The TIGER/Line files contain point landmark data that are not included in the Census TIGER data base topology. Point landmarks are entity points that mark the location of points of interest and are not connected to complete chains or GT-polygons. The following table summarizes the terms for spatial objects in the TIGER/Line files: Point (0-cell) Line (1-cell) Polygon (2-cell) Topology Node Complete Chain or Network Chain GT-polygon Non-topology Entity Point Attribute Shape Point Features The Census TIGER data base uses the term feature to informally describe spatial objects more complex than nodes, complete chains, or GT- polygons. For instance, Main Street is a feature that may consist of a series of complete chains with the same name. The Census TIGER data base contains complete chains, but does not contain features or link complete chains to features. Left- and Right-Side Data Fields If one is standing on a complete chain at the start node facing the end node, data listed in the fields carrying a right qualifier would be found to the right of the complete chain. Notice the position of the start and end nodes for the road in the central section of Figure 1-1; the right-side of the complete chain corresponds to GT-polygon 1 and the left-side corresponds to GT-polygon 2. From the information contained in this basic record, data users can collect the complete chains necessary to construct intersecting polygons and features. Single-Layer Topology All spatial objects in the TIGER/Line files exist in a single data layer that includes roads, hydrography, railroads, boundary lines, and miscellaneous features; they are topologically linked. For instance, nodes mark the intersections of roads and rivers. Subsurface features such as tunnels or above surface features such as bridges also create nodes when they cross surface features even though there is no direct real-world connection. Introduction to the TIGER/Line File Structure The Census 2000 TIGER/Line files are extracts of selected information from the Census TIGER data base, organized as topologically consistent networks. The records in these TIGER/Line files represent features traditionally found on a paper map. Each complete chain is classified by codes that describe the type of feature it represents. The Census 2000 TIGER/Line files consist of 17 record types that collectively contain geographic information (attributes) such as address ranges and ZIP Codes and their Add-On codes for street complete chains, names, feature classification codes, codes for legal and statistical entities, latitude/longitude coordinates of linear and point features, landmark features, area landmarks, key geographic locations, and area and polygon boundaries. Some counties or statistically equivalent entities do not require all of the 17 record types and therefore have less than 17 files. If the types of data contained in Record Types 4, 6, 7, 8, 9, and Z are not appropriate for a given county or statistically equivalent entity, then the U.S. Census Bureau does not include files for those record types. The file for each county (or statistically equivalent entity) is identified by the state and county FIPS code after the "tgr" in the file name (for example, tgr42107.rt1). The suffixes used for the record type files have been changed to make it easier to identify each record type file (when working with uncompressed versions of the county files). The suffix consistently is .rtn where n is the record type. The TIGER/Line data dictionary in Chapter 6 contains a complete list of all the fields in the 17 record types. Separate chapters cross-list the fields by feature attribute and geographic entity type. The next section provides a summary of Census 2000 TIGER/Line file record types. Census 2000 TIGER/Line File Record Types Record Type 1-Complete Chain Basic Data Record Record Type 1 provides a single record for each unique complete chain in the TIGER/Line files. The basic data record contains the end nodes for the complete chain. This record also contains address ranges and ZIP Codes (for most areas of the country where a street name/house numbering system existed at the time of data extraction from the Census TIGER data base) and the Census 2000 census geographic entity codes for each side of the complete chain. Additional feature identifier, address range, and ZIP Code data related to Record Type 1 are found on Record Types 4, 5, 6, and Z. Additional Census 2000 and 1990 geographic entity codes related to Record Type 1 are found on Record Type 3. Record Type 2-Complete Chain Shape Coordinates Record Type 2 provides an additional series of latitude and longitude coordinate values describing the shape of each complete chain in Record Type 1 that is not a straight line segment. That is, not all complete chains in Record Type 1 have shape points and therefore not all have an associated Record Type 2. Where a complete chain in Record Type 1 is not a straight line, Record Type 2 may have a many-to-one relationship with Record Type 1. Record Type 3-Complete Chain Geographic Entity Codes Record Type 3 includes the Census 2000 U.S. Census Bureau geographic area codes for the American Indian/Alaska Native areas. It also includes 1990 geographic codes for a variety of geographic area types. Record Type 3 has a one-to-one relationship with Record Type 1. Record Type 4-Index to Alternate Feature Identifiers Record Type 4 provides an index to alternate feature names associated with the complete chain (Record Type 1). A Record Type 4 will not exist for a Record Type 1 that has only one name. A complete chain can have more than one alternate name. Record Type 4 has a many-to-one relationship with Record Type 1 and a many-to-one relationship with Record Type 5. Record Type 5-Complete Chain Feature Identifiers Record Type 5 contains a list of all unique feature names for complete chains in the TIGER/Line files. Each name (or feature identifier) has an identification code number (FEAT). Record Type 5 has a one-to-many relationship with Record Type 4 and a one-to-many relationship with Record Type 9. Record Type 6-Additional Address Range and ZIP Code Data Record Type 6 provides additional address range information for a street complete chain when the information cannot be presented as a single address range (for example, the house/building numbers are not uniformly arranged to form an address range). Record Type 6 appears only for those counties that have address ranges and ZIP Code information in the Census TIGER data base. There is no assurance that the address ranges provided on Record Type 6 will cover fewer addresses than the address ranges appearing on Record Type 1. Data users must use Record Type 6 to obtain the entire picture of the potential address ranges along a complete chain. The address ranges used for geocoding along corporate corridors and corporate offset limits appear only in Record Type 6. Record Type 6 can have a one-to-one or a many-to-one relationship with Record Types 1 and with Record Type Z. Record Type 7-Landmark Features Record Type 7 contains the area and point landmarks from the Census TIGER data base. If Record Type 7 represents an area landmark rather than a point landmark, then a one-to-one relationship exists with Record Type 8. If a county file has no landmarks Record Types 7 or 8 will not exist for that county. Record Type 7 excludes all key geographic locations (KGLs) that contain an imputed address and have a ZIP+4 Add-on Code. These appear in Record Type 9. Record Type 8-Polygons Linked to Area Landmarks Record Type 8 links the polygon identification codes with the area landmark identification codes. If a county file does not have any area landmarks then there will not be a Record Type 7 or a Record Type 8 for that county. Record Type 8 can have a one-to-one, one-to-many, many-to-one, or many-to-many relationship with Record Type P. Record Type 9-Key Geographic Location Features Record Type 9 consists only of Key Geographic Locations (KGLs) in the Census TIGER data base that have an imputed address and a ZIP+4 Add-On code. This record type lists the names of special geocoding addresses such as shopping centers and airports. To determine the street name associated with the KGL, use the FEAT field to link Record Type 9 to Record Type 5. Use the CENID and POLYID fields to link the KGL to the GT-polygons on Record Types A or S. The KGLs contained in this record type are not included in Record Types 7 or 8, and have no LAND (landmark identification number). Record Type 9 has a one-to-one or many-to-one relationship with Record Type P. Record Type A-Polygon Geographic Entity Codes Record Type A contains a record for each polygon represented by Record Type P in the TIGER/Line files. The U.S. Census Bureau provides the basic 1990 geographic entity codes-state, county, county subdivision, place, American Indian/Alaska Native Area/Hawaiian Home Land, census tract, block-on this record type to assist data users who are interested only in polygon information. Record Type A also includes the school district codes and fields for the 106th and 108th Congressional Districts (the 108th field is blank for this release). Record Type C-Geographic Entity Names Record Type C provides a unique list of all geographic codes, their associated name, and some entity attributes in a flat (nonhierarchical) file. It contains a Data Year field that may have three values: 1990 for geographic names and codes valid for the 1990 census, 2000 when the geographic names and codes reference Census 2000 geographic entities, or blank when the geographic names and codes for Census 2000 are the same as for 1990. Multiple records for the same geographic entity show its change between 1990 and Census 2000. Record Type C is linked to other record types (1, 3, A, S) through geographic entity codes. Record Type H-TIGER/Line ID History Record Type H provides the history of each TIGER/Line ID when complete chains (Record Type 1) are split or merged. Record Type H shows the TLIDs of the complete chains in existence after the split or prior to the merge. Record Type I-Link Between Complete Chains and Polygons Record Type I links Record Type 1, the complete chain basic data, to Record Type P, the polygon internal point. The Record Type I to Record Type 1 link (TLID) may be used to link complete chain attributes and other data record types (2, 3, 4, 6, H, and Z) to each other. The Record I to Record Type P link (CENID and POLYID) may be used to link polygon attributes and other data record types (8, 9, A, and S) to each other. Record Type I has a one-to-one relationship with Record Type 1, but a many-to-one relationship with Record Type P. When Record Type I is linked to a single-sided Record Type 1 (county boundary), it will provide only the left- or the right-polygon identifier. Record Type P-Polygon Internal Point There is a Record Type P for every polygon in the TIGER/Line files. Record Type P has a one-to-one relationship with Record types A and S and a one- to-many relationship with Record Type I and identifies the internal point coordinates for each polygon. See the Internal Points section in Chapter 3. The TIGER/Line files include all complete chains and polygons in the Census TIGER data base. The topology of the Census TIGER data base ensures that a one-to-one relationship exists between the polygons constructed from Record Types 1 and 2 and Record Type P. Record Type R-TIGER/Line ID Record Number Range Record Type R contains the range of unique complete chain record numbers (TLIDs) assigned to a census file in a nationwide scheme. Record Type R has the lowest (minimum allowable), and the highest (maximum allowable) record numbers for the range. Numbers are assigned to complete chains beginning at the lowest value. The current number is the highest record number for the census file used. Each TIGER/Line file consists of an entire county or statistically equivalent entity. In the Census TIGER data base, the county or statistically equivalent entity may be split into many partitions. The U.S. Census Bureau assigns permanent record numbers to each of these partitions. These record numbers are found in Record Type R. Record Type R is not directly linked to any other record type. Record Type S-Polygon Additional Geographic Entity Codes Record Type S contains a record for each polygon represented by Record Type P in the TIGER/Line files. Record Type S contains geographic entity codes that identify polygons. The geographic entity codes reflect Census 2000 geography. Record Type Z-ZIP+4 Codes Record Type Z provides Postal +4 Add-On codes that make ZIP+4 codes out of the ZIP Codes on Type 1 and Type 6 records. Record Type Z has a one-to- one or many-to-one relationship with Record Type 1 and with Record Type 6. The Relationship Between Spatial Objects and TIGER/Line Record Types The TIGER/Line files do not have specific record types for each spatial object. Nodes, for example, do not have a separate record type; node coordinates appear with other data in Record Type 1, the complete chain basic data record. Defining a complete chain requires information from Record Types 1, the complete chain basic data record; 2, the complete chain shape coordinates record; and I, the link between complete chains and polygons record. Record Types 1, the complete chain basic data record, and 2, the complete chain shape coordinates record, alone describe the set of network chains. GT-polygons require the combined information of Record Types 1, the complete chain basic data record; 2, the complete chain shape coordinates record; I, the link between complete chains and polygons record; and P, the polygon internal point record. See Chapter 3 for a discussion on how to link data using different types of spatial objects. Linkages Between Record Types All the record types except Record Type R, the TIGER/Line record number range record, contain fields (such as TLID, FEAT, CENID, POLYID, LAND, or a geographic entity code) that are used to link together data from the record types. Chapter 2 discusses the TLID, CENID, POLYID, and LAND identification codes in detail. When different record types have a common key with the same data, a linkage can be made between the records. Some of the links are direct, while others are indirect and require a connection through an intermediate record type. An entire TIGER/Line file can be navigated using the record linkage keys. The TLID field provides a direct link between Record Type 1, the complete chain basic data record, and Record Type 2, the complete chain shape coordinates record. The Record Type 3 complete chain geographic entity codes, the index to alternate feature identifiers of Record Type 4, the additional address range and ZIP Code data of Record Type 6, the TIGER/Line ID history on Record Type H, the link between complete chains and polygons of Record Type I, and the ZIP+4 codes on Record Type Z link directly using the TLID field. These record types link indirectly to Record Type 5, the complete chain feature identifiers record, by using the FEAT field on Record Type 5 to link to Record Type 4, the index to alternate feature identifiers record. The TLID field appearing on Record Type 4 then provides a link to other record types. Record Type 8, the polygons linked to area landmarks record; Record Type 9, the key geographic location features record; Record Type A, the polygon geographic entity codes record; Record Type I, the link between complete chains and polygons record; Record Type P, the polygon internal pint record; and Record Type S, the polygon additional geographic entity codes record, link directly using the CENID and POLYID fields. An indirect link can be made form these record types to Record Type 5, the complete chain feature identifiers record, by using the FEAT code which appears on both Record Types 5 and 9, the key geographic location features record. Record Type A, the polygon geographic entity codes record; Record Type P, the polygon internal point record; and Record Type S, the polygon additional geographic entity codes record, can be linked indirectly to Record Type 7, the land features record, by using the LAND field appearing on Record Type 7 to link to Record Type 8, the polygons linked to area landmarks record. Record Type 8 contains the CENID and POLYID fields which link to Record Type A, the polygon geographic entity codes record; Record Type P, the polygon internal point record; and Record Type S, the polygon additional geographic entity codes record. Using the geographic entity codes as the key, the geographic entity names on Record Type C, the geographic entity names record, link to Record Type 1, the complete chain basic data record; Record Type 3, the complete chain geographic entity codes record; Record Type A, the polygon geographic entity codes record; and Record Type S, the polygon additional geographic entity codes record. Linkages may be made to data external to a TIGER/Line file. Record Types 1, the complete chain basic data record; 3, the complete chain geographic entity codes record; A, the polygon geographic entity codes record; and S, the polygon additional geographic entity codes record, contain geographic entity code keys-the Census 2000 or 1990 census geographic entity codes- that may be linked to the U.S. Census Bureau's statistical data (the Census 2000 Redistricting data and the several Summary Files or SFs). For the 1990 Redistricting data and Summary Tape Files (STFs) based on 1990 census data, one must use Record Type 3, the complete chain geographic entity codes record, or Record Type A, the polygon geographic entity codes record. With geographic information systems for processing and display, data users can use the geographic entity codes to link data tabulations with the geographic data.