官术网_书友最值得收藏!

Joining a shapefile attribute table to a CSV file

Joining attribute tables to other database tables allows you to use a spatial dataset in order to reference a dataset without any geometry, using a common key between the data tables. A very common use case for this is to join a vector dataset of census attributes to a more detailed census attribute dataset. The use case we will demonstrate here links a US census track file to a detailed CSV file that contains more in-depth information.

Getting ready

For this recipe, you will need a census tract shapefile and a CSV file containing the appropriate census data for the shapefile. You can download the sample data set from https://geospatialpython.googlecode.com/svn/census.zip.

Extract this data to a directory named /qgis_data/census.

How to do it...

The join operation is quite involved. We'll perform this operation and save the layer as a new shapefile with the joined attributes. Then we'll load the new layer and compare the field count to the original layer to ensure that the join occurred. We'll use the terms target layer and join layer. The target layer will be the shapefile, and the join layer will be a CSV with some additional fields we want to add to the shapefile. To do this, perform the following steps:

  1. Start QGIS.
  2. From the Plugins menu, select Python Console.
  3. First, load the county's census track layer and validate it:
    vectorLyr = QgsVectorLayer('/qgis_data/census/hancock_tracts.shp', 'Hancock' , "ogr")
    vectorLyr.isValid()
    
  4. Now, load the CSV file as a layer and validate it as well:
    infoLyr = QgsVectorLayer('/qgis_data/census/ACS_12_5YR_S1901_with_ann.csv', 'Census' , "ogr")
    infoLyr.isValid()
    
  5. Once this is done, you must add both the layers to the map registry for the two layers to interact for the join. However, set the visibility to False, so the layers do not appear on the map:
    QgsMapLayerRegistry.instance().addMapLayers([vectorLyr,infoLyr], False)
    
  6. Next, you must create a special join object:
    info = QgsVectorJoinInfo()
    
  7. The join object needs the layer ID of the CSV file:
    info.joinLayerId = infoLyr.id()
    
  8. Next, specify the key field from the CSV file whose values correspond to the values in the shapefile:
    info.joinFieldName = "GEOid2"
    
  9. Then, specify the corresponding field in the shapefile:
    info.targetFieldName = "GEOID"
    
  10. Set the memoryCache property to True in order to speed up access to the joined data:
    info.memoryCache = True
    
  11. Add the join to the layer now:
    vectorLyr.addJoin(info)
    
  12. Next, write out the joined shapefile to a new file on disk:
    QgsVectorFileWriter.writeAsVectorFormat(vectorLyr, "/qgis_data/census/joined.shp", "CP120", None, "ESRI Shapefile")
    
  13. Now, load the new shapefile back in as a layer for verification:
    joinedLyr = QgsVectorLayer('/qgis_data/census/joined.shp', 'Joined' , "ogr")
    
  14. Verify that the field count in the original layer is 12:
    vectorLyr.dataProvider().fields().count()
    
  15. Finally, verify that the new layer has a field count of 142 from the join:
    joinedLyr.dataProvider().fields().count()
    

How it works...

This recipe reaches out to the very edge of the PyQGIS API, forcing you to use some workarounds. Most recipes for data manipulation can be performed programmatically without writing data to disk or loading layers onto the map, but joins are different. Because the QgsVectorJoinInfo object needs the layer ID of the CSV layer, we must add both the layers to the map layer registry. Fortunately, we can do this without making them visible, if we are just trying to write a data manipulation script. A join is designed to be a temporary operation to query a dataset. Oddly, PyQGIS lets you create the join, but you cannot query it. This limitation is the reason why if you want to work with the joined data, you must write it to a new shapefile and reload it. Fortunately, PyQGIS allows you to do that.

There's more...

You can find an alternate method that works around the PyQGIS limitation in a Processing Toolbox script, which manually matches the joined data in Python, at https://github.com/rldhont/Quantum-GIS/blob/master/python/plugins/processing/algs/qgis/JoinAttributes.py.

主站蜘蛛池模板: 凤城市| 德兴市| 泰来县| 凤城市| 德保县| 沁源县| 梁山县| 竹溪县| 平利县| 青阳县| 霞浦县| 巴楚县| 山西省| 府谷县| 吴江市| 子洲县| 资阳市| 恭城| 咸丰县| 凌海市| 平舆县| 什邡市| 太谷县| 嘉荫县| 靖州| 梁河县| 荔波县| 东乡族自治县| 社旗县| 抚宁县| 洪泽县| 海盐县| 任丘市| 托克逊县| 宝坻区| 囊谦县| 建水县| 罗田县| 门头沟区| 黄梅县| 佛冈县|