Assessing the Attribute Accuracy of Volunteered Geographic Information

Abstract:
Since the emergence of the concept of Volunteered Geographic Information (VGI), the quality of this type of information is presented as its biggest problem. Therefore, this issue has been addressed frequently in the literature, and scientists have tried to evaluate the quality of VGI. However, attribute accuracy, despite its important role in a variety of spatial analyses and applications of VGI, has received less attention in comparison to other elements of quality in the literature. Positional accuracy, completeness, lineage, resolution, and time accuracy are among the most important elements of spatial data quality.
In this study, using a novel method and by leveraging Levenshtein algorithm along with text pre-processing, attribute accuracy of volunteered geographic features is examined, comparing this data with reference data. Levenshtein algorithm calculates the difference between two strings of text by counting the number of changes (edits) necessary to change one word to another, and thus sometimes is referred to as Levenshtein distance.
The first step of the proposed method is to find corresponding features in the two data sets to perform the comparison based on. This step is done by applying an automatic data matching algorithm between the two sets. This algorithm consists of five stages, each applied on either the reference data set or the VGI data set.
After data matching is done, each VGI feature is compared with its corresponding match in the reference data set and the Levenshtein distance between the “name” attribute of these two features is calculated. Then, features are categorized as having correct (accurate), approximately correct, or incorrect names based on the Levenshtein distance and assuming that the name of the reference features are correct. For VGI features without a match in the reference data set, a search distance is defined, inside which reference features with the exact same name as the VGI feature are sought.
The study area of this research is Tehran city, Iran. A data set produced by the municipality of Tehran is used as the reference data set and OpenStreetMap data as the VGI data set. According to the results, 47 percent of VGI features have a name attribute and among these, 33 percent of them have correct name, 44 percent have approximately correct name, and the remaining 23 percent have incorrect names. The Overall attribute accuracy of the VGI data set used in this study, is thus 77 percent, indicating that among those features that have a name attribute, 77 percent of them have either correct or approximately correct names. A future line of research, based on the findings of this paper, could be to develop methods for evaluating the attribute accuracy of a data set without having to compare it with a reference data set.
Language:
Persian
Published:
Journal of Geomatics Science and Technology, Volume:5 Issue: 3, 2016
Pages:
49 to 64
https://magiran.com/p1527277  
دانلود و مطالعه متن این مقاله با یکی از روشهای زیر امکان پذیر است:
اشتراک شخصی
با عضویت و پرداخت آنلاین حق اشتراک یک‌ساله به مبلغ 1,390,000ريال می‌توانید 70 عنوان مطلب دانلود کنید!
اشتراک سازمانی
به کتابخانه دانشگاه یا محل کار خود پیشنهاد کنید تا اشتراک سازمانی این پایگاه را برای دسترسی نامحدود همه کاربران به متن مطالب تهیه نمایند!
توجه!
  • حق عضویت دریافتی صرف حمایت از نشریات عضو و نگهداری، تکمیل و توسعه مگیران می‌شود.
  • پرداخت حق اشتراک و دانلود مقالات اجازه بازنشر آن در سایر رسانه‌های چاپی و دیجیتال را به کاربر نمی‌دهد.
In order to view content subscription is required

Personal subscription
Subscribe magiran.com for 70 € euros via PayPal and download 70 articles during a year.
Organization subscription
Please contact us to subscribe your university or library for unlimited access!