Extraction of Effective Textual and Semantic Features in Learning to Rank for Web Document Retrieval

Message:
Article Type:
Research/Original Article (دارای رتبه معتبر)
Abstract:

Ranking algorithms, as the core of web search systems, are responsible for finding and ranking the most relevant documents to user information needs from the crawled and indexed corpus. With the ever-increasing amount of available training data, ranking technologies are moving towards using Machine Learning methods, described as Learning to Rank algorithms. The basic Learning to Rank systems mainly have used textual features while ignoring semantic features. With the advent of Semantic Web, there is an emerging interest in developing and using semantic features for Learning to Rank systems. An important challenge is that there is currently no comprehensive study on the combined usage of textual and semantic features for Learning to Rank systems. In this paper, first, we define and implement four new sets of semantic features based on Knowledge Graph, Entity Repetition, Textual Fields and Vector Representation of Words and Texts. For experimental analysis, we used the MQ-2007 dataset from LETOR 4, which includes a set of textual features. The results of running six standard Learning to Rank Algorithms show that by using semantic features, either in isolation or in combination with textual features, significantly increases the performance. The increase in performance is even more significant when we limit the tests to hard queries. We also implemented an existing Feature Selection algorithm to test whether it can improve the results even further. The results showed improvements for some Learning to Rank algorithms, yet failed to improve on others.

Language:
Persian
Published:
Journal of Information Processing and Management, Volume:36 Issue: 4, 2021
Pages:
1081 to 1112
magiran.com/p2296475  
دانلود و مطالعه متن این مقاله با یکی از روشهای زیر امکان پذیر است:
اشتراک شخصی
با عضویت و پرداخت آنلاین حق اشتراک یک‌ساله به مبلغ 1,390,000ريال می‌توانید 70 عنوان مطلب دانلود کنید!
اشتراک سازمانی
به کتابخانه دانشگاه یا محل کار خود پیشنهاد کنید تا اشتراک سازمانی این پایگاه را برای دسترسی نامحدود همه کاربران به متن مطالب تهیه نمایند!
توجه!
  • حق عضویت دریافتی صرف حمایت از نشریات عضو و نگهداری، تکمیل و توسعه مگیران می‌شود.
  • پرداخت حق اشتراک و دانلود مقالات اجازه بازنشر آن در سایر رسانه‌های چاپی و دیجیتال را به کاربر نمی‌دهد.
In order to view content subscription is required

Personal subscription
Subscribe magiran.com for 70 € euros via PayPal and download 70 articles during a year.
Organization subscription
Please contact us to subscribe your university or library for unlimited access!