Benefiting from Structured Resources to Present a Computationally Efficient Word Embedding Method

Author(s):
Message:
Article Type:
Research/Original Article (دارای رتبه معتبر)
Abstract:

In recent years, new word embedding methods have clearly improved the accuracy of NLP tasks. A review of the progress of these methods shows that the complexity of these models and the number of their training parameters grows increasingly. Therefore, there is a need for methodological innovation for presenting new word embedding methodologies. Most current word embedding methods use a large corpus of unstructured data to train the semantic vectors of words. This paper addresses the basic idea of utilizing from structure of structured data to introduce embedding vectors. Therefore, the need for high processing power, large amount of processing memory, and long processing time will be met using structures and conceptual knowledge lies in them. For this purpose, a new embedding vector, Word2Node is proposed. It uses a well-known structured resource, the WordNet, as a training corpus and hypothesis that graphic structure of the WordNet includes valuable linguistic knowledge that can be considered and not ignored to provide cost-effective and small sized embedding vectors. The Node2Vec graph embedding method allows us to benefit from this powerful linguistic resource. Evaluation of this idea in two tasks of word similarity and text classification has shown that this method perform the same or better in comparison to the word embedding method embedded in it (Word2Vec). This result is achieved while the required training data is reduced by about 50,000,000%. These results provide a view of capacity of the structured data to improve the quality of existing embedding methods and the resulting vectors.

Language:
English
Published:
Journal of Artificial Intelligence and Data Mining, Volume:10 Issue: 4, Autumn 2022
Pages:
505 to 514
magiran.com/p2519431  
دانلود و مطالعه متن این مقاله با یکی از روشهای زیر امکان پذیر است:
اشتراک شخصی
با عضویت و پرداخت آنلاین حق اشتراک یک‌ساله به مبلغ 1,390,000ريال می‌توانید 70 عنوان مطلب دانلود کنید!
اشتراک سازمانی
به کتابخانه دانشگاه یا محل کار خود پیشنهاد کنید تا اشتراک سازمانی این پایگاه را برای دسترسی نامحدود همه کاربران به متن مطالب تهیه نمایند!
توجه!
  • حق عضویت دریافتی صرف حمایت از نشریات عضو و نگهداری، تکمیل و توسعه مگیران می‌شود.
  • پرداخت حق اشتراک و دانلود مقالات اجازه بازنشر آن در سایر رسانه‌های چاپی و دیجیتال را به کاربر نمی‌دهد.
In order to view content subscription is required

Personal subscription
Subscribe magiran.com for 70 € euros via PayPal and download 70 articles during a year.
Organization subscription
Please contact us to subscribe your university or library for unlimited access!