Semantic Textual Similarity of Persian-English sentences using deep learning

Author(s):

Mohammad Abdous , Behrouz Minaei Bidgoli *

Message:

Article Type:

Research/Original Article (دارای رتبه معتبر)

Abstract:

Semantic Textual similarity is one of the subtasks of natural language processing that has attracted extensive rese arch in recent years. Measuring semantic similarity between words, sentences, paragraphs, and documents plays an important role in natural language processing and computational linguistics. Semantic similarity of texts is used in question-answering systems, fraud detection, machine translation, information retrieval and etc. Semantic similarity means calculating the degree of similarity between two textual documents, paragraphs or sentences, which are presented in both monolingual and cross lingual forms. In this article, by using the parallel corpus, for the first time, the cross lingual model of semantic similarity for Persian-English sentences is presented, and then we test and compare our model with the Multilingual BERT model. The results show that by using parallel corpuses, the quality of sentence embedding in two different languages can be improved. Pearson correlation criterion based on cosine similarity between sentence's vector of multilingual Bert has increased from 65% to 73.77% by the proposed method. The proposed method was also tested on the Arabic-English language pair, and the results show that the proposed method is superior to the multilingual Bert.

Keywords:

Natural Language Processing , Semantic Similarity , cross lingual , Deep Learning

Language:

Persian

Published:

Journal of Soft Computing and Information Technology, Volume:11 Issue: 1, 2022

Pages:

18 to 31

magiran.com/p2450093

دانلود و مطالعه متن این مقاله با یکی از روشهای زیر امکان پذیر است:

اشتراک شخصی

با عضویت و پرداخت آنلاین حق اشتراک یک‌ساله به مبلغ 1,390,000ريال می‌توانید 70 عنوان مطلب دانلود کنید!

اشتراک سازمانی

به کتابخانه دانشگاه یا محل کار خود پیشنهاد کنید تا اشتراک سازمانی این پایگاه را برای دسترسی نامحدود همه کاربران به متن مطالب تهیه نمایند!

اطلاعات بیشتر

توجه!

حق عضویت دریافتی صرف حمایت از نشریات عضو و نگهداری، تکمیل و توسعه مگیران می‌شود.
پرداخت حق اشتراک و دانلود مقالات اجازه بازنشر آن در سایر رسانه‌های چاپی و دیجیتال را به کاربر نمی‌دهد.

In order to view content subscription is required

Personal subscription

Subscribe magiran.com for 70 € euros via PayPal and download 70 articles during a year.

Organization subscription

Please contact us to subscribe your university or library for unlimited access!

More information

علمی مصوب

مجله رایانش نرم و فناوری اطلاعات

Journal of Soft Computing and Information Technology

فصلنامه فنی مهندسی به زبان فارسی و انگلیسی

آخرین شماره | آرشیو

ISSN: 2383-1006 eISSN: 2588-4913

صاحب امتیاز:

دانشگاه صنعتی نوشیروانی بابل

سردبیر:

دکتر علی آقاگل زاده

تلفن نشریه: ۰۱۱-۳۵۵۰۱۴۶۰

اطلاعات بیشتر نشریه

درباره نشریه پیام به نشریه سایت اختصاصی نشریه پذیرش الکترونیکی مقاله راهنمای نویسندگان

سامانه نویسندگان

از نویسنده(گان) این مقاله دعوت می‌کنیم در سایت ثبت‌نام کرده و این مقاله را به فهرست مقالات رزومه خود پیوست کنند. راهنما

به جمع مشترکان مگیران بپیوندید!

Semantic Textual Similarity of Persian-English sentences using deep learning

Mohammad Abdous , Behrouz Minaei Bidgoli *

Natural Language Processing , Semantic Similarity , cross lingual , Deep Learning

مجله رایانش نرم و فناوری اطلاعات

Journal of Soft Computing and Information Technology