فهرست مطالب

Artificial Intelligence and Data Mining - Volume:7 Issue: 1, Winter 2019

Journal of Artificial Intelligence and Data Mining
Volume:7 Issue: 1, Winter 2019

  • تاریخ انتشار: 1397/11/21
  • تعداد عناوین: 18
|
|
  • Seyed M. Ghazali, Y. Baleghi * Pages 1-16
    Observation in absolute darkness and daytime under every atmospheric situation is one of the advantages of thermal imaging systems. In spite of increasing trend of using these systems, there are still lots of difficulties in analysing thermal images due to the variable features of pedestrians and atmospheric situations. In this paper an efficient method is proposed for detecting pedestrians in outdoor thermal images that adapts to variable atmospheric situations. In the first step, the type of atmospheric situation is estimated based on the global features of the thermal image. Then, for each situation, a relevant algorithm is performed for pedestrian detection. To do this, thermal images are divided into three classes of atmospheric situations: a) fine such as sunny weather, b) bad such as rainy and hazy weather, c) hot such as hot summer days where pedestrians are darker than background. Then 2-Dimensional Double Density Dual Tree Discrete Wavelet Transform (2D DD DT DWT) in three levels is acquired from input images and the energy of low frequency coefficients in third level is calculated as the discriminating feature for atmospheric situation identification. Feed-forward neural network (FFNN) classifier is trained by this feature vector to determine the category of atmospheric situation. Finally, a predetermined algorithm that is relevant to the category of atmospheric situation is applied for pedestrian detection. The proposed method in pedestrian detection has high performance so that the accuracy of pedestrian detection in two popular databases is more than 99%.
    Keywords: Outdoor thermal images, Atmospheric situations, Artificial Neural Network, Wavelet Transform, Pedestrian detection
  • M. Rezaei, V. Derhami * Pages 17-25
    Nonnegative Matrix Factorization (NMF) algorithms have been utilized in a wide range of real applications. NMF is done by several researchers to its part based representation property especially in the facial expression recognition problem. It decomposes a face image into its essential parts (e.g. nose, lips, etc.) but in all previous attempts, it is neglected that all features achieved by NMF do not need for recognition problem. For example, some facial parts do not have any useful information regarding the facial expression recognition. Addressing this challenge of defining and calculating the contributions of each part, the Shapley value is used. It is applied for identifying the contribution of each feature in the classification problem; then, affects less features are removed. Experiments on the JAFFE dataset and MUG Facial Expression Database as benchmarks of facial expression datasets demonstrate the effectiveness of our approach.
    Keywords: Non-negative Matrix Factorization (NMF), Shapley value, Game theory
  • M. H. Khosravi * Pages 27-34
    Image segmentation is an essential and critical process in image processing and pattern recognition. In this paper we proposed a textured-based method to segment an input image into regions. In our method an entropy-based textured map of image is extracted, followed by an histogram equalization step to discriminate different regions. Then with the aim of eliminating unnecessary details and achieving more robustness against unwanted noises, a low-pass filtering technique is successfully used to smooth the image. As the next step, the appropriate pixons are extracted and delivered to a fuzzy c-mean clustering stage to obtain the final image segments. The results of applying the proposed method on several different images indicate its better performance in image segmentation compared to the other methods.
    Keywords: Image segmentation, image texture, pixon
  • Z. Mirzamomen *, Kh. Ghafooripour Pages 35-45
    Multi-label classification has many applications in the text categorization, biology and medical diagnosis, in which multiple class labels can be assigned to each training instance simultaneously. As it is often the case that there are relationships between the labels, extracting the existing relationships between the labels and taking advantage of them during the training or prediction phases can bring about significant improvements. In this paper, we have introduced positive, negative and hybrid relationships between the class labels for the first time and we have proposed a method to extract these relations for a multi-label classification task and consequently, to use them in order to improve the predictions made by a multi-label classifier. We have conducted extensive experiments to assess the effectiveness of the proposed method. The obtained results advocate the merits of the proposed method in improving the multi-label classification results.
    Keywords: Multi-label classification, Label Relationships, Association rule, Positive relation, Negative relation
  • M. Ghazanfari , A. Badiee *, M. Shamsollahi Pages 47-58
    Heart disease is one of the major causes of morbidity in the world. Currently, large proportions of healthcare data are not processed properly, thus, failing to be effectively used for decision making purposes. The risk of heart disease may be predicted via investigation of heart disease risk factors coupled with data mining knowledge. This paper presents a model developed using combined descriptive and predictive techniques of data mining that aims to aid specialists in the healthcare system to effectively predict patients with Coronary Artery Disease (CAD). To achieve this objective, some clustering and classification techniques are used. First, the number of clusters are determined using clustering indexes. Next, some types of decision tree methods and Artificial Neural Network (ANN) are applied to each cluster in order to predict CAD patients. Finally, results obtained show that the C&RT decision tree method performs best on all data used in this study with 0.074 error. All data used in this study are real and are collected from a heart clinic database.
    Keywords: data mining, coronary heart disease, Clustering, Classification, decision tree
  • N. Emami *, A. Pakzad Pages 59-68
    Breast cancer has become a widespread disease around the world in young women. Expert systems, developed by data mining techniques, are valuable tools in diagnosis of breast cancer and can help physicians for decision making process. This paper presents a new hybrid data mining approach to classify two groups of breast cancer patients (malignant and benign). The proposed approach, AP-AMBFA, consists of two phases. In the first phase, the Affinity Propagation (AP) clustering method is used as instances reduction technique which can find noisy instance and eliminate them. In the second phase, feature selection and classification are conducted by the Adaptive Modified Binary Firefly Algorithm (AMBFA) for selection of the most related predictor variables to target variable and Support Vectors Machine (SVM) technique as classifier. It can reduce the computational complexity and speed up the data mining process.
    Experimental results on Wisconsin Diagnostic Breast Cancer (WDBC) datasets show higher predictive accuracy. The obtained classification accuracy is 98.606%, a very promising result compared to the current state-of-the-art classification techniques applied to the same database. Hence this method will help physicians in more accurate diagnosis of breast cancer.tion; Binary Firefly Algorithm; Support Vec
    Keywords: Breast Cancer, Affinity Propagation, Feature Selectors Machine
  • R. Yarinezhad *, A. Sarabi Pages 69-76
    Vehicular ad hoc networks (VANETs) are a particular type of Mobile ad hoc networks (MANET) in which the vehicles are considered as nodes. Due to rapid topology changing and frequent disconnection makes it difficult to design an efficient routing protocol for routing data among vehicles. In this paper, a new routing protocol based on glowworm swarm optimization algorithm is provided. Using the glowworm algorithm the proposed protocol detects the optimal route between three-way and intersections. Then, the packets are delivered based on the selected routes. The proposed algorithm by using the glowworm swarm optimization algorithm, which is a distributed heuristic algorithm, assigns a value to each route from a source to the destination. Then a route with the higher value is selected to send messages from the source to the destination. The simulation results show that the proposed algorithm has a better performance than the similar algorithms.
    Keywords: Vehicular Ad Hoc Network, Routing, Glowworm Swarm Optimization, Urban Environments, Data Delivery Delay
  • Seyed M. H. Hasheminejad *, M. Sarvmili Pages 77-96
    Nowadays, new methods are required to take advantage of the rich and extensive gold mine of data given the vast content of data particularly created by educational systems. Data mining algorithms have been used in educational systems especially e-learning systems due to the broad usage of these systems. Providing a model to predict final student results in educational course is a reason for using data mining in educational systems. In this paper, we propose a novel rule-based classification method, called S3PSO (Students’ Performance Prediction based on Particle Swarm Optimization), to extract the hidden rules, which could be used to predict students’ final outcome. The proposed S3PSO method is based on Particle Swarm Optimization (PSO) algorithm in discrete space. The S3PSO particles encoding inducts more interpretable even for normal users like instructors. In S3PSO, Support, Confidence, and Comprehensibility criteria are used to calculate the fitness of each rule. Comparing the obtained results from S3PSO with other rule-based classification methods such as CART, C4.5, and ID3 reveals that S3PSO improves 31 % of the value of fitness measurement for Moodle data set. Additionally, comparing the obtained results from S3PSO with other classification methods such as SVM, KNN, Naïve Bayes, Neural Network and APSO reveals that S3PSO improves 9 % of the value of accuracy for Moodle data set and yields promising results for predicting students’ final outcome.
    Keywords: Educational Data Mining, Particle Swarm Optimization, Rule-Based Classification
  • A. Asilian Bidgoli *, H. Ebrahimpour, Komle, M. Askari, Seyed J. Mousavirad Pages 97-108
    This paper parallelizes the spatial pyramid match kernel (SPK) implementation. SPK is one of the most usable kernel methods, along with support vector machine classifier, with high accuracy in object recognition. MATLAB parallel computing toolbox has been used to parallelize SPK. In this implementation, MATLAB Message Passing Interface (MPI) functions and features included in the toolbox help us obtain good performance by two schemes of task-parallelization and dataparallelization models. Parallel SPK algorithm ran over a cluster of computers and achieved less run time. A speedup value equal to 13 is obtained for a configuration with up to 5 Quad processors.
    Keywords: Object recognition, Spatial pyramid match kernel, Parallel computing, Cluster of computers, Support Vector Machine
  • A. Ahmadi Tameh, M. Nassiri *, M. Mansoorizadeh Pages 109-119
    WordNet is a large lexical database of English language, in which, nouns, verbs, adjectives, and adverbs are grouped into sets of cognitive synonyms (synsets). Each synset expresses a distinct concept. Synsets are interlinked by both semantic and lexical relations. WordNet is essentially used for word sense disambiguation, information retrieval, and text translation. In this paper, we propose several automatic methods to extract Information and Communication Technology (ICT)-related data from Princeton WordNet. We, then, add these extracted data to our Persian WordNet. The advantage of automated methods is reducing the interference of human factors and accelerating the development of our bilingual ICT WordNet.
    In our first proposed method, based on a small subset of ICT words, we use the definition of each synset to decide whether that synset is ICT. The second mechanism is to extract synsets which are in a semantic relation with ICT synsets. We also use two similarity criteria, namely LCS and S3M, to measure the similarity between a synset definition in WordNet and definition of any word in Microsoft dictionary. Our last method is to verify the coordinate of ICT synsets. Results show that our proposed mechanisms are able to extract ICT data from Princeton WordNet at a good level of accuracy.
    Keywords: WordNet_semantic relation_synset_part of speech_Information - Communication Technology
  • N. Nazari, M. A. Mahdavi * Pages 121-135
    Text summarization endeavors to produce a summary version of a text, while maintaining the original ideas. The textual content on the web, in particular, is growing at an exponential rate. The ability to decipher through such massive amount of data, in order to extract the useful information, is a major undertaking and requires an automatic mechanism to aid with the extant repository of information. Text summarization systems intent to assist with content reduction by means of keeping the relevant information and filtering the non-relevant parts of the text. In terms of the input, there are two fundamental approaches among text summarization systems. The first approach summarizes a single document. In other words, the system takes one document as an input and produce a summary version as its output. The alternative approach is to take several documents as its input and produce a single summary document as its output. In terms of output, summarization systems are also categorized into two major types. One approach would be to extract exact sentences from the original document in order to build the summary output. The alternative would be a more complex approach, in which the rendered text is a rephrased version of the original document. This paper will offer an in-depth introduction to automatic text summarization. We also mention some evaluation techniques to evaluate the quality of automatic text summarization.
    Keywords: automatic text summarization, multiple document summarization, single document summarization, summarization evaluation technique
  • M. Asadolahzade Kermanshahi, M. M. Homayounpour * Pages 137-147
    Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Most previous research attempted to improve training phase such as training algorithms, different types of network, network architecture, feature type, etc. But in this study, we focus on test phase which is related to generate phoneme sequence that is also essential to achieve good phoneme recognition accuracy. Past research used Viterbi algorithm on hidden Markov model (HMM) to generate phoneme sequences. We address an important problem associated with this method. To deal with the problem of considering geometric distribution of state duration in HMM, we use real duration probability distribution for each phoneme with the aid of hidden semi-Markov model (HSMM). We also represent each phoneme with only one state to simply use phonemes duration information in HSMM. Furthermore, we investigate the performance of a post-processing method, which corrects the phoneme sequence obtained from the neural network, based on our knowledge about phonemes. The experimental results using the Persian FarsDat corpus show that using extended Viterbi algorithm on HSMM achieves phoneme recognition accuracy improvements of 2.68% and 0.56% over conventional methods using Gaussian mixture model-hidden Markov models (GMM-HMMs) and Viterbi on HMM, respectively. The post-processing method also increases the accuracy compared to before its application.
    Keywords: Phoneme Recognition, Deep Neural Network, Hidden Markov Model, Hidden Semi-Markov Model, Extended Viterbi Algorithm
  • A. Torkaman, R. Safabakhsh * Pages 149-159
    Opponent modeling is a key challenge in Real-Time Strategy (RTS) games as the environment is adversarial in these games, and the player cannot predict the future actions of her opponent. Additionally, the environment is partially observable due to the fog of war. In this paper, we propose an opponent model which is robust to the observation noise existing due to the fog of war. In order to cope with the uncertainty existing in these games, we design a Bayesian network whose parameters are learned from an unlabeled game-logs dataset; so it does not require a human expert’s knowledge. We evaluate our model on StarCraft which is considered as a unified test-bed in this domain. The model is compared with that proposed by Synnaeve and Bessiere. Experimental results on recorded games of human players show that the proposed model can predict the opponent’s future decisions more effectively. Using this model, it is possible to create an adaptive game intelligence algorithm applicable to RTS games, where the concept of build order (the order of building construction) exists.
    Keywords: Bayesian Network, Opponent modeling, Real-Time Strategy games, StarCraft
  • M. Abtahi * Pages 161-168
    This paper proposes an intelligent approach for dynamic identification of the vehicles. The proposed approach is based on the data-driven identification and uses a high-performance local model network (LMN) for estimation of the vehicle’s longitudinal velocity, lateral acceleration and yaw rate. The proposed LMN requires no pre-defined standard vehicle model and uses measurement data to identify vehicle’s dynamics. The LMN is trained by hierarchical binary tree (HBT) learning algorithm, which results in a network with maximum generalizability and best linear or nonlinear structure. The proposed approach is applied to a measurement dataset, obtained from a Volvo V70 vehicle to estimate its longitudinal velocity, lateral acceleration and yaw rate. The results of identification revealed that the LMN can identify accurately the vehicle’s dynamics. Furthermore, comparison of LMN results and a multi-layer perceptron (MLP) neural network demonstrated the far-better performance of the proposed approach.
    Keywords: local model network, hierarchical binary tree, vehicle’s dynamics, Identification, neural network
  • M. YousefiKhoshbakht, N. Mahmoodi Darani * Pages 169-179
    Abstract: The Open Vehicle Routing Problem (OVRP) is one of the most important extensions of the vehicle routing problem (VRP) that has many applications in industrial and service. In the VRP, a set of customers with a specified demand of goods are given and a depot where a fleet of identical capacitated vehicles is located. We are also given the ‘‘traveling costs’’ between the depot and all the customers, and between each pair of customers. In the OVRP against to VRP, vehicles are not required to return to the depot after completing service. Because VRP and OVRP belong to NP-hard Problems, an efficient hybrid elite ant system called EACO is proposed for solving them in the paper. In this algorithm, a modified tabu search (TS), a new state transition rule and a modified pheromone updating rule are used for more improving solutions. These modifications lead that the proposed algorithm does not trapped at the local optimum and discovers different parts of the solution space. Computational results on fourteen standard benchmark instances for VRP and OVRP show that EACO finds the best known solutions for most of the instances and is comparable in terms of solutions quality to the best performing published metaheuristics in the literature.
    Keywords: Vehicle Routing Problem, Open Vehicle Routing Problem, Elite Ant System, Tabu Search, NP-hard Problems
  • E. Shahsavari, S. Emadi * Pages 181-191
    Service-oriented architecture facilitates the running time of interactions by using business integration on the networks. Currently, web services are considered as the best option to provide Internet services. Due to an increasing number of Web users and the complexity of users’ queries, simple and atomic services are not able to meet the needs of users; and to provide complex services, it requires service composition. Web service composition as an effective approach to the integration of business institutions’ plans has taken significant acceleration. Nowadays, web services are created and updated in a moment. Therefore, in the real world, there are many services which may not have composability according to the conditions and constraints of the user's preferred choice. In the proposed method for automatic service composition, the main requirements of users including available inputs, expected outputs, quality of service, and the priority are initially and explicitly specified by the user and service composition is done with this information. In the proposed approach, due to a large number of services with the same functionality, at first, the candidate services are reduced by the quality of service-based Skyline method, and moreover, by using an algorithm based on graph search, all possible solutions will be produced. Finally, the user’s semantic constraints are applied on service composition, and the best composition is offered according to user’s requests. The result of this study shows that the proposed method is more scalable and efficient, and it offers a better solution by considering the user’s semantic constraints.
    Keywords: Service composition, Web services, Skyline services, Qualitative parameters, Service
  • M. Moradi Zirkohi * Pages 191-200
    In this paper, a high-performance optimal fractional emotional intelligent controller for an Automatic Voltage Regulator (AVR) in power system using Cuckoo optimization algorithm (COA) is proposed. AVR is the main controller within the excitation system that preserves the terminal voltage of a synchronous generator at a specified level. The proposed control strategy is based on brain emotional learning, which is a self-tuning controller so-called brain emotional learning based intelligent controller (BELBIC) and is based on sensory inputs and emotional cues. The major contribution of the paper is that to use the merits of fractional order PID (FOPID) controllers, a FOPID controller is employed to formulate stimulant input (SI) signal. This is a distinct advantage over published papers in the literature that a PID controller used to generate SI. Furthermore, another remarkable feature of the proposed approach is that it is a model-free controller. The proposed control strategy can be a promising controller in terms of simplicity of design, ease of implementation and less time-consuming. In addition, in order to enhance the performance of the proposed controller, its parameters are tuned by COA. In order to design BELBIC controller for AVR system a multi-objective optimization problem including overshoot, settling time, rise time and steady-state error is formulated. Simulation studies confirm that the proposed controller compared to classical PID and FOPID controllers introduced in the literature shows superior performance regarding model uncertainties. Having applied the proposed controller, the rise time and settling time are improved 47% and 57%, respectively.
    Keywords: Brain emotional learning based intelligent controller, Cuckoo optimization algorithm, fractional order PID, Automatic Voltage Regulator
  • S. Ahmadkhani, P. Adibi *, A. ahmadkhani Pages 201-210
    In this paper, several two-dimensional extensions of principal component analysis (PCA) and linear discriminant analysis (LDA) techniques has been applied in a lossless dimensionality reduction framework, for face recognition application. In this framework, the benefits of dimensionality reduction were used to improve the performance of its predictive model, which was a support vector machine (SVM) classifier. At the same time, the loss of the useful information was minimized using the projection penalty idea. The well-known face databases were used to train and evaluate the proposed methods. The experimental results indicated that the proposed methods had a higher average classification accuracy in general compared to the classification based on Euclidean distance, and also compared to the methods which first extracted features based on dimensionality reduction technics, and then used SVM classifier as the predictive model.
    Keywords: Lossless Dimensionality Reduction, Face recognition, Support Vector Machine, (2D)2PCA, (2D)2LDA