A comparative analysis of deep neural network architectures for sentence classification using genetic algorithm

Rogers, Brendan; Noman, Nasimul; Chalup, Stephan; Moscato, Pablo

Title: A comparative analysis of deep neural network architectures for sentence classification using genetic algorithm
Creator: Rogers, Brendan; Noman, Nasimul; Chalup, Stephan; Moscato, Pablo
Relation: Evolutionary Intelligence Vol. 17, p. 1933-1952
Publisher Link: http://dx.doi.org/10.1007/s12065-023-00874-8
Publisher: Springer
Resource Type: journal article
Date: 2024
Description: Because of the number of different architectures, numerous settings of their hyper-parameters and disparity among their sizes, it is difficult to equitably compare various deep neural network (DNN) architectures for sentence classification. Evolutionary algorithms are emerging as a popular method for the automatic selection of architectures and hyperparameters for DNNs whose generalisation performance is heavily impacted by such settings. Most of the work in this area is done in the image domain, leaving text analysis, another prominent application domain of deep learning, largely absent. Besides, literature presents conflicting claims regarding the superiority of one DNN architecture over others in the context of sentence classification. To address this issue, we propose a genetic algorithm (GA) for optimising the architectural and hyperparameter settings in different DNN types for sentence classification. To enable the representation of the wide variety of architectures and hyperparameters utilised in DNNs, we employed a generalised and flexible encoding scheme in our GA. Our study involves optimising two convolutional and three recurrent architectures to ensure a fair and unbiased evaluation of their performance. Furthermore, we explore the effects of using F1 score versus accuracy as a performance metric during evolutionary optimisation of those architectures. Our results, using ten datasets, show that, in general, the architectures and hyperparameters evolved using the F1 score tended to outperform those evolved using accuracy and in the case of CNN and BiLSTM the results were significant in statistical measures. Of the five architectures considered, the GA-evolved gated recurrent unit (GRU) performed the strongest overall, achieving good generalisation performance while using relatively few trainable parameters, establishing GRU as the preferred architecture for the sentence classification task. The optimised architectures exhibited comparable performance with the state-of-the-art, given the large difference in trainable parameters.
Subject: sentence classification; hyperparameter optimisation; deep neural network (DNN); genetic algorithm
Identifier: http://hdl.handle.net/1959.13/1505115
Identifier: uon:55626
Identifier: ISSN:1864-5909
Language: eng
Reviewed

Hits: 2115
Visitors: 2081
Downloads: 0