Metaheuristic Optimization Algorithms in Artificial Intelligence: A Comprehensive Systematic Review of Neural Architecture Search, Hyperparameter Optimization, and Intelligent Feature Engineering
Abstract
The intersection of metaheuristic optimization algorithms and Artificial Intelligence (AI) has emerged as a transformative research frontier, yielding significant advances in the automated design and tuning of Machine Learning (ML) models. This paper presents a comprehensive systematic review, following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines, examining 347 peer-reviewed studies published between 2015 and 2025 across five major scholarly databases: Scopus, Web of Science (WoS), IEEE Xplore, ACM Digital Library, and arXiv. The review investigates three critical domains of AI optimization where metaheuristic algorithms have demonstrated exceptional efficacy: 1) Neural Architecture Search (NAS), encompassing convolutional, recurrent, and transformer architecture design, 2) Hyperparameter Optimization (HPO), covering learning rate tuning, batch size selection, regularization parameter calibration, and optimizer configuration, and 3) intelligent feature engineering, including wrapper-based feature selection, feature construction, and dimensionality reduction. Our analysis reveals that evolutionary algorithms (Genetic Algorithms (GAs), Differential Evolution (DE)) and swarm intelligence methods (Particle Swarm Optimization (PSO), Grey Wolf Optimizer (GWO), Whale Optimization Algorithm (WOA)) consistently outperform traditional grid search and random search methods, achieving average accuracy improvements of 2.3%–5.8% while reducing computational cost by 40–75%. Furthermore, hybrid metaheuristic–AI approaches demonstrate synergistic performance gains exceeding those of standalone methods. The review also provides bibliometric analysis, identifies key research trends, highlights methodological challenges—including computational overhead, scalability limitations, and reproducibility concerns—and proposes eight future research directions spanning federated optimization, quantum-inspired metaheuristics, and Large Language Model (LLM) architecture search. This work serves as a comprehensive reference for researchers and practitioners seeking to leverage metaheuristic intelligence for automated AI model optimization.
Keywords:
Metaheuristic algorithms, Artificial intelligence, Neural architecture search, Hyperparameter optimization, Feature selectionReferences
- [1] LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444. https://doi.org/10.1038/nature14539
- [2] Jordan, M. I., & Mitchell, T. M. (2015). Machine learning: Trends, perspectives, and prospects. Science, 349(6245), 255–260. https://doi.org/10.1126/science.aaa8415
- [3] Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT Press Cambridge. https://mitpress.mit.edu/9780262035613/deep-learning/
- [4] Talbi, E. G. (2009). Metaheuristics: From design to implementation. Wiley. https://www.wiley.com/en-us/Metaheuristics%3A+From+Design+to+Implementation+-p-9780470278581
- [5] Yang, X. S. (2020). Nature-inspired optimization algorithms. Elsevier. https://shop.elsevier.com/books/nature-inspired-optimization-algorithms/yang/978-0-12-821986-7
- [6] Boussaïd, I., Lepagnot, J., & Siarry, P. (2013). A survey on optimization metaheuristics. Information sciences, 237, 82–117. https://doi.org/10.1016/j.ins.2013.02.041
- [7] Karimi-Mamaghan, M., Mohammadi, M., Meyer, P., Karimi-Mamaghan, A. M., & Talbi, E. G. (2022). Machine learning at the service of meta-heuristics for solving combinatorial optimization problems: A state-of-the-art. European journal of operational research, 296(2), 393–422. https://doi.org/10.1016/j.ejor.2021.04.032
- [8] Handoko, S. D., Nguyen, D. T., Yuan, Z., & Lau, H. (2014). Reinforcement learning for adaptive operator selection in memetic search applied to quadratic assignment problem. GECCO comp ’14: Proceedings of the companion publication of the 2014 annual conference on genetic and evolutionary computation (pp. 193–194). ACM Digital Library. https://doi.org/10.1145/2598394.2598451
- [9] Dokeroglu, T., Canturk, D., & Kucukyilmaz, T. (2024). A survey on pioneering metaheuristic algorithms between 2019 and 2024. https://doi.org/10.48550/arXiv.2501.14769
- [10] Zoph, B., & Le, Q. V. (2016). Neural architecture search with reinforcement learning. https://doi.org/10.48550/arXiv.1611.01578
- [11] Elsken, T., Metzen, J. H., & Hutter, F. (2019). Neural architecture search: A survey. Journal of machine learning research, 20(55), 1–21. http://jmlr.org/papers/v20/18-598.html
- [12] Feurer, M., & Hutter, F. (2019). Hyperparameter optimization. In Automated machine learning (pp. 3–33). Springer, Cham. https://doi.org/10.1007/978-3-030-05318-5_1
- [13] Yu, T., & Zhu, H. (2020). Hyper-parameter optimization: A review of algorithms and applications. https://doi.org/10.48550/arXiv.2003.05689
- [14] Xue, B., Zhang, M., Browne, W. N., & Yao, X. (2016). A survey on evolutionary computation approaches to feature selection. IEEE transactions on evolutionary computation, 20(4), 606–626. https://doi.org/10.1109/TEVC.2015.2504420
- [15] Li, J., Cheng, K., Wang, S., Morstatter, F., Trevino, R. P., Tang, J., & Liu, H. (2017). Feature selection: A data perspective. ACM computing surveys (CSUR), 50(6), 1–45. https://doi.org/10.1145/3136625
- [16] Glorot, X., & Bengio, Y. (2010). Understanding the difficulty of training deep feedforward neural networks. Proceedings of the thirteenth international conference on artificial intelligence and statistics (pp. 249-256). JMLR Workshop and Conference Proceedings. https://proceedings.mlr.press/v9/glorot10a/glorot10a.pdf
- [17] Loshchilov, I., & Hutter, F. (2017). SGDR: Stochastic gradient descent with warm restarts. 5th international conference on learning representations (ICLR 2017) (PP. 1-11). OpenReview. https://researchr.org/publication/LoshchilovH17
- [18] Li, H., Xu, Z., Taylor, G., Studer, C., & Goldstein, T. (2018). Visualizing the loss landscape of neural nets. Advances in neural information processing systems 31 (NeurIPS 2018) (pp. 6389–6399). Curran Associates, Inc. https://proceedings.neurips.cc/paper/2018/hash/a41b3bb3e6b050b6c9067c67f663b915-Abstract.html
- [19] Zhou, Z. H. (2012). Ensemble methods: Foundations and algorithms. CRC Press. https://doi.org/10.1201/b12207
- [20] Holand, J. H. (1975). Adaptation in natural and artificial systems. The MIT Press. https://mitpress.mit.edu/9780262082136/adaptation-in-natural-and-artificial-systems/
- [21] Storn, R., & Price, K. (1997). Differential evolution – A simple and efficient heuristic for global optimization over continuous spaces. Journal of global optimization, 11(4), 341–359. https://doi.org/10.1023/A:1008202821328
- [22] Beyer, H. G., & Schwefel, H. P. (2002). Evolution strategies – A comprehensive introduction. Natural computing, 1(1), 3–52. https://doi.org/10.1023/A:1015059928466
- [23] Kennedy, J., & Eberhart, R. (1995). Particle swarm optimization. Proceedings of ICNN’95 - international conference on neural networks (pp. 1942–1948). IEEE. https://doi.org/10.1109/ICNN.1995.488968
- [24] Dorigo, M., Maniezzo, V., & Colorni, A. (1996). Ant system: Optimization by a colony of cooperating agents. IEEE transactions on systems, man, and cybernetics, part b (cybernetics), 26(1), 29–41. https://doi.org/10.1109/3477.484436
- [25] Karaboga, D. (2005). An idea based on honey bee swarm for numerical optimization. https://abc.erciyes.edu.tr/pub/tr06_2005.pdf
- [26] Mirjalili, S., Mirjalili, S. M., & Lewis, A. (2014). Grey wolf optimizer. Advances in engineering software, 69, 46–61. https://doi.org/10.1016/j.advengsoft.2013.12.007
- [27] Mirjalili, S., & Lewis, A. (2016). The whale optimization algorithm. Advances in engineering software, 95, 51–67. https://doi.org/10.1016/j.advengsoft.2016.01.008
- [28] Heidari, A. A., Mirjalili, S., Faris, H., Aljarah, I., Mafarja, M., & Chen, H. (2019). Harris hawks optimization: Algorithm and applications. Future generation computer systems, 97, 849–872. https://doi.org/10.1016/j.future.2019.02.028
- [29] Mirjalili, S., Gandomi, A. H., Mirjalili, S. Z., Saremi, S., Faris, H., & Mirjalili, S. M. (2017). Salp swarm algorithm: A bio-inspired optimizer for engineering design problems. Advances in engineering software, 114, 163–191. https://doi.org/10.1016/j.advengsoft.2017.07.002
- [30] Kirkpatrick, S., Gelatt, C. D., & Vecchi, M. P. (1983). Optimization by simulated annealing. Science, 220(4598), 671–680. https://doi.org/10.1126/science.220.4598.671
- [31] Mirjalili, S. (2016). SCA: A sine cosine algorithm for solving optimization problems. Knowledge-based systems, 96, 120–133. https://doi.org/10.1016/j.knosys.2015.12.022
- [32] Mirjalili, S. (2015). Moth-flame optimization algorithm: A novel nature-inspired heuristic paradigm. Knowledge-based systems, 89, 228–249. https://doi.org/10.1016/j.knosys.2015.07.006
- [33] Osaba, E., Del Ser, J., Sadollah, A., Bilbao, M. N., & Camacho, D. (2018). A discrete water cycle algorithm for solving the symmetric and asymmetric traveling salesman problem. Applied soft computing, 71, 277–290. https://doi.org/10.1016/j.asoc.2018.06.047
- [34] Bergstra, J., & Bengio, Y. (2012). Random search for hyper-parameter optimization. Journal of machine learning research, 13(2), 281–305. http://www.jmlr.org/papers/volume13/bergstra12a/bergstra12a.pdf
- [35] Snoek, J., Larochelle, H., & Adams, R. (2012). Practical Bayesian optimization of machine learning algorithms. Advances in neural information processing systems (Vol. 25, pp. 2951–2959). Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2012/file/05311655a15b75fab86956663e1819cd-Paper.pdf
- [36] Hutter, F., Hoos, H. H., & Leyton-Brown, K. (2011). Sequential model-based optimization for general algorithm configuration. In Learning and intelligent optimization (pp. 507–523). Berlin, Heidelberg: Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-642-25566-3_40
- [37] Miikkulainen, R., & Forrest, S. (2021). A biological perspective on evolutionary computation. Nature machine intelligence, 3(1), 9–15. https://doi.org/10.1038/s42256-020-00278-8
- [38] Fister Jr, I., Yang, X.-S., Fister, I., Brest, J., & Fister, D. (2013). A brief review of nature-inspired algorithms for optimization. Elektrotehniski vestnik, 80(3), 116–122. https://www.researchgate.net/publication/249645112
- [39] Abdel-Basset, M., Abdel-Fatah, L., & Sangaiah, A. K. (2018). Metaheuristic algorithms: A comprehensive review. In Computational intelligence for multimedia big data on the cloud with engineering applications (pp. 185–231). Academic Press. https://doi.org/10.1016/B978-0-12-813314-9.00010-4
- [40] Goldberg, D. E. (1989). Genetic algorithms in search, optimization & machine learning. Addison-Wesley. https://www.amazon.fr/Algorithms-Optimization-Learning-Goldberg-published/dp/B00E31KI3G
- [41] Xie, L., & Yuille, A. (2017). Genetic CNN. Proceedings of the IEEE international conference on computer vision (ICCV 2017) (pp. 1379–1388). IEEE. https://doi.org/10.1109/ICCV.2017.154
- [42] Das, S., & Suganthan, P. N. (2011). Differential evolution: A survey of the state-of-the-art. IEEE transactions on evolutionary computation, 15(1), 4–31. https://doi.org/10.1109/TEVC.2010.2059031
- [43] Mafarja, M., Aljarah, I., Faris, H., Hammouri, A. I., Al-Zoubi, A. M., & Mirjalili, S. (2019). Binary grasshopper optimisation algorithm approaches for feature selection problems. Expert systems with applications, 117, 267–286. https://doi.org/10.1016/j.eswa.2018.09.015
- [44] Bischl, B., Richter, J., Becker, M., Binder, M., & Pielok, T. (2023). Hyperparameter optimization: Foundations, algorithms, best practices, and open challenges. WIREs data mining and knowledge discovery, 13(2), 1–43. https://doi.org/10.1002/widm.1484
- [45] White, C., Neiswanger, W., & Savani, Y. (2021). BANANAS: Bayesian optimization with neural architectures for neural architecture search. Proceedings of the AAAI conference on artificial intelligence, (Vol. 35, No. 12, PP. 10293-10301). https://doi.org/10.1609/aaai.v35i12.17233
- [46] Liu, H., Simonyan, K., & Yang, Y. (2018). Darts: Differentiable architecture search. https://doi.org/10.48550/arXiv.1806.09055
- [47] Pham, H., Guan, M., Zoph, B., Le, Q., & Dean, J. (2018). Efficient neural architecture search via parameters sharing. Proceedings of the 35th international conference on machine learning (pp. 4095–4104). Proceedings of Machine Learning Research (PMLR). https://proceedings.mlr.press/v80/pham18a.html
- [48] Zoph, B., Vasudevan, V., Shlens, J., & Le, Q. V. (2018). Learning transferable architectures for scalable image recognition. 2018 IEEE/CVF conference on computer vision and pattern recognition (CVPR) (pp. 8697–8710). IEEE. https://doi.org/10.1109/CVPR.2018.00907
- [49] Guyon, I., & Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of machine learning research, 3, 1157–1182. https://www.jmlr.org/papers/v3/guyon03a.html
- [50] Eiben, A. E., & Smith, J. E. (2015). Introduction to evolutionary computing. Springer Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-44874-8
- [51] Črepinšek, M., Liu, S. H., & Mernik, M. (2013). Exploration and exploitation in evolutionary algorithms: A survey. ACM computing surveys (CSUR), 45(3), 1–33. https://doi.org/10.1145/2480741.2480752
- [52] Alba, E., & Dorronsoro, B. (2008). Cellular genetic algorithms. Springer New York, NY. https://doi.org/10.1007/978-0-387-77610-1
- [53] Morales-Castañeda, B., Zaldívar, D., Cuevas, E., Fausto, F., & Rodríguez, A. (2020). A better balance in metaheuristic algorithms: Does it exist? Swarm and evolutionary computation, 54, 100671. https://doi.org/10.1016/j.swevo.2020.100671
- [54] White, C., Zela, A., Ru, R., Liu, Y., & Hutter, F. (2021). How powerful are performance predictors in neural architecture search? Advances in neural information processing systems (pp. 28454–28469). Curran Associates, Inc. https://proceedings.neurips.cc/paper/2021/hash/ef575e8837d065a1683c022d2077d342-Abstract.html
- [55] Karafotias, G., Hoogendoorn, M., & Eiben, A. E. (2015). Parameter control in evolutionary algorithms: Trends and challenges. IEEE transactions on evolutionary computation, 19(2), 167–187. https://doi.org/10.1109/TEVC.2014.2308294
- [56] Salmani Pour Avval, S., Eskue, N. D., Groves, R. M., & Yaghoubi, V. (2025). Systematic review on neural architecture search. Artificial intelligence review, 58(3), 73. https://doi.org/10.1007/s10462-024-11058-w
- [57] Baker, B., Gupta, O., Raskar, R., & Naik, N. (2017). Accelerating neural architecture search using performance prediction. https://doi.org/10.48550/arXiv.1705.10823
- [58] Li, L., & Talwalkar, A. (2020). Random search and reproducibility for neural architecture search. Proceedings of the 35th uncertainty in artificial intelligence conference (pp. 367–377). Proceedings of Machine Learning Research (PMLR). https://proceedings.mlr.press/v115/li20c.html
- [59] Liu, Y., Sun, Y., Xue, B., Zhang, M., Yen, G. G., & Tan, K. C. (2023). A survey on evolutionary neural architecture search. IEEE transactions on neural networks and learning systems, 34(2), 550–570. https://doi.org/10.1109/TNNLS.2021.3100554
- [60] Real, E., Aggarwal, A., Huang, Y., & Le, Q. V. (2019). Regularized evolution for image classifier architecture search. Proceedings of the AAAI conference on artificial intelligence (pp. 4780–4789). AAAI Press. https://doi.org/10.1609/aaai.v33i01.33014780
- [61] Ren, P., Xiao, Y., Chang, X., Huang, P. Y., Li, Z., Chen, X., & Wang, X. (2021). A comprehensive survey of neural architecture search: Challenges and solutions. ACM computing surveys (CSUR), 54(4), 1–34. https://doi.org/10.1145/3447582
- [62] Page, M. J., McKenzie, J. E., Bossuyt, P. M., Boutron, I., Hoffmann, T. C., Mulrow, C. D., … ., & Moher, D. (2021). The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. British medical journal, 372. https://doi.org/10.1136/bmj.n71
- [63] Real, E., Liang, C., So, D., & Le, Q. (2020). AutoML-zero: Evolving machine learning algorithms from scratch. Proceedings of the 37th international conference on machine learning (pp. 8007–8019). Proceedings of Machine Learning Research (PMLR). https://proceedings.mlr.press/v119/real20a.html
- [64] Sun, Y., Xue, B., Zhang, M., & Yen, G. G. (2020). Evolving deep convolutional neural networks for image classification. IEEE transactions on evolutionary computation, 24(2), 394–407. https://doi.org/10.1109/TEVC.2019.2916183
- [65] Liu, H., Simonyan, K., Vinyals, O., Fernando, C., & Kavukcuoglu, K. (2017). Hierarchical representations for efficient architecture search. https://doi.org/10.48550/arXiv.1711.00436
- [66] Junior, F. E. F., & Yen, G. G. (2019). Particle swarm optimization of deep neural networks architectures for image classification. Swarm and evolutionary computation, 49, 62–74. https://doi.org/10.1016/j.swevo.2019.05.010
- [67] Brodzicki, A., Piekarski, M., & Jaworek-Korjakowska, J. (2021). The whale optimization algorithm approach for deep neural networks. Sensors, 21(23), 8003. https://doi.org/10.3390/s21238003
- [68] Singh, T., Solanki, A., Sharma, S. K., Jhanjhi, N. Z., & Ghoniem, R. M. (2023). Grey wolf optimization-based CNN-LSTM network for the prediction of energy consumption in smart home environment. IEEE access, 11, 114917-114935. https://doi.org/10.1109/ACCESS.2023.3311751
- [69] Lu, Z., Whalen, I., Boddeti, V., Dhebar, Y., Deb, K., Goodman, E., & Banzhaf, W. (2019). NSGA-net: Neural architecture search using multi-objective genetic algorithm. Proceedings of the genetic and evolutionary computation conference (pp. 419–427). New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/3321707.3321729
- [70] Lu, Z., Whalen, I., Dhebar, Y., Deb, K., Goodman, E. D., Banzhaf, W., & Boddeti, V. N. (2021). Multiobjective evolutionary design of deep convolutional neural networks for image classification. IEEE transactions on evolutionary computation, 25(2), 277–291. https://doi.org/10.1109/TEVC.2020.3024708
- [71] Gu, H., Wang, H., & Jin, Y. (2022). Surrogate-assisted differential evolution with adaptive multi-subspace search for large-scale expensive optimization. IEEE transactions on evolutionary computation, 27(6), 1765 - 1779. https://doi.org/10.1109/TEVC.2022.3226837
- [72] Ghosh, A., Jana, N. D., & Ghosh, S. (2025). Automated CNN architecture design with enhanced particle swarm optimization. Journal of heuristics, 31(4), 35. https://doi.org/10.1007/s10732-025-09570-5
- [73] Faramarzi, A., Heidarinejad, M., Mirjalili, S., & Gandomi, A. H. (2020). Marine predators algorithm: A nature-inspired metaheuristic. Expert systems with applications, 152, 113377. https://doi.org/10.1016/j.eswa.2020.113377
- [74] Franceschi, L., Donini, M., Perrone, V., Klein, A., Archambeau, C., Seeger, M., … ., & Frasconi, P. (2025). Hyperparameter optimization in machine learning. Foundations and trends in machine learning, 18(6), 975–1109. https://doi.org/10.1561/2200000088
- [75] Lorenzo, P. R., Nalepa, J., Kawulok, M., Ramos, L. S., & Pastor, J. R. (2017). Particle swarm optimization for hyper-parameter selection in deep neural networks. Proceedings of the genetic and evolutionary computation conference (pp. 481–488). New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/3071178.3071208
- [76] Xue, B., Zhang, M., & Browne, W. N. (2014). Particle swarm optimisation for feature selection in classification: Novel initialisation and updating mechanisms. Applied soft computing, 18, 261–276. https://doi.org/10.1016/j.asoc.2013.09.018
- [77] Ibrahim, M. Q., Hussein, N. K., Guinovart, D., & Qaraad, M. (2025). Optimizing convolutional neural networks: A comprehensive review of hyperparameter tuning through metaheuristic algorithms. Archives of computational methods in engineering, 32(8), 5123–5160. https://doi.org/10.1007/s11831-025-10292-x
- [78] Emary, E., Zawbaa, H. M., & Hassanien, A. E. (2016). Binary grey wolf optimization approaches for feature selection. Neurocomputing, 172, 371–381. https://doi.org/10.1016/j.neucom.2015.06.083
- [79] Albelwi, S., & Mahmood, A. (2017). A framework for designing the architectures of deep convolutional neural networks. Entropy, 19(6), 1–20. https://doi.org/10.3390/e19060242
- [80] Chen, K., & Xie, J. (2025). Hybrid adaptive Wolf-Particle swarm optimization algorithm and its application in CNN neural network hyperparameters optimization. Discover computing, 28(1), 319. https://doi.org/10.1007/s10791-025-09878-7
- [81] Al-Tashi, Q., Abdulkadir, S. J., Rais, H. M., Mirjalili, S., & Alhussian, H. (2020). Binary optimization using hybrid grey wolf optimization for feature selection. IEEE access, 7(1), 39496-39508. https://doi.org/10.1109/ACCESS.2019.2906757
- [82] Ibrahim, R. A., Elaziz, M. A., & Lu, S. (2018). Chaotic opposition-based grey-wolf optimization algorithm based on differential evolution and disruption operator for global optimization. Expert systems with applications, 108, 1–27. https://doi.org/10.1016/j.eswa.2018.04.028
- [83] Yang, X. S. (2010). A new metaheuristic bat-inspired algorithm. In Nature inspired cooperative strategies for optimization (NICSO 2010) (pp. 65–74). Berlin, Heidelberg: Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-642-12538-6_6
- [84] Coppola, C., Papa, L., Boresta, M., Amerini, I., & Palagi, L. (2024). Tuning parameters of deep neural network training algorithms pays off: A computational study. Transactions in operations research (TOP), 32(3), 579–620. https://doi.org/10.1007/s11750-024-00683-x
- [85] Probst, P., Boulesteix, A. L., & Bischl, B. (2019). Tunability: Importance of hyperparameters of machine learning algorithms. Journal of machine learning research, 20(53), 1–32. http://jmlr.org/papers/v20/18-444.html
- [86] Golovin, D., Solnik, B., Moitra, S., Kochanski, G., Karro, J., & Sculley, D. (2017). Google vizier: A service for black-box optimization. Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining (KDD ’17) (pp. 1487–1495). New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/3097983.3098043
- [87] Bischl, B., Casalicchio, G., Feurer, M., Gijsbers, P., Hutter, F., Lang, M., ... & Vanschoren, J. (2017). Openml benchmarking suites. https://doi.org/10.48550/arXiv.1708.03731
- [88] Chandrashekar, G., & Sahin, F. (2014). A survey on feature selection methods. Computers & electrical engineering, 40(1), 16–28. https://doi.org/10.1016/j.compeleceng.2013.11.024
- [89] Kaur, A., Chhabbra, A., & Shivani. (2024). A comprehensive review of feature selection techniques with metaheuristic algorithms (2019–2024). International conference on information and communication technology for competitive strategies (pp. 401-417). Singapore: Springer Nature Singapore. https://doi.org/10.1007/978-981-96-4142-0_34
- [90] Hussain, K., Mohd Salleh, M. N., Cheng, S., & Shi, Y. (2019). Metaheuristic research: A comprehensive survey. Artificial intelligence review, 52(4), 2191–2233. https://doi.org/10.1007/s10462-017-9605-z
- [91] Nguyen, B. H., Xue, B., & Zhang, M. (2020). A survey on swarm intelligence approaches to feature selection in data mining. Swarm and evolutionary computation, 54, 100663. https://doi.org/10.1016/j.swevo.2020.100663
- [92] Mafarja, M., & Mirjalili, S. (2018). Whale optimization approaches for wrapper feature selection. Applied soft computing, 62, 441–453. https://doi.org/10.1016/j.asoc.2017.11.006
- [93] Xue, B., Zhang, M., & Browne, W. N. (2013). Particle swarm optimization for feature selection in classification: A multi-objective approach. IEEE transactions on cybernetics, 43(6), 1656–1671. https://doi.org/10.1109/TSMCB.2012.2227469
- [94] Too, J., & Mirjalili, S. (2021). A hyper learning binary dragonfly algorithm for feature selection: A COVID-19 case study. Knowledge-based systems, 212, 106553. https://doi.org/10.1016/j.knosys.2020.106553
- [95] Siedlecki, W., & Sklansky, J. (1989). A note on genetic algorithms for large-scale feature selection. Pattern recognition letters, 10(5), 335–347. https://doi.org/10.1016/0167-8655(89)90037-8
- [96] Faris, H., Mafarja, M. M., Heidari, A. A., Aljarah, I., Al-Zoubi, A. M., Mirjalili, S., & Fujita, H. (2018). An efficient binary Salp Swarm algorithm with crossover scheme for feature selection problems. Knowledge-based systems, 154, 43–67. https://doi.org/10.1016/j.knosys.2018.05.009
- [97] Cui, X., Luo, Q., Zhou, Y., Deng, W., & Yin, S. (2022). Quantum-inspired moth-flame optimizer with enhanced local search strategy for cluster analysis. Frontiers in bioengineering and biotechnology, 10, 908356. https://doi.org/10.3389/fbioe.2022.908356
- [98] Nenavath, H., & Jatoth, R. K. (2018). Hybridizing sine Cosine algorithm with differential evolution for global optimization and object tracking. Applied soft computing, 62, 1019–1043. https://doi.org/10.1016/j.asoc.2017.09.039
- [99] Abd Elaziz, M., Ewees, A. A., Yousri, D., Abualigah, L., & Al-qaness, M. A. A. (2022). Modified marine predators algorithm for feature selection: Case study metabolomics. Knowledge and information systems, 64(1), 261–287. https://doi.org/10.1007/s10115-021-01641-w
- [100] Hancer, E., Xue, B., & Zhang, M. (2018). Differential evolution for filter feature selection based on information theory and feature ranking. Knowledge-based systems, 140, 103–119. https://doi.org/10.1016/j.knosys.2017.10.028
- [101] Guyon, I., Weston, J., Barnhill, S., & Vapnik, V. (2002). Gene selection for Cancer classification using support vector machines. Machine learning, 46(1), 389–422. https://doi.org/10.1023/A:1012487302797
- [102] Mirjalili, S. (2019). Genetic algorithm. In Evolutionary algorithms and neural networks (pp. 43-55). Springer International Publishing. https://www.springerprofessional.de/en/genetic-algorithm/15882800
- [103] Koza, J. R. (1992). Genetic programming on the programming of computers by means of natural selection. MIT Press. https://mitpress.mit.edu/9780262527910/genetic-programming/
- [104] Tran, B., Xue, B., & Zhang, M. (2019). Variable-length particle swarm optimization for feature selection on high-dimensional classification. IEEE transactions on evolutionary computation, 23(3), 473–487. https://doi.org/10.1109/TEVC.2018.2869405
- [105] Ruder, S. (2016). An overview of gradient descent optimization algorithms. https://doi.org/10.48550/arXiv.1609.04747
- [106] Ding, S., Li, H., Su, C., Yu, J., & Jin, F. (2013). Evolutionary artificial neural networks: A review. Artificial intelligence review, 39(3), 251–260. https://doi.org/10.1007/s10462-011-9270-6
- [107] Smith, L. N. (2017). Cyclical learning rates for training neural networks. 2017 IEEE winter conference on applications of computer vision (WACV) (pp. 464–472). IEEE. https://doi.org/10.1109/WACV.2017.58
- [108] Liang, J., Meyerson, E., Hodjat, B., Fink, D., Mutch, K., & Miikkulainen, R. (2019). Evolutionary neural autoML for deep learning. Proceedings of the genetic and evolutionary computation conference (pp. 401–409). New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/3321707.3321721
- [109] Such, F. P., Madhavan, V., Conti, E., Lehman, J., Stanley, K. O., & Clune, J. (2017). Deep neuroevolution: Genetic algorithms are a competitive alternative for training deep neural networks for reinforcement learning. https://doi.org/10.48550/arXiv.1712.06567
- [110] de Campos Souza, P. V., & Sayyadzadeh, I. (2025). GWO-FNN: Fuzzy neural network optimized via grey wolf optimization. Mathematics, 13(7), 1–48. https://doi.org/10.3390/math13071156
- [111] Ingber, L. (1993). Simulated annealing: Practice versus theory. Mathematical and computer modelling, 18(11), 29–57. https://doi.org/10.1016/0895-7177(93)90204-C
- [112] Aljarah, I., Faris, H., & Mirjalili, S. (2018). Optimizing connection weights in neural networks using the whale optimization algorithm. Soft computing, 22(1), 1–15. https://doi.org/10.1007/s00500-016-2442-1
- [113] Stanley, K. O., Clune, J., Lehman, J., & Miikkulainen, R. (2019). Designing neural networks through neuroevolution. Nature machine intelligence, 1(1), 24–35. https://doi.org/10.1038/s42256-018-0006-z
- [114] Kuncheva, L. I. (2014). Combining pattern classifiers: Methods and algorithms. Wiley Online Library. https://doi.org/10.1002/9781118914564
- [115] Brown, G. (2011). Ensemble learning. In Encyclopedia of machine learning (pp. 312–320). Springer. https://doi.org/10.1007/978-0-387-30164-8_252
- [116] Zhou, Z. H., Wu, J., & Tang, W. (2002). Ensembling neural networks: Many could be better than all. Artificial intelligence, 137(1), 239–263. https://doi.org/10.1016/S0004-3702(02)00190-X
- [117] Oliveira, L. S., Sabourin, R., Bortolozzi, F., & Suen, C. Y. (2003). A methodology for feature selection using multiobjective genetic algorithms for handwritten digit string recognition. International journal of pattern recognition and artificial intelligence, 17(06), 903–929. https://doi.org/10.1142/S021800140300271X
- [118] LeDell, E., & Poirier, S. (2020). H2o autoML: Scalable automatic machine learning. 7th ICML workshop on automated machine learning (pp. 1-16). International Machine Learning Society (IMLS). https://www.automl.org/wp-content/uploads/2020/07/AutoML_2020_paper_61.pdf
- [119] Mirjalili, S. (2015). How effective is the grey wolf optimizer in training multi-layer perceptrons. Applied intelligence, 43(1), 150–161. https://doi.org/10.1007/s10489-014-0645-7
- [120] Gharehchopogh, F. S., & Gholizadeh, H. (2019). A comprehensive survey: Whale optimization algorithm and its applications. Swarm and evolutionary computation, 48, 1–24. https://doi.org/10.1016/j.swevo.2019.03.004
- [121] Heidari, A. A., & Pahlavani, P. (2017). An efficient modified grey wolf optimizer with Lévy flight for optimization tasks. Applied soft computing, 60, 115–134. https://doi.org/10.1016/j.asoc.2017.06.044
- [122] Sutton, R. S., & Barto, A. G. (1999). Reinforcement learning: An introduction. MIT Press. https://mitpress.mit.edu/9780262039246/reinforcement-learning/
- [123] Stanley, K. O., & Miikkulainen, R. (2002). Evolving neural networks through augmenting topologies. Evolutionary computation, 10(2), 99–127. https://doi.org/10.1162/106365602320169811
- [124] Salimans, T., Ho, J., Chen, X., Sidor, S., & Sutskever, I. (2017). Evolution strategies as a scalable alternative to reinforcement learning. https://doi.org/10.1162/106365602320169811
- [125] Hansen, N. (2016). The CMA evolution strategy: A tutorial. https://doi.org/10.48550/arXiv.1604.00772
- [126] Jaderberg, M., Dalibard, V., Osindero, S., Czarnecki, W. M., Donahue, J., Razavi, A., … ., & Kavukcuoglu, K. (2017). Population based training of neural networks. https://doi.org/10.48550/arXiv.1711.09846
- [127] Demšar, J. (2006). Statistical comparisons of classifiers over multiple data sets. Journal of machine learning research, 7, 1–30. https://www.researchgate.net/publication/220320196
- [128] García, S., Fernández, A., Luengo, J., & Herrera, F. (2010). Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power. Information sciences, 180(10), 2044–2064. https://doi.org/10.1016/j.ins.2009.12.010
- [129] Shahriari, B., Swersky, K., Wang, Z., Adams, R. P., & de Freitas, N. (2016). Taking the human out of the loop: A review of Bayesian optimization. Proceedings of the IEEE, 104(1), 148–175. https://doi.org/10.1109/JPROC.2015.2494218
- [130] Falkner, S., Klein, A., & Hutter, F. (2018). BOHB: Robust and efficient hyperparameter optimization at scale. Proceedings of the 35th international conference on machine learning (pp. 1437–1446). Proceedings of Machine Learning Research (PMLR). https://proceedings.mlr.press/v80/falkner18a.html
- [131] Agrawal, T., & Choudhary, P. (2022). Metaheuristic optimization algorithms. Morgan Kaufmann. https://www.sciencedirect.com/book/edited-volume/9780443139253/metaheuristic-optimization-algorithms
- [132] Jin, Y. (2011). Surrogate-assisted evolutionary computation: Recent advances and future challenges. Swarm and evolutionary computation, 1(2), 61–70. https://doi.org/10.1016/j.swevo.2011.05.001
- [133] Greff, K., Srivastava, R. K., Koutník, J., Steunebrink, B. R., & Schmidhuber, J. (2017). LSTM: A search space odyssey. IEEE transactions on neural networks and learning systems, 28(10), 2222–2232. https://doi.org/10.1109/TNNLS.2016.2582924
- [134] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., … ., & Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems (Vol. 30, PP. 5998–6008). Curran Associates, Inc. https://proceedings.neurips.cc/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html
- [135] Tekkali, C., & Natarajan, K. (2023). Smart fraud detection in E-transactions using synthetic minority oversampling and binary Harris Hawks optimization. Computers, materials, & continua, 75(2), 3171. https://doi.org/10.32604/cmc.2023.036865
- [136] Ma, L., Liu, Y., Zhang, X., Ye, Y., Yin, G., & Johnson, B. A. (2019). Deep learning in remote sensing applications: A meta-analysis and review. ISPRS journal of photogrammetry and remote sensing, 152, 166–177. https://doi.org/10.1016/j.isprsjprs.2019.04.015
- [137] Litjens, G., Kooi, T., Bejnordi, B. E., Setio, A. A. A., Ciompi, F., Ghafoorian, M., … ., & Sánchez, C. I. (2017). A survey on deep learning in medical image analysis. Medical image analysis, 42, 60–88. https://doi.org/10.1016/j.media.2017.07.005
- [138] Liashchynskyi, P., & Liashchynskyi, P. (2019). Grid search, random search, genetic algorithm: A big comparison for NAS. https://doi.org/10.48550/arXiv.1912.06059
- [139] Wolpert, D. H., & Macready, W. G. (1997). No free lunch theorems for optimization. IEEE transactions on evolutionary computation, 1(1), 67–82. https://doi.org/10.1109/4235.585893
- [140] Zhang, D., Mishra, S., Brynjolfsson, E., Etchemendy, J., Ganguli, D., Grosz, B., … ., & Perrault, R. (2024). The 2024 AI index report. https://hai.stanford.edu/ai-index/2024-ai-index-report?hl=en-US
- [141] Wistuba, M., Rawat, A., & Pedapati, T. (2019). A survey on neural architecture search. https://doi.org/10.48550/arXiv.1905.01392
- [142] Cobo, M. J., López-Herrera, A. G., Herrera-Viedma, E., & Herrera, F. (2011). Science mapping software tools: Review, analysis, and cooperative study among tools. Journal of the american society for information science and technology, 62(7), 1382–1402. https://doi.org/10.1002/asi.21525
- [143] Osaba, E., Yang, X.-S., & Del Ser, J. (2020). Traveling salesman problem: A perspective review of recent research and new results with bio-inspired metaheuristics. In Nature-inspired computation and swarm intelligence (pp. 135–164). Academic Press. https://doi.org/10.1016/B978-0-12-819714-1.00020-8
- [144] He, X., Zhao, K., & Chu, X. (2021). AutoML: A survey of the state-of-the-art. Knowledge-based systems, 212, 106622. https://doi.org/10.1016/j.knosys.2020.106622
- [145] Cai, H., Zhu, L., & Han, S. (2018). Proxylessnas: Direct neural architecture search on target task and hardware. https://doi.org/10.48550/arXiv.1812.00332
- [146] Strubell, E., Ganesh, A., & McCallum, A. (2019). Energy and policy considerations for deep learning in NLP. Proceedings of the 57th annual meeting of the association for computational linguistics (pp. 3645–3650). Association for Computational Linguistics. https://doi.org/10.18653/v1/P19-1355
- [147] Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., … ., & Amodei, D. (2020). Language models are few-shot learners. Advances in neural information processing systems (Vol. 33, pp. 1877–1901). Neural Information Processing Systems Foundation. https://dl.acm.org/doi/abs/10.5555/3495724.3495883
- [148] Ying, C., Klein, A., Christiansen, E., Real, E., Murphy, K., & Hutter, F. (2019). NAS-bench-101: Towards reproducible neural architecture search. Proceedings of the 36th international conference on machine learning (pp. 7105–7114). Proceedings of Machine Learning Research (PMLR). https://proceedings.mlr.press/v97/ying19a.html
- [149] Zela, A., Siems, J., & Hutter, F. (2020). Nas-bench-1shot1: Benchmarking and dissecting one-shot neural architecture search. https://doi.org/10.48550/arXiv.2001.10422
- [150] Wong, C., Houlsby, N., Lu, Y., & Gesmundo, A. (2018). Transfer learning with neural autoML. Advances in neural information processing systems (pp. 8356–8365). Neural Information Processing Systems Foundation. https://proceedings.neurips.cc/paper_files/paper/2018/hash/bdb3c278f45e6734c35733d24299d3f4-Abstract.html
- [151] Henderson, P., Islam, R., Bachman, P., Pineau, J., Precup, D., & Meger, D. (2018). Deep reinforcement learning that matters. Proceedings of the AAAI conference on artificial intelligence (pp. 3207–3214). Association for the Advancement of Artificial Intelligence (AAAI). https://doi.org/10.1609/aaai.v32i1.11694
- [152] Lindauer, M., & Hutter, F. (2020). Best practices for scientific research on neural architecture search. Journal of machine learning research, 21(243), 1–18. http://jmlr.org/papers/v21/20-056.html
- [153] He, J., & Yao, X. (2001). Drift analysis and average time complexity of evolutionary algorithms. Artificial intelligence, 127(1), 57–85. https://doi.org/10.1016/S0004-3702(01)00058-3
- [154] Dong, X., & Yang, Y. (2020). Nas-bench-201: Extending the scope of reproducible neural architecture search. https://doi.org/10.48550/arXiv.2001.00326
- [155] Siems, J., Zimmer, L., Zela, A., Lukasik, J., Keuber, M., & Hutter, F. (2021). NAS-bench-301 and the case for surrogate benchmarks for neural architecture search. International conference on learning representations (PP. 1-11). OpenReview. https://ml.informatik.uni-freiburg.de/wp-content/uploads/papers/20-NIPS_WML-NB301.pdf
- [156] Mellor, J., Turner, J., Storkey, A., & Crowley, E. J. (2021). Neural architecture search without training. Proceedings of the 38th international conference on machine learning (pp. 7588–7598). Proceedings of Machine Learning Research (PMLR). https://proceedings.mlr.press/v139/mellor21a.html
- [157] Chen, W., Gong, X., & Wang, Z. (2021). Neural architecture search on imagenet in four gpu hours: A theoretically inspired perspective. https://doi.org/10.48550/arXiv.2102.11535
- [158] Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M. A., Lacroix, T., … ., & Lample, G. (2023). LLaMA: Open and efficient foundation language models. https://doi.org/10.48550/arXiv.2302.13971
- [159] Hu, E. J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., ... & Chen, W. (2022). Lora: Low-rank adaptation of large language models. International conference on learning representations (Iclr) (Vol. 1, No. 2, p. 3). https://arxiv.org/pdf/2106.09685v1/1000
- [160] McMahan, B., Moore, E., Ramage, D., Hampson, S., & y Arcas, B. A. (2017). Communication-efficient learning of deep networks from decentralized data. Proceedings of the 20th international conference on artificial intelligence and statistics (AISTATS 2017) (pp. 1273–1282). Proceedings of Machine Learning Research (PMLR). https://proceedings.mlr.press/v54/mcmahan17a.html
- [161] Schwartz, R., Dodge, J., Smith, N. A., & Etzioni, O. (2020). Green AI. Communication of the ACM, 63(12), 54–63. https://doi.org/10.1145/3381831
- [162] Zhang, G. (2011). Quantum-inspired evolutionary algorithms: A survey and empirical study. Journal of heuristics, 17(3), 303–351. https://doi.org/10.1007/s10732-010-9136-0
- [163] Aleti, A., & Moser, I. (2016). A systematic literature review of adaptive parameter control methods for evolutionary algorithms. ACM computing surveys, 49(3), 1–35. https://doi.org/10.1145/2996355
- [164] Li, K., Fialho, Á., Kwong, S., & Zhang, Q. (2014). Adaptive operator selection with bandits for a multiobjective evolutionary algorithm based on decomposition. IEEE transactions on evolutionary computation, 18(1), 114–130. https://doi.org/10.1109/TEVC.2013.2239648
- [165] Gaier, A., & Ha, D. (2019). Weight agnostic neural networks. Advances in neural information processing systems (Vol. 32, PP. 5365–5379). Curran Associates, Inc. https://proceedings.neurips.cc/paper/2019/hash/e98741479a7b998f88b8f8c9f0b6b6f1-Abstract.html
- [166] Thornton, C., Hutter, F., Hoos, H. H., & Leyton-Brown, K. (2013). Auto-weka: Combined selection and hyperparameter optimization of classification algorithms. Proceedings of the 19th ACM sigkdd international conference on knowledge discovery and data mining (pp. 847–855). New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/2487575.2487629
- [167] Feurer, M., Klein, A., Eggensperger, K., Springenberg, J., Blum, M., & Hutter, F. (2015). Efficient and robust automated machine learning. Advances in neural information processing systems (Vol. 28, PP. 2962–2970). Curran Associates, Inc. https://proceedings.neurips.cc/paper/2015/hash/11d0e6287202fced83f79975ec59a3a6-Abstract.html