A new AI-based method for clustering survey responses

Skip to content

PL EN

PL EN

SEARCH

Archive

Publication procedure

Contact

About the Journal Editorial Board Scientific Council Reviewers Ethical Code Special Issue GDRP - information on the processing of personal data Terms of the journal Indexing Licenses & Access

Archive

For Authors Publishing Policy Technical instruction for the authors Agreement CC BY-SA Copyright statement

Publication procedure

Peer Review Process Reviewer’s form .doc Reviewer’s form .pdf

Contact

Numer specjalny 5/2023 vol. 54

A new AI-based method for clustering survey responses

Jan Franciszek Laskowski ¹

,

Paweł Tomiło ¹

1

Lublin University of Technology

Submission date: 2023-07-25

Acceptance date: 2023-12-01

Publication date: 2023-12-18

Corresponding author

Jan Franciszek Laskowski

Lublin University of Technology

JoMS 2023;54(Numer specjalny 5):355-377

DOI: https://doi.org/10.13166/jms/176171

References (34)

KEYWORDS

Survey data analysis

artificial intelligence

variational autoencoder (VAE)

machine learning

pattern discovery

exploratory data analysis

TOPICS

Mangement science

ABSTRACT

Objectives:
Many research projects, particularly in social science research, depend on clustering survey responses. When analyzing survey data, traditional clustering algorithms have several drawbacks. The ability to analyze survey data more effectively has been made possible by recent developments in artificial intelligence (AI) and machine learning (ML). The aim of this article is to present a new, AI-based method of clustering survey responses using a Variational Autoencoder (VAE).

Material and methods:
To determine the effectiveness of grouping, the new VAE clustering method was compared with K-means, PCA and k-means, and Agglomerative Hierarchical Clustering methods by applying the Silhouette score, the Calinski-Harabasz score, and the Davies-Bouldin score metrics.

Results:
In the case of the Silhouette Score, the developed VAE method obtained a 69% higher average effectiveness of clustering survey responses than the others. For the Calinski-Harabasz Score and the Davies-Bouldin Score, respectively, the VAE method outperformed the other methods by 164% and 111%, respectively.

Conclusions:
The VAE method allowed for the most effective grouping of responses given by respondents. It has made it possible to capture complex relationships and patterns in the data. In addition, the method is suitable for analyzing different types of survey data (continuous, categorical, and mixed data) and is resistant to noise and missing data.

License

This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported (CC BY-SA)

REFERENCES (34)

1.

Arthur, D., Vassilvitskii, S. (2007). K-means. the advantages of careful seeding. Symposium on Discrete Algorithms. Accessed 20.04.2023 at https://forge.agroparistech.fr....

2.

/tree/670/biblio/clustering/kMeansPP-soda.pdf.

3.

Arturo, A., Scuola, V., Santanna, S., Binaghi, E., Vergani, A. A. (2018). A soft davies-bouldin separation measure. 2018 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE). https://doi.org/10.1109/FUZZ-I....

4.

Asadoorian, M., Kantarelis, D. (2005). Essentials of inferential statistics. Accessed 23.04.2023 at https://www.google.com/books?h....

5.

Bock, H. (2007). Clustering Methods: A History of k-Means Algorithms. Selected Contributions in Data Analysis. Accessed 12.05.2023 at https://link.springer.com/cont....

6.

Caliński, T. (1974). A dendrite method for cluster analysis. Taylor & Francis, 1–27. https://doi.org/10.1080/036109....

7.

Campello, R. J. G. B., Moulavi, D., Sander, J. (2013). Density-based clustering based on hierarchical density estimates, 7819 LNAI(PART 2), 160–172. Lecture Notes in Computer Science. https://doi.org/10.1007/978-3-....

8.

Davies, D. L., Bouldin, D. W. (1979). A Cluster Separation Measure, PAMI-1(2), 224–227. IEEE Transactions on Pattern Analysis and Machine Intelligence. https://doi.org/10.1109/TPAMI.....

9.

Day, W. H. E., Edelsbrunner, H. (1984). Efficient algorithms for agglomerative hierarchical clustering methods., 1(1), 7–24. Journal of Classification. https://doi.org/10.1007/BF0189....

10.

Doersch, C. (2016). Tutorial on Variational Autoencoders. Accessed 20.04.2023 at https://arxiv.org/abs/1606.059....

11.

Fowler, F. J. (2013). Survey research methods. Taylor & Francis.

12.

Fraley, C., Raftery, A. (1998). How many clusters? Which clustering method? Answers via model-based cluster analysis. The Computer Journal. Accessed 19.04.2023 at https://academic.oup.com/comjn....

13.

Holcomb, Z. (2016). Fundamentals of descriptive statistics. Accessed 22.04.2023 at https://www.google.com/books?h....

14.

Jollife, I. T., Cadima, J. (2016). Principal component analysis: a review and recent developments. 374(2065). Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences. https://doi.org/10.1098/RSTA.2....

15.

Kingma, D. P., Welling, M. (2019). An Introduction to Variational Autoencoders, 12(4), 307–392. Foundations and Trends® in Machine Learning. https://doi.org/10.1561/220000....

16.

Kleinbaum, D., Kupper, L., Nizam, A., Rosenberg, E. (2013). Applied regression analysis and other multivariable methods. Cengage Learning.

17.

Kriegel, H. P., Kröger, P., Sander, J., Zimek, A. (2011). Density-based clustering, 1(3), 231–240. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery. https://doi.org/10.1002/WIDM.3....

18.

Laskowska, A., Laskowski, J. F. (2022). Silver Generation at Work – Implications for Sustainable Human Capital Management in the Industry 5.0 Era, 15(1), 194. Sustainability. https://doi.org/10.3390/SU1501....

19.

Likas, A., Vlassis, N., Verbeek, J. (2003). The global k-means clustering algorithm. Pattern Recognition. Accessed 19.04.2023 at https://www.sciencedirect.com/....

20.

Lima, S., Aplicada, M. C. (2020). A genetic algorithm using Calinski-Harabasz index for automatic clustering problem, 12(3), 97–106. Revista Brasileira de Computação. https://doi.org/10.5335/rbca.v....

21.

Manning, C. (2009). An introduction to information retrieval. Accessed 11.04.2023 at https://ds.amu.edu.et/xmlui/bi....

22.

Murtagh, F., Contreras, P. (2012). Algorithms for hierarchical clustering: An overview, 2(1), 86–97. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery. https://doi.org/10.1002/WIDM.5....

23.

Ng, A., Jordan, M., Weiss, Y. (2001). On Spectral Clustering: Analysis and an algorithm, 14. Advances in Neural Information Processing Systems.

24.

Osgood, C. E. (1964). Semantic Differential Technique in the Comparative Study of Cultures, 66(3), 171-200. American Anthropologist.

25.

Petrovic, S. (2006). A comparison between the silhouette index and the davies-bouldin index in labelling ids clusters. Proceedings of the 11th Nordic Workshop of Secure. Accessed 15.04.2023 at https://citeseerx.ist.psu.edu/... 12e97cfdaefbb2fefc253b.

26.

Punj, G., Stewart, D. W. (1983). Cluster Analysis in Marketing Research: Review and Suggestions for Application, 20(2), 134–148. Journal of Marketing Research. https://doi.org/10.1177/002224....

27.

Rousseeuw, P. J. (1987). Silhouettes: A graphical aid to the interpretation and validation of cluster analysis, 20(C), 53–65. Journal of Computational and Applied Mathematics. https://doi.org/10.1016/0377-0....

28.

Schwartz, S. H., Cieciuch, J., Vecchione, M., Davidov, E., Fischer, R., Beierlein, C., Ramos, A., Verkasalo, M., Lönnqvist, J. E., Demirutku, K., Dirilen-Gumus, O., Konty, M. (2012). Refining the theory of basic individual values, 103(4), 663-688. Journal of Personality and Social Psychology. https://doi.org/10.1037/A00293....

29.

Shahapure, K., Nicholas, C. (2020). Cluster quality analysis using silhouette score. 2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA). Accessed 11.04.2023 at https://ieeexplore.ieee.org/ab....

30.

Shutaywi, M., Kachouie, N. N., Scarfone, M. (2021). Silhouette analysis for performance evaluation in machine learning with applications to clustering, 6(23), 759. Entropy, https://doi.org/10.3390/e23060....

31.

Themistocleous, C., Pagiaslis, A., Smith, A., Wagner, C. (2019). A comparison of scale attributes between interval-valued and semantic differential scales, 61(4), 394-407. International Journal of Market Research. https://doi.org/10.1177/147078....

32.

Tucker, L. (1951). A method for synthesis of factor analysis studies. ETS Program Report. Accessed 21.04.2023 at https://apps.dtic.mil/sti/pdfs....

33.

Wang, K. J., Zhang, J. Y., Li, D., Zhang, X. N., Guo, T. (2007). Adaptive affinity propagation clustering. 33(12), 1242–1246. Acta Automatica Sinica. https://doi.org/10.1360/aas-00....

34.

Ward, J. H. (1963). Hierarchical Grouping to Optimize an Objective Function, 58(301), 236–244. Journal of the American Statistical Association. https://doi.org/10.1080/016214....

Submit your paper

Share

RELATED ARTICLE

The Impact of Artificial Intelligence on Tourism Industry: A Marketing Perspective

Possibilities of Using Artificial Intelligence in Security Analysis

AI-powered digital transformation – organizational perspective. Literature review

The Use of Artificial Intelligence in Distance Education

Artificial Intelligence and Social-Emotional Learning: what relationship?

Indexes

eISSN:	2391-789X
ISSN:	1734-2031

© 2006-2025 Journal hosting platform by Bentus

Scroll to top