Approach to Textual Data Analysis

  • O. Babomuradov Executive director of the Kazan Federal University branch in Jizzakh, Jizzakh, Uzbekistan
  • O. Turakulov Tashkent University of Information Technologies named after Muhammad al-Khwarizmi, Tashkent, Uzbekistan
  • Sh. Karaxanova Tashkent University of Information Technologies named after Muhammad al-Khwarizmi, Tashkent, Uzbekistan
Keywords: textual data

Abstract

In this manuscript, approaches to processing textual data, based on which models and algorithms for classification and analysis of textual data are proposed. Developed algorithms serve to improve the efficiency of classification and analysis of textual data. A core algorithm for analyzing textual documents, a modification of a dictionary search algorithm, and algorithms A1, A2, and A3 for classification and analysis have been developed. The software developed on the basis of these algorithms is based on experimental research. A study sample of 2000 words was used in the experimental researches. The knowledge base is dynamic and expands during the training process.

Downloads

Download data is not yet available.

References

1. Xin-She Yang Introduction to Algorithms for Data Mining and Machine Learning// Copyright © 2019 Elsevier Inc. All rights reserved. Academic Press, ISBN: 978-0-12-817216-2, 171p.
2. Hemlata Sahu, Shalini Shrma, Seema Gondhalakar A Brief Overview on Data Mining Survey, International Journal of Computer Technology and Electronics Engineering (IJCTEE), 2013, Volume 1, Issue 3; P. IndiraPriya, Dr. D.K. Ghosh A Survey on Different Clustering Algorithms in Data Mining Technique, International Journal of Modern Engineering Research (IJMER) www.ijmer.com Vol.3, Issue.1, Jan-Feb. 2013 pp-267-274.
3. M. A. Deshmukh, Prof. R. A. Gulhane Importance of Clustering in Data Mining, International Journal of Scientific & Engineering Research, Volume 7, Issue 2, February-2016
4. Jaro M. A. Advances in record linkage methodology as applied to the 1985 census of Tampa Florida // Journal of the American Statistical Association.1989. | 84 (406). | Pp. 414{420. | DOI: 10.1080/01621459. 989.10478785.
5. Rassel S. Iskusstvenniy intellekt. Sovremenniy podxod [Artificial intelligence. Modern approach] / S. Rassel, P. Norvig, 2-ye izd.: Per. s angl. – M.: Izdatelskiy dom «Vilyams», 2006. – 1408 s.
6. Feldman R. The text mining handbook: advanced approaches in analyzing unstructured data [Tekst] / R. Feldman, J. Sanger. – Cambridge University Press, 2007. – 410 p.
7. Moyotl-Hernandez E. An Analysis on Frequency of Terms for Text Categorization [Tekst] / E. Moyotl-Hernandez, H. Jimenez-Salazar // Procesamiento del lenguaje natural. – 2004. – Vol. 33. – P. 141-146.
8. Moyotl-Hernandez E. Some Tests in Text Categorization using Term Selection by DTP [Tekst] / E. Moyotl-Hernandez, H. Jimenez-Salazar // Proceedings of the Fifth Mexican International Conference on Computer Science ENC'04. – Colima. – 2004. – P. 161-167.
9. Bolshakova Ye., Lukashevich N., Nokel M. Izvlechenie odnoslovnix terminov iz tekstovix kolleksiy na osnove metodov mashinnogo obucheniya [Extracting single-word terms from text collections based on machine learning methods] // Informatsionnie texnologii. — 2013. — S. 31—37
10. Usama F., Smyth P., Piatetsky–Shapiro G. From Data Mining to Knowledge Discovery in Databases // Arti_cal intelligence Magazine. | 1996. |17(3). | Pp. 34-54.
11. Gmurman V. Ye. Teoriya veroyatnostey i matematicheskaya statistika [Theory of Probability and Mathematical Statistics]. — Moskva : Visshaya shkola, 2013. — 479 s.
12. Roussopoulos N. Conceptual Modeling: Past, Present and the Continuum of the Future // Conceptual Modeling: Foundations and Applications. 2009. | Pp. 139{152.
13. Hutchins J. ALPAC: The (In)Famous Report // Readings in machine translation. 2003. Vol. 14. P. 131–135.
14. Manning K. D., Ragxavan P., Shyutse X. Vvedenie v informatsionniy poisk [Introduction to Information Retrieval]. : Per. s angl. / Pod red. P. I. Braslavskogo, D. A. Klyushina, I. V. Segalovicha. M.: OOO «I.D. Vilyams», 2011. 528 s.
15. Lukashevich N. V. Tezaurusi v zadachax informatsionnogo poiska [Thesauruses in information retrieval tasks]. M.: Izd-vo Moskovskogo universiteta, 2011. 512 s.
16. Deliyanni A., Kowalski R. A. Logic and Semantic Networks // Communications of the ACM. 1979. Vol. 22, no. 3. P. 184–192.
17. Shapiro S. C. Encyclopedia of Artificial Intelligence. 2nd edition. New York, NY, USA: John Wiley & Sons, Inc., 1992. 1724 pp.
18. Gavrilova T. A., Xoroshevskiy V. F. Bazi znaniy intellektualnix sistem [Intelligent systems knowledge bases]. SPb: Piter, 2000. 384 s.
19. Apresyan Yu.D., BoguslovskiyI.M., IomdinL.L. i. dr. Lingvisticheskiy protsessor dlya slojnix informatsionnix sistem [Linguistic processor for complex information systems]. M.: Nauka 1992.-256s.
20. Osipov G.S. Metodi iskusstvennogo intellekta [Artificial Intelligence Methods].-FIZMATLIT, 2011.
21. Osipov G, Smirnov I., Tikhamirov I. Relation-situational method for text search and analysis and its applications// Seientific and Technical Information Processing. -2010.-vol.37, no b.-P.432-437.
22. O. J. Babomuradov, N. S. Mamatov, L. B. Boboev, B. I. Otaxonova, “Text documents classification in Uzbek language,” International journal of recent technology and engineering, vol. 8, no. 2, pp. 3787–3789, 2019.
23. Y. Du, J. Liu, W. Ke, and X. Gong, “Hierarchy construction and text classification based on the relaxation strategy and least information model,” Expert Systems with Applications, vol. 100, pp. 157–164, 2018
24. G. Vinodhini and R. M. Chandrasekaran, “A comparative performance evaluation of neural network based approach for sentiment classification of online reviews,” Journal of King Saud University-Computer and Information Sciences, vol. 28, no. 1, pp. 2–12, 2016.; A. Abbasi, H. Chen, and A. Salem, “Sentiment analysis in multiple languages: Feature selection for opinion classification in web forums,” ACM Transactions on Information Systems, vol. 26, no. 3, p. 12, 2008.
Published
2023-10-27
How to Cite
Babomuradov, O., Turakulov, O., & Karaxanova, S. (2023). Approach to Textual Data Analysis. Central Asian Journal of Theoretical and Applied Science, 4(10), 170-180. Retrieved from https://cajotas.centralasianstudies.org/index.php/CAJOTAS/article/view/1314