Amarasinghe, L. W. and Nawarathna, R. D. (2021) Automatic Generation of Scripts for Database Creation from Scenario Descriptions. Asian Journal of Research in Computer Science, 7 (3). pp. 34-48. ISSN 2581-8260
135-Article Text-244-1-10-20220914.pdf - Published Version
Download (3MB)
Abstract
Aims: Database creation is the most critical component of the design and implementation of any software application. Generally, the process of creating the database from the requirement specification of a software application is believed to be extremely hard. This study presents a method to automatically generate database scripts from a given scenario description of the requirement specification.
Study Design: The method is developed based on a set of natural language processing (NLP) techniques and a few algorithms. Standard database scenario descriptions presented in popular textbooks on Database Design are used for the validation of the method.
Place and Duration of Study: Department of Statistics and Computer Science, Faculty of Science, University of Peradeniya, Sri Lanka, Between December 2019 to December 2020.
Methodology: The description of the problem scenario is processed using NLP operations such as tokenization, complex word handling, basic group handling, complex phrase handling, structure merging, and template construction to extract the necessary information required for the entity relational model. New algorithms are proposed to automatically convert the entity relational model to the logical schema and finally to the database script. The system can generate scripts for relational databases (RDB), object relational databases (ORDB) and Not Only SQL (NoSQL) databases. The proposed method is integrated into a web application where the users can type the scenario in natural or free text. The user can select the type of database (i.e., one of RDB, ORDB, NoSQL) considered in their system and accordingly the application generates the SQL scripts.
Results: The proposed method was evaluated using 10 scenario descriptions connected to 10 different domains such as company, university, airport, etc. for all three types of databases. The method performed with impressive accuracies of 82.5%, 84.0% and 83.5% for RDB, ORDB and NoSQL scripts, respectively.
Conclusion: This study is mainly focused on the automatic generation of SQL scripts from scenario descriptions of the requirement specification of a software system. Overall, the developed method helps to speed up the database development process. Further, the developed web application provides a learning environment for people who are novices in database technology.
Item Type: | Article |
---|---|
Subjects: | STM Digital Library > Computer Science |
Depositing User: | Unnamed user with email support@stmdigitallib.com |
Date Deposited: | 24 Feb 2023 07:42 |
Last Modified: | 07 May 2024 05:06 |
URI: | http://archive.scholarstm.com/id/eprint/122 |