XML Database: An Introduction and Analysis Essay Example
XML Database: An Introduction and Analysis Essay Example

XML Database: An Introduction and Analysis Essay Example

Available Only on StudyHippo
  • Pages: 17 (4643 words)
  • Published: August 1, 2018
  • Type: Case Analysis
View Entire Sample
Text preview

XML, which is also referred to as the eXtensible Markup Language, functions as a widely employed data format and internet standard for information exchange. To ensure efficient storage and retrieval of the extensive data contained in numerous XML documents, it becomes essential to utilize techniques like RDMS (Relational Database Management System).

There are two popular methods for converting an XML document into a relational DBMS: SAX and DOM parsing. This research explores both approaches and compares their performance. Additionally, different ways of structuring and tagging data from RDBMS tables as hierarchical XML documents have been studied. The goal is to determine the best alternative for capturing and querying XML data using RDBMS that offers optimal performance.

It is important to execute the request query and save the process in an XML document when loading a large amount of data. O

...

ne approach to achieve this is by using an XML native database system. However, this method has two weaknesses:

The XML database system is insufficient for storing data and cannot handle complex queries like a relational DBMS.

It is not possible for users to access XML documents and data stored in a relational DBMS.

To address the mentioned limitations, XML data techniques for querying and storing are employed using RDBMS. The approach involves the following steps:

Data or an XML document can be saved by implementing relational table design.

In the presented table, XML data is divided by separating them into columns.

3. SQL Queries are used to access XML documents obtained in RDBMS data format.

The process of transferring data from XML to a database involves utilizing C# as middleware. This can be accomplished through either the utilization of a SAX parser with

View entire sample
Join StudyHippo to see entire essay

the aid of the parsing technique or by employing the XML Tree Class.

To convert the current format of an XML document from a relational SQL SERVER 2008 database, supported by a C# script acting as middleware, into the opposite format, tagging and structuring are utilized. Any alternate methods that involve tagging and structuring (outside of the database engine) suggest that some of the task was completed independently of the relational database engine.

Tasks were performed for comparing the speed of loading the process in the browser between the XML document and RDBMS:

The task of searching the data in the XML document is accomplished by utilizing the DATA binding technique.

The DOM Tree method is used to present XML data from an SQL database, which was originally stored in a relational database management system (RDBMS). The XML document is searched for information and the query result is saved as another XML Document.

It can be challenging to update them.

We are unable to enforce the limitations.

Optimizing the XML database poses a difficult challenge.

Maintaining consistency is challenging.

Data storage and data transmission are distinct concepts. Information is stored in data storage and its validity is verified during storage. In contrast, data transmission refers to transferring data between systems. XML is often used for facilitating data transmission procedures.

META DATA Definition

CREATE TABLE

XML Schema Definition

The persistence of data is maintained.

The text has beenby expressing the same idea in a different way.

INSERT and UPDATE

CREATE XML DOCUMENT

QUERY

SELECT

XPath and XQuery

Creating databases with objects provides a major speed advantage. In an Object Oriented Database Management System (OODBMS), data is stored as objects instead of the relational rows and columns commonly found in relational DBMS. This distinction allows OODBMS to

outperform relational DBMS in terms of speed.

When it comes to certain tasks, OODBMS is a more suitable option compared to RDBMS. This is because OODBMS utilizes navigational interfaces for performing various operations, while RDBMS relies on declarative interfaces. Moreover, OODBMS efficiently implements navigational access to data by utilizing pointers.

One disadvantage of using RDBMS is the requirement for a relational mapping layer to match the entire model for application objects to the database object model. In contrast, OODBMS does not have this necessity. The presence of mapping in RDBMS can lead to an impedance mismatch, which is not an issue in OODBMS. Furthermore, OODBMS has the added advantage of improved performance.

OODBMS has a few disadvantages including:

As mentioned earlier, the use of pointers in OODBMS to facilitate navigational access to data is considered a drawback. This is because it often results in slower query processing and increased complexity compared to relational databases.

One disadvantage of OODBMS is that it lacks a mathematical foundation, unlike RDBMS. This results in OODBMS being weaker in query support compared to RDBMS.

The distinction between data-base centric thinking and OOP lies in their perspectives on the world. Data-base centric thinking approaches the world from a declarative and attribute-driven viewpoint, while OOP approaches it from a behavioral standpoint. This distinction is a key difference between databases and OOP. Database technology is often deemed unsuccessful, leading to research and industrial efforts to integrate database functionality into object programming languages.

When discussing database design, there are various approaches to consider, such as the data model approach and the design approach. For now, let's concentrate on the data model approach:

During project implementation, it is crucial to complete database design

within the specified timeframe. It is vital to maintain an economical approach during the development phase. When there are changes in data design, constructing and updating a data model becomes necessary as every application requires data storage. This applies to both developers and users.

The various normal forms encompass:

It is necessary to store similar groups in separate tables and each table should be assigned a primary key to identify its columns.

To ensure data redundancy is avoided, including a foreign key is crucial.

The primary key is essential as it establishes connections between each column in the table, causing them to depend on the primary key. Any field that fails to meet this criterion should be stored in a separate table with its own key.

Removing the independent relationship from the relational database is crucial.

Many to many relationships can be logically related, which is why they are also known as "Exists in never-never land."

Next comes the design approach, which is described below, after the data model approach:

When developing large-scale applications with potential for improvement, it is recommended to use the second and third form designs. It is crucial to consider scalability at the design stage of the application.

The third form requires you to create multiple tables with fewer entities, reducing data duplication across the tables. During a conversation with another developer, I learned that they chose to use the third form because they were confident it would not cause any dependency issues for their application. They achieved this by using commas to distinguish primary values in a specified field of a table.

When creating a class diagram for an application, it is important to design classes that can be used

as data objects. These classes should contain properties that describe the quality and description of an entity element. The properties are introduced using getter and setter methods. In order to create effective and understandable database tables, certain constraints and relationships need to be incorporated into the table. These constraints and relationships are explained in the Constraints section of the database design, utilizing query tools for clarity.

LO2 Design and implementation:

The purpose of this assignment is to learn about database design concepts, implement the designed database, and execute queries using SQL Server. It also involves creating a client-side application using C#.Net to modify, update, and view the results.

Deeveedeezee, the online DVD shop, stores customer and DVD information. This includes the title, genres, studio, classification, actors, directors, and other details. Users can rate DVDs on a scale of 1 to 5 stars and create their own wish lists.

Our goal is to comprehend the maintenance process of the DEEVEEDEEZEE database, by practically implementing it using SQL Server. Additionally, we aim to create a client end application that interacts with SQL Server, enabling users to view and add details.

The project title is WindowsFormApplication1. To run the client application, open the solution in MS Visual Studio and debug the code. This can be done by clicking on Build->Start Debugging.

Deeveedeezee.com offers customers access to a variety of DVD details, including reviews, ratings, and synopses. Customers can also create a personal wish list. To effectively manage this database and ensure its accuracy, an administrator or database team must continuously monitor and assess the necessary information.

Each user is given a unique user-id and password to access information about DVDs. They can also leave comments,

reviews, and ratings under their user-id. To handle the large amount of information, an Administrator or a team is needed to manage and maintain it. The admin and team members have their own user-ids and passwords to login and carry out assigned tasks. The focus of the discussion is on database design, implementation, and development of a client application that interacts with SQL server for easier data access.

Thus, the backend database includes a range of data regulations, definitions, and limitations that must be managed. The database stores unique information and is accessed by various user types through different tables and views that have appropriate fields and attributes. The provided database design diagram illustrates the overall structure of the database, including the tables, their attributes, and their respective properties.

The following diagram illustrates the tables and their attributes. This design is then enhanced and refined using the Entity Relationship Diagram. The report's Entity Relationship diagram is included below.

Entities are represented by rectangle boxes.

The ovals are considered to be attributes.

The rhombus is a representation of the relationships between entities.

The diagram depicts the entities, attributes, and relations.

It represents the relationships visually.

The ER diagram is an effective way to comprehend the connections between entities and attributes.

Within the diagram provided, users can be identified uniquely based on attributes such as userId, email, firstName, and more.

User entity is connected to DVD through Reviews.

The movie ratings given by viewers are recorded using Ratings.

Reviews from viewers are called comments.

However, a DVD can be uniquely identified by its attributes such as dvdId, title, studio, and so on.

Role links DVD and People together.

Individuals with various roles in the film industry, including actors, producers, and directors, have

the ability to undertake a range of tasks.

The attributes of individuals include peopleId and peopleName.

title

retailPrice

releaseDate

runningTime

synopsis

genres

studio

The text in the paragraph tag is "initials".The content within the HTML tag

consists of the text "firstName".

surname

dateOfBirth

address

email

mobileNo

telephoneNo

password

timestamp

studio

peopleId

dvdId

role

This email content will be rephrased and combined while preserving the and their contents.

userId

wishlist

The content within the HTML tag 'p' is 'dvdId'.

classification

description

peopleFirstName

userId

Email

dvdId

review

rating

The following HTML code displays a timestamp within a paragraph element:

timestamp

peopleName

peopleId

people

Role

dvdId

title

retailPrice

releaseDate

runningTime

synopsis

genres

studio

DVD

RatingComment

Review

userId

date of birth

surName

firstName

Initials

User

telephoneNo

mobileNo

email

Address

The text is already succinct and unified.

The database design diagram and entity relationship diagram were used to design the database. The resulting design was then implemented in SQL using MS Visual Studio.

The naming conventions for attributes and tables involve using camel case and singular table names. Screenshots are utilized to capture the output tables, contents, and query results shown below:

The recorded details of the users include their firstName, email, DOB, address, and contact details.

This table contains Primary key constraints and Not Null constraints.

Different data types for each attribute are being implemented according to the given data rules shown above.

Comments:

It stores all the details of a DVD, such as the running time, release date, and title, and creates a unique DVD ID.

In this table, there are Primary key constraints and Not Null constraints present.

The given data rules are being implemented to assign the datatypes for each attribute, as shown above.

It collects and retains the user's reviews and rating, specifically the number of stars given to the DVD.

The database includes different constraints, such as not null and check constraints. The check constraint ensures that the rating field is within the range of 0 to 5. Furthermore, there is a foreign key constraint where

the 'email' attribute refers to the primary key of the users table.

The primary key utilized is a blend of 'userId' and 'dvdId'.

Comment:

Each movie or DVD includes a list of individuals involved and assigns a unique identifier, 'peopleId', to each person.

The stored information includes the roles of each person involved in the movie, such as the director, producer, actor, etc.

The foreign key 'dvdId' in the dvdDetails table refers to the primary key 'id', which is an identity with a seed of 1.

The wishlist for each user is stored as a comma-separated list of values.

The table has the following properties: 'id' is set as the identity with seed 1, 'userId' is defined as the primary key, and 'email' is specified as the foreign key.

The DVD's classification and a brief description are stored in the information.

'dvdId' is the foreign key, while 'id' serves as both the primary key and an identity with a seed value of 1.

The user is unable to enter null values or leave them blank due to this constraint.

The user must input the required information for that particular field.

It is crucial to have essential input from the user.

The fields 'id', 'email', and 'userId' in the provided example should be filled out. If these details are not provided, an error will occur. The field 'wishlist' can remain empty.

CREATE TABLE wishList

id int NOT NULL,

email field is a non-null alphanumeric character field with a maximum length of 100 characters.

The variable userId is an integer.

The field "wishList" of type varchar(MAX) must be provided and cannot be left empty.

The goal of this restriction is to guarantee that each row in the table possesses a distinct identifier.

This row contains a field

that is exclusive to it and is not found in any other rows.

The primary key of a table can be used as a reference or foreign key in another table.

The table in the given example will have a unique identifier called 'userId' as its primary key, guaranteeing that each row has a distinct userId.

The given figure clearly shows that the userId is unique for each row.

The constraint used to uniquely identify a specific column is:

All values in a particular column are distinct.

The diagram shows that the userId field has a unique constraint and is an identity with a seed of 1. When a new row is inserted, the value of the userId is automatically incremented to ensure no duplicate values in the userId column.

The figure above shows that the 'userId' column does not have any repeated values and is subject to a unique constraint.

When this constraint is present, the database will ensure referential integrity.

Foreign keys are used to reference the primary key of another table.

CREATE TABLE wishList

(id: integer)

userId is an integer data type and it serves as the primary key.

There is a varchar (100) data type referencing the email column in the users table.

The variable wishList is a varchar with a maximum length of MAX.

In the provided figure, the 'email' serves as a foreign key that references the 'email', which is the primary key of the users table.

The

tag represents a paragraph and contains the statement that this constraint guarantees that all values in a column satisfy certain criteria.

CREATE TABLE review

(reviewId int NOT NULL,

userId - an integer that is required and cannot be null

dvdId is an integer that cannot have

a null value.

The review variable should be a string that is not null and has exactly 200 characters.

The rating must be a floating point number between 0 and 6.

The HTML tag

displays the text "timestamp (timestamp)".

This check constraint ensures that the star rating is between 0 and 5.

Or

CREATE TABLE dvdDetail

The content within the should not beor unified.

The retail price must be a non-null integer.

The "title" column needs to have a value that is not empty because it is of type varchar.

The synopsis is a non-null string of characters.

runningTime int CHECK(runningTime>10 AND rating<5000) NOT NULL

The genres column is a non-null varchar.

studio varchar NOT NULL

This check constraint guarantees that the duration of the running time falls within the range of 10 and 5000.

LO3: Utilization of Manipulation and Query tools

Manipulation and Query tools in the database consist of commands and statements that assist with data manipulation. SQL follows specific standards, including DML (Data Manipulation Language), which includes important SQL commands such as INSERT, UPDATE, ALTER, and others. These tools and query commands help in modifying data and performing regular updates. SQL Server 2008 provides a great platform for accessing databases and offers a query window to execute different DML commands. These tools greatly simplify data modification and information updates. The following sections provide a detailed understanding of these tools and their functionalities.

Find all DVDs in the Romantic Comedy genre and arrange them by price.

SELECT dvdId, title, retailPrice, releaseDate, runningTime, synopsis, genres, studio

FROM dvdDetails

WHERE

ORDER BY retailPrice

Executing the query would yield the following table:

Query to display all DVDs owned by Universal Pictures UK

Studio.

Below is the initial

data in the dvdDetails table:

SELECT dvdId, title, retailPrice, releaseDate, runningTime, synopsis, genres, studio

FROM dvdDetails

WHERE (studio = 'Universal Pictures UK Studio')

Table: Result of the query

Query for viewing all DVDs featuring either Johnny Depp or Leonardo DiCaprio as actors.

Table: Initial data in the dvdDetails table

Query:

Use the inner join statement to select dvdId, title, retailPrice, releaseDate, runningTime, synopsis, genres, and studio from the dvdDetails table. Additionally, select dvdId as Expr1 from the subquery where dvdId is distinct.

Combining the table "roles" and the table "people" by matching the peopleId in "people" with the peopleId in "roles".

WHERE (people.peopleFirstName = 'Shahrukh') OR (people.peopleFirstName = 'Amir khan')

The html tag

contains the text "AS choose ON choose.dvdId = dvdDetails.dvdId".

Table: Result of the query

The DVDs directed by Steven Spielberg can be viewed.

Table: Initial data in the dvdDetails table

SELECT the dvdId, title, retailPrice, releaseDate, runningTime, synopsis, genres, studio from dvdDetails and also choose.dvdId AS Expr1

FROM dvdDetails

INNER JOIN

(SELECT DISTINCT [NATURAL].dvdId)

FROM roles AS [NATURAL]

INNER JOIN people ON people.peopleId = [NATURAL].peopleId

WHERE ([NATURAL].role = 'Director') OR (people.peopleFirstName = 'Farah Khan')) AS choose ON choose.dvdId = dvdDetails.dvdId

Looking for a fantastic gift idea for Valentine's Day? Take a look at our thoughtfully selected assortment of DVDs!

Table: Initial data in the dvdDetails table

SELECT dvdId, title, retailPrice, releaseDate, runningTime, synopsis, genres, studio

FROM dvdDetails

WHERE (genres = 'Romantic')

Table: Result of the query

View all DVDs appropriate for children under 10 to watch.

Table: Initial data in the dvdDetails table

The following information should be selected: dvdId, title, retailPrice, releaseDate, runningTime, synopsis, genres, and studio.

FROM dvdDetails

WHERE (genres = ‘under 10’)

Table: Result of the query

DVDs that are on the wishlist for a specific customer.

Table: Initial data in wishlist table

SELECT wishList

FROM wishlist

WHERE (userId = 1)

Table: Result of the

query

Arranged in chronological order, here are the DVDs listed based on their release dates.

Table: Initial data in the dvdDetails table

Retrieve the dvdId, title, retailPrice, releaseDate, runningTime, synopsis, genres, and studio from the database.

FROM dvdDetails

ORDER BY releaseDate

Table displaying the outcome of the query.

All DVDs from a certain studio are arranged in a specific order.

The data displayed in the dvdDetails table below represents the initial information.

The selected attributes are dvdId, title, retailPrice, releaseDate, runningTime, synopsis, genres, and studio.

FROM dvdDetails

The text "

ORDER BY studio

" is already unified and cannot be furtherwhile keeping the and their contents unchanged.

Table: Result of the query

To include a demonstration DVD.

Table: Initial data in the dvdDetails table

INSERT INTO dvdDetails

(retailPrice, releaseDate, genres, studio, runningTime, title)

The content of the are as follows:

VALUES (99, CONVERT(DATETIME, ‘1990-05-02 00:00:00’, 102), ‘under10’, ‘Coke Studio’, 1100, ‘Tom’)

Table: Result of the query

Table showing the data in the dvdDetails table after executing the query.

An update query is used to modify the review star rating given by a customer for a DVD.

Table: Data initially present in the review table

UPDATE review

SET rating = 3.5

WHERE (userId = 1)

Table: The result of the query

Table: Data in review table after executing query

There are two categories of queries: simple and complex. Simple queries can be directly included in the code, while complex queries are converted into stored procedures, views, and triggers that can be utilized later in the code. Stored procedures serve as pre-compiled methods for fetching data from the database and can encompass different types of queries such as select, insert, update, etc. In case an error arises during query execution, we revert to the previous state and disregard the result

of that particular query.

Accurate data results can only be achieved if there is a proper correlation between the tables.

LO4 includes the utilization, documentation, and execution of a relational database management system.

The accurate results from the queries in section 3.2 illustrate the correct design of the relational database. Accurate output can only be produced by queries when table relationships are accurate. Furthermore, both the view section and desired client application view also produce accurate results, further confirming the proper implementation of the relational database.

Microsoft Visual Studio was utilized for creating the client application in C#.NET with the objective of enabling users to add and view data from the database. This ensured accurate updates and delivered pertinent information via deeveedeezee.com's online website.

These are the requirements for using this application:

MS Visual Studio is installed in the computer system with C#.NET and SQL server.

To import the MDF file or database into the application, you must utilize MS Studio and navigate to Data > Add new datasources > Database.

The location of the database needs to be updated in the SqlConnection string path after it is loaded.

Each user of this client application is given a distinctive username and password, which they use to sign in. The application will only show the designated features for viewing and adding items. The image displayed below illustrates how the Client Application appears after a successful login.

Fig: Client Application afterlogin client app.JPG

By clicking the respective buttons, the user can view and add data. They will be directed to the forms displayed below:

view and add.jpg

Users have the ability to access a variety of information on the online website, such as DVD details, customer reviews, ratings, wish lists,

and user data.

Details about a DVD can be added to display to customers, including information about the roles played by actors, actresses, and other staff involved in that specific product. It can also include the rating based on reviews and the rating given by customers, as well as new user data.

The subsequent sets of information outline the process of how the data and different specifics are incorporated into the database through the client application:

add dvddetails.JPG Fig: Add dvdDetails

Upon clicking the submit button in the coding section, the query is run to insert the DVD details such as the DVD name or title, price, genres, date of release, studio, and synopsis. This query updates the dvdDetails table as displayed in the above query section.

add ratings.JPG

The client application allows users to access and view customer reviews and ratings online. Users can rate a DVD by providing its ID and selecting a rating from 1 to 5. When the submit button is clicked, the query will run in the background and update the corresponding table.

add roles1.JPG

This information assists the customer by providing details about the product's cast.

add users.JPG

Adding users into the database helps validate their identity.

The user can see the following output by using the same approach.

The dvdDetails table holds data about DVDs, such as the title, price, release date, and extra features. The dvdID is automatically generated and acts as the primary key. This information can be used to categorize DVDs by genre and studio when necessary.

This table is used for updating the online site with the roles played by different individuals involved in the creation of this product. Each person is identified

by their unique ids.

Customers can update their wishlist online, and this information is stored in the database. The wishlist provides useful information about user preferences for future reference.

This table provides the personal details and contact information of both the users or customers using deeveedeezee.com online and the staff members using this application. This information can be utilized for various purposes.

According to online viewers' reviews, users of this application can include a modified rating using data gathered from various sources.

The GRIDVIEW tool is used in all the view forms mentioned above to display tables of results based on user preferences. This improves the visual presentation and provides a more user-friendly and convenient interface for users.

A relational database is made up of a collection of relations, which are represented by 2D tables. These tables consist of rows and columns and act as the main storage component for a relational database. This type of database can have multiple tables, each with their own unique rows and columns. The benefits of using a relational database include its accessibility, association, and flexibility. To ensure that data is reliable and consistent, there are rules in place that enforce entity integrity and referential integrity. Entity integrity guarantees that each tuple (a row in a table) has a distinct identity while also preventing null values in any other field that is part of the primary key. Maintaining referential integrity at the database level depends on the utilization of primary and foreign keys to prevent inconsistent updates or deletions. Microsoft Access serves as an example of a true relational database.

Referential Integrity is the process of establishing and enforcing relationships between entities

or tables using foreign keys. These relationships serve as logical connections and help define business rules. It ensures that the value in the child table matches a primary key value in the parent table. The foreign key values are derived from the primary key table.

There are different types of relationships that entities can have, such as one-to-one, one-to-many, and many-to-many relationships.

Entities are related through relationships, mainly using foreign keys. In this association, the value of the parent table serves as the primary key and is referred to as a foreign key in other tables.

The main implications of a relational database management system involve:

Provides independence between the physical data storage and the logical database structure.

Accessing all data is a simple task.

Offers versatility in the design of databases.

The duplication of information is reduced.

RDBMSs are primarily designed for performing CRUD operations, specifically creating, reading, updating, and deleting data.

In the previous section, a comprehensive explanation is given regarding the implementation, testing, and documentation processes for RDBMS. The database creation for deeveedeezee specifically utilizes SQL Server 2008.

Verification in database is the process of ensuring that data entry into the database is accurate and free from transcription errors. This is especially important when transferring data between tables or converting it to a different format. These processes are susceptible to errors during data entry. To address this issue, unique and primary keys are utilized in this project to verify and validate the accuracy of entered data. These keys are essential in confirming that the correct information is transcribed accurately and matches the original data during table transfers.

Get an explanation on any task
Get unstuck with the help of our AI assistant in seconds
New