akash verma: assignment of Bsc IT 1st sem database management systems

[ Fall 2014 ] ASSIGNMENT

PROGRAM	BSc IT
SEMESTER	FIRST
SUBJECT CODE & NAME	BT0066, Database Management Systems

Q. No.1 Explain the concept of data independence and explain why it is required?

Answer: The concept of the data independence is similar in many respects to the concept of abstract data type in modern languages like C++. Both hide implementation details from the users. This allows users to concentrate on the general structure rather than low-level implementation details.

Data independence is advantageous in the database environment, since it allows for changes at on level of the database, without affecting others levels. These changes are absorbed by the mappings between the levels. Logical data independence is more difficult to achieve than physical independence. Since application programs are heavily dependent on the logical structure of the data they access.

Q. No.2 Draw and explain the diagram of SQL server 2000 architecture.

Answer:

SQL server 2000 architecture

The above diagram displays the architecture of SQL Server, it contains many memory components. The data storage needs of a modern corporation or government organization and very complex. Some examples are:

· Online Transaction Processing (OLTP) system must be capable of handling thousands of orders placed at the same time.

· Increasing numbers of corporations are implementing large web sites as a mechanism for their customers to enter orders, contact the service department, get information about products, and for many other tasks that previously required contact with employees. These sites require data storage that is secure, yet tightly integrated with the web.

· Organization are implementing off-the-shelf software packages for critical services such as human resources planning, manufacturing resources planning, and inventory control. These system require database capable of sorting large amount of data and supporting large numbers of users.

· Organization have many users who must continue working when they do not have access to the network. Example are mobile disconnected users, such as traveling sales representative or regional inspectors. These users must synchronize the data on a notebook or laptop with the current data in the corporate system, disconnect from the network, record the result of their work while in the field, and then finally reconnect with the corporate network and merge the result of their Fieldwork into the corporate data store.

· Managers and Marketing personnel need increasingly sophisticated analysis of trends recorded in corporate data. They need rebust Online Analytical Processing (OLAP) systems easily built from OLTP data and support sophisticated data analysis.

· Independent Software Vendors (ISVs) must be able to distribute data storage capabilities with application targeted at individuals or small workgroups. This means the data storage mechanism must be transparent to the users who purchase the application. This requires a data storage system that can be configured by the application and then tune itself automatically so that the users do not need to dedicate database administrators to constantly monitor and tune the application.

Q. No.3 What do you mean by indexed sequential file organization?

Explain with the help of an example.

Answer: In an indexed sequential file, the file contains data records and pointers to the records. Data records and record pointers are arranged in buckets, which consist of an integral number of physically contiguous 512-byte disk blocks. Individual records within the file are located by the specification of the keys (indexes) associated with the records. Each file must have a primary key-that is, a field within the record that has a unique value to distinguish it from all other records in the file. An indexed sequential file can also have up to 254 alternate keys, which need not have unique values.
RMS writes records to an indexed file in collating sequence according to the primary key, putting them in buckets that are chained together. Thus, an individual file can be accessed sequentially with any key.
The following figure illustrates an indexed sequential file with a single key:

An Indexed Sequential File

The records in the file illustrated in above figure consist of address data that might have been defined in a PL/I structure as follows:

DECLARE 1

ADDRESS_FILE,

2 EMPLOYEE_NAME CHARACTER(30),

2 ADDRESS,

3 STREET CHARACTER(20),

3 ZIP_CODE CHARACTER(5);

In this file, the key is the employee name.

When RMS writes records to an indexed sequential file, it builds and maintains a tree-like structure of key values and location pointers. When records are accessed by key, RMS uses the tree to locate individual records. Thus, when a PL/I program accesses the record whose key value is JONES, RMS traverses the indexes to locate the record.

When new records are added to an indexed sequential file, a data bucket may not have enough room to accommodate a new record. In this case, RMS performs what is called bucket splitting: it inserts a new bucket in the chain of data buckets and moves enough records from the previous bucket to preserve the primary key sequence. Bucket splitting is transparent to the PL/I program; the program knows only that it has added a record to the file.

Q. No.4 What is the system catalog in RDBMS? Also explain what information is stored in the system catalog.

Answer: We can store a relationship using one of several alternatives file structure, and we can create one or more indexes as a file on every relation. Conversely, in a relational DBMS, every file contains either the tuples in a relation or the entries in an index. The collection of files corresponding to user’s relation and indexes represent the data in the database.

A fundamental property of a database system is that it maintains a description of all the data that it contains. A relational DBMS maintains information about every relation and index that it contains. The DBMS also maintains information about views is stored and used to compute that tuples that belong in the view when the view is queried. This information is stored in a collection of a relations, maintained by the system, called the catalog relation; the catalog relation are also called the system catalog, the catalog, or the data dictionary. The system catalog, is sometimes referred to as metadata; that is, not data, but descriptive information about the data. The information in the system catalog is used extensively for query optimization.

Information stored in the System Catalog

Let us consider what is stored in the system catalog. At a minimum we have system wide information, such as the size of the buffer poo and the page size, and the following information about individual relations, indexes, and views:

For each relation:

· Its relation name, the file name(or some identifier), and the file structure(e.g., heap file) of the file in which it is stored.

· The attribute name and type of each of its attributes.

· The index name of each index on the relation.

· The integrity constraints(e.g., primary key and foreign key constraints) on the relation.

For each index:

· The index name and the structure of the index.

· The search key attributes.

For each view:

· Its view name and definition.

In addition, statistic about relation and indexes are stored in the system catalogs and updated periodically. The following information is commonly stored:

Cardinality: The number of tuples NTuples(R) for each relation R.

Size: The number of pages NPages(R) for each relation R.

Index Size: The number of pages INPages(I) for each index I.

Index Height: The number of nonleaf levels IHeight(I) for tree index I.

Index Range: The minimum present key value ILow(I) and the maximum present key value IHigh(I) for each index I.

Q. No.5 What do you mean by semantics of TRC queries? Give an example of TRC queries.

Answer: What does a TRC query mean? More precisely, what is the set of answer tuples for a given TRC query? The answer to a TRC query {T | p(T)}, as we noted earlier, is the set of all tuples t for which the formula p(T ) evaluates to true with variable T assigned the tuple value t. To complete this definition, we must state which assignments of tuple values to the free variables in a formula make the formula evaluate to true.

A query is evaluated on a given instance of the database. Let each free variable in a formula F be bound to a tuple value. For the given assignment of tuples to variables, with respect to the given database instance, F evaluates to (or simply ‘is’ ) true if one of the following holds:

· F is an atomic formula R summation Rel, and R is assigned a tuple in the instance of relation Rel.

· F is a comparison R.a op S.b, R.a op constant, or constant op R.a, and the tuples assigned to R and S have field values R.a and S.b that make the comparison true.

· F is of the form p, and p is not true; or of the form p ^q, and both p and q are true; or of the form p q, and one of them is true, or of the form p) q and q is true whenever p is true.

· F is of the form for any R(p(R)), and there is some assignment of tuples to the free variables in p(R), including the variable R that makes the formula p(R) true.

F is of the form R(p(R)), and there is some assignment of tuples to the free variables in p(R) that makes the formula p(R) true no matter what tuple is assigned to R.

Example of TRC Queries

student(rollNo, name, degree, year, sex, deptNo, advisor)

department(deptId, name, hod, phone)

Obtain the rollNo, name of all girl students in the Maths Dept (deptId= 2)

{s.rollNo,s.name| student(s)^ s.sex=‘F’^ s.deptNo=2}

attributes required in the result

This predicate is true whenever value of s is a tuple from the student relation, false otherwise, In general, if tis a tuple variable with range relation r, r(t) is taken as a predicate which is true if and only if the value of tis a tuple in r .

Obtain the rollNo, name of all girl students in the MathsDept

{s.rollNo,s.name | student(s) ^ s.sex=‘F’^ (∃d)(department(d) ^ d.name=‘Maths’^ d.deptId= s.deptNo)}

s: free tuple variable d: existentially bound tuple variable

Existentially or universally quantified tuple variables can be used on the RHS of the vertical bar to specify query conditions

Attributes of free (or unbound ) tuple variables can be used on LHS of vertical bar to specify attributes required in the results.

Example Relational Scheme

student (rollNo, name, degree, year, sex, deptNo, advisor)

department (deptId, name, hod, phone)

professor (empId, name, sex, startYear, deptNo, phone)

course (courseId, cname, credits, deptNo)

enrollment (rollNo, courseId, sem, year, grade)

teaching (empId, courseId, sem, year, classRoom)

preRequisite(preReqCourse,courseID)

Example queries in TRC (1/5)

1) Determine the departments that do not have any girl students

student (rollNo, name, degree, year, sex, deptNo, advisor)

department (deptId, name, hod, phone)

{d.name|department(d) ^

¬(∃s)(student(s) ^

s.sex=‘F’^ s.deptNo= d.deptId)

Examples queries in TRC (2/5)Schema

2) Obtain the names of courses enrolled by student named Mahesh

{c.name | course(c) ^

(∃s) (∃e) ( student(s) ^ enrollment(e)

^ s.name = “Mahesh”

^ s.rollNo= e.rollNo

^ c.courseId = e.courseId}

Examples queries in TRC (3/5)

3) Get the names of students who have scored ‘S’ in all subjects they have enrolled. Assume that every student is enrolled in at least one course.

{s.name| student(s)

(∀e)(( enrollment(e) ^

e.rollNo= s.rollNo) →e.grade=‘S’)}

person P with all S grades:

for enrollment tuples not having her roll number, LHS is false for enrollment tuples having her roll number, LHS is true, RHS also true so the implication is true for all e tuples

person Q with some non-S grades: for enrollment tuples not having her roll number, LHS is false for enrollment tuples having her roll number, LHS is true, but RHS is false for at least one tuple.

So the implication is not true for at least one tuple.

Examples queries in TRC (4/5)

4)Get the names of students who have taken at least one course taught by their advisor

{s.name | student(s) ^

(∃e)(∃t)(enrollment(e) ^ teaching(t) ^

e.courseId= t.courseId ^

e.rollNo= s.rollNo^

t.empId= s.advisor}

5) Display the departments whose HODs are teaching at least one course in the current semester

{d.name | department(d) ^(∃t)(teaching(t) ^

t.empid= d.hod

^ t.sem= ‘odd’^ t.year= ‘2008’)}

Examples queries in TRC (5/5)

6)Determine the students who are enrolled for every course taught by Prof Ramanujam. Assume that Prof Ramanujam teaches at least one course.

1. {s.rollNo| student (s) ^

2. (∀c)(course (c) ^

3. ((∃t),(∃p)( teaching(t) ^ professor(p) ^

4. t.courseId= c.courseId^

5. p.name= “Ramanujam”^

6. p.empId= t.empId)) →

7. (∃e) (enrollment(e) ^

8. e.courseId= c.courseId^

9. e.rollNo= s.rollNo)

10. )

11. }

Q. No.6 Explain BCNF with an example.

Answer: Boyce-Codd Normal Form is a higher version of the third normal form. This from deals with certain type of the anamoly that is not handled by 3NF. A 3NF table which does not have multiple overlapping candidate keys is said to be in BCNF.

BCNF was developed by Raymond Boyce and E.F. Codd; the latter is widely considered the father of relational database design.

BCNF is really an extension of 3rd Normal Form (3NF). For this reason it is frequently termed 3.5NF. 3NF states that all data in a table must depend only on that table’s primary key, and not on any other field in the table. At first glance it would seem that BCNF and 3NF are the same thing. However, in some rare cases it does happen that a 3NF table is not BCNF-compliant. This may happen in tables with two or more overlapping composite candidate keys.

Assume that a relation has more than one possible key. Assumes further that the composite keys have a common attribute. If an attribute of a composite key is dependent on an attribute of the other composite key, a normalization called BCNF is needed.

􀂄 A relation R(X) is in Boyce–Codd Normal Form if for every non-trivial functional dependency Y → Z defined on it, Y contains a key K of R(X). That is, Y is a superkey for R(X).

􀂄 Example: Person1(SI#, Name, Address)

The only FD is SI# → Name, Address

Since SI# is a key, Person1 is in BCNF

􀂄 Anomalies and redundancies, as discussed earlier, do not occur in databases with relations in BCNF.

Usually tables that are in Third Normal Form are already in Boyce Codd Normal Form. Boyce Codd Normal Form (BCNF) is considered a special condition of third Normal form. A table is in BCNF if every determinant is a candidate key. A table can be in 3NF but not in BCNF. This occurs when a non key attribute is a determinant of a key attribute. The dependency diagram may look like the one below

The table is in 3NF. A and B are the keys and C and D depend on both A and B. There are no transitive dependencies existing between the non key attributes, C and D.

The table is not in BCNF because a dependency exists between C and B. In other words if we know the value of C we can determine the value of B.

akash verma

Wednesday, 8 July 2015

assignment of Bsc IT 1st sem database management systems

1 comment: