saravanan velu

Sunday, 29 January 2012

BASICS OF DATABASE

1. What is a database?
Organised collection of interrelated data and the data in the database is integrated, can be shared and can be accessed concurrently.
2. What is database management system (DBMS)?
DBMS is a collection of interrelated files and a set of programs that allow user to access and modify this data.
3. What is the main aim of using a DBMS?
The main aim of using DBMS is to provide a convenient and efficient way to store, retrieve and modify information
4. Define master file and Transaction file.
A master file stores relatively static data about an entity and changes rarely and where as transaction file stores relatively transient data about a particular data processing task and changes more frequently as transactions happen more frequently and in large numbers
5. Disadvantages of traditional file system to information processing or advantages of DBMS.
Data security, data redundancy, data isolation, program/ data dependency, lack of flexibility
and concurrent access anomalies
6. Define database schema.
The overall design of database is called database schema.
7. What is three level architecture for a DBMS?
The 3 levels of DBMS architecture are: External/ view level, conceptual/ logical level and internal/ physical level
8. State different users of database?
End user, application programmer and database administrator.
9. What are the different functions of DBA?
1. Defining conceptual schema
2. Defining the internal schema
3. Liaising with users
4. Granting of authorization for data access
5. Defining integrity constraints
10. What is data model? Name two widely used data models?
A data model is a conceptual tool to describe data, data relationships, data semantics and consistency constraints. Two widely used data models are
1. Object based logical model (E-R model).
2. Record based logical model.

11. Define the terms entity, attribute, relation.
Entity is a common word anything real or abstract, about which we want to store data.
Attribute is a characteristic property of an entity. An entity would have multiple attributes.
Relationship is a natural association that exists between one or more entities.
12. What is cardinality of a relationship? What are the different types of cardinality relationship?
Cardinality of a relationship defines the type of relationship between two participating entities. There are 4 types of cardinality relationship. Those are

1. One to one relationship
2. One to many relationship
3. Many to one relationship
4. Many to many relationship

13. What is normalisation? What is the need of it?
Data base based on the E-R model may have some amount of inconsistency, ambiguity and redundancy. To resolve these issues some amount of refinement is required. This refinement process is called as normalisation.
This normalisation technique is based on a mathematical foundation.
We need to refine our design so that we make an efficient database in terms of storage space and insert, update and delete operations
14. What are the types of normalisation?
The following are the types of normal forms:
First Normal Form (1NF): A relation R is said to be in the first normal form if and only if all the attributes of the relation R are atomic in nature.
Second Normal Form (2NF): A relation is said to be in Second Normal form if and only if, it is in the first normal form and no partial dependency exists between non-key attributes and key attributes.
Third Normal Form (3NF): A relation R is said to be in the third normal form if and only if, it is in 3NF and no transitive dependency exists between non-key attributes and key attributes.
Boyce Codd Normal Form (BCNF): A relation is said to be in BCNF if and only if the determinants are candidate keys. BCNF relation is a strong 3NF, but not every 3NF relation is BCNF.
15. What are the merits and demerits of normalization?
Merits: 1. Normalization is based on mathematical foundation. 2. Removes the redundancy to the greater extent. After 3NF, data redundancy is minimised to the extent of foreign keys. 3. Removes the anomalies present in Inserts, Updates and Deletes.
Demerit: 1. Data retrieval (Select) operation performance will be severely affected. 2. Normalization may not always represent real world scenarios.
16. What is a determinant?
Attribute X can be defined as determinant if it uniquely defines the attribute value Y in given relationship or entity.
17. What is RDBMS?
Relational Database: Any database for which the logical organization is based on relational data model.
RDBMS: A DBMS that manages the relational database.
An RDBMS is type of DBMS that stores data in the form of related tables.
18. Name some popular RDBMS packages.
Oracle- By Oracle Corp.
Sybase- Sybase Inc.
Informix- Informix Software Inc.
DB2- IBM
Ingres- Computer Associates International Inc.
SQL Server- Microsoft
19. Where do we use RDBMS?
Database are widely used in real life applications such as:
Airlines, Banking, Universities, Telecommunications and Sales.
SQL
20. What is SQL?
SQL stands for Structured Query Language. It is a query language used to communicate with a database
21. What is a database query?
The process of requesting data from the database and receiving back the results is called a database query.
22. What are the difference between char and varchar2 data types?
Ex. A. Name varchar2 (20): B. Name char (20)
If you store string of length of 10 characters the varchar2 uses 10 character spaces and remaining memory is given to the OS where as in case of char the remaining spaces will be wasted.
23. What are the SQL statements?

Data definition language (create, alter, drop, truncate)
Data manipulation language (insert, delete, update)
Data control language (grant, revoke)
Data querying/ retrieving language (select)
Transaction control language (commit, rollback)
24. What is a constraint? What are the different constraints available?
Constraint is defined as imposing predefined rules on table columns
There mainly two types of constraints. Those are
1. Column constraint: It is specified as part of a column definition and applies only to that column
2. Table constraint: It is declared independently from a column definition and can apply to more than one column in a table
These constraints must be used when constraint is applied for more than one column of a table

25. What is a key?
Key is a pre defined constraint. There are different keys are available. Those are

1. Unique
2. Not null
3. Primary key
4. Foreign key
5. Check
6. Default

26. What is Candidate key?

A candidate key is a set of one or more attributes that can uniquely identify a row in a given table.

27. What is Simple Candidate Key?

A candidate key comprising of one attribute only.
28. What is Composite Candidate key?

A candidate key comprising of two or more attributes.

29. What is invalid candidate key?

A candidate key should be comprised of a set of attributes that can uniquely identify a row. A subset of the attributes should not posses the unique identification property.

30. What is Primary key?

During the creation of the table, the database designer chooses one of the candidate key from amongst the several available, to uniquely identify rows in the table. The candidate key so chosen is called the primary key. The primary key of a table is always not null and unique.

31. What is foreign key?

A foreign key is a set of attribute(s) the values of which are required to match the values of a candidate key in the same or another table. The foreign key attribute(s) can have duplicate or null values.

32. What is super key?

Any superset of a candidate key is a super key.

33. What is non-key attributes?

The attributes other than the primary key attributes in a table/relation are called non-key attributes.

34. What is an index?
An index is a structure that provides rapid access to the rows of a table based on the values of one or more columns
35. What are the advantages of index?
1. It speeds up the execution of SQL statements with search conditions that refer to the indexed column(s).
2. It is most appropriate when retrieval of data from tables is more frequent than inserts and updates.
36. What are the disadvantages of having an index?
(a) It consumes additional disk space. (b) The index table must be updated every time a row is added to the table and every time the indexed column is updated in existing row. This imposes additional overhead on insert and update statements for the table.
37. How do we suppress (avoid) duplicate rows in select statement?
Distinct is a key word. It is used to suppress duplicate rows in select statement output.
38. Why do we use “where” clause?
“Where” is a clause, it used to specify a search condition that limits the number of rows retrieved.
39. What are the different operators that can be used within where clause?
The operators used with where are

1. Comparison operators (=, <>, <, <=, >, >=)
2. Logical operators (and, or ,not)
3. Between, in, like
4. Is null, is not null

40. What is a NULL value?
A null value is used to indicate the absence of a value. It is not a zero or blank character. Null cannot be compared to any other value.
41. Why do we use is null, is not null operators?
If it is needed to find out the rows, then we have to use is null operator so that we get the rows whose one of the column values is null and in order to retrieve the rows which does not have null as any column value.
42. What are aggregate functions / column functions?
A SQL column function takes an entire column of data as its arguments and produces a single data item that summarizes the column.
Sum(), avg(), min(), max(), count() and count(*)
43. What is the difference between count() and count(*)?
Count () counts the non-null values in a column and where as count(*) counts rows of query results and does not depend on the presence or absence of null values in a column. If there are no rows, it returns a value of zero
44. What are the different clauses available in SQL?
There are 4 clauses. Those are

1. where
2. group by
3. having
4. order by

45. How do we use ‘group by’ clause?
The group by clause is used in select statement to collect data across multiple records and group the results by one or more columns.
46. How is ‘having’ clause is used?
The having clause is used along with the group by clause. The having clause can be used to select and reject row groups. The having clause specifies condition for groups.
47. What is union?
The union operation combines the rows from two sets of query results. By default, the union operation eliminates duplicate rows as part of its processing.
48. What is the difference between union and union all?
By using Union all we retrieve all the records from both tables (it may have some duplicates) where as by using union all the records from both tables and there will not be any duplicates in query result.
49. How does intersect (operation) work?
The intersect operation selects the common row from two sets of query results.
50. What is a sub query?
A query within a query is called sub query. Inner query is written in the where clause and it is evaluated first and based on the result of that outer query is evaluated.
51. How does co-related sub query work?
In co-related sub-queries, SQL performs a sub query, once for each row of the main query. The column(s) from the table of the outer query is always referred in the inner clause.
JOINS
52. What is a join?
Joins are used in queries to explain how different tables are related. Joins also let you select data from a table depending upon data from another table.
Types of joins: INNER JOINs, OUTER JOINs, CROSS JOINs. OUTER JOINs are further classified as LEFT OUTER JOINS, RIGHT OUTER JOINS and FULL OUTER JOINS.
53. What is a Cartesian join/ cross join?
Cross joins return all rows the first table. Each row from the first table is combined with all rows from the second table. Cross joins are also known as the Cartesian product of two tables
54. What is a self join?
Joining a table itself is known as self join
55. What is an inner join?
An inner join between two(more) tables is the Cartesian product that satisfies the join condition in the where clause. Inner join uses comparison operators to match rows from two tables based on the values in common columns from each table.
56. What is an equi join?
Equi join is also an inner join in which the joining condition is based on equality condition between values in the common columns
57. What is an outer join?
An outer join is used to retrieve the rows with an unmatched value in the relevant column
58. What are the different outer joins available?
There are 3 outer joins. Those are
1. Left outer join
2. Right outer join
3. Full outer join
59. How does a left join work?
Left outer join includes null extended copies of the unmatched rows from the first (left) table but does not include unmatched rows from the second (right) table.
60. What does a right outer join include?
The right outer join includes null extended copies of the unmatched rows all the rows from the second (right) table and only common rows from first table
61. What is the use of keyword Exists?
The exists checks whether a sub query produces any row(s) of results. If a query following exists returns at least one row, the exists returns true and stops further execution of the inner select statement. The outer query will be executed only if the exists returns true. If the inner query produces no rows, the exists returns false and the outer query will not be executed. The exists cannot produce a null value.
VIEWS
62. What is a view?
A view is virtual table in the database defined by a query. A view does not exist in the database as a stored set of data values.
63. What is a horizontal view?
Horizontal view restricts a user’s access to selected rows of a table.
64. What is vertical view?
Vertical view restricts a user’s access to select columns of a table.
65. What is a joined view?
A joined view draws its data from two or three different tables and presents the query results as a single virtual table. Once the view is defined, one can use a single table query against the view for the requests that would otherwise each require a two or three table join.
66. What is the use of ‘check option ‘in case of views?
If a view is created on specific a deptno, and while inserting values into view it should check whether the deptno is correct deptno for which the present view working for. This will be taken care if the view is created with check option.
67. What are the advantages of views?
Security: security is provided to the data base to the user to a specific no. of rows of a table.
Query simplicity: by using joined views data can be accessed from different tables.
Data integrity: if data is accessed and entered through a view, the DBMS can automatically check the data to ensure that it meets specified integrity constraints.
68. What are the disadvantages of views?
Performance: The DBMS the query against the view into queries against the underlying source table. If a table is defined by a multi table query, then even a simple query against a view becomes a complicated join, and it may take a long time to complete. This is reference to insert, delete and update operations
Update restrictions: when a user tries to update rows of a view, the DBMS must translate the request into an update into an update on rows of the underlying source table. This is possible for simple views, but more complicated views cannot be updated.
69. Why do we use data control language statements? What are those?
DCL statements are used to control access to the data base and the data. It is used to enforce data security. The DCL statements are Grant, revoke.
70. What is the use of grant statement?
The grant statement is used to grant security privileges on data base objects to specific users. By using grant you can give permissions like insert, delete and update.
71. What is a grant statement with a grant option?
A grant statement with grant option clause conveys along with the specified privileges, the right to grant those privileges to the other user.
72. How do we cancel the privileges on the table?
By using revoke statement we can revoke the privileges previously granted with the grant statement.
EMBEDDED SQL
73. What is the purpose of embedded SQL?
To blend SQL statements directly into a program written in a host programming language. Such as C, Pascal, Cobol, Fortran use embedded SQL.
74. Why do we use embedded SQL?
SQL has the following limitations

1. No provision to declare variable
2. No unconditional branching
3. No if statement to test conditions
4. No for loop, do or while statements to construct loops
5. No block structure

75. What is EXEC SQL?
Every embedded SQL statement begins with an introducer that flags it as a SQL statement. The IBM SQL products use the introducer EXEC SQL for most host languages
76. What is SQLCA?
The SQLCA (the SQL communication area) is data structure that contains error variables and status indicator. By examining SQLCA, the application program can determine the success or failure of its embedded SQL statements.
77. Define host variable?
A host variable is a program variable. It is defined using the data types of the programming language like C and manipulated by the programming language. To identify the host variable, the variable is prefixed by a colon (:) when it appears in embedded SQL statements.
78. What is the use of host variables?
The uses of host language variables are
1. To store data into the database
2. To retrieve data from the database

79. What are the indicator variables?
To store null values in the database or retrieve null values from the database, embedded SQL allows each variable to have a companion host indicator variable.
80. What are the different values associated with indicator variable?
In the embedded SQL statement, the host variable and Indicator variable together specify a single SQL-style value as follows
1. An indicator value of zero indicates that the host variable contains a valid value
2. A negative indicator value indicates that the host variable should be assumed to have a null value; the actual value of the host variable is irrelevant and should be disregarded
3. A positive indicator value indicates that the host variable contains a valid value which may have been rounded off or truncated
TRANSACTIONS
81. What is a transaction? Explain the terms associated with it?
A transaction is nothing but an interaction between different users, or different systems or user and system.
Transaction is a logical unit of work which takes database from one consistent state to another consistent state.
A transaction cannot be divided into smaller tasks
The successful completion of a transaction is called as the commit state. After this changes are permanent and irreversible
If one step fails, the complete transaction fails and the system is taken back to the original state that existed before the start of the transaction. This process of going back to the original state is called as rollback.
If the transaction rolls back, then the transaction is reaches the abort state.
82. What is a transaction processing system? What are the different transaction systems are there?
The transaction processing (TP) systems which mimic the real life system like salary processing, library, banking, air line, defence missile systems are basically divided into three categories. Those are
1. Batch transaction Systems
2. Online transaction processing systems
3. Real time transaction processing systems
83. What is a batch transaction processing systems?
In the batch transaction processing system, a set of application programs will be working on a set of input data to produce desired output. In this process there will be absolutely no human interaction.
Ex. Salary slip generation
84. What is online transaction processing system (OLTP)?
In the OLTP systems, the user will be continuously interacting with system through a computer or a terminal on a regular basis.
Ex. Air line reservation, banking ATM
85. What is Real time transaction processing system?
This system is the most complicated among all the transaction systems. It is capable of handling unexpected inputs to unexpected output
Ex. Air traffic control systems or missile defence system
And even these are systems are capable of handling a sudden change in the air pressure, the temperature, the wind direction, the target speed and the direction and can change their output based on these.
86. What are the properties of a transaction?
Every transaction must these properties. Those are

1. Atomicity
2. Consistency
3. Isolation
4. Durability

Most of the times we call these as ACID properties.
87. What is durability in case of a transaction?
Once a transaction completes (commits), the changes made to the database permanent and are available to all the transactions that follow it.
88. What are the requirements for an OLTP system?
In addition to ACID properties, OLTP systems require integrity and concurrency.
89. What is integrity? What are the different integrity checks?
All the data entered into the system must be validated for its correctness and adherence to the organisation’s business rules. This is implemented in RDBMS using three types of integrity checks. Those are
1. Domain integrity
2. Entity integrity
3. Referential integrity
90. What is domain integrity check?
It involves the implementation of business domain specific rules.
91. How is an entity integrity check implemented?
Entity integrity check is implemented using the primary key constraint. Basically entity integrity refers to the fact that a particular attribute uniquely indentifies the physical entity.
92. How is referential integrity implemented?
It is implemented using the relationships between primary keys and foreign keys of tables within a database. This ensures data consistency. Referential integrity requires that the value of every foreign key in every table be matched by the value of a primary key in another table. This relationship is called parent-child relationship.
93. What are the different restrictions that we can put on the foreign key at the time of creation?
1. On delete restrict: do not allow delete the parent table data if it is referred in the child table.
2. On delete set null: on delete the parent table data, set null value in the child table wherever the deleted data is referred.
3. On delete set default: set the null values to child rows on deletion of parent records.
4. On delete set cascade: delete all the child table record from the parent table on the deletion of parent record in the parent table.
94. What is concurrency?
Concurrency means allowing different users transactions to execute simultaneously.
95. What is locking?
Locking is a mechanism to have a controlled access to the resources like database, Table space, table, rows and columns. While these resources are put under lock by some transaction, other transactions have very restricted or no access to these resources, depending on the locking mode.
96. What is the primary aim of implementing locks on a table?
To achieve consistency while supporting concurrency of transactions.
97. What is a shared lock?
When a particular table is locked in the shared mode by one transaction, all the other transactions can perform the read operation on the locked resource, but no updates or modifications are possible by other transactions.
98. What is a shared lock?
This is most restrictive lock. Once a transaction puts the x lock on a particular resource, no other transaction can put any kind of lock on this resource. This resource is exclusively reserved for first transaction. This x lock allows the least concurrency.
Usually insert/ update/ delete operations put the x lock on resources before writing /modifying /deleting operations.
99. What is granularity of locking?
Granularity of refers to the granular at which a resource can be locked. In RDBMS application is capable of locking a table explicitly, then the granularity of locking at field level. If it can lock only up to the row level, the granularity of that is RDBMS product is row level.
100. What is intent locking?
In the intent locking only the intention of locking is expressed at the ancestor node of the required resource and the resource at the lower level is locked explicitly only when required.
The intent locking mechanism not only increases concurrency but also stops the implicit locking of ancestral resources.
101. How many ways is intent locking classified?
Intent locking is classified as intent shared lock and intent exclusive lock.
102. What is shared intent lock?
The combination of shared and intent exclusive lock is referred to as shared intent exclusive lock or SIX lock.
A share and intent exclusive lock indicates an S lock at the current level plus an intention to insert, delete, update data at lower level of granularity.
103. What is a Dead lock?
Dead lock is situation where one transaction is waiting for another transaction to release the resource it needs, and vice versa. Each transaction will be waiting for the other for the other to release the resource.
104. What is time stamping?
This is one of the concurrency management techniques. Every resource in database will be associated with last successful read and last successful write timestamp (time of occurrence up to milliseconds)
105. What is the biggest achievement of time stamping?
The biggest advantage of time stamping is it leads to no dead lock condition as no resources are locked.
106. Why is time stamping not used in RDBMS always?
Time stamping technique leads to large number of rollbacks. Due to this reason time stamping technique is not implemented as the concurrency control mechanism in most of the commercial RDBMS application.
107. Which one is the widely used concurrency control method?
Almost all the commercial RDMBS packages use a locking technique as the concurrency controlling mechanism while maintaining the consistency in the system.
108. How do we implement security in RDBMS packages?
Security is implemented in RDBMS packages using:
1. User id and pass word to restrict the users from acquiring an unauthorised access.
2. Grant and revoke statements to provide restrict access control to resources like tables.
3. Database views to restrict access to sensitive data.
4. Encryption of data to avoid unauthorised access.
109. When does a data base go into an inconsistent state?
A data base may go in an inconsistent state by
1. An application error
2. Power failure
3. OS or database failure
110. What is recovery?
If the database is in inconsistent state, it is necessary to restore it to a consistent state. Recovery process can be achieved either using files or backups of the database.
111. What are the different backup techniques available?
1. Dumping
2. Cold backup
3. Hot backup
112. What is an instance failure in RDBMS?
1. Un-planned power shut down
2. Sudden break down in OS or database
3. Memory failure
113. What is a transaction log?
Transactional log or the journal log or redo log is physical life. Instance failures can be taken handled by making use of transactional log or redo files.
ONLINE ANALYTICAL PROCESSING (OLAP)
114. What is an online analytical processing (OLAP)?
An organisation’s success also depends on its ability to analyze data and to make intelligent decisions that would potentially affect its future. Systems that facilitate such analysis are called OLAP.
115. What is the difference between OLTP and OLAP?
In OLAP we take the historical data stored to enable trend analysis and future predictions. We will de normalize databases to facilitate queries and analysis. We won’t have frequent updates. Joins will be simple as tables are de normalized.
In OLTP Old data is purged or archived. We use normalized databases to facilitate insertion, deletion, and updation. Updates are more common and the joins are more as the tables are normalized.
116. What is a data ware house?
A data ware house is repository which stores integrated information for efficient querying and analysis. Data ware house has data collected from multiple, disparate sources of an organisation.
117. What is the need for going data ware housing?
1. Analysis requires millions of records of data which are historical in nature.
2. Data is collected from heterogeneous sources
3. Need to make quick and effective strategic decisions
118. What are the characteristics of data ware house?
According to Bill Inmon, known as the father of data ware housing, a data ware house is a subject oriented, integrated, time variant, non volatile collection of data in support of management decisions.
119. What is ETL in data ware housing?
ETL stands for Extraction, transformation and loading. It is very important step in data ware housing. The definition of ETL: it is described as the process of selecting, migrating, transforming, cleaning and converting mapped data from the operational environment to data warehousing environment.
120. What is a fact table?
Each data ware house or data mart includes one or more fact tables. A fact table is central to a star or snowflake schema, and captures the data that measures the organisations business operations. Fact tables generally contain large number of rows.
121. What is a dimension table?
Dimension table contains attributes that describe fact records in the fact table. Some of these attributes provide descriptive information.
122. What is cube? And what is the use of it?
The OLAP tools allows you to turn data stored in relational databases into meaningful, easy to navigate business information by creating data cube. The dimensions of data a cube represent distinct categories for analyzing business data.
MISCELLANEOUS QUESTIONS:
1. Define the "Integrity rules"
There are two Integrity rules.
Entity Integrity: States that "Primary key cannot have NULL value"
Referential Integrity: States that "Foreign Key can be either a NULL value or should be Primary Key value of other relation.
2. What is System R? What are its two major subsystems?
System R was designed and developed over a period of 1974-79 at IBM San Jose Research Centre. It is a prototype and its purpose was to demonstrate that it is possible to build a Relational System that can be used in a real life environment to solve real life problems, with performance at least comparable to that of existing system.
Its two subsystems are
Research Storage and System Relational Data System.
3. What is Data Independence?
Data independence means that "the application is independent of the storage structure and access strategy of data". In other words, the ability to modify the schema definition in one level should not affect the schema definition in the next higher level.
Two types of Data Independence:
Physical Data Independence: Modification in physical level should not affect the logical level.
Logical Data Independence: Modification in logical level should affect the view level.
NOTE: Logical Data Independence is more difficult to achieve
4. What is a view? How it is related to data independence?
A view may be thought of as a virtual table, that is, a table that does not really exist in its own right but is instead derived from one or more underlying base table. In other words, there is no stored file that direct represents the view instead a definition of view is stored in data dictionary.
Growth and restructuring of base tables is not reflected in views. Thus the view can insulate users from the effects of restructuring and growth in the database. Hence accounts for logical data independence.
5. What is Weak Entity set?
An entity set may not have sufficient attributes to form a primary key, and its primary key compromises of its partial key and primary key of its parent entity, then it is said to be Weak Entity set.
6. What is an attribute?
It is a particular property, which describes the entity.
7. What are Relation Schema and a Relation?
A relation Schema denoted by R (A1, A2, An) is made up of the relation name R and the list of attributes Ai that it contains. A relation is defined as a set of tuples. Let r be the relation which contains set tuples (t1, t2, t3, tn). Each tuple is an ordered list of n-values t= (v1, v2, vn).
8. What is Query evaluation engine?
It executes low-level instruction generated by compiler.
9. What is Functional Dependency?
A Functional dependency is denoted by X Y between two sets of attributes X and Y that are subsets of R specifies a constraint on the possible tuple that can form a relation state r of R. The constraint is for any two tuples t1 and t2 in r if t1[X] = t2[X] then they have t1[Y] = t2[Y]. This means the value of X component of a tuple uniquely determines the value of component Y.
10. What is Multivalued dependency?
Multivalued dependency denoted by X Y specified on relation schema R, where X and Y are both subsets of R, specifies the following constraint on any relation r of R: if two tuples t1 and t2 exist in r such that t1[X] = t2[X] then t3 and t4 should also exist in r with the following properties
Ø t3[x] = t4[X] = t1[X] = t2[X]
Ø t3[Y] = t1[Y] and t4[Y] = t2[Y]
Ø t3 [Z] = t2 [Z] and t4 [Z] = t1 [Z]
where [Z = (R-(X U Y))]
11. What is Lossless join property?
It guarantees that the spurious tuple generation does not occur with respect to relation schemas after decomposition.
12. What is Fully Functional dependency?
It is based on concept of full functional dependency. A functional dependency X Y is fully functional dependency if removal of any attribute A from X means that the dependency does not hold any more.
13. What is Domain-Key Normal Form?
A relation is said to be in DKNF if all constraints and dependencies that should hold on the constraint can be enforced by simply enforcing the domain constraint and key constraint on the relation.
14. What are partial, alternate, artificial, compound and natural key?
Partial Key:
It is a set of attributes that can uniquely identify weak entities and that are related to same owner entity. It is sometime called as Discriminator.
Alternate Key:
All Candidate Keys excluding the Primary Key are known as Alternate Keys.
Artificial Key:
If no obvious key either stands alone or compound is available, then the last resort is to simply create a key, by assigning a unique number to each record or occurrence. Then this is known as developing an artificial key.
Compound Key:
If no single data element uniquely identifies occurrences within a construct, then combining multiple elements to create a unique identifier for the construct is known as creating a compound key.
Natural Key:
When one of the data elements stored within a construct is utilized as the primary key, then it is called the natural key.
15. What is indexing and what are the different kinds of indexing?
Indexing is a technique for determining how quickly specific data can be found.
Types:

Ø Binary search style indexing
Ø B-Tree indexing
Ø Inverted list indexing
Ø Memory resident table
Ø Table indexing
16. What is meant by query optimization?
The phase that identifies an efficient execution plan for evaluating a query that has the least estimated cost is referred to as query optimization.
17. What do you mean by atomicity and aggregation?
Atomicity:
Either all actions are carried out or none are. Users should not have to worry about the effect of incomplete transactions. DBMS ensures this by undoing the actions of incomplete transactions.
Aggregation:
A concept which is used to model a relationship between a collection of entities and relationships. It is used when we need to express a relationship among relationships.
18. What is a checkpoint and When does it occur?
A Checkpoint is like a snapshot of the DBMS state. By taking checkpoints, the DBMS can reduce the amount of work to be done during restart in the event of subsequent crashes.
19. What is "transparent DBMS"?
It is one, which keeps its Physical Structure hidden from user.
20. What is RDBMS KERNEL?
Two important pieces of RDBMS architecture are the kernel, which is the software, and the data dictionary, which consists of the system-level data structures used by the kernel to manage the database
You might think of an RDBMS as an operating system (or set of subsystems), designed specifically for controlling data access; its primary functions are storing, retrieving, and securing data. An RDBMS maintains its own list of authorized users and their associated privileges; manages memory caches and paging; controls locking for concurrent resource usage; dispatches and schedules user requests; and manages space usage within its table-space structures.
21. How do you communicate with an RDBMS?
You communicate with an RDBMS using Structured Query Language (SQL)
22. Define SQL and state the differences between SQL and other conventional programming Languages
SQL is a nonprocedural language that is designed specifically for data access operations on normalized relational database structures. The primary difference between SQL and other conventional programming languages is that SQL statements specify what data operations should be performed rather than how to perform them.
23. What is an Oracle Instance?
The Oracle system processes, also known as Oracle background processes, provide functions for the user processes-functions that would otherwise be done by the user processes themselves
Oracle database-wide system memory is known as the SGA, the system global area or shared global area. The data and control structures in the SGA are shareable, and all the Oracle background processes and user processes can use them.
The combination of the SGA and the Oracle background processes is known as an Oracle instance
24. What is ROWID?
The ROWID is a unique database-wide physical address for every row on every table. Once assigned (when the row is first inserted into the database), it never changes until the row is deleted or the table is dropped.
The ROWID consists of the following three components, the combination of which uniquely identifies the physical storage location of the row.
Ø Oracle database file number, which contains the block with the rows
Ø Oracle block address, which contains the row
Ø The row within the block (because each block can hold many rows)
The ROWID is used internally in indexes as a quick means of retrieving rows with a particular key value. Application developers also use it in SQL statements as a quick way to access a row once they know the ROWID
25. What is database Trigger?
A database trigger is a PL/SQL block that can defined to automatically execute for insert, update, and delete statements against a table. The trigger can e defined to execute once for the entire statement or once for every row that is inserted, updated, or deleted. For any one table, there are twelve events for which you can define database triggers. A database trigger can call database procedures that are also written in PL/SQL.
26. Name two utilities that Oracle provides, which are use for backup and recovery.
Along with the RDBMS software, Oracle provides two utilities that you can use to back up and restore the database. These utilities are Export and Import.
The Export utility dumps the definitions and data for the specified part of the database to an operating system binary file. The Import utility reads the file produced by an export, recreates the definitions of objects, and inserts the data
If Export and Import are used as a means of backing up and recovering the database, all the changes made to the database cannot be recovered since the export was performed. The best you can do is recover the database to the time when the export was last performed.
27. What are stored-procedures? And what are the advantages of using them.
Stored procedures are database objects that perform a user defined operation. A stored procedure can have a set of compound SQL statements. A stored procedure executes the SQL commands and returns the result to the client. Stored procedures are used to reduce network traffic.
28. Does PL/SQL support "overloading"? Explain
The concept of overloading in PL/SQL relates to the idea that you can define procedures and functions with the same name. PL/SQL does not look only at the referenced name, however, to resolve a procedure or function call. The count and data types of formal parameters are also considered.
PL/SQL also attempts to resolve any procedure or function calls in locally defined packages before looking at globally defined packages or internal functions. To further ensure calling the proper procedure, you can use the dot notation. Prefacing a procedure or function name with the package name fully qualifies any procedure or function reference.
29. What are cursors give different types of cursors?
PL/SQL uses cursors for all database information accesses statements. The language supports the use two types of cursors
Implicit, and Explicit.
30. What is de-normalization and when would you go for it?
As the name indicates, de-normalization is the reverse process of
normalization. It's the controlled introduction of redundancy in to
the database design. It helps improve the query performance as the
number of joins could be reduced.
31. What's the difference between a primary key and a unique key?
Both primary key and unique enforce uniqueness of the column on which
they are defined. But by default primary key creates a clustered index
on the column, where are unique creates a non-clustered index by
default. Another major difference is that, primary key doesn't allow
NULLs, but unique key allows one NULL only.
32. What are user defined data types and when you should go for them?
User defined data types let you extend the base SQL Server data types by
providing a descriptive name, and format to the database. Take for
example, in your database, there is a column called Flight_Num which
appears in many tables. In all these tables it should be varchar(8).
In this case you could create a user defined data type called
Flight_num_type of varchar(8) and use it across all your tables.
33. What are defaults? Is there a column to which a default can't be bound?
A default is a value that will be used by a column, if no value is
supplied to that column while inserting data. IDENTITY columns and
timestamp columns can't have defaults bound to them.
34. What is a transaction and what are ACID properties?
A transaction is a logical unit of work in which, all the steps must
be performed or none. ACID stands for Atomicity, Consistency,
Isolation, Durability. These are the properties of a transaction.
35. What's the difference between DELETE TABLE and TRUNCATE TABLE commands?
DELETE TABLE is a logged operation, so the deletion of each row gets
logged in the transaction log, which makes it slow. TRUNCATE TABLE
also deletes all the rows in a table, but it won't log the deletion of
each row, instead it logs the deallocation of the data pages of the
table, which makes it faster. Of course, TRUNCATE TABLE can be rolled
back.
36. What are constraints? Name different types of constraints.
Constraints enable the RDBMS enforce the integrity of the database
automatically, without needing you to create triggers, rule or defaults.
Types of constraints: NOT NULL, CHECK, UNIQUE, PRIMARY KEY, FOREIGN KEY
37. What are the steps you will take to improve performance of a poor
performing query?
There could be a lot of reasons
behind the poor performance of a query. But some general issues that
you could talk about would be: No indexes, table scans, missing or out
of date statistics, blocking, excess recompilations of stored
procedures, procedures and triggers without SET NOCOUNT ON, poorly
written query with unnecessarily complicated joins, too much
normalization, excess usage of cursors and temporary tables.
Some of the tools/ways that help you troubleshooting performance
problems are: SET SHOWPLAN_ALL ON, SET SHOWPLAN_TEXT ON, SET
STATISTICS IO ON, SQL Server Profiler, Windows NT /2000 Performance
monitor, Graphical execution plan in Query Analyzer.
38. What are cursors? Explain different types of cursors. What are the
disadvantages of cursors? How can you avoid cursors?
Cursors allow row-by-row processing of the result sets.
Types of cursors: Static, Dynamic, Forward-only, Keyset-driven.
Disadvantages of cursors: Each time you fetch a row from the cursor,
it results in a network roundtrip; where as a normal SELECT query
makes only one roundtrip, however large the result set is. Cursors are
also costly because they require more resources and temporary storage
(results in more IO operations). Further, there are restrictions on
the SELECT statements that can be used with some types of cursors.
Most of the times, set based operations can be used instead of
cursors. Here is an example:
If you have to give a flat hike to your employees using the following
criteria:
Salary between 30000 and 40000 -- 5000 hike
Salary between 40000 and 55000 -- 7000 hike
Salary between 55000 and 65000 -- 9000 hike
In this situation many developers tend to use a cursor, determine each
employee's salary and update his salary according to the above
formula. But the same can be achieved by multiple update statements or
can be combined in a single UPDATE statement as shown below:
UPDATE tbl_emp SET salary =
CASE WHEN salary BETWEEN 30000 AND 40000 THEN salary + 5000
WHEN salary BETWEEN 40000 AND 55000 THEN salary + 7000
WHEN salary BETWEEN 55000 AND 65000 THEN salary + 10000
END
Another situation in which developers tend to use cursors: You need to
call a stored procedure when a column in a particular row meets
certain condition. You don't have to use cursors for this. This can be
achieved using WHILE loop, as long as there is a unique key to
identify each row. For examples of using WHILE loop for row by row
processing, check out the 'My code library' section of my site or
search for WHILE.
39. What are triggers? How many triggers you can have on a table? How to
invoke a trigger on demand?
Triggers are special kind of stored procedures that get executed
automatically when an INSERT, UPDATE or DELETE operation takes place
on a table.
In SQL Server 6.5 you could define only 3 triggers per table, one for
INSERT, one for UPDATE and one for DELETE. From SQL Server 7.0
onwards, this restriction is gone, and you could create multiple
triggers per each action. But in 7.0 there's no way to control the
order in which the triggers fire. In SQL Server 2000 you could specify
which trigger fires first or fires last using sp_settriggerorder
Triggers can't be invoked on demand. They get triggered only when an
associated action (INSERT, UPDATE, DELETE) happens on the table on
which they are defined.
Triggers are generally used to implement business rules, auditing.
Triggers can also be used to extend the referential integrity checks,
but wherever possible, use constraints for this purpose, instead of
triggers, as constraints are much faster.
Till SQL Server 7.0, triggers fire only after the data modification
operation happens. So in a way, they are called post triggers. But in
SQL Server 2000 you could create pre triggers also. Search SQL Server
2000 books online for INSTEAD OF triggers.
Also check out books online for 'inserted table', 'deleted table' and
COLUMNS_UPDATED ()
There is a trigger defined for INSERT operations on a table, in an
OLTP system. The trigger is written to instantiate a COM object and
pass the newly inserted rows to it for some custom processing. What
do you think of this implementation? Can this be implemented better?
Instantiating COM objects is a time consuming process and since you
are doing it from within a trigger, it slows down the data insertion
process. Same is the case with sending emails from triggers. This
scenario can be better implemented by logging all the necessary data
into a separate table, and have a job which periodically checks this
table and does the needful.
40. What is aggregation, decomposition, generalization in RDBMS?
Aggregation: Selecting the data in group of records is called aggregation.

There are five aggregate system functions they are viz. Sum, Min, Max, Avg, Count. They all have their own purpose.
Decomposition: Selecting all data without any grouping and aggregate functions is called Decomposition. The data is selected, as it is present in the table.

Generalization: while generalization seems to be simplification of data, i.e. to bring the data from Un-normalized form to normalized form.
41. What is the difference between view and join?
View is a virtual table where data is not stored physically but gives the convenient method to retrieve and manipulate the information as needed. It is done for SECURITY REASONS.

Join is joining of two related tables.
42. Is it possible to have primary key and foreign key in one table if yes so what is the use of foreign key?
Yes, we can have.

Foreign key is used to make relationship with another table. We can think foreign key as a copy of primary key from another relational table.
43. What is the difference between Primary Key and Aggregate Key
What is the difference between varchar and varchar2?
Primary Key is a much similar to unique key. Only difference is that unique key can be null but primary key cannot be null. Primary key is used to avoid duplication of data.
A primary key consists of more than one column. Also known as a concatenated key or Aggregate Key. it is also called as composite key.
Example for varchar & varchar2()
Emp_name varchar(10) - if you enter value less than 10 then remaining space cannot be deleted. it used total 10 spaces.
Emp_name varchar2(10) - if you enter value less than 10 then remaining space is automatically deleted.

UYTRE44. What is difference between RDBMS and DBMS?
DBMS includes the theoretical part that how data’s are stored in a table. It does not relate tables with another. While RDBMS is the procedural way that includes SQL syntaxes for relating tables with another and handling data’s stored in tables. DBMS doesn't show the relation while RDBMS show the relation and moreover DBMS is for small organisations where RDBMS for large amount of data In DBMS all the tables are treated as different entities. There is no relation established among these entities. But the tables in RDBMS are dependent and the user can establish various integrity constraints on these tables so that the ultimate data used by the user remains correct. In DBMS there are entity sets in the form of tables but relationship among them is not defined while in RDBMS in each entity is well defined with a relationship set so as retrieve our data fast and easy.
45. Explain Codd's rule.
In 1985, Edgar Codd published a set of 13 rules which he defined as an evaluation scheme for a product which claimed to be a Relational DBMS. And they are:
Rule 1 : The information Rule.
"All information in a relational data base is represented explicitly at the logical level and in exactly one way - by values in tables."
Everything within the database exists in tables and is accessed via table access routines.
Rule 2 : Guaranteed access Rule.
"Each and every datum (atomic value) in a relational data base is guaranteed to be logically accessible by resorting to a combination of table name, primary key value and column name."
To access any data-item you specify which column within which table it exists, there is no reading of characters 10 to 20 of a 255 byte string.
Rule 3 : Systematic treatment of null values.
"Null values (distinct from the empty character string or a string of blank characters and distinct from zero or any other number) are supported in fully relational DBMS for representing missing information and inapplicable information in a systematic way, independent of data type."
If data does not exist or does not apply then a value of NULL is applied, this is understood by the RDBMS as meaning non-applicable data.
Rule 4 : Dynamic on-line catalog based on the relational model.
"The data base description is represented at the logical level in the same way as-ordinary data, so that authorized users can apply the same relational language to its interrogation as they apply to the regular data."
The Data Dictionary is held within the RDBMS, thus there is no-need for off-line volumes to tell you the structure of the database.
Rule 5 : Comprehensive data sub-language Rule.
"A relational system may support several languages and various modes of terminal use (for example, the fill-in-the-blanks mode). However, there must be at least one language whose statements are expressible, per some well-defined syntax, as character strings and that is comprehensive in supporting all the following items
• Data Definition
• View Definition
• Data Manipulation (Interactive and by program).
• Integrity Constraints
• Authorization.
Every RDBMS should provide a language to allow the user to query the contents of the RDBMS and also manipulate the contents of the RDBMS.
Rule 6 : .View updating Rule
"All views that are theoretically updatable are also updatable by the system."
Not only can the user modify data, but so can the RDBMS when the user is not logged-in.
Rule 7 : High-level insert, update and delete.
"The capability of handling a base relation or a derived relation as a single operand applies not only to the retrieval of data but also to the insertion, update and deletion of data."
The user should be able to modify several tables by modifying the view to which they act as base tables.
Rule 8 : Physical data independence.
"Application programs and terminal activities remain logically unimpaired whenever any changes are made in either storage representations or access methods."
The user should not be aware of where or upon which media data-files are stored
Rule 9 : Logical data independence.
"Application programs and terminal activities remain logically unimpaired when information-preserving changes of any kind that theoretically permit un-impairment are made to the base tables."
User programs and the user should not be aware of any changes to the structure of the tables (such as the addition of extra columns).
Rule 10 : Integrity independence.
"Integrity constraints specific to a particular relational data base must be definable in the relational data sub-language and storable in the catalog, not in the application programs."
If a column only accepts certain values, then it is the RDBMS which enforces these constraints and not the user program, this means that an invalid value can never be entered into this column, whilst if the constraints were enforced via programs there is always a chance that a buggy program might allow incorrect values into the system.
Rule 11 : Distribution independence.
"A relational DBMS has distribution independence."
The RDBMS may spread across more than one system and across several networks, however to the end-user the tables should appear no different to those that are local.
Rule 12 : Non-subversion Rule.
"If a relational system has a low-level (single-record-at-a-time) language, that low level cannot be used to subvert or bypass the integrity Rules and constraints expressed in the higher level relational language (multiple-records-at-a-time)."
SAMPLE QUERIES:
1.DDL STATEMENTS.
1. Create: it is used to create any database object.
Ex. Create table emp (empno number(4), ename varchar2(20), sal number(8,2), dept number(2));
2. Drop: it is used to delete database objects permanently.
Ex. Drop table emp;
3. Alter: it is used to modify the data base objects. Like increasing column size, adding a new column and deleting a column.
Ex. Alter table emp modify empno number(5);
Alter table emp add loc varchar2(20);
Alter table emp drop commission;
2. DML STATEMENTS:
1. Insert : it is used to insert the values (or rows )into database.
a. Insert into emp values(1001,’yugander’,50000,10);
b. insert into(empno, ename, sal)values(1002,rahat,50000);
(here in above case null will be inserted for dept column)
c. insert into emp values(&empno,’&ename’,&sal,&deptno);
Above statement is used insert more values into table. It asks values from the keyboard and after entering entire row press ‘\’ so that we can insert another row. After entering values, it should be committed.
2. Delete:
It is used to delete data(rows) from the database objects. Mainly deals with data(rows)
Delete from emp;
Delete from emp where deptno=10;
( delete rows from deptno 10)
3. Update:
It is used to update the table data. We can update single column or any number of columns using a single update statement.
Update emp set sal=sal+3000 where empno =1001;
If the where condition is not mentioned then whole table will be updated.
Update emp set sal=sal+2000, deptno=10 where deptno=11;
It updates sal, deptno whose deptno is 10
3. TCL STATEMENTS:
Tcl statements are used to save or cancel the database transactions
1. Commit:
It is used to save the transactions permanently to the database.
Ex. Dml statements
Commit;
Above dml statements will be saved permanently.
2. Rollback:
it is used to cancel the dml statements.
Ex. Dml statements
Rollback;
Above dml statements will be cancelled
4. where clause:
There are many operators used in where clause to specify the condition.
Consider this SQL statement and corresponding where clause explanation follows.
Select * from emp where[*********]
1. comparator operators
Ex. where empno=1001;(= is comparison operator)( we can have <, >, <>,>=,<=)
2. Special operators.
These are used to retrieve data very fast. These are
1. in, not in 2. Between, not between
3. like, not like 4. Is null, is not null
Ex. 1. Where empno in (1001,1005,1111)
Above statement retrieves data from all the three records if three record found else retrieves from the records found from the specified empno.s.’ not in’ is just opposite of this
We can use ‘in’ statement with numbers, dates and characters.
2. Where joining _date between 12-aug-07 and 31-dec-07;sss
where sal between 10000 and 20000;
In the above statements between is used with numbers and dates. Between takes both the extremes. And not between is opposite to between.
3.where ename like ‘s%’;
Above statement retrieves all the names which start with ‘s’
Above statement retrieves patterns. Like has got Meta characters i.e. _ and %
_ represents single character
% represents 0 or more characters
and ‘like’ always deals with characters only.
4. where sal is null;
Above statement retrieves the records whose sal is a null value . Is null operator is used to compare null values. We can’t compare a null value with another null value. Where as ‘is not null’ is 0pposite of ‘is null’ .
5. Using pre-defined functions
1. Arithmetic functions :
select ename ,abs(commission-sal) from emp;
select power (2,4), sin(10), ln (10),log (10,100),exp(2),sign(5),round (123.25.2),trunc(123.25.2),
mod (5,2) from dual;
In the above statement, many arithmetic functions have been used. These are already pre defined or built ins.
2. Character functions:
Select initcap(S), lower(S), upper(S), length(S), reverse(S) from dual;
In the above statement S stands for a string. Initcap function gives initials as the caps and remaining letters as small letters
Ex. If input is yugander jetty and the output will be Yugander Jetty
And length function gives length of string.
Select ascii(c)(i.e. c is a character), chr(n), concat(s1,s2) from dual;s
The above statement ascii function gives the equivalent ascii value, chr fuction takes input as a number and gives its equivalent ascii character and concat is fuction used to concat 2 strings
3. Date functions:
These are applicable to dates only.
Select sysdate from the dual;
Select add_months(d,+ or – n), months_between(d1,d2) from dual;
In the above statements sysdate is pseudo column it gives the today’s date. Add_months () is a function used to add n months to the given date d. Months_between () gives no.of months between given dates
4. General functions:
Select least(100,10,7,99), greatest(100,10,7,99) from dual;
In above statement, least() gives least value in the given list and greatest() gives biggest of the list values.
Select vsize(ename) from dual;
Gives the memory occupied by that coloumn.
Select ename, job, deptno, decode( deptno, 10,sal*0.20,20, sal*0.35, 30, sal*0.40) bonus from emp;
Decode () in the above statement, for dept=10, 20, 30 calculates the bonus and generates a report on each person with his name, job, deptno and bonus.
It is a very important function to check multiple conditions while manipulating and retrieving it is equivalent to if statement in C
6. Group by clause:
it is used to group the rows based on specified column.
Select deptno,sum(sal) from emp group by deptno;
Above statements groups the rows based on deptno and gives the total sal dept. Wise
7. Having clause:
Select job, count(*), sum (sal), avg(sal) from emp where deptno=30 group by job,
having count(*) > 1 ;
This select statement gives the output as , first it takes deptno=30 and then groups according to job and checks for count after the grouping is done.
Having clause is used to check on the grouped results.
8. Order by:
Select * from emp order by sal desc;
This statement gives sal in descending order. Default order is ascending for order by clause.
9. Joins:
1. Select empno, ename, sal, emp.deptno, dname, loc from emp, dept where emp.deptno=dept.deptno;
This is an inner join (equi join). Here deptno is present in both the tables. It is needed to mention table name in front of column name. This retrieves the rows from the both table which ever satisfy the condition mentioned.
2. select fname, cname from faculty, course;
Here faculty, course are the two table with a single column and fname, cname are the columns. fname is in faculty and cname is in course.
Above statement is a join called Cartesian join. It is used to retrieve data from more than one table without any condition. It retrieves all the possible combinations of rows.
3. select empno,ename, sal, emp.deptno, dname ,loc from emp, dept where emp.deptno=dept.deptno(+);
This is called left outer join all the rows from the dept table and common rows from the emp table
In the above case If + is on the left side then it is called right outer join then common rows from the dept table and all the rows from the emp table will be in output.
4.left+ right outer join will give you full outer join here ‘+’ will be union operator.
5. select worker .ename “subordinate” , manager.ename “superior” from emp worker,emp superior where worker.mgr=superior.mgr;
Here emp table is joined to itself . it is acting once as worker table and once as superior
Set operators
Select job from emp where deptno=10
Intersect
Select job from emp where deptno=20;
This query gives the output as common job from10 and 20 dept.s from emp table.
Different set operators are union, union all, minus and intersect.
10. DATA CONTROL LANGUAGE:
1. Grant insert, delete on emp to user1;
this statements gives the permissions to user1;
2. grant all on emp to user1;
all means insert, delete, update permissions.
3. Revoke all on emp from user1;
Above statement takes back the permissions from user1;
4. grant all on emp to user1 with grant option;
User1 can also give this table to other users
Creating users
DBA creates users
Create user yugander identified by hello;
Here user name is yugander and hello is his pass word and after this he should be given resource to connect . so it will be as follows
Grant connect, resource to yugander;
11. SOME SPECIAL DDL COMMANDS
1. rename emp to employ;
This query changes the table name emp to employ. This change is permanent.
2. alter table emp rename column mgr to mgrno;
This changes the column name permanetntly.
3. truncate table emp;
It deletes entire table data and just structure remains.
12. SUBQUERIS
1. Select * from emp where empno in(select distinct empno from incr);
Here first inner select will be executed and then based on that result, outer query is evaluated.
2. select * from emp where sal > any (select sal from emp where deptno=10);
Or
Select * from emp where sal>(select min(sal) from the emp);
It lists all the employees whose sal is more than lowest sal of deptno.
In above query replace ‘any’ with ‘all’ and in the next query min() with max() we will get max sal
13. CO-RELATED SUB QUERIES
Select empno, ename , sal , job, deptno from emp where sal>(select avg(sal) from emp where deptno=e.deptno);
This lists the employees whose sal is more than their average salary.
14. IMPORTANT QUERIES
1. Top highly paid employees
Select rownum, empno,ename, job,sal from (select rownum, empno,ename, job,sal from emp order by sal desc where rownum<=5);
2. N th max. Salary
Select rownum, empno,ename, job,sal from (select rownum, empno,ename, job,sal from emp order by sal desc ) group by rownum, empno,ename,job, sal having rownum=&n;
3. Retrieve Alternate Rows
Select rownum, empno,ename, job,sal from emp group by rownum, empno,ename,job, sal having mod( rownum,2)=0;/* for even numbered rows*/
Select rownum, empno,ename, job,sal from emp group by rownum, empno,ename,job, sal having mod( rownum,2)=1;/* odd numbered rows*/
4. Removing duplicate rows
Delete from emp where rowid not in (select min(rowid) from emp group by empno);
15. CONSTRAINTS
1. Imposing all the key constraints (Column level) at the time of table creation.
Create table emp(empno number(4) primary key, ename varchar2(20) not null, sex char(1) check( sex in(‘m’, ‘n’)), job varchar2(20), hire_date default sysdate, e-mail unique, deptno references dept(deptno));
Here in above query the key constraints are are imposed at the time of creation. It is a good practice to give name to the imposed constraints. So this example after naming the constraints.
2. Write a query to activate default constraint
Here consider that dept is table, at the time of creation of table we have provided loc as Hyderabad.
So the query as follows
Insert into dept values(30, ‘import’, default);
3. Write a query to explain referential integrity
On delete cascade clause:
Consider table emp the attributes are empno, ename, deptno and this deptno references to deptno of dept table.
Take one more table called incr, the attributes are empno, amount . empno of incr references to empno of emp table.
First if i want to delete emp table it won’t be deleted because child rows will be existing. The same thing happens if dept table is deleted.
In order to avoid these things it is needed to include on delete cascade clause with reference key at the time of creating a table.
Ex. Create table dept(deptno number(2), dname varchar(20));
Create table emp(empno number(4) primary key, ename varchar2(20), deptno number(2) references dept(deptno) on delete cascade);
Create table incr(empno number(4) references emp( empno) on delete cascade, amount number(8,2));
If the table are defined as shown above, there won’t be problem if you delete a table which has got children. It automatically removes all the child rows whenever parent record is removed.
Cascade constraints:
If this constraint is applied with reference key constraint, it automatically removes foreign key constraint when parent table is dropped.
Child rows will not be removed
Table constraints
The constraints defined next to table definition are called table constraints.
Create table customer_details(
Cust_id_number(5) constraint Nnull Not Null,
Cust_name varchar2(20) constraint Nnull2 Not Null,
Account_num number(5),
Account_type varchar2(10) constraint Nnull3 Not Null,
Bank_branch varchar2(25) constraint Nnull4 Not Null,
Cust_email varchar2(30),
Constraint Pkey1 primary key(cust_id,account_num));
16. Miscellenious Queries
1. Generate serial numbers for a table rows.
Select rownum, empno,ename ,sal from emp;
Here this numbering is not permanent. And this numbering is done with the help of rownum.it is pseudo column.
2. what is rowed ? write a query on it?
Row id is a unique stored permanently in database. It is automatically generated for every row inserted into the database. It comprises of 18 bit value holds object id, block id, file id, record id.
Select rowid, ename from emp;
3. How do we create a view?
Create view view_emp as select* from emp where deptno=10;
View is stored select statement. It will not hold any data. It is virtual table.
4. How do delete a view?
Drop view view_emp;
5. What is a force view?
A view can be created without having a table. That view is called force view. Example is
Create force view view_emp as select * from emp;
6. What is materialized view ? write a query?
It is a static view. It holds data in it. No DML is allowed on it. DML on table will not reflect in view.
It is created DBA, the query as follows..
Create materialized view mvi as select* from emp;
7. What is a synonym? Write a query on it?
It is used to hide the original name and owner of the database object. It provides security while sharing by hiding identity of the component. DML on synonym are reflected in the table.
Synonyms are two types.
1. Private synonyms. 2. Public synonyms
Create synonym esyn for emp;/* private synonym acts as view only*/
Create public synonym stu_info for student;
After this creation we can give permission to users for accessing. The user will not know where it has come from because with the same name it will be accessed any where.
8. What is sequence? Write a query on it?
It is used to generate numbers automatically. Not related to any table. It uses two pseudo columns
1. next val 2. Curr val
Create sequence s1 incremented by 1;
Select s1.next val from dual;
Select s1.curr val from dual;
9. What is an index? Write a query on it?
It is a pointer it locates physical address of data. It will improve performance while retrieving or manipulating data from the table. It is automatically activated when indexed column is referred in where clause.
Create index idx1 on emp( job);
We can have composite index.
Create index idx2 on emp( deptno, job);
10. What is ‘role’? write a query on it?
It is created by DBA.
It holds the collection of permissions on default database objects to stored.
For example there are three tables.
Create role rahat_role;
Grant all on emp to rahat_role;
Grant select, insert on dept to rahat_role;
Grant all on student to rahat_role ;
Grant role rahat_role to ugy;
Here rahat_role is a role i.e collection of permissions on different tables. Ugy is a user
Rahat_role is given to the user ugy by the user rahat.
To delete a role
Drop role rahat_role;
Revoke all on emp from rahat_role;
The query gets back the permissions from the role rahat_role

Manual Testing

Testing is process execution of application in controlled manner with the intent of

finding the errors. It is nothing but “Detection”.

The Quality Assurance involves through out the software development process to

monitor and improve the process, making sure that agreed upon standards and

procedures are followed and ensuring that problems are found and dealt with it. It

is oriented for “Prevention”.

Solving problems is a high-visibility process; preventing problems is low-visibility.

This is illustrated by an old parable.

Software Industry

 In India itself, Software industry growth has been phenomenal.

 IT field has enormously grown in the past 50 years.

 IT industry in India is expected to touch 10,000 crores of which software share is

dramatically increasing.

Software Crisis

 Software cost/schedules are grossly inaccurate.

Cost overruns of several times, schedule slippage’s by months, or even years are

common.

 Productivity of people has not kept pace with demand

Added to it is the shortage of skilled people.

 Quality of software is than desired

Error rates of released software leave customer dissatisfied…Threatening the very

business.

Software Myths

 Management Myths

 Software Management is different.

 Why change or approach to development?

 We have provided the state-of-the-art hardware.

 Problems are technical

 If project is late, add more engineers.

 We need better people.

 Developers Myths

 We must start with firm requirements

 Why bother about Software Engineering techniques, I will go to terminal and

code it.

 Once coding is complete, my job is done.

 How can you measure the quality..it is so intangible.

 Customer’s Myth

 A general statement of objective is good enough to produce software.

 Anyway software is “Flexware”, it can accommodate my changing needs.

 What do we do ?

 Use Software Engineering techniques/processes.

 Institutionalize them and make them as part of your development culture.

 Adopt Quality Assurance Frameworks : ISO, CMM

 Choose the one that meets your requirements and adopt where necessary.

Software Engineering:

RequiQreumicekn Dtse sGiganthering

Refine Requirements Build Prototype

Customer evaluation of the prototype

Design

Implement

MaTienstat in

Requirements Analysis

 Software Engineering is an engineering discipline concerned with the

practical problems of developing large software.

 Software Engineering discipline tracks both technical & non-technical

problems associated with software development.

 Challenge for Software Engineers is to produce high quality software

with finite amount of resources & within a predicted schedule.

 Apply Engineering Concepts to developing Software

 Apply Engineering Concepts to removing crisis.

Software Engineering Process

Consists of Three generic Phases:

 Definition, Development, and Maintenance.

Definition (What)

Customer Contact, Planning, Requirement Analysis.

Development Phase (How)

Design, Coding, Testing.

Maintenance Phase (Change)

Correction, Adaptation, Enhancement, Reengineering.

Support Activities

Quality Assurance, Configuration Management.

Software Life Cycle Models

 Prototyping Model

 Waterfall Model – Sequential

 Spiral Model

 V Model - Sequential

Prototyping Model of Software Development

A prototype is a toy implementation of a system; usually exhibiting limited functional capabilities,

low reliability, and inefficient performance. There are several reasons for developing a prototype.

An important purpose is to illustrate the input data formats, messages, reports and the interactive

dialogues to the customer. This a valuable mechanism for gaining better understanding of the

customer’s needs. Another important use of the prototyping model is that it helps critically

examine the technical issues associated with the product development.

Customer

Suggestions

Acceptance by Customer

Planning

Validation

Installation Operation and Maintenance

Testing and Integration

Detailed Design

Verification

System

Design

Verification

Coding

Verification

Classic Waterfall Model

In a typical model, a project begins with feasibility analysis. On successfully demonstrating the

feasibility of a project, the requirements analysis and project planning begins. The design

starts after the requirements analysis is complete, and coding begins after the design is

complete. Once the programming is completed, the code is integrated and testing is done. On

successful completion of testing, the system is installed. After this, the regular operation and

maintenance of the system takes place.

Feasibility

Report

Requirement Document

And Project Plan

System Design

Document

Detailed

Design Document

Programs

Test Plan, Test Report

And Manuals

Installation

Report

Typical Spiral Model

Process Feedback

Project Process

Process Review

Study

RFP

Proposal Proposal

Study Review

Working System

Acceptance Customer

Test Review

Actual

Proposal

Requirement

Study Requirement

Review

Deliverable System

System

Testing Certification

Approved

S.R.S

System Functional

Analysis Specification

Review

Integrated Tested Code

Integration Integration

Approved

S.R.S

Design Design

Document Review

Unit Tested Code

Unit Test Test Audit

Approved

S.D.S

Detailed Program

Design Specification

Entry Criteria

Task Validation

Approved

S.P.S

Coding Code

Inspection

Design

Typical Spiral Model

Developed by Barry Boehm in 1988. it provides the potential for rapid development of

incremental versions of the software. In the spiral model, software is developed in a series of

incremental releases. During early iterations , the incremental release might be a paper

model or prototype.

Each iteration consists of

Planning, Risk Analysis, Engineering, Construction & Release & Customer Evaluation.

Customer Communication: Tasks required to establish effective communication between

developer and customer.

Planning: Tasks required to define resources, timelines, and other project related information.

Risk Analysis: Tasks required to assess both technical and management risks.

Engineering: Tasks required to build one or more representatives of the application.

Construction & Release: Tasks required to construct, test, install and provide user support

(e.g., documentation and training)

Customer evaluation: Tasks required to obtain customer feedback based on evaluation of the

software representations created during the engineering stage and implemented during the

installation state.

Project

Closure Report

Accepted

Proposal

Delivered

System

Requirement

Specification

Executable

System

Functional

Specification

Unit Tested

Code

Design

Specification

Code

Design

Specification

Exit Criteria

V – Process Model

Project Process

(Project Plan)

Acceptance

Test Plan

System

Test Plan

Int.Test

Plan.

Unit

T.P

Legend

Chaos model

From Wikipedia, the free encyclopedia.

In computing, the Chaos model is a structure of software development that extends the spiral

model and waterfall model. Raccoon defined the chaos model.

The chaos model notes that the phases of the life cycle apply to all levels of a project, from

the whole project to individual lines of code.

 The whole project must by defined, implemented, and integrated.

 Systems must by defined, implemented, and integrated.

 Modules must be defined, implemented, and integrated.

 Functions must be defined, implemented, and integrated.

 Lines of code are defined, implemented and integrated.

There are several tie-ins with chaos theory.

 The chaos model may help explain why software is so unpredictable.

 It explains why high-level concepts like architecture cannot be treated independently

of low-level lines of code.

 It provides a hook for explaining what do next, in terms of the chaos strategy.

Chaos strategy

From Wikipedia, the free encyclopedia.

The chaos strategy is an approach to software development that extends other strategies

(such as step-wise refinement), and it works with the chaos model.

The main rule is always resolve the most important issue first.

 An issue is an incomplete programming task.

 The most important issue is a combination of big, urgent, and robust.

• Big issues provide value to users as working functionality.

• Urgent issues are timely that they would otherwise hold up other work.

• Robust issues are trusted and tested. Developers can then safely focus their

attention elsewhere.

 To resolve means to bring it to a point of stability.

The chaos strategy resembles the way that programmers work toward the end of a project,

when they have a list of bugs to fix and features to create. Usually someone prioritizes the

remaining tasks, and the programmers fix them one at a time. The chaos strategy states that

this is the only valid way to do the work.

The chaos strategy was inspired by Go strategy.

Top-down and bottom-up design

(Redirected from Top-Down Model)

Top-down and Bottom-up are approaches to the software development process, and by

extension to other procedures, mostly involving software.

In the Top-Down Model an overview of the system is formulated, without going into detail for

any part of it. Designing it in more detail then refines each part of the system. Each new part

may then be refined again, defining it in yet more detail until the entire specification is

detailed enough to begin development.

By contrast in bottom-up design individual parts of the system are specified in detail, and may

even be coded. The parts are then linked together to form larger components, which are in

turn linked until a complete system is arrived at.

Top down approaches emphasise planning, and a complete understanding of the system. It is

inherent that no coding can begin until a sufficient level of detail has been reached on at least

some part of the system. Bottom up emphasises coding, which can begin as soon as the first

module has been specified. However bottom-up coding runs the risk that modules may be

coded without having a clear idea of how they link to other parts of the system, and that such

linking may not be as easy as first thought.

Modern software design approaches usually combine both of these approaches. Although an

understanding of the complete system is usually considered necessary for good design,

leading theoretically to a top-down approach, most software projects attempt to make use of

existing code to some degree. Pre-existing modules give designs a 'bottom-up' flavor. Some

design approaches also use an approach where a partially functional system is designed and

coded to completion, and this system is then expanded to fulfill all the requirements for the

project.

Iterative and Incremental development

From Wikipedia, the free encyclopedia.

Iterative and Incremental development is a software development process, one of the

practices used in Extreme programming.

The basic idea behind iterative enhancement is to develop a Software system incrementally,

allowing the Developer to take advantage of what was being learned during the development

of earlier, incremental, deliverable versions of the system. Learning comes from both the

development and use of the system, where possible. Key steps in the process were to start

with a simple implementation of a subset of the software requirements and iteratively enhance

the evolving sequence of versions until the full system is implemented. At each iteration,

design modifications are made along with addition new functional capabilities.

The Procedure itself consists of the Initialization step, the Iteration step, and the Project

Control List. The initialization step creates a base version of the system. The goal for this

initial implementation is to create a product to which the user can react. It should offer a

sampling of the key aspects of the problem and provide an solution that is simple enough to

understand and implement easily. To guide the iteration process, a project control list is

created that contains a record of all tasks that need to be preformed. It includes such items as

new features to be implemented and areas of redesign of the exiting solution. The control list

is constantly being revised as a result of the analysis phase.

The iteration step involves the redesign and implementation of a task from project control list,

and the analysis of the current version of the system. The goal for the design and

implementation of any iteration is to be simple, straightforward, and modular, supporting

redesign at that stage or at as a task added to the project control list. The code represents the

major source of documentation of the system. The analysis of iteration is based upon user

feedback and the program analysis facilities available. It involves analysis of the structure,

modularity, Usability, reliability, efficiency, and achievement of goals. The project control list is

modified in the of the analysis results.

Guidelines the drive the implementation and analysis include:

 Any difficulty in design, coding and testing a modification should signal then need for

redesign or re-coding.

 Modifications should fit easily into isolated and easy-to-find- modules. If they do not,

some redesign is needed.

 Modifications to tables should be especially easy to make. If any table modification is

not quickly and easily done, redesign is indicated.

 Modifications should be come easier to make as the iterations progress. If they are

not, there is a basic problem such as a design flaw or a proliferation of patches.

 Patches should normally be allowed to exist for only one or two iterations. Patches

may be necessary to avoid redesigning during an implementation phase.

 The existing implementation should be analyzed frequently to determine how well it

measures up to project goals.

 Program analysis facilities should be used whenever available to aid in the analysis

of partial implementations

 User reaction should be solicited and analyzed for indications of deficiencies in the

current implementation.

Iterative Enhancement was successfully applied to the development of an extendable family of

compilers for a family of programming languages on a variety of hardware architectures. A set

of 17 versions of the system was developed at one site generating 17 thousand source lines of

high-level language (6500 lines of executable code). The system was further developed at two

different sites, leading to two different versions of the base language: one version essentially

focused on mathematical applications, adding real numbers and various mathematical

functions, and the other adding compiler writing capabilities. Each iteration was analyzed from

the user's point of view (the language capabilities were determined in part by the user's

needs) and the developer's point of view (the compiler design evolved to be more easily

modified for characteristics like adding new data types). Measurement such as coupling and

modularization were tracked over multiple versions.

Using analysis and measurement as drivers of the enhancement process is one major

difference between iterative enhancement and the current Agile processes. It provides support

for determining the effectiveness of the processes and the quality of product. It allows one to

study, and therefore improve and tailor, the processes for the particular environment. This

measurement and analysis activity can be added to existing agile development methods.

In fact, the context of multiple iterations provides advantages in the use of measurement.

Measures are sometimes difficult to understand in the absolute but the relative changes in

measures over the evolution of the system can be very informative as they provide a basis for

comparison. For example, a vector of measures, m1, m2, ... mn, can be defined to

characterize various aspects of the product at some point in time, e.g., effort to date, changes,

defects, logical, physical, and dynamic attributes, environmental considerations. Thus an

observer can tell how product characteristics like size, complexity, coupling, and cohesion are

increasing or decreasing over time. One can monitor the relative change of the various aspects

of the product or can provide bounds for the measures to signal potential problems and

anomalies.

Testing

 An examination of the behavior of a program by executing on sample data sets.

 Testing comprises of set of activities to detect defects in a produced material.

 To unearth & correct defects

 To detect defects early & to reduce cost of defect fixing

 To avoid user detecting problems

 To ensure that product works as users expected it to.

Why Testing?

• To unearth and correct defects.

• To detect defects early and to reduce cost of defect fixing.

• To ensure that product works as user expected it to.

• To avoid user detecting problems.

What is the 'software life cycle'?

The life cycle begins when an application is first conceived and ends when it is no

longer in use. It includes aspects such as

Initial concept, Test planning, Maintenance,

Requirements analysis, Coding, Updates,

Functional design, Document preparation, Retesting,

Internal design, Integration, Phase-out, and

Documentation planning, Testing, Other aspects.

Test Life Cycle

Identify Test Candidates 5- Evaluate Results

Test Plan 6- Document Test Results

Design Test Cases 7- Casual Analysis/ Preparation of Validation Reports

Execute Tests 8- Regression Testing / Follow up on reported bugs

Types of Tests

Contract Other Tests

System Req. Spec. System Tests

Functional Spec. Functional Tests

H.LD. Integration Testing

L.L.D. Unit Testing

What is software 'quality'?

Quality software is reasonably bug-free, delivered on time and within budget, meets

requirements and/or expectations, and is maintainable. However, quality is obviously a

subjective term. It will depend on who the 'customer' is and their overall influence in the

scheme of things. A wide-angle view of the 'customers' of a software development project

might include end-users, customer acceptance testers, customer contract officers, customer

management, the development organization's management/accountants/testers/salespeople,

future software maintenance engineers, stockholders, magazine columnists, etc. Each type of

'customer' will have their own slant on 'quality' - the accounting department might define

quality in terms of profits while an end-user might define quality as user-friendly and bug-free.

What is 'Software Quality Assurance'?

Software QA involves the entire software development PROCESS,

monitoring and improving the process,

making sure that any agreed-upon standards and procedures are followed, and

ensuring that problems are found and dealt with. It is oriented to 'prevention'. ( OR )

The purpose of Software Quality Assurance is to provide management with appropriate

visibility into the process being used by the software project and of the products being built.

Software Quality Assurance involves reviewing and auditing the software products and

activities to verify that they comply with the applicable procedures and standards and

providing the software project and other appropriate managers with the results of these

reviews and audits.

(See the Bookstore section's 'Software QA' category for a list of useful books on Software

Quality Assurance.)

What is Quality Control (QC)?

QC is the series of inspections, reviews, and tests used throughout the development cycle to

ensure that each work product meets the requirements placed upon it. QC includes a feedback

loop to the process that created the work product. The combination of measurement and

feedback allows us to tune the process when the work products created fail to meet their

specification. These approach views QC as part of the manufacturing process QC activities may

be fully automated, manual or a combination of automated tools and human interaction. An

essential concept of QC is that all work products have defined and measurable specification to

which we may compare the outputs of each process the feedback loop is essential to minimize

the defect produced.

What is 'Software Testing'?

Testing involves operation of a system or application under controlled conditions and

evaluating the results (eg, 'if the user is in interface A of the application while using hardware

B, and does C, then D should happen'). The controlled conditions should include both normal

and abnormal conditions. Testing should intentionally attempt to make things go wrong to

determine if things happen when they shouldn't or things don't happen when they should. It is

oriented to 'detection'. (See the Bookstore section's 'Software Testing' category for a list of

useful books on Software Testing.)

Organizations vary considerably in how they assign responsibility for QA and testing.

Sometimes they're the combined responsibility of one group or individual. Also common are

project teams that include a mix of testers and developers who work closely together, with

overall QA processes monitored by project managers. It will depend on what best fits an

organization's size and business structure.

Principles of Good Testing:

COMPLETE TESTING ISN'T POSSIBLE

 No matter how much you test, it is impossible to achieve total confidence

 The only exhaustive test is one that leaves the tester exhausted!

TEST WORK IS CREATIVE AND DIFFICULT

 Understand and probe what the system is supposed to do

 Understand and stress the limitations and constraints

 Understand the domain and application in depth.

TESTING IS RISK-BASED

 We can't identify all risks of failure.

 Risk assessments indicate how much to test and what to focus on.

ANALYSIS, PLANNING, AND DESIGN ARE IMPORTANT

 Test objectives must be identified and understood

 Tests must be planned and designed systematically

 Without a road map, you will get lost

MOTIVATION IS IMPORTANT

 You cannot be effective if you don't care about the job

 You must want to find problems and enjoy trying to break the system

TIME AND RESOURCES ARE IMPORTANT

 You can't be effective if you don't have the time or resources to do the job

TIMING OF TEST PREPARATION MATTERS A LOT

 Early test preparation leads to an understanding of project requirements and

design.

 Early test preparation uncovers and prevents problems.

 Early tests improve the effectiveness of subsequent reviews and inspections

MEASURING AND TRACKING COVERAGE IS ESSENTIAL

 You need to know what requirements, design, and code have and have not been

covered

 Complex software is too difficult to cover without systematic measurement

Three Major Concerns in Multiplatform Testing:

 The platform in the test lab will not be representatives of the platform in the

real world. This can happen because the platform in the test lab may not be

updated to the current specifications or may be configured in a manner that

is not representative of the typical configuration for the platform.

 The software will be expected to work on platform not included in the test

labs. By implications users may expect the software to work on platform that

has not been included in testing.

 The supporting software on various platforms is not comprehensive. User

platform may contain software that is not the same as used in the platform

in Test lab, for example: a different database management system and so

forth.

What are some recent major computers system failures caused by software bugs?

In January of 2001 newspapers reported that a major European railroad was hit by the

aftereffects of the Y2K bug. The company found that many of their newer trains would not run

due to their inability to recognize the date '31/12/2000'; altering the control system’s date

settings started the trains.

News reports in September of 2000 told of a software vendor settling a lawsuit with a

large mortgage lender; the vendor had reportedly delivered an online mortgage processing

system that did not meet specifications, was delivered late, and didn't work.

In early 2000, major problems were reported with a new computer system in a large

suburban U.S. public school district with 100,000+ students; problems included 10,000

erroneous report cards and students left stranded by failed class registration systems; the

district's CIO was fired. The school district decided to reinstate it's original 25-year old system

for at least a year until the bugs were worked out of the new system by the software vendors.

In October of 1999 the $125 million NASA Mars Climate Orbiter spacecraft was

believed to be lost in space due to a simple data conversion error. It was determined that

spacecraft software used certain data in English units that should have been in metric units.

Among other tasks, the orbiter was to serve as a communications relay for the Mars Polar

Lander mission, which failed for unknown reasons in December 1999. Several investigating

panels were convened to determine the process failures that allowed the error to go

undetected.

Bugs in software supporting a large commercial high-speed data network affected

70,000 business customers over a period of 8 days in August of 1999. Among those affected

was the electronic trading system of the largest U.S. futures exchange, which was shut down

for most of a week as a result of the outages.

In April of 1999 a software bug caused the failure of a $1.2 billion military satellite

launch, the costliest unmanned accident in the history of Cape Canaveral launches. The failure

was the latest in a string of launch failures, triggering a complete military and industry review

of U.S. space launch programs, including software integration and testing processes.

Congressional oversight hearings were requested. A small town in Illinois received an

unusually large monthly electric bill of $7 million in March of 1999. This was about 700 times

larger than it's normal bill. It turned out to be due to bugs in new software that had been

purchased by the local Power Company to deal with Y2K software issues.

In early 1999 a major computer game company recalled all copies of a popular new

product due to software problems. The company made a public apology for releasing a product

before it was ready. The computer system of a major online U.S. stock trading service failed

during trading hours several times over a period of days in February of 1999 according to

nationwide news reports. The problem was reportedly due to bugs in a software upgrade

intended to speed online trade confirmations. In April of 1998 a major U.S. data

communications network failed for 24 hours, crippling a large part of some U.S. credit card

transaction authorization systems as well as other large U.S. bank, retail, and government

data systems. The cause was eventually traced to a software bug.

January 1998 news reports told of software problems at a major U.S.

telecommunications company that resulted in no charges for long distance calls for a month

for 400,000 customers. The problem went undetected until customers called up with questions

about their bills.

In November of 1997 the stock of a major health industry company dropped 60% due

to reports of failures in computer billing systems, problems with a large database conversion,

and inadequate software testing. It was reported that more than $100,000,000 in receivables

had to be written off and that multi-million dollar fines were levied on the company by

government agencies.

A retail store chain filed suit in August of 1997 against a transaction processing system

vendor (not a credit card company) due to the software's inability to handle credit cards with

year 2000 expiration dates.

In August of 1997 one of the leading consumer credit reporting companies reportedly

shut down their new public web site after less than two days of operation due to software

problems. The new site allowed web site visitors instant access, for a small fee, to their

personal credit reports. However, a number of initial users ended up viewing each other’s

reports instead of their own, resulting in irate customers and nationwide publicity. The

Problem was attributed to "...unexpectedly high demand from consumers and faulty

software that routed the files to the wrong computers."

In November of 1996, newspapers reported that software bugs caused the 411-

telephone information system of one of the U.S. RBOC's to fail for most of a day. Most of the

2000 operators had to search through phone books instead of using their 13,000,000-listing

database. The bugs were introduced by new software modifications and the problem software

had been installed on both the production and backup systems. A spokesman for the software

vendor reportedly stated that 'It had nothing to do with the integrity of the software.

It was human error.' On June 4 1996 the first flight of the European Space Agency's

new Ariane 5 rocket failed shortly after launching, resulting in an estimated uninsured loss of a

half billion dollars. It was reportedly due to the lack of exception handling of a floating-point

error in a conversion from a 64-bit integer to a 16-bit signed integer.

Software bugs caused the bank accounts of 823 customers of a major U.S. bank to be

credited with $924,844,208.32 each in May of 1996, according to newspaper reports. The

American Bankers Association claimed it was the largest such error in banking history. A bank

spokesman said the programming errors were corrected and all funds were recovered.

Software bugs in a Soviet early-warning monitoring system nearly brought on nuclear

war in 1983, according to news reports in early 1999. The software was supposed to filter out

false missile detentions caused by Soviet satellites picking up sunlight reflections off cloudtops,

but failed to do so. Disaster was averted when a Soviet commander, based on a what he

said was a '...funny feeling in my gut', decided the apparent missile attack was a false alarm.

The filtering software code was rewritten.

Why is it often hard for management to get serious about quality assurance?

Solving problems is a high-visibility process; preventing problems is low-visibility. This is

illustrated by an old parable:

In ancient China there was a family of healers, one of whom was known throughout the

land and employed as a physician to a great lord. The physician was asked which of his family

was the most skillful healer. He replied, "I tend to the sick and dying with drastic and dramatic

treatments, and on occasion someone is cured and my name gets out among the lords."

"My elder brother cures sickness when it just begins to take root, and his skills are known

among the local peasants and neighbors." "My eldest brother is able to sense the spirit of

sickness and eradicate it before it takes form. His name is unknown outside our home."

How can new Software QA processes be introduced in an existing organization?

A lot depends on the size of the organization and the risks involved. For large

organizations with high-risk (in terms of lives or property) projects, serious management buyin

is required and a formalized QA process is necessary.

Where the risk is lower, management and organizational buy-in and QA implementation

may be a slower, step-at-a-time process. QA processes should be balanced with productivity

so as to keep bureaucracy from getting out of hand. For small groups or projects, a more adhoc

process may be appropriate, depending on the type of customers and projects. A lot will

depend on team leads or managers, feedback to developers, and ensuring adequate

communications among customers, managers, developers, and testers. In all cases the most

value for effort will be in requirements management processes, with a goal of clear, complete,

testable requirement specifications or expectations.

What are 5 common problems in the software development process?

Poor requirements - if requirements are unclear, incomplete, too general, or not testable,

there will be problems.

Unrealistic schedule - if too much work is crammed in too little time, problems are

inevitable.

Inadequate testing - no one will know whether or not the program is any good until the

customer complains or systems crash.

Futurities - requests to pile on new features after development is underway; extremely

common.

Miscommunication - if developers don't know what's needed or customers have erroneous

expectations, problems are guaranteed.

What are 5 common solutions to software development problems?

Solid requirements - clear, complete, detailed, cohesive, attainable, testable requirements

that are agreed to by all players. Use prototypes to help nail down requirements.

Realistic schedules - allow adequate time for planning, design, testing, bug fixing, retesting,

changes, and documentation; personnel should be able to complete the project

without burning out.

Adequate testing - starts testing early on, re-test after fixes or changes, plan for adequate

time for testing and bug fixing.

Stick to initial requirements as much as possible - be prepared to defend against

changes and additions once development has begun, and be prepared to explain

consequences. If changes are necessary, they should be adequately reflected in related

schedule changes. If possible, use rapid prototyping during the design phase so that customer

can see what to expect. This will provide them a higher comfort level with their requirement

decisions and minimize changes later on.

Communication - requires walkthroughs and inspections when appropriate; make extensive

use of group

Communication tools - e-mail, GroupWare, networked bug-tracking tools and change

management tools Intranet capabilities, etc.; insure that documentation is available and upto-

date - preferably electronic, not paper; promote teamwork and cooperation; use prototypes

early on so that customers' expectations are clarified.

Why does software have bugs?

Miscommunication or no communication

Software complexity

Programming errors

Changing requirements

Time pressures

Poorly documented code

Software development tools

What is Verification? Validation?

Verification typically involves reviews and meetings to evaluate documents, plans, code,

requirements, and specifications. This can be done with checklists, issues lists, walkthroughs,

and inspection meetings. Validation typically involves actual testing and takes place after

verifications are completed.

What is a 'walkthrough'?

A 'walkthrough' is an informal meeting for evaluation or informational purposes. Little or no

preparation is usually required.

What's an 'inspection'?

An inspection is more formalized than a 'walkthrough', typically with 3-8 people including a

moderator, reader, and a recorder to take notes. The subject of the inspection is typically a

document such as a requirements spec or a test plan, and the purpose is to find problems and

see what's missing, not to fix anything.

What kinds of testing should be considered?

Black box testing - not based on any knowledge of internal designs or code. Tests are based

on requirements and functionality.

White box testing - based on knowledge of the internal logic of an application's code. Tests

are based on coverage of code statements, branches, paths, and conditions.

Unit testing - the most 'micro' scale of testing; to test particular functions or code modules.

Typically done by the programmer and not by testers, as it requires detailed knowledge of the

internal program design and code. Not always easily done unless the application has a welldesigned

architecture with tight code; may require developing test driver modules or test

harnesses.

Incremental integration testing - continuous testing of an application as new functionality

is added; requires that various aspects of an application's functionality be independent enough

to work separately before all parts of the program are completed, or that test drivers be

developed as needed; done by programmers or by testers.

Integration testing - testing of combined parts of an application to determine if they

function together correctly. The 'parts' can be code modules, individual applications, client and

server applications on a network, etc. This type of testing is especially relevant to client/server

and distributed systems.

Functional testing - black box type testing geared to functional requirements of an

application; this type of testing should be done by testers. This doesn't mean that the

programmers shouldn't check that their code works before releasing it.

System testing - black box type testing that is based on overall requirements specifications;

covers all combined parts of a system.

end-to-end testing - similar to system testing; the 'macro' end of the test scale; involves

testing of a complete application environment in a situation that mimics real-world use, such

as interacting with a database, using network communications, or interacting with other

hardware, applications, or systems if appropriate.

Sanity testing - Typically an initial testing effort to determine if a new software version is

performing well enough to accept it for a major testing effort. For example, if the new

software is crashing systems every 5 minutes, bogging down systems to a crawl, or destroying

databases, the software may not be in a 'sane' enough condition to warrant further testing in

its current state.

Regression testing - re-testing after fixes or modifications of the software or its

environment. It can be difficult to determine how much re-testing is needed, especially near

the end of the development cycle. Automated testing tools can be especially useful for this

type of testing.

Acceptance testing - final testing based on specifications of the end-user or customer, or

based on use by end-users/customers over some limited period of time.

Load testing - testing an application under heavy loads, such as testing of a web site under a

range of loads to determine at what point the system's response time degrades or fails.

Stress testing - term often used interchangeably with 'load' and 'performance' testing. Also

used to describe such tests as system functional testing while under unusually heavy loads,

heavy repetition of certain actions or inputs, input of large numerical values, large complex

queries to a database system, etc.

Performance testing - term often used interchangeably with 'stress' and 'load' testing.

Ideally 'performance' testing (and any other 'type' of testing) is defined in requirements

documentation or QA or Test Plans.

Usability testing - testing for 'user-friendliness'. Clearly this is subjective, and will depend on

the targeted end-user or customer. User interviews, surveys, video recording of user sessions,

and other techniques can be used. Programmers and testers are usually not appropriate as

usability testers.

Install / uninstall testing - testing of full, partial, or upgrade install/uninstall processes.

Recovery testing - testing how well a system recovers from crashes, hardware failures, or