Title: Multi-RQP Generating Test Databases for the Functional Testing of OLTP Applications
1Multi-RQP Generating Test Databases for the
Functional Testing of OLTP Applications
- Carsten Binnig
- Joint work with Donald Kossmann, Eric Lo
DBTest Workshop, SIGMOD 2008, Vancouver
2Motivation
- Todays testing techniques are not very efficient
- 20-70 of the costs of a SW project are spent for
testing - ? Costs caused by software errors in the US in
2000 20-60 bn - Test automation is not a trivial problem
- Writing test programs which verify the
application behavior - ? Maintainability / Quality of test programs is
not easy - Test automation for DB apps is even harder
- Testing a certain behavior needs a particular
database state - Existing tools generate test databases
independent from test cases - How to generate relevant test databases for OLTP
apps?(other work ICDE07, SIGMOD07)
3Example Online-Library (1)
- Use Case Reservation of a book
- User enters the inventory number of the book
- System shows the details of the book
- Exception 1 Book belongs to the closed stacks
- User enter her account information
- System checks account info and finishes
reservation - Exception 2 User account is disabled
- Exception 3 User account that has overdue fines
4Example Online-Library (2)
- Test Case (Expected Behavior) Test Database
- Test Case 1 (Exception 1)
- A book that belongs to the closed stack
- Test Case 2 (Exception 2)
- A book that does not belong to the closed stacks
- A user account that is disabled
- Test Case 3 (Exception 3)
- Test Case 4 (Successful Reservation)
- A book that does not belong to the closed stacks
- A valid user account with no overdue fines
5Outline
- Introduction
- State of the Art / Requirements
- (Multi-) Reverse Query Processing (RQP)
- Conclusions / Future Work
6State of the Art
- General-purpose Database Generators gt Random
data over database schema (size of tables, data
distributions) - Low test coverage gt Data does not enable
execution of all test cases - High maintenance costs gt Manual adaptation of
data necessary - Script-based Database Generators gt
Application-specific data (e.g., a bunch of SQL
INSERT statements) - High initial costs gt Writing code to generate
test database - Hard to extend gt Hard to analyze side-effects
- Data Extractors gt Extract data from existing
applications - Test coverage strongly depends on existing data
- High initial effort gt Test data needs to be
anonymized
7Requirements for the Generation of Test
Databases
- Specify a test database individually per test
case - High test coverage is possible
- Good extensibility for new test cases
- Allow a declarative specification of the test
database - Maintainability of the specification is good
(automatic evolution?) - Data generation can be optimized (runtime, amount
of data) - Specify only relevant the data for each test case
- Initial costs to specify the test database are
low - Changeability of specification is good
- Enable logical data independence of test data
specification - Database schema can be changed without changing
the specification of the test database
8Outline
- Introduction
- State of the Art / Requirements
- (Multi-) Reverse Query Processing (RQP)
- Conclusions / Future Work
9Reverse Query Processing (RQP)
- Problem Statement
- Given SQL Query Q, Result R, Database Schema S
- Output Database D with Q(D)R and D satisfies S
- Example (Test Case 1) There exists at least one
book which does belong to the closed stacks
S CREATE TABLE book ( b_id INTEGER PRIMARY
KEY, b_title VARCHAR (20), b_closed BOOLEAN NOT
NULL, ... )
Application
D
RQP
Q SELECT COUNT() AS cnt FROM book WHERE
b_closed 1
Test Case 1
R lt gt1 gt
10RQP Basic Idea
- Query Processing
- Input Database D, Query Q
- Output Result R
- Reverse Query Processing
- Input Query Q, Result R
- Output Database D
Result
Query
Query Processor
Reverse Query Processor
Database
Query
gt RQP can generate many different databases
11Query Processing (Simplified)
SQL Query
Query Result
Query Plan (Relational Algebra)
R
?COUNT() as cnt
Q SELECT COUNT() FROM book WHERE
b_closed 1
sb_closed 1
D
book
12Reverse Query Processing (Query Compilation)
Reverse Query Plan(Reverse Relational Algebra)
SQL Query
?-1COUNT() as cnt
Q SELECT COUNT() AS cnt FROM book WHERE
b_closed 1
s-1b_closed 1
Data flow
book
13Reverse Query Processing (Top-Down Data
Generation)
R lt gt1 gt
R
?-1COUNT() as cnt
Q
s-1b_closed 1
Data flow
book
D
S CREATE TABLE book ( b_id INTEGER PRIMARY
KEY, b_title VARCHAR (20) NOT NULL,
b_closed BOOLEAN NOT NULL, ... )
14Multi Reverse Query Processing
- Problem
- One query result are often not sufficient to
specify a test database for more complex test
cases - Multiple queries result are necessary
- Example (Test Case 4)
- Book that does not belong to the closed stacks
(Q1 und R1) and - Valid user account without overdue fines (Q2 und
R2) - Idea Restrict input query classes such that
- MRQP can be solved efficiently by using RQP
- User can still specify any test database
15RQP-disjoint Queries
- Idea Q1/R1 and Q2/R2 specify disjoint data sets
- Example (Test Case 4)
- Book that does not belong to the closed stacks
(Q1 and R1) and - Valid user account without overdue fines (Q2 and
R2)
Q1 SELECT COUNT() AS cnt FROM book
WHERE b_closed0 R1 lt 1 gt
Q2 SELECT COUNT() AS cnt FROM user
WHERE u_fines0 R2 lt 1 gt
D1
D2
Table book
Table user
16Query-Refinement
- Idea Q1/R1 specifies a subset of Q2/R2
- Example (Test Case for Use Case Book Search)
- Ten books of author Grisham (Q1 and R1)
- One of these books should belong to the closed
stacks (Q2 and R2)
Q1 SELECT COUNT() AS cnt FROM book
WHERE b_author'Grisham' R1 lt 10 gt
Q2 SELECT COUNT() AS cnt FROM book
WHERE b_author'Grisham' AND b_closed 1 R2
lt 1 gt
Q1 SELECT COUNT() AS cnt FROM book
WHERE b_author'Grisham' AND b_closed ltgt
1 R1 lt 9 gt
D2
D1
Table book
Table book
17Outline
- Introduction
- State of the Art / Requirements
- (Multi-) Reverse Query Processing (RQP)
- Conclusions / Future Work
18Conclusions / Future Work
- Problems of existing test database generators
- Test databases are generated independent of test
cases - Low test coverage, high maintainability costs,
- (M)RQP to specify and generate test databases
- Test data specification declarative, minimal,
- Data generation based on database techniques
(e.g., algebra operators, ) - Open Research Problems
- Evolution of test databases
- Study usability of MRQP
19 ? snoitseuQ
Questions ?