Title: Data Structures for SAT Solvers The 2-Literal Representation
1Data Structures forSAT SolversThe 2-Literal
Representation
- Gábor Kuspergkusper_at_aries.ektf.hu
- Eszterházy Károly College
- Eger, Hungary
2Boolean Satisfiability (SAT)
- Identify truth assignment that satisfies boolean
formula or prove it does not exist - Well-known NP-complete problem
3Outline
- Notation
- Data structures used by SAT solvers
- Literal matrix (Scherzo)
- Adjacency lists (GRASP, )
- Head/tail lists (SATO)
- Watched literals (Chaff)
- New data structure
- 2-Literal Matrix
4Conjunctive Normal Form (CNF)
j ( a c ) ( b c ) (a b c )
5Literal Clause Classification
j (a b)(a b c )(a c d )(a b
c )
6Additional Definitions
- Resolution
- Example ?1 (a b c), ?2 (a b d)
- Resolution res(?1, ?2, a) (b c d)
- Unit Propagation
- An unresolved clause is unit if it has exactly
one unassigned literal - j (a c)(b c)(a b c)
- A unit clause has exactly one option for being
satisfied - c must be set to 0.
- Boolean Constraint Propagation iterated
application of unit propagation
7Data Structures
- Literal matrix (Scherzo)
- View CNF formula as a matrix, where the rows
denote the clauses and the columns the variables - 2-Literal matrix, NEW
- Adjacency lists (most SAT solvers)
- Counter-based state maintenance
- Keep counters of sat, unsat and unassigned (free)
literals for each clause - Lazy data structures
- Head/Tail lists (SATO)
- Watched literals (Chaff)
8State-of-the-art SAT Solvers
- MiniSAT solverhttp//www.cs.chalmers.se/Cs/Resea
rch/FormalMethods/MiniSat/ - Java SAT solverhttp//www.sat4j.org/
- A paper about data structuresEfficient data
structures for backtrack search SAT solversInês
Lynce and João Marques-Silva
9Literal Matrix
- View CNF formula as a matrix, where the rows
denote the clauses and the columns the variables - Assigned variables result in unsat literals
- Satisfied clauses result in sat clauses
- Each clause is an array of bits
- Each clause contains counter of sat, unsat and
unassigned sat literals - Used in the past in Binate Covering algorithms
- E.g. Scherzo, by Courdert et al., DAC95 and
DAC96
101-Literal Matrix Representation
- We can call the Literal Matrix to 1-Literal
Matrix - We decode combination of 1-clause, each 1-clause
correspond to a bit01 -, 10 01 a, 10 a - The representation00 sat 10 a 01 a 11 unsat
111-Literal Matrix
121-Literal Matrix
a assigned 0
b assigned 1
13Definition of k-clause
- A k-clause has k literal.
- Example j ( a c ) ( b c ) (a b c
) - 3-clauses in this formula are
- (a b c )
- 2-clauses in this formula are
- (a c)
- (b c)
- There is no unit, i.e., 1-clause in this example.
142-Literal Matrix Representation
- We decode combination of 2-clause. Each 2-clause
correspond to a bit1000 a?e, 0100a?e,
0010a?e, 0001 a?e - Can code every boolean functions with two
variables. - The representation0 0000 sat 8 1000 a?e 1
0001 a?e 9 1001 a?e 2 0010 a?e A 1010 e
3 0011 a B 1011 a?e 4 0100 a?e C 1100 a
5 0101 e D 1101 a?e 6 0110 a?e E
1110 a?e 7 0111 a?e F 1111 unsat
152-Literal Matrix
1000 - 0100 - 0010 -- 0001 xx 1111
162-Literal Matrix
1000 - 0100 - 0010 -- 0001 xx 1111
(?a ?c) assigned 1
a assigned 1
17Unit Propagation
- public void unitPropagation(int column, BitSet
unitToProp) - if (nLiteralscolumn.equals(unSatLit))
- return
- BitSet clone (BitSet)
nLiteralscolumn.clone() - clone.and(unitToProp)
- if (clone.equals(nLiteralscolumn))
- subsumed true
- nLiteralscolumn.or(unitToProp)
- if (nLiteralscolumn.equals(unSatLit))
- numberOfEffectiveLiterals--
-
18n-Literal Matrix Representation
- We decode combination of n-clause, each n-clause
correspond to a bit. - It can code every boolean functions with n
variables. - We need 2n bit.
- The 1-literal and the 2-literal matrix have the
same size.
191-Literal vs. 2-Literal Matrix
- 1-Literal Matrix
- Advantages
- Easy to implement
- Unit propagation results either in an sat clause
or an unsat literal - Disadvantages
- Wasteful, on 4 bit we store only 9 different
information
201-Literal vs. 2-Literal Matrix
- 2-Literal Matrix
- Advantages
- Economical, on 4 bit we store 15 different
information - One can propagate more (1110) or less (1000)
information at once as a normal unit (1100) - Disadvantages
- Unit propagation by a 2-literal does not
necessarily result in a sat clause or an unsat
literal
21Standard CNF Representation
- Adjacency list representation
- Each clause contains
- A list of literals
- Counter of sat, unsat and unassigned (free)
literals - Each variable x keeps a list with all clauses
with literals on x - Number of references kept in variables equals
total number of literals, L - Used in some SAT solvers
- GRASP
- rel-sat (some versions)
- POSIT
- etc.
22Lazy Data Structures
- Head/Tail Lists
- Each clause contains a list of literals
- Each unresolved clause is only referenced in two
unassigned variables (but possibly in several
assigned variables) - Each time a variable is assigned, referenced
clauses either become unit, sat, unsat or a new
reference becomes associated with another of the
clauses unassigned variables - Unit and unsat clauses can then be identified in
constant time - Clause can be declared unit/unsat by inspection
of two references - When backtracking, previous references are
recovered - Knowledge of the order of literal assignments is
maintained and it is essential
23Examples of Lazy Structures
unsatisfied literal
clause literals
_at_1
_at_2
_at_4
_at_3
literal references kept in variables
unassigned literal
satisfied literal
literal assigned search decision depth d, _at_d
Largest number of literal references in
variables L Smallest number of literal
references in variables 2C
24Head/Tail Lists
25Lazy Data Structures
- Watched Literals
- Each unresolved clause is only referenced in two
unassigned variables (and not in any assigned
variables) - Each time a variable is assigned, referenced
clauses either become unit, sat, unsat or, of the
two clause references, one becomes associated
with another of the clauses unassigned variables - Unit and unsat clauses can only be identified in
linear time - Must visit all literals to confirm that clause is
unit or unsat - When backtracking, do nothing
- Knowledge of the order of literal assignments in
clause is not (and cannot be) maintained
26Watched Literals
27HT vs. WL
- Head/Tail Lists
- Advantages
- Order relation between the two (H and T)
references - More efficient identification of unit and unsat
clauses - When one reference attempts to visit the other,
clause is either unit or unsat - Better accuracy in characterizing the dynamic
size of clauses - Disadvantages
- Larger overhead during backtracking
- Worst-case number of references for each clause
equals number of literals - Total (worst-case) L
- Similar to adjacency lists in the worst-case
28HT vs. WL
- Watched Literals (WL)
- Advantages
- Smaller overhead
- Constant number (2) of references for each clause
- Total (worst-case) 2C
- Twice the number of clauses, and C ltlt L
- Disadvantages
- Lack of order relation between the two (W)
references - Identification of new unit or unsat clauses is
always linear in clause size - Worse accuracy in characterizing the dynamic size
of clauses
29Matrix vs. Lazy Data Structures
- Matrix data structures
- Each clause is an array of bits
- Lazy data structures
- Each clause is a list of literals
- Matrix data structures
- Advantages
- Can identify not only unit clauses but also
binary and ternary ones - Disadvantages
- It needs space also for not concrete literals
- unit propagation is a C time method
- backtrack is a C time method
30Matrix vs. Lazy Data Structures
- Lazy data structures
- Advantages
- Unit propagation is a P N time method
- PN lt C
- Disadvantages
- We dont know the size of the clause, can
identify only unit clauses