Title: Parallel Algorithms for general Galois lattices building
1Parallel Algorithms for general Galois lattices
building
- Fatma BAKLOUTI , Gérard LEVY
- CERIA
- fatma.baklouti_at_dauphine.fr,
gerardlevy_at_dauphine.fr
Workshop WAS 2003
2Plan
- Knowledge Discovery in Databases (KDD)
- One tool for data mining Galois Lattices
- Problems and solutions
- Row-sharing
- Column-sharing
- Conclusion and perspectives
3Knowledge Discovery in Databases (KDD)
- Knowledge Discovery in Databases (KDD) or Data
Mining (DM) - Extraction of interesting (non-trivial,
implicit, previously unknown and potentially
useful) information (knowledge) or patterns from
data in large databases or other information
repositories - Fayyad et al., 1996
-
4- DM emergence factors
- Wide Data bases volume from Gbyte to Tbyte
- Clientele report
- Example
- Analysis of a client basket in mass distribution
- Which group or set of products were frequently
bought by a client during a passage
in a shop? - Disposition of product on shelves.
- Example Milk and bread
- when a client buys milk, does he buy bread too ?
5- Various applications
- Medecine, Finances, Distribution,
telecommunication - Fields of research
-
Data Base
Statistics
IHM
Learning
KDD
Etc
Information Science
6KDD General Process
Text Picture Sound Data
Data acquisition
Data Preparation
Selection,cleaning, integration
Transformations, editing construction of
attributes
Model Concept
7- Books
- Data Mining,
- Han Kamber (Morgan Kaufmann Pubs, 2001)
- Mastering Data Mining,
- Berry Linoff (Wiley Computer Publishing, 2000)
-
- Interesting sites
- http//www.kddnuggets.com
- http//www.crisp-dm.org CRoss-Industry
Standard Process for Data Mining - effort de
standardization
8Galois Lattices
- Using Galois Lattice (mathematical structure) for
solving Data Mining problems. - References
- Birkhoffs Lattice Theory 1940, 1973
- Barbut Monjardet 1970
- Wille 1982
- Chein, Norris, Ganter, Bordat,
- Diday, Duquenne,
- Emilion, Lévy, Diday, Lambert
- Basic Concepts
- Context, Galois connection, Concept.
9Galois Lattices - Definition
- Context (O, A, I)
- O finite set of examples
- A finite set of attributes
- I binary relation between O and A, (I ? O x
A) - Example
O
10Galois Lattices - Definition
- Galois connection
- Oi ? O and Ai ? A, we define f et g like this
- f P(O) ? P(A) f(Oi) a ? A / (o,a) ? I, ? o
? Oi intention - g P(A) ? P(O) g(Ai) o ? O / (o,a) ? I, ? a ?
Ai extension - f et g are decreasing applications
- h g f and k f g, are
- Increasing O1 ? O2 ? h (O1) ? h (O2)
- Extensive O1 ? h (O1)
- Idempotent h (O1) h h (O1)
- h and k are closure operators.
- (f,g) Galois connection between P(O) and P(A)
11Galois Lattices - Definition
- Galois connexion Example
- O1 6,7 ? f(O1) a,c
intention - A1 a,c ? g(A1) 1,2,3,4,6,7 extension
Remark h(O1) g f(O1) g(A1) ? O1
12Galois Lattices - Definition
- Concept
- Oi ? O et Ai ? A,
- (Oi, Ai) is a concept iff Oi is the extension of
Ai and Ai is the intention of Oi - Oi g (Ai) and Ai f(Oi)
- L (Oi, Ai) ? P(O)?P(A) / Oi g(Ai) et Ai
f(Oi) concepts set. - L ordered set by the relationship
- (O1, A1) (O2, A2) iff O1 ? O2 (or A2 ? A1).
- Galois Lattice
- T(L, ) an ordered set of concepts.
13Galois Lattices - Definition
- Concept Example
- O1 6,7 ? f(O1) a,c
- A1 a,c ? g(A1) 1,2,3,4,6,7
- Remark h(O1) g f(O1) g(A1) ? O1
- (6,7 , a,c) ? L
- (1,2,3,4,6,7, a,c) ? L
- Because
- h(1,2,3,4,6,7) g f(1,2,3,4,6,7)
- g (a,c)
- 1,2,3,4,6,7
14(No Transcript)
15Generalized Galois Lattices
- Context lt I, F, d gt
- T ltF, ?, ?, gt
- Tj ltFj, ?j, ?j, jgt for all j de J, J 1,n
- d I ? F
- di (di1,, dij,, din) description of the
individual i relatively to the attributes j of J.
1 2 j n
1 i k
- x ? I
- f (x) ?d(i) i ? x Intention
- ? z ? F
- g (z) i ? I z d(i) Extension
Individuals I
16General Galois Lattice - Example
F F1 x F2 x F3 Size short, medium, high
1 lt 2 lt 3 Weight thin, fat
0 lt 1 Age child,
adolescent, adult 1 lt 2 lt 3
F1 F2 F3
f Cedric, Carine 1, 1, 2 g1, 1, 2
Cedric, Carine
Individuals I
17Ø, 313
4,312
3,203
34,202
24,112
134,201
234,102
1234,101
18Problems
- Large data volume
- Partition data on different server nodes
- Process in parallel locally
- Group results on one (client) node
- Post-process
- Our tool
- SDDS (Scalable Distributed Data Structures )
19Solutions
Column-sharing
Row-sharing
1 2 3
1 2 3
C
C
3
1 2
1 2 3
1 2 3
C2
C1
C3
C4
20Row-sharing
M2
M1
C1 T1TG(C1)
C2 T2TG(C2)
M
TTG(C)
21Example
C
C2
C1
T1GL(C1)
T2GL(C2)
T GL(C) is it egal to the horizontal product of
lattices T1 GL (C1) and T2 GL (C2) ?
22We apply an algorithm (here Bordats algorithm)
to context C1 and C2 to build respectively
lattice T1 GL(C1) and lattice T2 GL(C2).
Graph of lattice T1 GL(C1)
Graph of lattice T2 GL(C2)
23Total number of closed pairs ( X , z ) of
lattice T1 GL(C1) 12. pair(1) X,
z(2,3,3) pair(2) X1, z(1,0,2) pair(3)
X2, z(2,1,0) pair(4) X3,
z(0,3,1) pair(5) X4, z(1,1,1) pair(6)
X1,4, z(1,0,1) pair(7) X2,4,
z(1,1,0) pair(8) X3,4, z(0,1,1) pair(9)
X1,2,4, z(1,0,0) pair(10) X1,3,4,
z(0,0,1) pair(11) X2,3,4, z(0,1,0) pair(12)
X1,2,3,4, z(0,0,0). Total number of closed
pairs of T2 GL(C2) 5 pair (1) X,
z(2,3,3) pair(2) X5, z(0,1,3) pair(3)
X6, z(2,0,0) pair (4) X5,6, z(0,0,2)
pair(5) X5, 6, 7, z(0,0,0).
24 X X1 ? X2 z z1 ? z2
Horizontal product of lattices T1 GL (C1) and
T2 GL (C2)
25We apply BORDATs algorithm to the full context
C.
Graph of lattice T GL(C)
26Total number of closed pairs (X, z) of T
GL(C) 15. pair(1) X, z(2,3,3) pair(2)
X1, z(1,0,2) pair(3) X2, z(2,1,0) pair
(4) X3, z(0,3,1) pair(5) X4,
z(1,1,1) pair(6) X5,, z(0,1,3) pair(7)
X1,4, z(1,0,1) pair(8) X1,5,6,
z(0,0,2) pair(9) X2,4, z(1,1,0) pair(10)
X2,7, z(2,0,0) pair(11) X3,4,5,
z(0,1,1) pair(12) X1,2,4,7, z(1,0,0)
pair(13) X1,3,4,5,6, z(0,0,1) pair(14)
X2,3,4,5, z(0,1,0) pair(15)
X1,2,3,4,5,6,7, z(0,0,0).
T GL(C) is the horizontal product of
lattices T1 GL(C1) and T2 GL(C2)
27Columnsharing
M2
M1
C1 T1TG(C2)
C2 T2TG(C2)
M
TTG(C)
28Example
C
C1
C2
T2GL(C2)
T1GL(C1)
T GL(C) is it egal to the vertical product of
lattices T1 GL (C1) and T2 GL (C2) ?
29Graph of lattice T1 GL(C1)
Graph of lattice T2 GL(C2)
30- Total number of closed pairs ( X , z ) of
lattice T1 GL(C1) 8. - pair(1)Â X, z (2,3)
- pair(2)Â X2, z (2,1)
- pair(3)Â X 3, z (0,3),
- pair(4)Â X 2,4, z (1,1)
- pair(5)Â X 2,7, z (2,0)
- pair(6)Â X 2,3,4,5, z(0,1),
- pair(7)Â X 1,2,4,7, z(1,0)
- pair(8)Â X 1,2Â ,3,4,5,6,7, z(0,0).
- Total number of closed pairs ( X , z ) of
lattice T1 GL(C1) 4. - pair(1)Â X 5, z(3)
- pair(2)Â X 1,5,6, z(2)
- pair(3)Â X 1,3,4,5,6, z(1)
- pair(4)Â X 1,2Â ,3,4,5,6,7, z (0).
31 X X1 ? X2 z (z1 , z2 )
T GL(C) is the vertical product of lattices
T1 GL(C1) and T2 GL(C2)
32Conclusion and perspectives
- Generalized Galois Lattices.
- Problem of large data base can be perhaps
resolved in our way. - Sharing context into two subsets.
- Possibility of building different architectures
for stations networks.
33Thank you for Your Attention Fatma
Baklouti Gérard LEVY fbaklouti_at_excite.com gerardl
evy_at_dauphine.fr Â