Title: Oaktable
1Oaktable
- Jonathan Lewis and ORACLE_TRACE
- Oracle_Trace crashes my Database
- I start the SGA attach by searching every offset
- Anjo Kolk says James Morle wrote a program using
xksmmem - I show James my first draft using xksmmem
- James is baffled by why I'm hard coding offsets
- James says the offsets are in some X table
- I search, turn up a mail by Jonathan Lewison
xkqfco - Goldmine all the offsets
- Thanks Mogens Nogard!
- Thanks to TomKyte's Decimal to Hex
2(No Transcript)
3http//oraperf.sourceforge.net
4Direct Oracle SGA Memory Access
- Reading data directly from Oracles shared memory
segment using C code - Tuesday, October 27, 2009
5SGA on UNIX
Snnn
SMON
Pnnn
Dnnn
PMON
CKPT
Redo LogBuffer
Shared Pool
Database Buffer Cache
DBWR
ARCH
LGWR
Machine Memory
oracle
sqlplus
6SGA on NT
Dnnn
Snnn
Pnnn
CKPT
Machine Memory
SMON
Redo LogBuffer
Shared Pool
Database Buffer Cache
oracle
Process Space
sqlplus
7What is the SGA
- Memory Cache
- Often Used Data
- Rapid Access
- Shareable
- Concurrently Access
8SGA 4 main regions
- Fixed information
- Users info
- Database statistics
- Xdual
- etc
- Data block cache
- SQL cache ( library cache/shared pool)
- Redo log buffer
9How is the SGA info Used?
- Automatically
- data blocks cached
- Log buffer
- Sql cache
- Updates of system and user statistics
- User Queries
- User info vsession
- System info vparameter
- Performance statistics vsysstat, vlatch,
vsystem_event - Buffer cache headers, xbh
10Why Direct Access with C?
- Reading Hidden Information
- Sort info on version 7
- OPS locking info version 8
- Contents of data blocks (only the headers or
visible in X) - Access while Database is Hung
- High Speed Access
- Sampling User Waits, catch ephemeral data
- Scan LRU chain in Xbh
- Statistically approximate statistics
- SQL statistics per user
- Low overhead
11Database Slow or Hung
- Often happens at the largest sites when cutting
edge support is expected. - Shared Pool errors ORA 4031
- Archiver or Log file Switch Hangs
- Hang Bugs
- Library Cache Latch contention
- ORA-00379 no free buffers available in buffer
pool DEFAULT
12Statistical Sampling
By Rapidly Sampling SQL statistics and the users
who have the statistics open, one can see how
much work a particular user does with a
particular SQL statement
13Low Overhead
- Marketing Appeal
- Clients are sensitive about their production
databases - Heisenberg uncertainty affect less overhead
less affect monitoring has on performance which
we are monitoring
14SGA made visible through xtables
- Most of the SGA is not visible
- XKSMMEM Exception, Raw Dump of SGA
- Information Externalized through X tables
- Useful or Necessary information is Externalized
- Externalized publicly through V Tables
15Machine Memory
0x80000000
16Graphic SGA
Buffer Cache
SGA 0x80000000
Fixed Area
Buffer Cache
Shared Pool
Log Buffer
17Fixed Area
XKSUSECST- user waits
SGA 0x80000000
0x85251EF4
18XKSUSECST
170 Records
2328 bytes
Row 1
0x85251EF4
Row 2
Row 3
19XKSUSECST Record
One Record in XKSUSECST
1276
20XKSUSECST Fields
1276
1278
1280
1284
1288
Seq
Event
p3
p2
p1
21Externalization of C structs X tables
If Structure foo was externalized in a X SQLgt
describe xfoo Column Name
Type ------------------------------
-------- ADDR
RAW(8) INDX
NUMBER ID
NUMBER B
NUMBER
22SGA is One Large C Struct
- struct foo
-
- int id
- int A
- int B
- int C
-
- struct foo fooN
23Struct C code
- include ltstdio.hgt
- include ltfcntl.hgt
- define N 20
- / structure definition /
- struct foo
-
- int id
- int a
- int b
- int c
-
- / end structure definition /
24Struct Record
- main()
- struct foo foo20
- int fptr
- / zero out memory of struct /
- memset(foo,0,sizeof(foo))
- foo0.id1 / row 0 /
- foo0.a12
- foo0.b13
- foo0.c13
25Struct Write to File
- foo1.id2 / row 1 /
- foo1.a22
- foo1.b23
- foo1.c24
- / write to file, simulate SGA /
- if ((fptr open("foo.out",O_WRONLY
O_CREAT,0777)) lt 0 ) - return -1
- write(fptr,foo,sizeof(foo))
- return 0
-
26Simulate SGA with a File
write(fp,foo,sizeof(foo))
27Simulate SGA with a File
Row 1
Row 0
A
ID
B
ID
C
A
bits
bytes
hex bytes
oct bytes
28Struct File Contents
- ./foo
- ls -l foo.out
- -rw-r--r-- joe dba 320 Feb 10 1941 foo.out
- int 32 bits
- Int 4 bytes
- 20 entries 4 int 4 bytes/int 320 bytes
29od octal dump
- od -l foo.out
- 0000000 1 12 13
13 - 0000020 2 22 23
24 - 0000040 0 0 0
0 -
- 0000500
30Struct File Contents
Address is in Hex Column 2 is the ID Column 3 is
field A Column 4 is field B Column 5 is field C
31X tables ?
- Ok, xfoo foo20
- How do I get a list of x tables?
- Where is each X located?
- VFixed_Tables
32VFixed_Table list of X tables
SQLgt desc vfixed_table Name
Null? Type
-----------------------------------------
-------- ----------------- NAME
VARCHAR2(30)
OBJECT_ID
NUMBER TYPE
VARCHAR2(5) TABLE_NUM
NUMBER
33Graphic X Addresses
SGA 0x80000000
0x8???????? X????
34VFixed_Table
- spool addr.sql
- select
- 'select 'addr, ''''name''''' from '
name ' where rownum lt 2' - from
- vfixed_table
- where
- name like 'X'
- /
- spool off
- _at_addr.sql
35Example finding the address
- select
- a.addr ,
- 'XKSUSE'
- from
- XKSUSE
- where
- rownum lt 2
36X layout
6802B244 XKSLEMAP 6802B7EC XKSLEI 6820B758
XKSURU 6820B758 XKSUSE - vsession 6820B758
XKSUSECST vsession_wait 6820B758 XKSUSESTA
vsession_stat 6820B758 XKSUSIO 6826FBD0
XKSMDD 6831EA0C XKSRCHDL
37What's in these X views
- V views are documented
- V views are based often on X tables
- The map from v to X is described in
- VFixed_View_Definition
38VFixed_View_Definition
- SQLgt desc VFixed_View_Definition
- Name Type
- -----------------------------------
-------------- - VIEW_NAME VARCHAR2(30)
- VIEW_DEFINITION VARCHAR2(4000)
39Definition of VSession_Wait
- SQLgt select
- VIEW_DEFINITION
- from
- VFIXED_VIEW_DEFINITION
- where
- view_name'GVSESSION_WAIT'
- VIEW_DEFINITION
- --------------------------------------------------
--------------------- - select s.inst_id,s.indx,s.ksussseq,e.kslednam,
e.ksledp1,s.ksussp1,s.ksussp1r,e. - ksledp2, s.ksussp2,s.ksussp2r,e.ksledp3,s.ksussp3,
s.ksussp3r, decode(s.ksusstim, - 0,0,-1,-1,-2,-2, decode(round(s.ksusstim/10000),
0,-1,round(s.ksusstim/10000))) - , s.ksusewtm, decode(s.ksusstim, 0, 'WAITING',
-2, 'WAITED UNKNOWN TIME', -1, ' - WAITED SHORT TIME', 'WAITED KNOWN TIME') from
xksusecst s, xksled e where bit - and(s.ksspaflg,1)!0 and bitand(s.ksuseflg,1)!0
and s.ksussseq!0 and s.ksussop - ce.indx
40The Fields in X tables
- OK, I've picked an X
- I've got the starting address
- Now, how do I get the fields?
41XKQFTA
- Kernel Query Fixed_view Table
- INDX use to find column information
- KQFTANAM X table names
42XKQFCO
- Kernel Query Fixed_view Column
- KQFCOTAB Join with XKQFTA.INDX
- KQFCONAM Column name
- KQFCOOFF Offset from beginning of the row
- KQFCOSIZ Columns size in bytes
43XKSUSECST Fields
Address
1276
1278
1280
1284
1288
Seq
Event
p3
p2
p1
2
2
4
4
4
BYTES
44SGA Contents in Resume
In resume Oracle takes the C structure defining
the SGA and maps it onto a shared memory segment
Oracle provides access to some of the SGA
contents via X tables
45 Procedure
- Choose a V view
- Find base X Tables for v view
- Map X fields to V fields
- Get address of X table in SGA
- Get the size of each record in X table
- Get the number of records in X table
- Get offsets for each desired field in X table
- Get the base address of SGA
461) VSESSION_WAIT Example
- List of all users waiting
- Detailed information on the waits
- Data is ephemeral
- Useful in Bottleneck diagnostics
- High sampling rate candidate
- Event 10046 captures this info
- Good table for SGA sampling
47VSESSION_WAIT Description
SQLgt desc vsession_wait Name
Type ------------------------
----------------- --------------------------
SID
,NUMBER SEQ
,NUMBER EVENT
,VARCHAR2(64) P1TEXT
,VARCHAR2(64) P1
,NUMBER P1RAW
,RAW(4) P2TEXT
,VARCHAR2(64) P2
,NUMBER P2RAW
,RAW(4) P3TEXT
,VARCHAR2(64) P3
,NUMBER P3RAW
,RAW(4) WAIT_TIME
,NUMBER
SECONDS_IN_WAIT
,NUMBER STATE
,VARCHAR2(19) )
48VSESSION_WAIT Short
SQLgt desc vsession_wait Name
Type ----------------------------
------------- SID
NUMBER SEQ NUMBER
EVENT VARCHAR2(64) P1
NUMBER P2
NUMBER P3
NUMBER)
49VFIXED_VIEW_DEFINITION
- Gives mappings of V views to X tables
- SQLgt select
- VIEW_DEFINITION
- from
- VFIXED_VIEW_DEFINITION
- where
- view_name'VSESSION_WAIT
-
50VSESSION_WAIT View Definition
VIEW_DEFINITION ----------------------------------
----------------------------------- select
s.inst_id, s.indx,
s.ksussseq, e.kslednam, e.ksledp1, s.ks
ussp1, s.ksussp1r, e.ksledp2, s.ksussp2, s.ksussp2
r, e.ksledp3, s.ksussp3, s.ksussp3r, round(s.ksuss
tim / 10000), s.ksusewtm, decode(s.ksusstim, 0,
'WAITING', -2, 'WAITED UNKNOWN TIME', -1,
'WAITED SHORT TIME', 'WAITED KNOWN TIME') from
xksusecst s, xksled e where
bitand(s.ksspaflg,1)!0 and bitand(s.ksuseflg,1)
!0 and s.ksussseq!0 and s.ksussopce.indx
51View Definition Short
VIEW_DEFINITION ----------------------------------
----------------------------------- select
s.indx, s.ksussseq, e.kslednam, s.ksussp1, s
.ksussp2, s.ksussp3 from xksusecst s,
xksled e where s.ksussopce.indx
522) VSESSION_WAIT Based on XKSUSECT
VIEW_DEFINITION ----------------------------------
------------------ select indx, ksussseq, ksus
sopc, ksussp1, ksussp2, ksussp3 from
xksusecst
53Equivalent SQL Statements
select indx, ksussseq, ksussopc, ksussp1, ks
ussp2, ksussp3 from xksusecst
select sid seq event
p1 p2 p3 from
vsession_wait )
Note xksusecst. Ksussopc is the event
xksled.kslednam is a list of the event names
where xksled.indx xksusecst. ksussopc
543) V to X Field Mapping
554) Get base SGA address for X table
Find the location of XKSUSECST in the SGA SQLgt
select addr from xksusecst where rownum lt
2 ADDR -------- 85251EF4
565) Find the Size of Each Record
- SQLgt select
- ((to_dec(e.addr)-to_dec(s.addr)))
row_size - from
- (select addr from xksusecst where rownum
lt 2) s, - (select max(addr) addr from xksusecst
where rownum lt 3) e -
- ROW_SIZE
- ----------------
- 2328
576) Find the Number of Records in the structure
- SQLgt select count() from xksusecst
- COUNT()
- --------------
- 170
58Get Offsets for Each Desired Field in X table
- SQLgt select c.kqfconam field_name,
- c.kqfcooff offset,
- c.kqfcosiz sz
- from
- xkqfco c,
- xkqfta t
- where
- t.indx c.kqfcotab and
- t.kqftanam'XKSUSECST'
- order by
- offset
-
59XKQFTA - X Tables Names
- List of X tables
- INDX use to find column information
- KQFTANAM X table names
- To get Column information join with XKQFCO
- XKQFTA.INDX XKQFCO.KQFCOTAB
60XKQFCO X Table Columns
- List of all the columns in X Tables
- KQFCOTAB Join with XKQFTA.INDX
- KQFCONAM Column name
- KQFCOOFF Offset from beginning of the row
- KQFCOSIZ Columns size in bytes
61Field Offsets
- FIELD_NAME OFFSET SZ
- ------------------------------ ---------- -------
--- - ADDR 0
4 - INDX 0
4 - KSUSEWTM 0 4
- INST_ID 0
4 - KSSPAFLG 1
1 - KSUSSSEQ 1276 2
- KSUSSOPC 1278 2
- KSUSSP1 1280 4
- KSUSSP1R 1280 4
- KSUSSP2 1284 4
- KSUSSP2R 1284 4
- KSUSSP3 1288 4
- KSUSSP3R 1288 4
- KSUSSTIM 1292 4
- KSUSENUM 1300 2
- KSUSEFLG 1308 4
62What are all the fields at OFFSET 0?
- These are all calculated values and not stored
explicitly in the SGA. - ADDR memory address
- INDX record number, like rownum
- INST_ID database instance ID
- KSUSEWTM calculated field
63Unexposed Fields
- What happens between OFFSET 1 and 1276?
- Unexposed Fields
- Sometimes exposed elsewhere, in our case
- VSESSION
- VSESSTAT
64Fields at Same Address
- Why do some fields start at the same address?
- KSUSSP1
- KSUSSP1R
- Are at the same address
- Equivalent of
- VSESSION_WAIT.P1
- VSESSION_WAIT.P1RAW
- These are the same data, just exposed as
- Hex
- Decimal
657) Offsets of Fields
668) Get Base SGA Address
-
- SQLgt select addr from xksmmem where rownum lt 2
-
- ADDR
- --------------
- 80000000
67Results XKSUSECST
68Machine Memory
0x80000000
69Fixed Area
XKSUSECST- user waits
SGA 0x80000000
0x85251EF4
70XKSUSECST
170 Records
2328 bytes
Row 1
0x85251EF4
Row 2
Row 3
71XKSUSECST Record
One Record in XKSUSECST
1276
72XKSUSECST Fields
1276
1278
1280
1284
1288
Seq
Event
p3
p2
p1
73Attaching to the SGA
- UNIX System Call shmat
- To attach to shared memory Unix as a system call
- void shmat( int shmid,
- const void shmaddr,
- int shmflg )
74ID and Address arguments to shmat
- The arguments are
- shmid shared memory identifier specified
- shmaddr starting address of the shared memory
- shmflg - flags
- The argument shmflg can be set to SHM_RDONLY . To
avoid any possible data corruption the SGA should
only be attached read only. - The arguments shmid and shmaddr need to be set to
Oracles SGA id and address.
75Finding Oracle SGAs ID and Address
- Use ORADEBUG to find the SGA id
- SQLgt oradebug setmypid
- Statement processed.
- SQLgt oradebug ipc
- Information written to trace file.
-
76Finding Trace File
- SQLgt show parameters user_dump
- NAME VALUE
- ----------------------- --------------------------
------ - user_dump_dest /u02/app/oracle/admin/V901/udump
- SQLgt exit
- cd /u02/app/oracle/admin/V901/udump
- ls -ltr tail -1
- -rw-r----- usupport dba Aug 24 1801
v901_ora_23179.trc
77Finding SHMID in Trace File
- vi v901_ora_23179.trc
-
- Total size 004456c Minimum Subarea size
00000000 - Area Subarea Shmid Stable Addr
Actual Addr - 0 0 34401
0080000000 0080000000 -
78Attaching to the SGA
-
- Shmid 34401
- Shmaddr 0x80000000
- Shmflg SHM_RDONLY
-
- The SGA attach call in C would be
-
- Shmat(34401, 0x80000000, SHM_RDONLY)
-
- This call needs to be executed as a UNIX user who
has read permission to the Oracle SGA
79C Code Headers
- include ltstdio.hgt
- include ltsys/ipc.hgt
- include ltsys/shm.hgt
- include lterrno.hgt
- include "event.h"
-
- event.h is for translating the event s into
event names
80Events.h
- Spool events.h
- select 'char event100' from dual
- select '"'name'",' from vevent_name
- select ' "" ' from dual
- spool off
81Define Base Addresses and Sizes
- / SGA BASE ADDRESS /
- define SGA_BASE 0x80000000
- / START ADDR of KSUSECST(VSESSION_WAIT) /
- define KSUSECST_ADDR 0x85251EF4
- / NUMBER of ROWS/RECORDS in KSUSECST /
- define SESSIONS 150
- / SIZE in BYTES of a ROW in KSUSECST /
- define RECORD_SZ 2328
82Define Offsets to Fields
-
- define KSUSSSEQ 1276 / sequence /
- define KSUSSOPC 1278 / event /
- define KSUSSP1R 1280 / p1 /
- define KSUSSP2R 1284 / p2 /
- define KSUSSP3R 1288 / p3 /
83Set Up Variables
- main(argc, argv)
- int argc
- char argv
-
- void addr
- int shmid
- int shmaddr
- void current_addr
- long p1r, p2r, p3r
- unsigned int i, seq, tim, flg, evn
84Attach to SGA
- / ATTACH TO SGA /
- shmidatoi(argv1)
- shmaddrSGA_BASE
- if (
- (void )shmat(
- shmid,
- (void )shmaddr,
- SHM_RDONLY)
- (void )-1 )
- printf("shmat error attatching to
SGA\n") - exit()
-
85Set Up Sampling Loop
- / LOOP OVER ALL SESSIONS until CANCEL /
- while (1)
- / set current address to beginning of Table
/ - current_addr(void )KSUSECST_ADDR
- sleep(1)
- printf("H J") / clear screen /
- / print page heading /
- printf("4s 8s -20.20s 10s 10s 10s
\n", - "sid", "seq", "wait","p1","p2","p3")
86Loop over all Sessions
- for ( i0 i lt SESSIONS i )
- seq(unsigned short
)((int)current_addrKSUSSSEQ) - evn(short )
((int)current_addrKSUSSOPC) - p1r(long )
((int)current_addrKSUSSP1R) - p2r(long )
((int)current_addrKSUSSP2R) - p3r(long )
((int)current_addrKSUSSP3R) - if ( evn ! 0 )
- printf("4d 8u -20.20s 10X
10X 10X \n", - i, seq, eventevn ,p1r,
p2r,p3r - )
-
- current_addr(void )((int)current_add
rRECORD_SZ) -
-
-
87Output
- sga_read_session_wait 34401
- sid seq wait p1
p2 p3 - 0 40582 pmon timer 12C
0 0 - 1 40452 rdbms ipc message 12C
0 0 - 2 43248 rdbms ipc message 12C
0 0 - 3 24706 rdbms ipc message 12C
0 0 - 4 736 smon timer 12C
0 0 - 5 88 rdbms ipc message 2BF20
0 0 - 8 178 SQLNet message from 6265710 1
0
88Pitfalls
- Byte Swapping
- 32 bit vs 64 bit
- Multiple Shared Memory Segments
- Segmented Memory
- Addresses are "unsigned int"
- Misaligned Access
89Little Endian vs Big Endian
- Is low byte values first or high byte values
first ? - a byte is 8 bits
- 00000000-11111111 bits,0 31 dec, 0x0 - 0xFF
hex - Big Endian is "normal" , highest bit first
- In ascii, the word "byte" is stored as
- b 62, y 79, t 74, e 65
- echo 'byte' od -x
- b y t e
- 62 79 74 65
- Little Endian, ie byte swapped (Linux, OSF,
Sequent, ? ) - y b e t
- 79 62 65 74
90Byte Swap Example
- Short 2 bytes ie 16 bits
- Goal, get the flag in the "second" byte
- ifdef __linux
- uflg(short )((int)sga_address)gtgt8
- else
- uflg(short )((int)sga_address)
- endif
91Byte Swap
- Big Endian
- 00 00 00 00 00 00 00 01
- Little Endian
- 00 00 00 01 00 00 00 00
- Solution, push the value over 8 places, to the
right, - ie gtgt8
9264 bit vs 32 bit
- SQLgt desc xksmmem
- Name
Type - -------------------------------------
--------- - ADDR
RAW(4) - INDX
NUMBER - INST_ID
NUMBER - KSMMMVAL
RAW(4) - -gt 32 bit
- Raw(8) -gt 64 bit
93Segmented Memory
- xksuse can be dis-contiguous
- Work around
- select 'int users' from dual
- select '0x'addr',' from xksuse
- select '0x0' from dual
94 Misaligned Access
- Some platforms seg fault when addressing
misaligned bytes, need to read in even bytes or
units of 4 bytes depending on platform
1
2
3
4
5
6
7
8
95xksusecst Record What's Missing?
One Record in XKSUSECST
???
???
1276
96Select Addr from X? where Rownumlt 2
6802B244 XKSLEMAP 6802B7EC XKSLEI 6820B758
XKSURU 6820B758 XKSUSE vsession 6820B758
XKSUSECST vsession_wait 6820B758 XKSUSESTA
vsesstat 6820B758 XKSUSIO 6826FBD0
XKSMDD 6831EA0C XKSRCHDL
97xksuse Record Contains xksusecst
One Record in XKsusecst
vsession_wait
vsession
vsesstat
vsession
1276
236
xksusecst
xksusesta
xksuse
98Getting vsesstat addresses
- select 'define '
- upper(translate(s.name,'
-()/''','________'))' ' - to_char(c.kqfcooff STATISTIC 4 )
- from
- xkqfco c,
- xkqfta t,
- vstatname s
- where
- t.indx c.kqfcotab
- and ( t.kqftanam'XKSUSESTA' ) and
c.kqfconam'KSUSESTV' - and kqfcooff gt 0
- order by
- c.kqfcooff
- /
99User Drilldown Query 4 joins
- select
- w.sid sid,
- w.seq seq,
- w.event event,
- w.p1raw p1,
- w.p2raw p2,
- w.p3raw p3,
- w.SECONDS_IN_WAIT ctime,
- s.sql_hash_value sqlhash,
- s.prev_hash_value psqlhash,
- st.value cpu
- from
- vsession s,
- vsesstat st,
- vstatname sn,
- vsession_wait w
- where
- w.sid s.sid and
- st.sid s.sid and
100Other Fun Stuff
- The next example is output from an SGA program
that follows the LRU of the Buffer Cache - The program demonstrates the
- insertion point of LRU
- cold end of LRU
- hot end of the LRU
- Full Table Scan Insertion Point
101LRU HOT
102LRU COLD