Title: MongoDB ??
1MongoDB ??
? ? ? (madvirus_at_madvirus.net)
2????
3NoSQL
4NoSQL is a movement promoting a loosely defined
class of non-relational data stores that break
with a long history of relational databases.
These data stores may not require fixed table
schemas, usually avoid join operations and
typically scale horizontally.
- WIKIPEDIA
5NoSQL
- ?? ??
- schema-free
- replication
- ??? API
- no relation / no join
- eventually consistency (not ACID)
- scalable / distributed
- NoSQL? ??
- Document store
- MongoDB, CouchDB, Jackrabbit, Lotus Notes
- Key/Value store
- Amazon SimpleDB, MemcacheDB,
- Column ??
- Hadoop/HBase, Cassandra, Hypertable,
- Graph DB
- Neo4j, HyperGraphDB,
6??? ?? MongoDB
- ???? ??
- MongoDB? ?? ?? ?? ?? ??
- TOC
- Document DB? MongoDB
- ? MongoDB?
- MongoDB ???
7Document DB MongoDB
8Document
9Document Self-Contained
10Document Schema Free
11Document DB
- ?? ???
- ?? ??
- ltkey, valuegt ???? ?? ??
- ??? JSON ?? ??
- No Join
- Schema Free
- No SQL
- ? ??, ?? ???, HA (replication), ?? ??
- ?? ??
- MongoDB, CouchDB, Riak, ...
12Document ??
ts 20100415110001.001, user
nickname madvirus, id 12345
, site domain
cc2.wemade.com, uri /main/main
ts 20100415110001.005, user
nickname wemade, id 1 ,
site domain cc2.wemade.com,
uri /main/main , referer
domain www.wemade.com, uri
/main/main
13MongoDB
- humongous DB
- ?? ?? 1.4.0
- ??
- Document-oriented storage
- index ??
- geospacial ??? ??, multikey
- ?? ???? (no row lock, no table block)
- Replication
- Sharding (alpha)
- MapReduce
- ???? ???? ?? ???? ??
- ??, ??? ?
- ?? ??, ??? ?? ?!
- ??? ???? ??
- ?? ?? C, C, Java, Perl, PHP, Python, Ruby
- ???? ?? C/.NET, Erlang, Go, Groovy/Scala/Clojure
14MongoDB ????
?? ???? ??
?? ??? ??????
??? ???, ????? ??? ????
?? ???? ??? ????
??? ???, ???, ??, ??? ????
?? ???? ???
??? ?? ????
15mongoDB ??
16MongoDB? ???? - Collection, Document
- Database
- Collection
- RDBMS? Table? ??
- Document? ??
- Document
- RDBMS? Row? ??
- Schema Free
- ???? ?? ? ?? ?? (_id)
- ???? ?? ?? mongoDB? ?? ??
17?? ? ??
- http//www.mongodb.org ?? ????
- ?? ??
- ??DB?? /data/db (c\data\db), ?? 27017
- ?? ??
- mongo ?? ??? shutdownServer() ??
- CtrlC, kill -2 PID, kill -15 PID
- kill -9 PID ??? ??? ?? ? ??
mongod mongod --dbpath /wemade/db --port
90912 mongod fork logpath /wemade/log/db/mongo
db.log
mongo gt db.shutdownServer()
18??? ??? ??
- bin/mongo ?? ???? ??
- ?????? ??? ??
mongo gt use weblog switched to db weblog gt
db.pageview_minute.find() "_id"
ObjectId("4bbc35128319000000001984"), "hour"
"201004071632", "pageviews" 4000, "site"
"funpc.wemade.com" "_id" ObjectId("4bbc355d8
319000000001985"), "hour" "201004071633",
"pageviews" 601, "site" "funpc.wemade.com"
"_id" ObjectId("4bbc35688319000000001986"),
"hour" "201004071634", "pageviews" 3194,
"site" "funpc.wemade.com" "_id"
ObjectId("4bbc35a48319000000001987"), "hour"
"201004071635", "pageviews" 3210, "site"
"funpc.wemade.com" "_id" ObjectId("4bbc35e08
319000000001988"), "hour" "201004071636",
"pageviews" 2142, "site" "funpc.wemade.com"
"_id" ObjectId("4bbc361c8319000000001989"),
"hour" "201004071637", "pageviews" 853,
"site" "funpc.wemade.com" "_id"
ObjectId("4bbc3740831900000000198a"), "hour"
"201004071641", "pageviews" 1026, "site"
"funpc.wemade.com" "_id" ObjectId("4bbc37488
31900000000198b"), "hour" "201004071642",
"pageviews" 7979, "site" "funpc.wemade.com"
"_id" ObjectId("4bbc3784831900000000198c"),
"hour" "201004071643", "pageviews" 995,
"site" "funpc.wemade.com" gt
19????? ??? ??
Mongo mongo new Mongo("localhost", 27017) DB
db mongo.getDB("weblog") DBCollection
collection db.getCollection("pageview_minute")
BasicDBObject q new BasicDBObject() SimpleDateF
ormat format new SimpleDateFormat("yyyyMMddHHmm"
) q.put("hour", format.format(time)) q.put("site
", site) BasicDBObject o new
BasicDBObject() BasicDBObject incVal new
BasicDBObject() incVal.put("pageviews", new
Integer(1)) o.put("inc", incVal) collection.up
date(q, o, true, true)
20DB/??? ??/??
- Lazy Creation
- ?? ???? ??? ? DB/??? ???
- ???? DB/??? ?? ??
- ??? ??
- db.collName.drop()
21?? ?? ?? API
- ?? ??
- db.collName.save( name mongo )
- ?? ??
- db.collName.find()
- db.collName.find( name mongo )
- db.collName.find( , name 1, ssn 1 )
- db.collName.find( ).sort( userid 1 )
- db.collName.find( , , 10, 20)
- db.collName.count()
- ?? ??
- db.collName.update( userid madvirus,
lastupts val, false ) - ?? ??
- db.collName.remove( )
- db.collName.remove( userid madvirus )
- ??? ??
- db.collectionName.ensureIndex( userid 1, regts
1 ) - ??
- group(), min(), max(), in, where (or ??),
22? ????
23MongoDB? ?? ??
- Replication
- Sharding
- MapReduce
- Atomic ? ??
- Capped Collection
- Geospacial Index
24Replication
- ?? ? Replication ??
- High Availability (Failover)
- Read Throughput ??
- ??
Master-Slave
Replica Pair
Master
25Sharding
- ??? ??? ?? ??? ??? ??? ???? ???? ??
- MongoDB? ??? ??? ???? ??
Shard
Shard
Shard
Shard
Chunk
Chunk
Chunk
Chunk
mongos
config server
Client
26MapReduce ??
- MapReduce
- Google? ??
- ??? ??? ??? ?? ???? ?? ??
- 2?? ??
- Map?? / Reduce ??
- ??
- ??? ????? ??
- Collective Intelligence
- ? ?? ?? ??
- ??? ??/???
- ???? ??
27MapReduce ?? ??
?? http//www.slideshare.net/spirosd/mapreduce-di
stributed-computing-on-large-commodity-clusters
28MapReduce ?
tags dog, cat tags cat tags
mouse, cat, dog tags
map function() this.tags.forEach(
function(tag) emit( tag, 1 )
)
dog 1, 1 cat 1, 1, 1 mouse
1
reduce function( key , values ) var total
0 for ( var i0 i lt values.length i )
total valuesi return
total
"_id" "cat", "value" 3 "_id" "dog",
"value" 2 "_id" "mouse", "value" 1
29MapReduce Sharding
- ? Shard? ?? ??? ?? ? ??
- ????? ??? ?? ??? ??
- Shard ? ???? ??? ?? ?? ?? ??/??? ??
- ?? ??? ??/??? ??
Shard
MapReduce ??
? Shard? MapReduce ?? ??
Shard
MapReduce ?? ??
MapReduce ??
mongos
Client
Shard
? Shard? Reduce ??
MapReduce ??
? Shard? Reduce ??? Reduce ??
Shard
MapReduce ??
30Atomic ? ??
- upsert? inc? ??? Atomic ? ??
- ??? ?? ??? ??? ??
- ?? ? ???, 10?? PV, ?? ?? ?? ?
c.update( hour "20100415100001, site "abc"
, inc pageviews 1 ,
upsert true ) c.update( hour
"20100415100001, site "abc" ,
inc pageviews 1 , upsert
true ) c.update( hour "20100415100001,
site "abc" , inc
pageviews 1 , upsert true
) c.update( hour "20100415100002, site
"abc" , inc pageviews 1 ,
upsert true )
??? ?? hour 2010041510001, site abc,
pageviews 1
pageviews ? ?? hour 2010041510001, site
abc, pageviews 2
pageviews ? ?? hour 2010041510001, site
abc, pageviews 3
??? ?? hour 2010041510002, site abc,
pageviews 1
31Capped Collection
- ?? ??? ?? ???
- ??
- db.createCollection(mycoll, capped true,
size 1000, max 100) - ?? ??
- ??? ???? ??, ? ??? ?? ? ?? ??? ?? ??
- ?? ??? ??? ?? ??
- ??
- ?? ?? ???, ?? ???? ? ?? ?? ??
- ?? ?? ??
- 32bit ???? 10? ??? (? 950M) ?? ??
- 64bit ??? ??? ????
- ??
- ??
- LRU ??? ??? ??
- ??
- ??? ?? ?? ?????? ??? ?? ??
32Geospacial Index
33Geospacial Index
- ??? ??? ??? ??
- db.position.ensureIndex( loc 2d, min -500,
max 500 ) - loc 30, 30 , loc x 50, y 30
- ??
- ??? ?? ? ?? (?? ??)
- db.place.find( loc near x50, y30 )
- db.place.find( loc near x50, y30
).limit(3) - ?? ?? ??? ? ??
- db.place.find( loc within box
0,0,10,10 ) - db.place.find( loc within center 50,
50, 20 ) - ?? ???
- ??? ?? ?? ??
- db.place.ensureIndex( loc 2d, cat 1 )
- db.place.find( loc near 10,10 , cat
bank ).limit(10) - ??
- ??(?) ?? ?? (? ?? ??? ??)
- ??? ???????? ??
34??
35MongoDB? ???
- ??? ??
- ? App ??? ????
- ?? ???
- ?? ???, ???, ??, ???? ?
- ??? ??
- Atomic Inc? ??? ??? ?? (??? ??/??? PV)
- ??? ?? ?? ??
- Sharding MapReduce ?? ??? ?? ??/??? ??
- ??
- ltkey, valuegt, Schema Free ???? ?? ?? ??
- Capped Collection
- ?? ?? ??? ??
- ??? ?? ??
- ????? ??? ???
- ?, ??? ???, ?? ???, ?? ???
36???
- NoSQL DB? ?? ?? ??
- Document ?? ??
- No Relation, No Join, No SQL, Schema Free
- MapReduce ??
- ?? ??, no group by
- RDBMS? Document DB? ??? ??
37????
- MapReduce
- Wikipediahttp//en.wikipedia.org/wiki/MapReduce
- MapReducedistributed computing on large
commodity clustershttp//www.slideshare.net/spir
osd/mapreduce-distributed-computing-on-large-commo
dity-clusters - NoSQL
- NoSQL ?? http//en.wikipedia.org/wiki/NoSQL
- MongoDB ?? ??
- MongoDB ??? http//www.mongodb.org/display/DOCS/
DeveloperZone - Fast Updates with MongoDB http//blog.mongodb.or
g/post/248614779/fast-updates-with-mongodb-update-
in-place - MongoDB is Fatastic for Logging
http//blog.mongodb.org/post/172254834/mongodb-is-
fantastic-for-logging - Using MongoDB for Real-time Analytics
- http//blog.mongodb.org/post/171353301/using-mongo
db-for-real-time-analytics
38QA