Lectures‎ > ‎

MongoDB

Google's http://en.wikipedia.org/wiki/BigTable predates MongoDB. NoSQL friends includes HBase, Cassandra, Redis, MongoDB, Voldemort, CouchDB, Dynomite, Hypertable, ...

Distributed collections of collections of dictionaries stored as JSON/BSON documents.

From doc:
  • A Mongo system (see deployment above) holds a set of databases
  • database holds a set of collections
  • collection holds a set of documents
  • document is a set of fields
  • field is a key-value pair
  • key is a name (string)
  • value is a
    • basic type like string, integer, float, timestamp, binary, etc.,
    • a document, or
    • an array of values
tradeoffs:
NoSQL databases scale more easily than RDBMS; can be much faster. "shard key" dictates how stuff gets distributed across boxes. They are also much simpler and don't require a db administrator.  Are they as reliable?

Note:
  • collections are SCHEMA-LESS, can be heterogeneous
  • each doc has unique _id, auto-generated
  • can index keys ("columns")
  • atomic operations on single documents
  • javascript in mongo shell
  • queries use json too:
    • {name: {first: 'John', last: 'Doe'}}
    • {name.last: 'Doe'}
    • {name.last: /^D/}
    • equality, regular expressions, ranges, geospatial
  • Joins must be done MANUALLY in memory with java etc...
  • Embedded documents and arrays reduce need for joins, at the cost of data duplication and having to update multiple locations when things change.
  • I see NO transaction model, except for one implemented manually!

Install


$ sudo mkdir -p /var/lib/mongodb
$ sudo mongod --dbpath /var/lib/mongodb


Mongo Shell



> show collections;
system.indexes
users

> db.users.find();
{ "_id" : ObjectId("5075cc673004f76b80219e33"), "info" : "lives to code.", "name" : "parrt", "passwd" : "secret2" }
{ "_id" : ObjectId("5075ccec3004425635986fc4"), "name" : "parrt", "passwd" : "secret", "info" : "lives to code." }
...

> db.users.find({"name":"tombu"})
{ "_id" : ObjectId("5075d0203004d9e3e6c52c2e"), "name" : "tombu", "passwd" : "sort of secret", "phones" : [ "1234", "0101" ] }
{ "_id" : ObjectId("5075d0743004248cf1603657"), "name" : "tombu", "passwd" : "sort of secret", "phones" : [ "1234", "0101" ] }
...

> t = db.users.findOne({"name":"tombu"});
{
"_id" : ObjectId("5075d0203004d9e3e6c52c2e"),
"name" : "tombu",
"passwd" : "sort of secret",
"phones" : [
"1234",
"0101"
]
}
> t.passwd = "shhhh!"
shhhh!
> db.users.save(t)

To create a new database using the shell:

> db.pages.find()
> db.pages.save({"url":"http://cnn.com", "html":"..."})
> db.pages.find()
{ "_id" : ObjectId("5075de9b0627457e04f1d3ac"), "url" : "http://cnn.com", "html" : "..." }

Documents referring to other documents


Manual references serve as foreign keys, unless you want to duplicate subdocuments.

It turns out object IDs are technically not unique across collections.

Java interface


ċ
Find.java
(1k)
Terence Parr,
Oct 10, 2012, 1:47 PM
ċ
Insert.java
(1k)
Terence Parr,
Oct 10, 2012, 1:47 PM
ċ
Test.java
(1k)
Terence Parr,
Oct 10, 2012, 1:47 PM
ċ
Update.java
(1k)
Terence Parr,
Oct 10, 2012, 1:47 PM
Comments