Abstract of
MongoDB
. From "hu mongo us" . Document-oriented database, not relational . Schema free . Manages hierarchical collection of BSON (bee-son) documents . Written in C++ . Has an official driver for C# with support from 10gen
. Scalable with high-performance (scales horizontally) . Designed to address today's workloads . BASE rather than ACID compliant . Replication . Part of the " NoSQL " class of DBMS . Website with list of all features - http://www.mongodb.org/
Why are these interesting?
. New requirements are arising in environments where we have higher volumes of data with high operation rates, agile development and cloud computing. This reflects the growing interactivity of applications which are becoming more networked and social, driving more requests to the database where high-performance DBMS such as MongoDB become favorable. . Not requiring a schema or migration scripts before you add data makes it fit well with agile development approaches. Each time you complete new features, the schema of your database often needs to change. If the database is large, this can mean a slow process.
ACID
. Relational databases make the ACID promise: - Atomicity - a transaction is all or nothing - Consistency - only valid data is written to the database - Isolation - pretend all transactions are happening serially and the data is correct - Durability - what you write is what you get . The problem is ACID can give you too much, it trips you up when you are trying to scale a system across multiple nodes. . Down time is unacceptable so your system needs to be reliable. Reliability requires multiple nodes to handle machine failures. . To make scalable systems that can handle lots and lots of reads and writes you need many more nodes. . Once you try to scale ACID across many machines you hit problems with network failures and delays. The algorithms don't work in a distributed environment at any acceptable speed.
CAP
. If you can't have all of the ACID guarantees it turns out you can have two of the following three characteristics: - Consistency - your data is correct all the time. What you write is what you read. - Availability - you can read and write and write your data all the time - Partition Tolerance - if one or more nodes fails the system still works and becomes consistent when the system comes on-line. . In distributed systems, network partitioning is inevitable and must be tolerated, so essential CAP means that we cannot have both consistency and 100% availability.
" If the network is broken, your database won't work. "
|