Ruby driver for MongoDB
Go to file
2009-06-02 11:24:52 -04:00
bin drop database before doing benchmarking 2009-03-24 12:01:08 -04:00
examples update types in example 2009-03-12 15:33:13 -04:00
ext/cbson BUG RUBY-15 don't check key names on create_index operations 2009-06-02 09:38:31 -04:00
lib API CHANGE: better, less redundant API for index_information 2009-06-02 11:24:52 -04:00
tests API CHANGE: better, less redundant API for index_information 2009-06-02 11:24:52 -04:00
.gitignore checkpoint - beginnings of c encoder 2009-03-03 17:07:22 -05:00
LICENSE.txt Add file w/ Apache License v2.0 2009-02-15 08:24:50 -05:00
mongo-extensions.gemspec BUMP 0.7 mainly adding support for negative integers 2009-05-15 11:20:51 -04:00
mongo-ruby-driver.gemspec BUMP 0.8 Collection#save, DB#previous_error and DB#reset_error_history 2009-05-26 15:28:13 -04:00
Rakefile remove entire output dir, not just a single version 2009-04-23 15:39:57 -04:00
README.rdoc credits 2009-06-01 09:21:59 -04:00

= Introduction

This is a Ruby driver for the 10gen Mongo DB. For more information about
Mongo, see http://www.mongodb.org.

Start by reading the XGen::Mongo::Driver::Mongo and XGen::Mongo::Driver::DB
documentation, then move on to XGen::Mongo::Driver::Collection and
XGen::Mongo::Driver::Cursor.

Here is a quick code sample. See the files in the "examples" subdirectory for
many more.

  require 'mongo'

  include XGen::Mongo::Driver

  db = Mongo.new('localhost').db('sample-db')
  coll = db.collection('test')

  coll.clear
  3.times { |i| coll.insert({'a' => i+1}) }
  puts "There are #{coll.count()} records. Here they are:"
  coll.find().each { |doc| puts doc.inspect }

This driver also includes an implementation of a GridStore class, a Ruby
interface to Mongo's GridFS storage. NOTE: the GridStore code may be moved to
a separate project.

= Installation

Install the "mongo" gem by typing

  $ gem sources -a http://gems.github.com
  $ sudo gem install mongodb-mongo

The first line tells RubyGems to add the GitHub gem repository. You only need
to run this command once.

=== From the GitHub source

The source code is available at http://github.com/mongodb/mongo-ruby-driver.
You can either clone the git repository or download a tarball or zip file.
Once you have the source, you can use it from wherever you downloaded it or
you can install it as a gem from the source by typing

  $ rake gem:install

Note: when you install the gem this way it is called "mongo", not
"mongodb-mongo". In either case, you "require 'mongo'" in your source code.

=== Optional C Extension

There is a separate gem containing optional C extensions that will increase the
performance of the driver. To use the optional extensions just install the gem
by typing

  $ sudo gem install mongodb-mongo_ext

To install from source type this instead

  $ rake gem:install_extensions

That's all there is to it!

= Examples

There are many examples in the "examples" subdirectory. Samples include using
the driver and using the GridFS class GridStore. Mongo must be running for
these examples to work, of course.

Here's how to start mongo and run the "simple.rb" example:

  $ cd path/to/mongo
  $ ./mongod run
  ... then in another window ...
  $ cd path/to/mongo-ruby-driver
  $ ruby examples/simple.rb

See also the test code, especially tests/test_db_api.rb.


= The Driver

Here is some simple example code:

  require 'rubygems'        # not required for Ruby 1.9
  require 'mongo'

  include XGen::Mongo::Driver
  db = Mongo.new.db('my-db-name')
  things = db.collection('things')

  things.clear
  things.insert('a' => 42)
  things.insert('a' => 99, 'b' => Time.now)
  puts things.count                               # => 2
  puts things.find('a' => 42).next_object.inspect # {"a"=>42}


= GridStore

The GridStore class is a Ruby implementation of Mongo's GridFS file storage
system. An instance of GridStore is like an IO object. See the rdocs for
details, and see examples/gridfs.rb for code that uses many of the GridStore
features like metadata, content type, rewind/seek/tell, etc.

Note that the GridStore class is not automatically required when you require
'mongo'. You need to require 'mongo/gridfs'.

Example code:

  GridStore.open(database, 'filename', 'w') { |f|
    f.puts "Hello, world!"
  }
  GridStore.open(database, 'filename, 'r') { |f|
    puts f.read         # => Hello, world!\n
  }
  GridStore.open(database, 'filename', 'w+') { |f|
    f.puts "But wait, there's more!"
  }
  GridStore.open(database, 'filename, 'r') { |f|
    puts f.read         # => Hello, world!\nBut wait, there's more!\n
  }



= Notes

== String Encoding

The BSON ("Binary JSON") format used to communicate with Mongo requires that
strings be UTF-8 (http://en.wikipedia.org/wiki/UTF-8).

Ruby 1.9 has built-in character encoding support. All strings sent to Mongo
and received from Mongo are converted to UTF-8 when necessary, and strings
read from Mongo will have their character encodings set to UTF-8.

When used with Ruby 1.8, the bytes in each string are written to and read from
Mongo as-is. If the string is ASCII all is well, because ASCII is a subset of
UTF-8. If the string is not ASCII then it may not be a well-formed UTF-8
string.

== Primary Keys

The field _id is a primary key. It is treated specially by the database, and
its use makes many operations more efficient.

The value of an _id may be of any type. (Older versions of Mongo required that
they be XGen::Mongo::Driver::ObjectID instances.)

The database itself inserts an _id value if none is specified when a record is
inserted.

The driver automatically sends the _id field to the database first, which is
how Mongo likes it. You don't have to worry about where the _id field is in
your hash record, or worry if you are using an OrderedHash or not.

=== Primary Key Factories

A primary key factory is a class you supply to a DB object that knows how to
generate _id values. Primary key factories are no longer necessary because
Mongo now inserts an _id value for every record that does not already have
one. However, if you want to control _id values or even their types, using a
PK factory lets you do so.

You can tell the Ruby Mongo driver how to create primary keys by passing in
the :pk option to the Mongo#db method.

  include XGen::Mongo::Driver
  db = Mongo.new.db('dbname', :pk => MyPKFactory.new)

A primary key factory object must respond to :create_pk, which should take a
hash and return a hash which merges the original hash with any primary key
fields the factory wishes to inject. NOTE: if the object already has a primary
key, the factory should not inject a new key; this means that the object is
being used in a repsert but it already exists. The idea here is that when ever
a record is inserted, the :pk object's +create_pk+ method will be called and
the new hash returned will be inserted.

Here is a sample primary key factory, taken from the tests:

  class TestPKFactory
    def create_pk(row)
      row['_id'] ||= XGen::Mongo::Driver::ObjectID.new
      row
    end
  end

Here's a slightly more sophisticated one that handles both symbol and string
keys. This is the PKFactory that comes with the MongoRecord code (an
ActiveRecord-like framework for non-Rails apps) and the AR Mongo adapter code
(for Rails):

  class PKFactory
    def create_pk(row)
      return row if row[:_id]
      row.delete(:_id)      # in case it exists but the value is nil
      row['_id'] ||= XGen::Mongo::Driver::ObjectID.new
      row
    end
  end

A database's PK factory object may be set either when a DB object is created
or immediately after you obtain it, but only once. The only reason it is
changeable at all is so that libraries such as MongoRecord that use this
driver can set the PK factory after obtaining the database but before using it
for the first time.

== The DB Class

=== Primary Key factories

See the section on "Primary Keys" above.

=== Strict mode

Each database has an optional strict mode. If strict mode is on, then asking
for a collection that does not exist will raise an error, as will asking to
create a collection that already exists. Note that both these operations are
completely harmless; strict mode is a programmer convenience only.

To turn on strict mode, either pass in :strict => true when obtaining a DB
object or call the :strict= method:

  db = XGen::Mongo::Driver::Mongo.new.db('dbname', :strict => true)
  # I'm feeling lax
  db.strict = false
  # No, I'm not!
  db.strict = true

The method DB#strict? returns the current value of that flag.

== Cursors

Random cursor fun facts:

- Cursors are enumerable.

- The query doesn't get run until you actually attempt to retrieve data from a
  cursor.

- Cursors have a to_a method.



= Testing

If you have the source code, you can run the tests.

  $ rake test

The tests assume that the Mongo database is running on the default port. You
can override the default host (localhost) and port (Mongo::DEFAULT_PORT) by
using the environment variables MONGO_RUBY_DRIVER_HOST and
MONGO_RUBY_DRIVER_PORT.

The project mongo-qa (http://github.com/mongodb/mongo-qa) contains many more
Mongo driver tests that are language independent. To run thoses tests as part
of the "rake test" task, download the code "next to" this directory. So, after
installing the mongo-qa code you would have these two directories next to each
other:

  $ ls
  mongo-qa
  mongo-ruby-driver
  $ rake test

The tests run just fine if the mongo-qa directory is not there.

Additionally, the script bin/validate is used by the mongo-qa project's
validator script.


= Documentation

This documentation is available online at http://mongo.rubyforge.org. You can
generate the documentation if you have the source by typing

  $ rake rdoc

Then open the file html/index.html.


= Release Notes

See the git log comments.


= To Do

* Tests for update and repsert.

* Add a way to specify a collection of databases on startup (a simple array of
  IP address/port numbers, perhaps, or a hash or something). The driver would
  then find the master and, on each subsequent command, ask that machine if it
  is the master before proceeding.

* Introduce optional per-database and per-collection PKInjector.

* More tests.

== Optimizations

* Only update message sizes once, not after every write of a value. This will
  require an explicit call to update_message_length in each message subclass.

* ensure_index commands should be cached to prevent excessive communication
  with the database. (Or, the driver user should be informed that ensure_index
  is not a lightweight operation for the particular driver.)


= Credits

Adrian Madrid, aemadrid@gmail.com
* bin/mongo_console
* examples/benchmarks.rb
* examples/irb.rb
* Modifications to examples/simple.rb
* Found plenty of bugs and missing features.
* Ruby 1.9 support.
* Gem support.
* Many other code suggestions and improvements.

Aman Gupta, aman@tmm1.net
* Collection#save

Jon Crosby, jon@joncrosby.me
* Some code clean-up

= License

 Copyright 2008-2009 10gen Inc.

   Licensed under the Apache License, Version 2.0 (the "License");
   you may not use this file except in compliance with the License.
   You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.