collection – Collection level operations

Collection level utilities for Mongo.

pymongo.ASCENDING
Ascending sort order.
pymongo.DESCENDING
Descending sort order.
class pymongo.collection.Collection(database, name, options=None)

Get / create a Mongo collection.

Raises TypeError if name is not an instance of (str, unicode). Raises InvalidName if name is not a valid collection name. Raises TypeError if options is not an instance of dict. If options is non-empty a create command will be sent to the database. Otherwise the collection will be created implicitly on first use.

Parameters:
  • database: the database to get a collection from
  • name: the name of the collection to get
  • options: dictionary of collection options. see pymongo.database.Database.create_collection for details.
count()

Get the number of documents in this collection.

To get the number of documents matching a specific query use pymongo.cursor.Cursor.count().

create_index(key_or_list, direction=None, unique=False, ttl=300)

Creates an index on this collection.

Takes either a single key or a list of (key, direction) pairs. The key(s) must be an instance of (str, unicode), and the directions must be one of (ASCENDING, DESCENDING). Returns the name of the created index.

Parameters:
  • key_or_list: a single key or a list of (key, direction) pairs specifying the index to create
  • direction (optional): DEPRECATED this option will be removed
  • unique (optional): should this index guarantee uniqueness?
  • ttl (optional): time window (in seconds) during which this index will be recognized by subsequent calls to ensure_index() - see documentation for ensure_index() for details
database

The Database that this Collection is a part of.

Changed in version 1.3: database is now a property rather than a method. The database() method is deprecated.

distinct(key)

Get a list of distinct values for key among all documents in this collection.

Raises TypeError if key is not an instance of (str, unicode).

To get the distinct values for a key in the result set of a query use pymongo.cursor.Cursor.distinct().

Parameters:
  • key: name of key for which we want to get the distinct values

Note

Requires server version >= 1.1.0

New in version 1.1.1.

drop_index(index_or_name)

Drops the specified index on this collection.

Can be used on non-existant collections or collections with no indexes. Raises OperationFailure on an error. index_or_name can be either an index name (as returned by create_index), or an index specifier (as passed to create_index). An index specifier should be a list of (key, direction) pairs. Raises TypeError if index is not an instance of (str, unicode, list).

Parameters:
  • index_or_name: index (or name of index) to drop
drop_indexes()

Drops all indexes on this collection.

Can be used on non-existant collections or collections with no indexes. Raises OperationFailure on an error.

ensure_index(key_or_list, direction=None, unique=False, ttl=300)

Ensures that an index exists on this collection.

Takes either a single key or a list of (key, direction) pairs. The key(s) must be an instance of (str, unicode), and the direction(s) must be one of (ASCENDING, DESCENDING).

Unlike create_index(), which attempts to create an index unconditionally, ensure_index() takes advantage of some caching within the driver such that it only attempts to create indexes that might not already exist. When an index is created (or ensured) by PyMongo it is “remembered” for ttl seconds. Repeated calls to ensure_index() within that time limit will be lightweight - they will not attempt to actually create the index.

Care must be taken when the database is being accessed through multiple connections at once. If an index is created using PyMongo and then deleted using another connection any call to ensure_index() within the cache window will fail to re-create the missing index.

Returns the name of the created index if an index is actually created. Returns None if the index already exists.

Parameters:
  • key_or_list: a single key or a list of (key, direction) pairs specifying the index to ensure
  • direction (optional): DEPRECATED this option will be removed
  • unique (optional): should this index guarantee uniqueness?
  • ttl (optional): time window (in seconds) during which this index will be recognized by subsequent calls to ensure_index()
find(spec=None, fields=None, skip=0, limit=0, slave_okay=None, timeout=True, snapshot=False, tailable=False, _sock=None, _must_use_master=False)

Query the database.

The spec argument is a prototype document that all results must match. For example:

>>> db.test.find({"hello": "world"})

only matches documents that have a key “hello” with value “world”. Matches can have other keys in addition to “hello”. The fields argument is used to specify a subset of fields that should be included in the result documents. By limiting results to a certain subset of fields you can cut down on network traffic and decoding time.

Raises TypeError if any of the arguments are of improper type. Returns an instance of Cursor corresponding to this query.

Parameters:
  • spec (optional): a SON object specifying elements which must be present for a document to be included in the result set
  • fields (optional): a list of field names that should be returned in the result set (“_id” will always be included)
  • skip (optional): the number of documents to omit (from the start of the result set) when returning the results
  • limit (optional): the maximum number of results to return
  • slave_okay (optional): DEPRECATED this option is deprecated and will be removed - see the slave_okay parameter to pymongo.Connection.__init__.
  • timeout (optional): if True, any returned cursor will be subject to the normal timeout behavior of the mongod process. Otherwise, the returned cursor will never timeout at the server. Care should be taken to ensure that cursors with timeout turned off are properly closed.
  • snapshot (optional): if True, snapshot mode will be used for this query. Snapshot mode assures no duplicates are returned, or objects missed, which were present at both the start and end of the query’s execution. For details, see the snapshot documentation.
  • tailable (optional): the result of this find call will be a tailable cursor - tailable cursors aren’t closed when the last data is retrieved but are kept open and the cursors location marks the final document’s position. if more data is received iteration of the cursor will continue from the last document received. For details, see the tailable cursor documentation.

New in version 1.1: The tailable parameter.

find_one(spec_or_object_id=None, fields=None, slave_okay=None, _sock=None, _must_use_master=False)

Get a single object from the database.

Raises TypeError if the argument is of an improper type. Returns a single SON object, or None if no result is found.

Parameters:
  • spec_or_object_id (optional): a SON object specifying elements which must be present for a document to be returned OR an instance of ObjectId to be used as the value for an _id query
  • fields (optional): a list of field names that should be included in the returned document (“_id” will always be included)
  • slave_okay (optional): DEPRECATED this option is deprecated and will be removed - see the slave_okay parameter to pymongo.Connection.__init__.
full_name

The full name of this Collection.

The full name is of the form database_name.collection_name.

Changed in version 1.3: full_name is now a property rather than a method. The full_name() method is deprecated.

group(keys, condition, initial, reduce, finalize=None, command=True)

Perform a query similar to an SQL group by operation.

Returns an array of grouped items.

Parameters:
  • keys: list of fields to group by
  • condition: specification of rows to be considered (as a find query specification)
  • initial: initial value of the aggregation counter object
  • reduce: aggregation function as a JavaScript string
  • finalize: function to be called on each object in output list.
  • command (optional): DEPRECATED if True, run the group as a command instead of in an eval - this option is deprecated and will be removed in favor of running all groups as commands

Changed in version 1.3: The command argument now defaults to True and is deprecated.

index_information()

Get information on this collection’s indexes.

Returns a dictionary where the keys are index names (as returned by create_index()) and the values are lists of (key, direction) pairs specifying the index (as passed to create_index()).

insert(doc_or_docs, manipulate=True, safe=False, check_keys=True)

Insert a document(s) into this collection.

If manipulate is set the document(s) are manipulated using any SONManipulators that have been added to this database. Returns the _id of the inserted document or a list of _ids of the inserted documents. If the document(s) does not already contain an ‘_id’ one will be added. If safe is True then the insert will be checked for errors, raising OperationFailure if one occurred. Safe inserts wait for a response from the database, while normal inserts do not.

Parameters:
  • doc_or_docs: a SON object or list of SON objects to be inserted
  • manipulate (optional): manipulate the documents before inserting?
  • safe (optional): check that the insert succeeded?
  • check_keys (optional): check if keys start with ‘$’ or contain ‘.’, raising pymongo.errors.InvalidName in either case

Changed in version 1.1: Bulk insert works with any iterable

map_reduce(map, reduce, full_response=False, **kwargs)

Perform a map/reduce operation on this collection.

If full_response is False (default) returns a Collection instance containing the results of the operation. Otherwise, returns the full response from the server to the map reduce command.

Parameters:
  • map: map function (as a JavaScript string)

  • reduce: reduce function (as a JavaScript string)

  • full_response (optional): if True, return full response to this command - otherwise just return the result collection

  • **kwargs (optional): additional arguments to the map reduce command may be passed as keyword arguments to this helper method, e.g.:

    >>> db.test.map_reduce(map, reduce, limit=2)
    

Note

Requires server version >= 1.1.1

New in version 1.2.

name

The name of this Collection.

Changed in version 1.3: name is now a property rather than a method. The name() method is deprecated.

options()

Get the options set on this collection.

Returns a dictionary of options and their values - see pymongo.database.Database.create_collection for more information on the options dictionary. Returns an empty dictionary if the collection has not been created yet.

remove(spec_or_object_id=None, safe=False)

Remove a document(s) from this collection.

Warning

Calls to remove() should be performed with care, as removed data cannot be restored.

Raises TypeError if spec_or_object_id is not an instance of (dict, ObjectId). If safe is True then the remove operation will be checked for errors, raising OperationFailure if one occurred. Safe removes wait for a response from the database, while normal removes do not.

If no spec_or_object_id is given all documents in this collection will be removed. This is not equivalent to calling drop_collection(), however, as indexes will not be removed.

Parameters:
  • spec_or_object_id (optional): a dict or SON instance specifying which documents should be removed; or an instance of ObjectId specifying the value of the _id field for the document to be removed
  • safe (optional): check that the remove succeeded?

Changed in version 1.2: The spec_or_object_id parameter is now optional. If it is not specified all documents in the collection will be removed.

New in version 1.1: The safe parameter.

rename(new_name)

Rename this collection.

If operating in auth mode, client must be authorized as an admin to perform this operation. Raises TypeError if new_name is not an instance of (str, unicode). Raises InvalidName if new_name is not a valid collection name.

Parameters:
  • new_name: new name for this collection
save(to_save, manipulate=True, safe=False)

Save a document in this collection.

If to_save already has an ‘_id’ then an update (upsert) operation is performed and any existing document with that _id is overwritten. Otherwise an ‘_id’ will be added to to_save and an insert operation is performed. Returns the _id of the saved document.

Raises TypeError if to_save is not an instance of dict. If safe is True then the save will be checked for errors, raising OperationFailure if one occurred. Safe inserts wait for a response from the database, while normal inserts do not. Returns the _id of the saved document.

Parameters:
  • to_save: the SON object to be saved
  • manipulate (optional): manipulate the SON object before saving it
  • safe (optional): check that the save succeeded?
update(spec, document, upsert=False, manipulate=False, safe=False, multi=False)

Update a document(s) in this collection.

Raises TypeError if either spec or document is not an instance of dict or upsert is not an instance of bool. If safe is True then the update will be checked for errors, raising OperationFailure if one occurred. Safe inserts wait for a response from the database, while normal inserts do not.

There are many useful update modifiers which can be used when performing updates. For example, here we use the "$set" modifier to modify some fields in a matching document:

>>> db.test.insert({"x": "y", "a": "b"})
ObjectId('...')
>>> list(db.test.find())
[{u'a': u'b', u'x': u'y', u'_id': ObjectId('...')}]
>>> db.test.update({"x": "y"}, {"$set": {"a": "c"}})
>>> list(db.test.find())
[{u'a': u'c', u'x': u'y', u'_id': ObjectId('...')}]
Parameters:
  • spec: a dict or SON instance specifying elements which must be present for a document to be updated
  • document: a dict or SON instance specifying the document to be used for the update or (in the case of an upsert) insert - see docs on MongoDB update modifiers
  • upsert (optional): perform an upsert operation
  • manipulate (optional): manipulate the document before updating?
  • safe (optional): check that the update succeeded?
  • multi (optional): update all documents that match spec, rather than just the first matching document. The default value for multi is currently False, but this might eventually change to True. It is recommended that you specify this argument explicitly for all update operations in order to prepare your code for that change.

New in version 1.1.1: The multi parameter.

Previous topic

database – Database level operations

Next topic

cursor – Tools for iterating over MongoDB query results

This Page