PyMongo is thread-safe and even provides built-in connection pooling for threaded applications.
Every Connection instance has built-in connection pooling. By default, each thread gets its own socket reserved on its first operation. Those sockets are held until end_request() is called by that thread.
Calling end_request() allows the socket to be returned to the pool, and to be used by other threads instead of creating a new socket. Judicious use of this method is important for applications with many threads or with long running threads that make few calls to PyMongo operations.
Alternatively, a Connection created with auto_start_request=False will share sockets (safely) among all threads.
When disconnect() is called by any thread, all sockets are closed. PyMongo will create new sockets as needed.
Starting with version 2.2 PyMongo supports Python 3.x where x >= 1. See the Python 3 FAQ for details.
The only async framework that PyMongo fully supports is Gevent.
Currently there is no great way to use PyMongo in conjunction with Tornado or Twisted. PyMongo provides built-in connection pooling, so some of the benefits of those frameworks can be achieved just by writing multi-threaded code that shares a Connection.
There are asynchronous MongoDB drivers in Python: AsyncMongo for Tornado and TxMongo for Twisted. Compared to PyMongo, however, these projects are less stable, lack features, and are less actively maintained.
It is possible to use PyMongo with Tornado, if some precautions are taken to avoid blocking the event loop:
Cursors in MongoDB can timeout on the server if they’ve been open for a long time without any operations being performed on them. This can lead to an OperationFailure exception being raised when attempting to iterate the cursor.
MongoDB doesn’t support custom timeouts for cursors, but cursor timeouts can be turned off entirely. Pass timeout=False to find().
MongoDB only supports IEEE 754 floating points - the same as the Python float type. The only way PyMongo could store Decimal instances would be to convert them to this standard, so you’d really only be storing floats anyway - we force users to do this conversion explicitly so that they are aware that it is happening.
The database representation is 9.99 as an IEEE floating point (which is common to MongoDB and Python as well as most other modern languages). The problem is that 9.99 cannot be represented exactly with a double precision floating point - this is true in some versions of Python as well:
>>> 9.99 9.9900000000000002
This request has come up a number of times but we’ve decided not to implement anything like this. The relevant jira case has some information about the decision, but here is a brief summary:
Prior to PyMongo version 1.7, the correct way is to only save naive datetime instances, and to save all dates as UTC. In versions >= 1.7, the driver will automatically convert aware datetimes to UTC before saving them. By default, datetimes retrieved from the server (no matter what version of the driver you’re using) will be naive and represent UTC. In newer versions of the driver you can set the Connection tz_aware parameter to True, which will cause all datetime instances returned from that Connection to be aware (UTC). This setting is recommended, as it can force application code to handle timezones properly.
Be careful not to save naive datetime instances that are not UTC (i.e. the result of calling datetime.datetime.now()).
Something like pytz can be used to convert dates to localtime after retrieving them from the database.
PyMongo doesn’t support saving datetime.date instances, since there is no BSON type for dates without times. Rather than having the driver enforce a convention for converting datetime.date instances to datetime.datetime instances for you, any conversion should be performed in your client code.
Django is a popular Python web framework. Django includes an ORM, django.db. Currently, there’s no official MongoDB backend for Django.
django-mongodb-engine is an unofficial, actively developed MongoDB backend that supports Django aggregations, (atomic) updates, embedded objects, Map/Reduce and GridFS. It allows you to use most of Django’s built-in features, including the ORM, admin, authentication, site and session frameworks and caching through django-mongodb-cache.
However, it’s easy to use MongoDB (and PyMongo) from Django without using a Django backend. Certain features of Django that require django.db (admin, authentication and sessions) will not work using just MongoDB, but most of what Django provides can still be used.
We have written a demo Django + MongoDB project. The README for that project describes some of what you need to do to use MongoDB from Django. The main point is that your persistence code will go directly into your views, rather than being defined in separate models. The README also gives instructions for how to change settings.py to disable the features that won’t work with MongoDB.
One project which should make working with MongoDB and Django easier is mango. Mango is a set of MongoDB backends for Django sessions and authentication (bypassing django.db entirely).
mod_wsgi is a popular Apache module used for hosting Python applications conforming to the wsgi specification. There is a potential issue when deploying PyMongo applications with mod_wsgi involving PyMongo’s C extension and mod_wsgi’s multiple sub interpreters.
One tricky issue that we’ve seen when deploying PyMongo applications with mod_wsgi is documented here, in the Multiple Python Sub Interpreters section. When running PyMongo with the C extension enabled it is possible to see strange failures when encoding due to the way mod_wsgi handles module reloading with multiple sub interpreters. There are several possible ways to work around this issue:
The json module won’t work out of the box with all documents from PyMongo as PyMongo supports some special types (like ObjectId and DBRef) that are not supported in JSON. We’ve added some utilities for working with json and simplejson in the json_util module.
On Unix systems, dates are represented as seconds from 1 January 1970 and usually stored in the C time_t type. On most 32-bit operating systems time_t is a signed 4 byte integer which means it can’t handle dates after 19 January 2038; this is known as the year 2038 problem. Neither MongoDB nor Python uses time_t to represent dates internally so do not suffer from this problem, but Python’s datetime.datetime.fromtimestamp() used by PyMongo’s Python implementation of bson does, which means it is susceptible. Therefore, on 32-bit systems you may get an error retrieving dates after 2038 from MongoDB using PyMongo with the Python version of bson.
The C implementation of bson also used to suffer from this problem but it was fixed in commit 566bc9fb7be6f9ab2604 (10 May 2010).