3.3. Mango Queries

In addition to map/reduce views, CouchDB supports an expressive query system called Mango.

Mango consists of two major concepts:

  • Selectors, which are the queries, and are passed to the mango endpoints

  • Indexes, which are a specialization of design docs used in mango queries

There are a few important endpoints for interacting with these concepts:

3.3.1. Selectors

Selectors are expressed as a JSON object describing documents of interest. Within this structure, you can apply conditional logic using specially named fields.

Whilst selectors have some similarities with MongoDB query documents, these arise from a similarity of purpose and do not necessarily extend to commonality of function or result.

Warning

While CouchDB will happily store just about anything JSON, Mango has limitations about what it can work with:

  • Empty field names ("") cannot be queried (“One or more conditions is missing a field name.”).

  • Field names starting with $ must be escaped with \ (eg, \$foo) (“Invalid operator: $”).

3.3.1.1. Selector Basics

Elementary selector syntax requires you to specify one or more fields, and the corresponding values required for those fields. This selector matches all documents whose "director" field has the value "Lars von Trier".

{
    "director": "Lars von Trier"
}

A simple selector, inspecting specific fields:

"selector": {
    "title": "Live And Let Die"
},
"fields": [
    "title",
    "cast"
]

3.3.1.2. Selector with 2 fields

This selector matches any document with a name field containing "Paul", and that also has a location field with the value "Boston".

{
    "name": "Paul",
    "location": "Boston"
}

3.3.1.3. Subfields

A more complex selector enables you to specify the values for field of nested objects, or subfields. For example, you might use a standard JSON structure for specifying a field and subfield.

Example of a field and subfield selector, using a standard JSON structure:

{
    "imdb": {
        "rating": 8
    }
}

An abbreviated equivalent uses a dot notation to combine the field and subfield names into a single name.

{
    "imdb.rating": 8
}

3.3.1.4. Operators

Operators are identified by the use of a dollar sign ($) prefix in the name field.

There are two core types of operators in the selector syntax:

  • Combination operators

  • Condition operators

In general, combination operators are applied at the topmost level of selection. They are used to combine conditions, or to create combinations of conditions, into one selector.

Every explicit operator has the form:

{
    "$operator": argument
}

A selector without an explicit operator is considered to have an implicit operator. The exact implicit operator is determined by the structure of the selector expression.

3.3.1.5. Implicit Operators

There are two implicit operators:

  • Equality

  • And

In a selector, any field containing a JSON value, but that has no operators in it, is considered to be an equality condition. The implicit equality test applies also for fields and subfields.

Any JSON object that is not the argument to a condition operator is an implicit $and operator on each field.

In the below example, we use an operator to match any document, where the "year" field has a value greater than 2010:

{
    "year": {
        "$gt": 2010
    }
}

In this next example, there must be a field "director" in a matching document, and the field must have a value exactly equal to "Lars von Trier".

{
    "director": "Lars von Trier"
}

You can also make the equality operator explicit.

{
    "director": {
        "$eq": "Lars von Trier"
    }
}

In the next example using subfields, the required field "imdb" in a matching document must also have a subfield "rating" and the subfield must have a value equal to 8.

Example of implicit operator applied to a subfield test:

{
    "imdb": {
        "rating": 8
    }
}

Again, you can make the equality operator explicit.

{
    "imdb": {
        "rating": { "$eq": 8 }
    }
}

An example of the $eq operator used with database indexed on the field "year":

{
  "selector": {
    "year": {
      "$eq": 2001
    }
  },
  "sort": [
    "year"
  ],
  "fields": [
    "year"
  ]
}

In this example, the field "director" must be present and contain the value "Lars von Trier" and the field "year" must exist and have the value 2003.

{
    "director": "Lars von Trier",
    "year": 2003
}

You can make both the $and operator and the equality operator explicit.

Example of using explicit $and and $eq operators:

{
    "$and": [
        {
            "director": {
                "$eq": "Lars von Trier"
            }
        },
        {
            "year": {
                "$eq": 2003
            }
        }
    ]
}

It is entirely up to you whether you use the implicit or explicit form. The implicit form is a little easier to write if you do that by hand. The explicit form is a little easier if you programatically contract your selectors. The end result will be the same.

3.3.1.6. Explicit Operators

All operators, apart from ‘Equality’ and ‘And’, must be stated explicitly.

3.3.1.7. Combination Operators

Combination operators are used to combine selectors. In addition to the common boolean operators found in most programming languages, there are three combination operators ($all, $elemMatch, and $allMatch) that help you work with JSON arrays and one that works with JSON maps ($keyMapMatch).

A combination operator takes a single argument. The argument is either another selector, or an array of selectors.

The list of combination operators:

Operator

Argument

Purpose

$and

Array

Matches if all the selectors in the array match.

$or

Array

Matches if any of the selectors in the array match. All selectors must use the same index.

$not

Selector

Matches if the given selector does not match.

$nor

Array

Matches if none of the selectors in the array match.

$all

Array

Matches an array value if it contains all the elements of the argument array.

$elemMatch

Selector

Matches and returns all documents that contain an array field with at least one element that matches all the specified query criteria.

$allMatch

Selector

Matches and returns all documents that contain an array field with all its elements matching all the specified query criteria.

$keyMapMatch

Selector

Matches and returns all documents that contain a map that contains at least one key that matches all the specified query criteria.

$text

String

Perform a text search

3.3.1.7.1. The $and operator

$and operator used with two fields:

{
  "selector": {
    "$and": [
      {
        "title": "Total Recall"
      },
      {
        "year": {
          "$in": [1984, 1991]
        }
      }
    ]
  },
  "fields": [
      "year",
      "title",
      "cast"
  ]
}

The $and operator matches if all the selectors in the array match. Below is an example using the primary index (_all_docs):

{
    "$and": [
        {
            "_id": { "$gt": null }
        },
        {
            "year": {
                "$in": [2014, 2015]
            }
        }
    ]
}

3.3.1.7.2. The $or operator

The $or operator matches if any of the selectors in the array match. Below is an example used with an index on the field "year":

{
    "year": 1977,
    "$or": [
        { "director": "George Lucas" },
        { "director": "Steven Spielberg" }
    ]
}

3.3.1.7.3. The $not operator

The $not operator matches if the given selector does not match. Below is an example used with an index on the field "year":

{
    "year": {
        "$gte": 1900,
        "$lte": 1903
    },
    "$not": {
        "year": 1901
    }
}

3.3.1.7.4. The $nor operator

The $nor operator matches if the given selector does not match. Below is an example used with an index on the field "year":

{
    "year": {
        "$gte": 1900.
        "$lte": 1910
    },
    "$nor": [
        { "year": 1901 },
        { "year": 1905 },
        { "year": 1907 }
    ]
}

3.3.1.7.5. The $all operator

The $all operator matches an array value if it contains all the elements of the argument array. Below is an example used with the primary index (_all_docs):

{
    "_id": {
        "$gt": null
    },
    "genre": {
        "$all": ["Comedy","Short"]
    }
}

3.3.1.7.6. The $elemMatch operator

The $elemMatch operator matches and returns all documents that contain an array field with at least one element matching the supplied query criteria. Below is an example used with the primary index (_all_docs):

{
    "_id": { "$gt": null },
    "genre": {
        "$elemMatch": {
            "$eq": "Horror"
        }
    }
}

3.3.1.7.7. The $allMatch operator

The $allMatch operator matches and returns all documents that contain an array field with all its elements matching the supplied query criteria. Below is an example used with the primary index (_all_docs):

{
    "_id": { "$gt": null },
    "genre": {
        "$allMatch": {
            "$eq": "Horror"
        }
    }
}

3.3.1.7.8. The $keyMapMatch operator

The $keyMapMatch operator matches and returns all documents that contain a map that contains at least one key that matches all the specified query criteria. Below is an example used with the primary index (_all_docs):

{
    "_id": { "$gt": null },
    "cameras": {
        "$keyMapMatch": {
            "$eq": "secondary"
        }
    }
}

3.3.1.7.9. The $text operator

The $text operator performs a text search using either a search or nouveau index. The specifics of the query follow either search syntax or nouveau syntax (which both use Lucene and implement the same syntax).

{
    "_id": { "$gt": null },
    "$text": "director:George"
}

Warning

Queries cannot contain more than one $text

3.3.1.8. Condition Operators

Condition operators are specific to a field, and are used to evaluate the value stored in that field. For instance, the basic $eq operator matches when the specified field contains a value that is equal to the supplied argument.

Note

For a condition operator to function correctly, the field must exist in the document for the selector to match. As an example, $ne means the specified field must exist, and is not equal to the value of the argument.

The basic equality and inequality operators common to most programming languages are supported. Strict type matching is used.

In addition, some ‘meta’ condition operators are available. Some condition operators accept any valid JSON content as the argument. Other condition operators require the argument to be in a specific JSON format.

Operator type

Operator

Argument

Purpose

(In)equality

$lt

Any JSON

The field is less than the argument.

$lte

Any JSON

The field is less than or equal to the argument.

$eq

Any JSON

The field is equal to the argument.

$ne

Any JSON

The field is not equal to the argument.

$gte

Any JSON

The field is greater than or equal to the argument.

$gt

Any JSON

The field is greater than the argument.

Object

$exists

Boolean

Check whether the field exists or not, regardless of its value.

$type

String

Check the document field’s type. Valid values are "null", "boolean", "number", "string", "array", and "object".

Array

$in

Array of JSON values

The document field must exist in the list provided.

$nin

Array of JSON values

The document field not must exist in the list provided.

$size

Integer

Special condition to match the length of an array field in a document. Non-array fields cannot match this condition.

Miscellaneous

$mod

[Divisor, Remainder]

Divisor is a non-zero integer, Remainder is any integer. Non-integer values result in a 404. Matches documents where field % Divisor == Remainder is true, and only when the document field is an integer.

$regex

String

A regular expression pattern to match against the document field. Only matches when the field is a string value and matches the supplied regular expression. The matching algorithms are based on the Perl Compatible Regular Expression (PCRE) library. For more information about what is implemented, see the Erlang Regular Expression.

$beginsWith

String

Matches where the document field begins with the specified prefix (case-sensitive). If the document field contains a non-string value, the document is not matched.

Warning

Regular expressions do not work with indexes, so they should not be used to filter large data sets. They can, however, be used to restrict a partial index.

3.3.1.9. Creating Selector Expressions

We have seen examples of combining selector expressions, such as using explicit $and and $eq operators.

In general, whenever you have an operator that takes an argument, that argument can itself be another operator with arguments of its own. This enables us to build up more complex selector expressions.

However, only operators that define a contiguous range of values such as $eq, $gt, $gte, $lt, $lte, and $beginsWith (but not $ne) can be used as the basis of a query that can make efficient use of a json index. You should include at least one of these in a selector, or consider using a text index if greater flexibility is required.

For example, if you try to perform a query that attempts to match all documents that have a field called afieldname containing a value that begins with the letter A, this will trigger a warning because no index could be used and the database performs a full scan of the primary index:

Request

POST /movies/_find HTTP/1.1
Accept: application/json
Content-Type: application/json
Content-Length: 112
Host: localhost:5984

{
    "selector": {
        "afieldname": {"$regex": "^A"}
    }
}

Response:

HTTP/1.1 200 OK
Cache-Control: must-revalidate
Content-Type: application/json
Date: Thu, 01 Sep 2016 17:25:51 GMT
Server: CouchDB (Erlang OTP)
Transfer-Encoding: chunked

{
    "warning":"no matching index found, create an index to optimize query time",
    "docs":[
    ]
}

Warning

It is always recommended that you create an appropriate index when deploying in production.

Most selector expressions work exactly as you would expect for the given operator. But it is not always the case: for example, comparison of strings is done with ICU and can can give surprising results if you were expecting ASCII ordering. See Views Collation for more details.

3.3.2. Indexes

Indexes are like indexes in most other database systems: they spend a little extra space to improve the performance of queries.

They primarily consist of a list of fields to index, but can also contain a selector to create a partial index.

Note

Mango indexes have a type, currently json, text, nouveau. The majority of this document covers json indexes. text and nouveau are related to the Search and Nouveau systems, respectively. (See Text Indexes.)

You will also occasionally find reference to the special index type. This represents synthetic indexes produced by CouchDB itself and refers exclusively to _all_docs.

3.3.2.1. Index Definitions

Index definitions are JSON objects with the following fields:

  • ddoc (string): ID of the design document the index belongs to. This ID can be used to retrieve the design document containing the index, by making a GET request to /{db}/ddoc, where ddoc is the value of this field.

  • name (string): Name of the index.

  • partitioned (boolean): Partitioned (true) or global (false) index.

  • type (string): Type of the index. Can be "json", "text", "nouveau", or sometimes "special".

  • def/index (object): Definition of the index, depending on the type (see below). Which name is used depends on the context.

3.3.2.2. JSON Indexes

JSON Indexes are you standard structural indexes, used by the majority of selector operators.

Their definition consists of:

  • fields (array): Array of field names following the sort syntax. Nested fields are also allowed, e.g. “person.name”.

  • partial_filter_selector (object): A selector to apply to documents at indexing time, creating a partial index. Optional

Example:

{
    "type" : "json",
    "index": {
        "fields": ["foo"]
    }
}

3.3.2.3. Partial Indexes

Partial indexes allow documents to be filtered at indexing time, potentially offering significant performance improvements for query selectors that do not map cleanly to a range query on an index.

Let’s look at an example query:

{
    "selector": {
        "status": {
            "$ne": "archived"
        },
        "type": "user"
    }
}

Without a partial index, this requires a full index scan to find all the documents of "type":"user" that do not have a status of "archived". This is because a normal index can only be used to match contiguous rows, and the "$ne" operator cannot guarantee that.

To improve response times, we can create an index which excludes documents where "status": { "$ne": "archived" } at index time using the "partial_filter_selector" field:

POST /db/_index HTTP/1.1
Content-Type: application/json
Content-Length: 144
Host: localhost:5984

{
  "index": {
    "partial_filter_selector": {
      "status": {
        "$ne": "archived"
      }
    },
    "fields": ["type"]
  },
  "ddoc" : "type-not-archived",
  "type" : "json"
}

Partial indexes are not currently used by the query planner unless specified by a "use_index" field, so we need to modify the original query:

{
    "selector": {
        "status": {
            "$ne": "archived"
        },
        "type": "user"
    },
    "use_index": "type-not-archived"
}

Technically, we do not need to include the filter on the "status" field in the query selector - the partial index ensures this is always true - but including it makes the intent of the selector clearer and will make it easier to take advantage of future improvements to query planning (e.g. automatic selection of partial indexes).

Note

An index with fields is only used, when the selector includes all of the fields indexed. For instance, if an index contains ["a", "b"] but the selector only requires field ["a"] to exist in the matching documents, the index would not be valid for the query. All indexes, however, can be treated as if they include the special fields _id and _rev. They never need to be specified in the query selector.

3.3.2.4. Text Indexes

Mango can also interact with the Search and Nouveau search systems, using the $text selector and the appropriate index. These indexes can be queried using either $text or GET /{db}/_design/{ddoc}/_search/{index} / GET /{db}/_design/{ddoc}/_nouveau/{index}.

Example index:

{
    "type": "nouveau",
    "index": {
        "fields": [
            {"name": "foo", "type": "string"},
            {"name": "bar", "type": "number"},
            {"name": "baz", "type": "string"},
        ],
        "default_analyzer": "keyword",
    }
}

A Text or Nouveau index definition consists of:

  • fields: The list of fields to index. "all_fields" or list of objects:

    • name (string): not blank

    • type (string): one of "text", "string", "number", "boolean"

  • default_analyzer (string): Analyzer to use, defaults to "keyword" Optional

  • default_field: Enables the “default field” index, boolean or object of enabled and analyzer Optional

  • partial_filter_selector (object): A selector, causing this to be a partial index Optional

  • selector (object): A selector Optional

3.3.2.5. Indexes and Design Documents

Ultimately, indexes are stored using design documents, using the same view systems under the hood. If you go looking, you can find the design documents backing mango indexes. However, exactly how mango indexes map to design documents is an implementation detail, and users are encouraged to manage their indexes using the /{db}/_index family of endpoints.