
What is CouchDB?
The first sentence of CouchDB's definition (as defined by http://couchdb.apache.org/) is as follows:
CouchDB is a document database server, accessible through the RESTful JSON API.
Let's dissect this sentence to fully understand what it means. Let's start with the term database server.
Database server
CouchDB employs a document-oriented database management system that serves a flat collection of documents with no schema, grouping, or hierarchy. This is a concept that NoSQL has introduced, and is a big departure from relational databases (such as MySQL), where you would expect to see tables, relationships, and foreign keys. Every developer has experienced a project where they have had to force a relational database schema into a project that really didn't require the rigidity of tables and complex relationships. This is where CouchDB does things differently; it stores all of the data in a self-contained object with no set schema. The following diagram will help to illustrate this:

In the previous example, we might want to facilitate the ability for many users to belong to one-to-many groups. In order to handle this functionality in a relational database (such as MySQL), we would create a users table, a groups table, and a link table, called users_groups
, that allow you to map many users to many groups. This practice is common to most web applications.
Now look at the CouchDB documents. There are no tables or link tables, just documents. These documents contain all of the data pertaining to a single object.
Note
This diagram is very simplified. If we wanted to create more logic around the groups in CouchDB, we would have had to create group documents, with a simple relationship between the user documents and group documents. We'll touch on how to handle this type of relationship as we get deeper into the book.
We saw the term document quite a bit in this section. So let's dig further into what documents are and how CouchDB uses them.
Documents
To illustrate how you might use documents, first imagine that you are physically filling out the paper form of a job application. This form has information about you, your address, and past addresses. It also has information about many of your past jobs, education, certifications, and much more. A document would save all of this data exactly in the way you would see it in the physical form - all in one place, without any unnecessary complexity.
In CouchDB, documents are stored as JSON objects that contain key and value pairs. Each document has reserved fields for metadata such as id, revision
, and deleted
. Besides the reserved fields, documents are 100 percent schema-less, meaning that each document can be formatted and treated independently with as many different variations as you might need.
Let's take a look at an example of what a CouchDB document might look like for a blog post:
{ "_id": "431f956fa44b3629ba924eab05000553", "_rev": "1-c46916a8efe63fb8fec6d097007bd1c6", "title": "Why I like Chicken", "author": "Tim Juravich", "tags": [ "Chicken", "Grilled", "Tasty" ], "body": "I like chicken, especially when it's grilled." }
Tip
Downloading the example code
You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.
The first thing you might notice is the strange markup of the document, which is JavaScript Object Notation (JSON). JSON is a lightweight data-interchange format based on JavaScript syntax and is extremely portable. CouchDB uses JSON for all communication with it, so you'll get very familiar with it through the course of this book.
The next thing that you might notice is that there is a lot of information in this document. There are key-value pairs that are simple to understand, such as "title", "author
", and "body
", but you'll also notice that "tags
" is an array of strings. CouchDB lets you embed as much information as you want directly into a document. This is a concept that might be new to relational database users who are used to normalized and structured databases.
We mentioned reserved fields earlier on. Let's look at the two reserved fields that you saw in the previous example document: _id
and _rev
.
_id
is the unique identifier of the document. This means that _id
is mandatory, and no two documents can have the same value. If you don't define an _id
on creation of a document, CouchDB will choose a unique one for you.
_rev
is the revision version of the document and is the field that helps drive CouchDB's version control system. Each time you save a document, the revision number is required so that CouchDB knows which version of the document is the newest. This is required because CouchDB does not use a locking mechanism, meaning that if two people are updating a document at the same time, then the first one to save his/her changes first, wins. One of the unique things about CouchDB's revision system is that each time a document is saved, the original document is not overwritten, and a new document is created with the new data, while CouchDB stores a backup of the previous documents in its original form in an archive. Old revisions remain available until the database is compacted, or some cleanup action occurs.
The last piece of the definition sentence is the RESTful JSON API. So, let's cover that next.
RESTful JSON API
In order to understand REST, let's first define HyperText Transfer Protocol (HTTP) . HTTP is the underlying protocol of the Internet that defines how messages are formatted and transmitted and how services should respond when using a variety of methods. These methods consist of four main verbs, such as GET, PUT, POST
, and DELETE
. In order to fully understand how HTTP methods function, let's first define REST.
Representation State Transfer (REST) is a stateless protocol that accesses addressable resources through HTTP methods. Stateless means that each request contains all of the information necessary to completely understand and use the data in the request, and addressable resources means that you can access the object via a URL.
That might not mean a lot in itself, but, by putting all of these ideas together, it becomes a powerful concept. Let's illustrate the power of REST by looking at two examples:

By looking at the table, you can see that each resource is in the form of a URL. The first resource is collection
, and the second resource is abc123
, which lives inside of collection
. Each of these resources responds differently when you pass different methods to them. This is the beauty of REST and HTTP working together.
Notice the bold words I used in the table: Read, Update, Create, and Delete. These words are actually, in themselves, another concept, and it, of course, has its own term; CRUD. The unflattering term CRUD stands for Create, Read, Update, and Delete and is a concept that REST uses to define what happens to a defined resource when an HTTP method is combined with a resource in the form of a URL. So, if you were to boil all of this down, you would come to the following diagram:

This diagram means:
- In order to CREATE a resource, you can use either the POST or PUT method
- In order READ a resource, you need to use the GET method
- In order to UPDATE a resource, you need to use the PUT method
- In order to DELETE a resource, you need to use the DELETE method
As you can see, this concept of CRUD makes it really clear to find out what method you need to use when you want to perform a specific action.
Now that we've looked at what REST means, let's move onto the term API, which means Application Programming Interface. While there are a lot of different use cases and concepts of APIs, an API is what we'll use to programmatically interact with CouchDB.
Now that we have defined all of the terms, the RESTful JSON API could be defined as follows: we have the ability to interact with CouchDB by issuing an HTTP request to the CouchDB API with a defined resource, HTTP method, and any additional data. Combining all of these things means that we are using REST. After CouchDB processes our REST request, it will return with a JSON-formatted response with the result of the request.
All of this background knowledge will start to make sense as we play with CouchDB's RESTful JSON API, by going through each of the HTTP methods, one at a time.
We will use curl
(which we learned to use in the previous chapter) to explore each of the HTTP methods by issuing raw HTTP requests.