[[document]] === What Is a Document?
Most entities or objects in most applications can be serialized into a JSON object, with keys and values.((("objects")))((("JSON", "objects")))((("keys and values"))) A key is the name of a field or property, and a value can ((("values")))be a string, a number, a Boolean, another object, an array of values, or some other specialized type such as a string representing a date or an object representing a geolocation:
[source,js]
{ "name": "John Smith", "age": 42, "confirmed": true, "join_date": "2014-06-01", "home": { "lat": 51.5, "lon": 0.1 }, "accounts": [ { "type": "facebook", "id": "johnsmith" }, { "type": "twitter", "id": "johnsmith" } ]
}
Often, we use the terms object and document interchangeably. However, there is a distinction.((("objects", "documents versus")))((("documents", "objects versus"))) An object is just a JSON object--similar to what is known as a hash, hashmap, dictionary, or associative array. Objects may contain other objects. In Elasticsearch, the term document has a specific meaning. It refers to the top-level, or root object that((("root object"))) is serialized into JSON and stored in Elasticsearch under a unique ID.
=== Document Metadata
A document doesn't consist only of its data.((("documents", "metadata"))) It also has metadata—information about the document.((("metadata, document"))) The three required metadata elements are as follows:
_index
::
Where the document lives
_type
::
The class of object that the document represents
_id
::
The unique identifier for the document
==== _index
An index is like a database in a relational database; it's the place we store and index related data.((("indices", "_index, in document metadata")))
[TIP]
Actually, in Elasticsearch, our data is stored and indexed in shards, while an index is just a logical namespace that groups together one or more shards.((("shards", "grouped in indices"))) However, this is an internal detail; our application shouldn't care about shards at all. As far as our application is concerned, our documents
live in an index. Elasticsearch takes care of the details.
We cover how to create and manage indices ourselves in <website
as our index name.
==== _type
In applications, we use objects to represent things such as a user, a blog
post, a comment, or an email. Each object belongs to a class that defines
the properties or data associated with an object. Objects in the user
class
may have a name, a gender, an age, and an email address.
In a relational database, we usually store objects of the same class in the same table, because they share the same data structure. For the same reason, in Elasticsearch we use the same type for ((("types", "_type, in document metadata)))documents that represent the same class of thing, because they share the same data structure.
Every type has its own <
We show how to specify and manage mappings in <
A _type
name can be lowercase or uppercase, but shouldn't begin with an
underscore or contain commas.((("types", "names of"))) We will use blog
for our type name.
==== _id
The ID is a string that,((("id", "_id, in document metadata"))) when combined with the _index
and _type
,
uniquely identifies a document in Elasticsearch. When creating a new document,
you can either provide your own _id
or let Elasticsearch generate one for
you.
==== Other Metadata
There are several other metadata elements, which are presented in
<