Concepts / Sending and managing data / Record specifications
May. 10, 2019

Record Specifications

An Example of a Typical Record

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
[
  {
    "objectID": 42,               // record identifier
    "title": "Breaking Bad",      // string attribute
    "episodes": [                 // array of strings attribute
      "Crazy Handful of Nothin'",
      "Gray Matter"
    ],
    "like_count": 978,            // integer attribute
    "avg_rating": 1.23456,        // float attribute
    "air_date": 1356846157,       // date as timestamp
    "featured": true,             // boolean attribute
    "actors": [                   // nested objects attribute
      {
        "name": "Walter White",
        "portrayed_by": "Bryan Cranston"
      },
      {
        "name": "Skyler White",
        "portrayed_by": "Anna Gunn"
      }
    ],
    "_tags": "tv series, drugs"
  }
]

Accepted Datatypes for your Attributes

Records in Algolia are modeled with JSON, which makes them easier to configure and more flexible.

You can index data attributes that have the following format:

  • string: "foo"
  • integer/float: 12 or 12.34
  • boolean: true
  • nested objects: { "sub": { "a1": "v1", "a2": "v2" } }
  • array (of strings, integers, floats, booleans, nested objects, arrays): ["foo", "bar"]

Unique identifier - ObjectID

The engine requires every object to be identified by a unique objectID. We recommend you set your own internal IDs. When you don’t, Algolia generates them for you (like “228506501”), and you can retrieve them by browsing the index. Later, you’ll need to use the objectIDs for updates and deletes.

Please note that when you retrieve objects, objectIDs are always returned as string values, even if you’ve provided a custom objectID as an integer. If you wish to use only integers in your application, and if you are confident that you only have integer values as objectIDs, you can safely cast every objectID as an integer after retrieving the objects.

Because the objectID is used as a unique identifier for your objects, it gets special treatment by Algolia:

  • It can be searched by declaring it as a searchableAttributes.
  • It can’t be highlighted nor snippeted. If objectID is declared in attributesToHighlight or attributesToSnippet, it will be ignored.
  • It cannot be excluded from the results. If objectID is declared in unretrievableAttributes or omitted from attributesToRetrieve, it will still be returned.
  • It can be used as a facet filter, but it can’t be faceted. If objectID is declared in attributesForFaceting, it will be ignored. Faceting on a unique identifier makes little sense anyway, since every facet count would be equal to one.

Dates

Date attributes should be formatted as Unix timestamps (e.g., 1435735848) if you want to filter or sort by date. The Algolia engine doesn’t interpret dates as ISO 8601 strings, so you must convert your dates into numeric values.

Reserved Attribute Names

The Algolia API uses underscore prefixes to identify reserved attributes name.

Records

In a record, you can use attribute names like _tags or _geoloc but they have an imposed schema. All other attribute names are schema-agnostic.

Note: Reserved words are not searchable by default. If you wish to search into _tags and/or _geoloc, you need to add these to your searchableAttributes. However, keep in mind that once you use searchableAttributes for any attribute, you need to list all attributes you wish to search into.

Search Response

In the search response, Algolia will return attribute names like: _highlightResult, _snippetResult, _rankingInfo or _distinctSeqID. These are reserved Algolia words that are tied to specific features. If you use them in your records, it will create conflicts.

Data Sanitization

Algolia accepts any data you send it, without alteration. Same with the response; Algolia returns all data in your index as is. It therefore saves and returns HTML/XML tags and their properties.

Therefore, you need to manage this yourself. Otherwise, you run the risk of an XSS attack.

Terminology

Object = record

We use the words “object” and “record” interchangeably; sometimes within the same sentence. While they can be different within the field of computer science, for us, they are the same. Don’t place any significance on their usage:

  • indices contain “objects” or “records”.
  • JSON contains “objects” or “records”.

Indexes = indices

We use these words interchangeably. The former is the American spelling, while the API often uses the British spelling.

In our documentation, we try to always use “indices”.

Don’t place any significance on their usage either. They mean the same thing.

Attributes

All objects and records contain attributes. Sometimes we refer to them as fields or elements.

Within the search and indexing contexts, we often speak of settings and parameters. Again, these terms are mostly interchangeable.

Some attributes are plain key/value pairs. Others can be more complex, as in Java or C#, where they can be an object, an array, nested objects, or a collection.

Operations

An operation is an atomic action performed on our engine. There are two types of operations: index and search operations.

  • Indexing is the process of adding, updating, deleting, or manipulating the data within the index.
  • Searching is the process of querying the data stored in the index to return relevant search results.

Did you find this page helpful?