Generate a Schema

A schema is a JSON dictionary that describes data points you have for your users and for the objects the users interact with on the platform.
In order to start the learning process, you’ll need to inform Begin’s learning algorithms about the users and their interactions with the data points available in your application.
This article will walk you through the guidelines for building your own schema using a book store example.

Schema Breakdown

Each schema is broken into different parts:

1. Schema Flag

The initial key, schema, flags this document as a schema file for our training algorithm. This is mandatory and must be the only top-level item in the schema.
{ "schema": {} }
After the schema flag, define the following objects at the root level:

2. User

Defines the aspects of a user relevant to the learning model. (For example, user names, date of birth, location, etc.)
{ "schema": { "user": {} } }

3. Other Objects (eg. Song, Book, Album, Shelf, Product, etc.)

You can optionally define other objects in your schema that the user interacts with (books, magazines, libraries, events, etc). Defining these objects enables you to use them as sources to run your machine learning algorithms.
For example, a bookseller using our recommendation algorithm may define a book or magazine object, with titles, genres, author, theme, reading level, and year of publication.
In this guide we will give our custom object the name book
{ "schema": { "user": {}, "book": {}, } }

4. Interactions

A mandatory structure within the schema. Interactions describe the actions and sentiments that happen between the user and different objects of your schema.
{ "schema": { "user": {}, "book": {}, "interactions": {}, } }
You can have as many keys and objects as you need, but each must be unique in order to avoid collisions.

Let’s build a user object!

With our basic structure outlined, let’s add some attributes to the user object.
  1. Labels: If your objective is to classify data into various labels, mention all of them in an array.
    1. Here we aim to classify user object as fake or not_fake
      { "schema": { "user": { "labels" : ["fake", "not_fake"], }, "book": {}, "interactions": {}, }, }
  1. Attributes: Select attributes for user. Example: date_of_birth
    1. { "schema": { "user": { "labels" : ["fake", "not_fake"], "date_of_birth": {}, }, "book": {}, "interactions": {}, }, }
      • Set a type, which can be any of the following
        • number
        • text
        • category
        • lat_lng_location
        • date
        • boolean
        • Here we add type as date since we are dealing with date of birth
          { "schema": { "user": { "labels" : ["fake", "not_fake"], "date_of_birth": { "type": "date", }, "book": {}, "interactions": {} }
      • Select other relevant information. Example: min_date and map_to
        • { "schema": { "user": { "labels" : ["fake", "not_fake"], "date_of_birth": { "type": "date", "min_date": "1960-01-01", "map_to": "user_date_of_birth" }, "book": {}, "interactions": {} }
You can read more about object Toggle Attributes with the link below

Onto the Book object

  1. Our book object exists at the same level as the user. We have added the Book-Title attribute to it with a type "text",
    1. { "schema": { "user": { "labels" : ["fake", "not_fake"], "date_of_birth": { "type": "date", "min_date": "1960-01-01", "map_to": "user_date_of_birth" }, "book": { "Book-Title":{ "type":"text" }, }, "interactions": {} }


There are important rules to follow when building interactions:
  1. They must be on the same level as the user object.
  1. They must contain at least one element inside of it that contains the structure _with_* (For example _with_user, _with_recipe, _with_article, etc.).
    1. { "schema": { "user": { "labels" : ["fake", "not_fake"], "date_of_birth": { "type": "date", "min_date": "1960-01-01", "map_to": "user_date_of_birth" }, "book": { "Book-Title":{ "type":"text" }, }, "interactions": { "_with_book": {}, } }
      HEADS UP: You can include as many _with_* objects as you need, provided they’re defined in the schema as a high-level object (for example, _with_user).
  1. sentiment: Specific to interaction keys, they weigh the interactions on a scale from very negative (dead) to extremely positive(super_positive).
The sentiment options available are:
  • Super Positive (super_positive)
  • Positive (positive)
  • Neutral (neutral)
  • Negative (negative)
  • Super Negative (super_negative)
  • Dead (dead) (The worst form of interaction possible)
{ "schema": { "user": { "labels" : ["fake", "not_fake"], "date_of_birth": { "type": "date", "min_date": "1960-01-01", "map_to": "user_date_of_birth" }, "book": { "Book-Title":{ "type":"text" }, }, "interactions": { "_with_book": { "like": { "sentiment": "positive" }, }, }, }

Final result

Below is an example of a complete interaction object schema:
{ "schema": { "user": { "labels" : ["fake", "not_fake"], "date_of_birth": { ... }, "joining_date": { ... }, "instructions": [{ ... }] }, "book": { "Book-Title":{ "type":"text" }, } }, "interactions": { "_with_book": { "like": { "sentiment": "positive" } }, "_with_user": { "followed": { "sentiment": "positive" }, "report": { "sentiment": "super_negative" } } } } }

Next Step

Create a Project
Data processing

Further Readings:

Check out to enhance and improve your machine learning performance