In this post I'll walk you through building a Memri Pod plugin that interacts with the Pod API using python. We'll install al the prerequisites you need to get started and I'll walk you through every step. At the end of this tutorial you'll have a small test plugin up and running and you know how to implement the most important interactions with the Pod. Furthermore you'll know where to find the documentation for everything else. Let's get started!

I assume here that you'll have python3 installed and that you are on some type of posix (linux, macOS) system (sorry Windows users, I hope you'll be able to translate).

Installing Memri Pod

The first thing we'll do is install and run a local copy of the Memri Pod. The Pod or Personal Online Datastore is a server written in Rust with a SQLite database that exposes a Graph API to store data safely and securely. It encrypt the database with encryption keys that need to be send with every API request. Furthermore it sports a micro-services architecture using docker that enables plugins to be dynamically started based on triggers. Adding certain items to the Pod will trigger a plugin to be run. More on this later. For now we'll follow the instructions in the Pod repo here to get started.

1. Install Rust and sqlcipher

  • On MacOS: brew install rust sqlcipher
  • On ArchLinux: pacman -S --needed rust sqlcipher base-devel
  • On Ubuntu and Debian:
apt-get install sqlcipher build-essential libsqlcipher-dev python3-dev
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

2. Clone the Pod to your local disk

cd path/to/development/folder
git clone https://gitlab.memri.io/memri/pod.git
cd pod

3. Run the Pod in development mode

Run the pod with this command: ./examples/run_development.sh

It will first install all the dependencies using cargo (Rust's package manager). When it's done the output should look something like this:

Now that you have the pod running we can move on to create our first plugin!

Creating a Plugin

In order to create a plugin we will start by creating a new directory (or "folder" what is the modern nomenclature?) and initializing a git repo. We'll then install the pyMemri dependency and create the equivalent of a Hello World plugin.

  1. Initializing the git repo
cd path/to/development/folder
mkdir demo-plugin
cd demo-plugin
git init .

This should set you up with a new empty git repo in the demo-plugin directory. Please note that as practice we at Memri no longer use master as the default branch name as this may be associate with a culture of oppression. We therefor set the default branch to dev before calling git init . for the first time using the following command git config --global init.defaultBranch dev.

2. Installing pyMemri

The pyMemri library is available on PyPi (the python package repository) and we use it as a client to provide us with an easy to use interface to the pod. Installation is very simple using pip.

pip install pymemri

3. A hello world plugin

When the pod runs a plugin it will run this in a separate docker container restricted from other plugins and the pod server itself. During development this is not needed and we will simply run the plugin via python in our default environment.

When in production, the pod will pass the plugin a secret that's encoded in a binary blob and describes what access the plugin has to the Pod API. This is an important security measure following the Principle of Least Privilege that ensures that if, in a worst-case scenario, a plugin is malicious or becomes compromised that the amount of damage it can do is extremely limited. For instance an importer plugin may only need access to the last item it imported and only have the ability to create new items.

All this logic is handled by the PodClient implementation in the pyMemri package. We will simply use this client to access the pod api. If the plugin does not have the rights to use one of the APIs the call with return an error. In development all APIs are available to us plugin creators.

In order to include the pyMemri library and expose it to our plugin let's create a new file called plugin.py in your favorite editor and add the following lines.

import pymemri
from pymemri.data.schema import *
from pymemri.pod.client import *
from pymemri.data.photo import *

From here we can now create the client. For testing purpose we'll include some variables that tell the client how to connect to the pod. The database_key and owner_key will be passed through environment variable from the Pod. For testing purposes we'll use "key" as a temporary key to use.

DEFAULT_POD_ADDRESS = "http://localhost:3030"
POD_VERSION = "v3"

client = PodClient(database_key="key", owner_key="key")

Alright, now we're ready to start interacting with the pod. You can use the following code to test if everything is working. This code defines a new type called Dog and adds that to the schema, then creates a 'dog' and lastly fetches that 'dog' from the pod to test if the create call did its job.

class Dog(Item):
    def __init__(self, name, age, id=None, deleted=None):
        super().__init__(id=id, deleted=deleted)
        self.name = name
        self.age = age
    
    @classmethod
    def from_json(cls, json):
        id = json.get("id", None)
        name = json.get("name", None)
        age = json.get("age", None)
        return cls(id=id,name=name,age=age)
        
dog = Dog("max", 2)
client.add_to_schema(dog);
dog2 = Dog("bob", 3)
client.create(dog2);

dog_from_db = client.get(dog2.id, expanded=False)

print(dog_from_db)

You can now run this via your editor or on the command line. this should look something like this:

% python plugin.py
Dog (#9c97feace45561c1cc9c0eec302bfd07) 

If you are seeing this then you successfully created your first plugin. Yay!

Debugging

If the connection to the pod is not working you can check the following things to make sure your configuration is set up correctly:

  • Make sure the pod is on the dev branch
  • Check the pod output to see if a connection is being made
  • Try curl-ing the pod in order to see if you can connect to it curl http://localhost:3030/
  • Is there any other process running on port 3030?
  • Do a git pull to check that you have the latest version of the dev branch

More advanced use cases

To make building plugins easy and straightforward the PodClient API supports various use cases. In this section we'll go through some of those use cases. For a complete overview of all the API calls that are supported, please check out the pyMemri documentation. You can also go one level deeper and check out the REST API for the Pod here.

Authentication

In order to access an external service your plugin generally needs to authenticate. The credentials for authentication can be read from the Pod. You can assume that these credentials are stored in an Account item in the Pod. In production you would want to fetch the Account associated with the Importer item that is tied to your plugin. The relationship looks like this in the Pod.

ImporterRun -> Importer -> Acccount

Where each edge is depicted as an ->. Your plugin will be started with the id of the ImporterRun item in an environment variable. You can use this to get a reference to the item and traverse from there to the Account to fetch the login details

importer_run_id  = int(environ[RUN_UID_ENV])
importerRun = client.get(importer_run_id, expanded=False)
# Traverse edges from here (TODO)

Authentication Workaround

In order to keep things straightforward while developing we recommend to hardcode the authentication details in your plugin. If you are accessing an API through oAuth or needing a 2FA authentication flow, simply implement this through an interactive command line flow where you ask the user for additional information as you block execution of the application.

% python plugin.py
Please go to the following url for oAuth https://oauth.example.com/ABCDEF
Enter the code: ******
Authentication completed

This tutorial will be updated in the future when we update the way to implement the authentication process with the Pod.

Storing temporary credentials

When using oAuth you may want to store the session token for use at a later date. The best place to store this session token is in the Pod. That way only the user has access to it and your plugin can use it at its next run. This is how you can do that when creating an Account for the first time.

class Account(Item):
    def __init__(self, handle, service, session_token, id=None, deleted=None):
        super().__init__(id=id, deleted=deleted)
        self.handle = handle
        self.service = service
        self.session_token = session_token
    
    @classmethod
    def from_json(cls, json):
        id = json.get("id", None)
        handle = json.get("handle", None)
        service = json.get("service", None)
        session_token = json.get("session_token", None)
        return cls(id=id,handle=handle,service=service,session_token=session_token)

acc_schema = Account("", "", "")
client.add_to_schema(acc_schema)

session_token = "AA6929CD9DE7C"
acc2 = Account("nickname", "cloud-service-name", session_token)
client.create(acc2)

If the account already exists simply update the session token.

acc = client.search({"type": "Account"}) # TODO: search for specific account
acc.sessionToken = "12344"
client.update_item(acc)

Schema

When you are creating a new plugin start by checking if there are any existing schema definitions that you can reuse. By reusing the existing schema definitions you make sure that you have compatibility with other plugins that import similar data and user interfaces that depend on that schema definition. You can find the existing schema using the schema browser.

If you find that you need to add a new property, item or edge for your plugin you can do that as specified above. Please join our discord to discuss any schema that you think may be more generic in order to discuss with the rest of the community and to find consensus on the best way to implement a certain data type.

Incremental runs

All importer plugins should support importing data incrementally in order to prevent duplicating data on each run. To implement this the plugin can adopt various techniques to determine what to import.

Continue from the last imported item

The following method returns the last item of a certain type. You can use this to find the last imported item and start importing from there.

client.search_last_added(type="Person")

In the near future you will be able to filter based on the account that imported the data as well. We will update this article to reflect that when that lands.

Check if each item that exists in the pod

The following code returns all Person items that you can then search using a for loop.

all_people = client.search({"type": "Person"})

In the future the search API will support searching on fields and edges to speed up this process. We recommend creating a helper function in your plugin that can be replaced later when a more complete search API.

Alternative methods

There are several alternative methods to implement incremental runs. You could store additional fields in the Account item (e.g. timestamp, id-field) and use that to continue where you left of.

Edges

The Pod API exposes a graph database that allows you to connected items to each other with edges. Each edge has a type which is a string that describes the relationship between the two edges. An edge is directional which means it goes from one item to the other, but not back again. Use edges to connect items you create in your plugin.

In the following example we create an edge from an Email to a Person via an edge of the type sender. That way we store who the sender is of the email.

person_item = Person.from_data(firstName="Alice", lastName="X")
item_succes = client.create(person_item)
edge = Edge(email_item, person_item, "sender")
edge_succes = client.create_edge(edge)
(!) Be sure to create the items first before creating the edge. Creating an edge will fail if the items it points to do not exist on the pod.

Photos and Files

Photos and files can be uploaded to the Pod and attached to other items via edges. The following example creates a random image and then creates a File item to which the image is uploaded. Attach the File item to another item via an edge (for instance to an Email as via an attachment edge).

x = np.random.randint(0, 255+1, size=(640, 640), dtype=np.uint8)
photo = IPhoto.from_np(x)
file = photo.file[0]

success1 = client.create(file)
success2 = client.upload_photo(x)

Join our Discord and share your plugins

When you are starting building plugins you'll likely have many questions. Please join our discord and ask those questions to our community and the Memri core engineers. We look forward to help you out and work together towards giving people back control over their data!