In this post I'll walk you through building a Memri Pod plugin that interacts with the Pod API using python. We'll install al the prerequisites you need to get started and I'll walk you through every step. At the end of this tutorial you'll have a small test plugin up and running and you know how to implement the most important interactions with the Pod. Furthermore you'll know where to find the documentation for everything else. Let's get started!

I assume here that you'll have python3 installed and that you are on some type of posix (linux, macOS) system (sorry Windows users, I hope you'll be able to translate).

Installing Memri Pod

The first thing we'll do is install and run a local copy of the Memri Pod. The Pod or Personal Online Datastore is a server written in Rust with a SQLite database that exposes a Graph API to store data safely and securely. It encrypt the database with encryption keys that need to be send with every API request. Furthermore it sports a micro-services architecture using docker that enables plugins to be dynamically started based on triggers. Adding certain items to the Pod will trigger a plugin to be run. More on this later. For now we'll follow the instructions in the Pod repo here to get started.

1. Install Rust and sqlcipher

  • On MacOS: brew install rust sqlcipher
  • On ArchLinux: pacman -S --needed rust sqlcipher base-devel
  • On Ubuntu and Debian:
apt-get install sqlcipher build-essential libsqlcipher-dev python3-dev
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

2. Clone the Pod to your local disk

cd path/to/development/folder
git clone https://gitlab.memri.io/memri/pod.git
cd pod
git checkout dev

3. Run the Pod in development mode

Run the pod with this command: ./examples/run_development.sh

It will first install all the dependencies using cargo (Rust's package manager). When it's done the output should look something like this:

Now that you have the pod running we can move on to create our first plugin!

Creating a Plugin

In order to create a plugin we will start by creating a new directory (or "folder" what is the modern nomenclature?) and initializing a git repo. We'll then install the pyMemri dependency and create the equivalent of a Hello World plugin.

1. Install the plugin template

By using the plugin template you get a standard directory structure as well as a few of the other acceptance criteria for a memri plugin. You can read more about the plugin template here: https://gitlab.memri.io/plugins/plugin-template

cd path/to/development/folder
git clone https://gitlab.memri.io/plugins/plugin-template.git
mv plugin-template demo-plugin
cd demo-plugin
rm -Rf .git
git init .

This should set you up with a new git repo in the demo-plugin directory based on the plugin template. Please note that as practice we at Memri no longer use master as the default branch name as this may be associate with a culture of oppression. We therefor recommend setting the default branch to dev before calling git init . for the first time using the following command git config --global init.defaultBranch dev.

2. Installing pyMemri and other dependencies

The pyMemri library provides an API to easily interface with the Memri pod. It is available on PyPi (the python package repository) and we use it as a client to provide us with an easy to use interface to the pod. Installation is very simple using pip. The plugin template also makes sure you have the other dependencies installed.

pip install .
N.B. Make sure to update pyproject.toml and setup.py with your dependencies and update the author fields!

3. A hello world plugin

When the pod runs a plugin it will run this in a separate docker container restricted from other plugins and the pod server itself. During development this is not needed and we will simply run the plugin via python in our default environment.

When in production, the pod will pass the plugin a secret that's encoded in a binary blob and describes what access the plugin has to the Pod API. This is an important security measure following the Principle of Least Privilege that ensures that if, in a worst-case scenario, a plugin is malicious or becomes compromised that the amount of damage it can do is extremely limited. For instance an importer plugin may only need access to the last item it imported and only have the ability to create new items.

All this logic is handled by the PodClient implementation in the pyMemri package. We will simply use this client to access the pod api. If the plugin does not have the rights to use one of the APIs the call with return an error. In development all APIs are available to us plugin creators.

In order to include the pyMemri library and expose it to our plugin let's create a new file called plugin.py in your favorite editor and add the following lines.

import pymemri
from pymemri.data.schema import *
from pymemri.pod.client import *
from pymemri.data.photo import *

From here we can now create the client. For testing purpose we'll include some variables that tell the client how to connect to the pod. The database_key and owner_key will be passed through environment variable from the Pod. For testing purposes we'll use "key" as a temporary key to use.

DEFAULT_POD_ADDRESS = "http://localhost:3030"
POD_VERSION = "v3"

client = PodClient(database_key="key", owner_key="key")

Alright, now we're ready to start interacting with the pod. You can use the following code to test if everything is working. This code defines a new type called Dog and adds that to the schema, then creates a 'dog' and lastly fetches that 'dog' from the pod to test if the create call did its job.

class Dog(Item):
    def __init__(self, name, age, id=None, deleted=None):
        super().__init__(id=id, deleted=deleted)
        self.name = name
        self.age = age
    
    @classmethod
    def from_json(cls, json):
        id = json.get("id", None)
        name = json.get("name", None)
        age = json.get("age", None)
        return cls(id=id,name=name,age=age)
        
dog = Dog("max", 2)
client.add_to_schema(dog);
dog2 = Dog("bob", 3)
client.create(dog2);

dog_from_db = client.get(dog2.id, expanded=False)

print(dog_from_db)

You can now run this via your editor or on the command line. this should look something like this:

% python plugin.py
Dog (#9c97feace45561c1cc9c0eec302bfd07) 

If you are seeing this then you successfully created your first plugin. Yay!

Debugging

If the connection to the pod is not working you can check the following things to make sure your configuration is set up correctly:

  • Make sure the pod is on the dev branch
  • Check the pod output to see if a connection is being made
  • Try curl-ing the pod in order to see if you can connect to it curl http://localhost:3030/
  • Is there any other process running on port 3030?
  • Do a git pull to check that you have the latest version of the dev branch

More advanced use cases

To make building plugins easy and straightforward the PodClient API supports various use cases. In this section we'll go through some of those use cases. For a complete overview of all the API calls that are supported, please check out the pyMemri documentation. You can also go one level deeper and check out the REST API for the Pod here.

Installation

During installation of your plugin you add items to the pod that are necessary for your plugin to function. Please see the install_plugin function in the template. The bare minimum to install is the Plugin item that contains the information about your plugin. Furthermore you may want to add CVU screens that are needed for authentication flows (more on this in a later blog post).

Plugin Run Metadata

Each time a plugin is run a new item is added to the pod. In fact, the pod uses triggers and the trigger to run a plugin is the creation of a PluginRun item. The ID of this item is passed to the plugin and is used by the Plugin base class PluginFlow to get access to metadata for the plugin, such as authentication credentials.

Authentication

In order to access an external service your plugin usually needs to authenticate. The credentials for authentication can be read from the Pod. These credentials should be stored in an Account item that has an edge to the Plugin item.

PluginRun -> edge(plugin) -> Plugin -> edge(account) -> Acccount

You can store credentials such as the username and password on the Account item as well as temporary credentials such as a session token or an oAuth token.

The plugin template contains two example flows for authentication. Choose the one that fits your use case and adjust it to your needs. The first example flow asks first for a username and password from the user and then asks the user for the two-factor-authentication (2FA) code before completing the authentication flow. The second example completes a more simple oAuth flow.

When a log in fails you can store the error information on the errorMessage property of the Account item. The CVU for your plugin can then display this message.

Schema

When you are creating a new plugin start by checking if there are any existing schema definitions that you can reuse. By reusing the existing schema definitions you make sure that you have compatibility with other plugins that import similar data and user interfaces that depend on that schema definition. You can find the existing schema using the schema browser in the schema repository.

If you find that you need to add a new property, item or edge for your plugin you can do that as specified above. Please join our discord to discuss any schema that you think may be more generic in order to discuss with the rest of the community and to find consensus on the best way to implement a certain data type.

Person items
It's important to note that when creating an importer that we distinguish between Account items and Person items. For instance for the Twitter plugin each follower would be an Account with a follows edge to the twitter Account of the user of the pod. Each account will be attached to a Person item. Deduplication of Person items is quite complex and is handled separately by the person deduplication plugin. Each importer plugin should simply create a Person item for each Account item it adds to the pod with an owner edge in between. The deduplication plugin will take care of the rest!

Incremental runs

All importer plugins should support importing data incrementally in order to prevent duplicating data on each run. To implement this the plugin can adopt various techniques to determine what to import.

Continue from the last imported item

The following method returns the last item of a certain type. You can use this to find the last imported item and start importing from there.

client.search_last_added(type="Person")

In the near future you will be able to filter based on the account that imported the data as well. We will update this article to reflect that when that lands.

Check if an item already exists in the pod

The following code returns all Person items that you can then search using a for loop.

all_people = client.search({"type": "Person"})

In the future the search API will support searching on fields and edges to speed up this process. We recommend creating a helper function in your plugin that can be replaced later when a more complete search API.

Alternative methods

There are several alternative methods to implement incremental runs. You could store additional fields in the Account item (e.g. timestamp, id-field) and use that to continue where you left of.

Edges

The Pod API exposes a graph database that allows you to connected items to each other with edges. Each edge has a type which is a string that describes the relationship between the two edges. An edge is directional which means it goes from one item to the other, but not back again. Use edges to connect items you create in your plugin.

In the following example we create an edge from an Email to a Person via an edge of the type sender. That way we store who the sender is of the email.

person_item = Person.from_data(firstName="Alice", lastName="X")
item_succes = client.create(person_item)
edge = Edge(email_item, person_item, "sender")
edge_succes = client.create_edge(edge)
(!) Be sure to create the items first before creating the edge. Creating an edge will fail if the items it points to do not exist on the pod.

Uploading files

Photos and files can be uploaded to the Pod and attached to other items via edges.

Photos
The following example creates a random photo and adds it to the pod

x = np.random.randint(0, 255+1, size=(640, 640), dtype=np.uint8)
photo = IPhoto.from_np(x)
succes = client.add_to_schema(IPhoto.from_np(x))

# Store the photo
client.create(photo)

# Retrieve the photo
res = client.get_photo(photo.id, size=640)

Files
The following example reads a file from disk and uploads it to the pod. It then creates a File item and sets the sha256 hash of the file as its identifier. Attach the File item to another item via an edge (for instance to an EmailMessage via an attachment edge).

file=open("video.mp4","rb")
bytes = list(file.read(3))
file.close()

# Upload to the pod
pod_client.upload_file(bytes)

# Create SHA
from hashlib import sha256
h1 = sha256()
h1.update(bytes)
digest = h1.hexdigest()

# Create File
file = new File(sha256=digest)
pod_client.create(file)

Join our Discord and share your plugins

When you are starting building plugins you'll likely have many questions. Please join our discord and ask those questions to our community and the Memri engineers. We look forward to help you out and work together towards giving people back control over their data!