WARNING: this tutorial is out of date, please check memri.io for the most up-to-date instructions.

With Memri we want to enable you to have all your personal data in your own control, stored in your private and secure personal Pod. In the current situation, you probably have all your data stored across many services. You might have notes in Evernote, photos in iCloud, and your mail in Gmail. One of the reasons it’s hard to switch away from these services, is the hassle to transfer your data from one place to another, which can entail hours of manual labor.

That’s why we create Downloaders and Importers: Downloaders fetch your data from one of the services you use, Importers store this information in your Pod in such a way you can query and use it right away in the iOS app or the browser application. (If you're curious about indexers, they work on existing data in your Pod...)

This tutorial explains how to create a new downloader, and will also touch upon the use of Importers to test whether your data shows up in the app. We assume you already have Memri setup on your machine. If that's not the case, take a look at: Tutorial: Setting up Memri.

In Downloaders, we use the following directory structure: downloaders/service/[name_of_service]. Where ‘name_of_service’ would be the service from where you download, for instance iCloud. In that directory, there could be multiple scripts to download data of various data types, for instance notes: downloaders/service/iCloud/download_notes.js. In this tutorial we’ll create a script named hello_world.py in the directory _tutorial.

The downloader

Downloaders are used to connect to a service and extract your data from them, whether it is photos, files or contacts. All larger files, such as photos, will be stored in the file system of your Pod (not implemented yet), while all smaller items will be stored in the database, allowing you to easily tag, connect and search them.

All items will be given a uid by the database, in the form of a 64-bit integer. Additionally, many services also have their own ids. For this reason the database also supports externalIds, which are a custom string that will be prepended with the service in the database to guarantee uniqueness. As you do not know the uids that will be given to your items, you will mainly be dealing with the externalIds in a downloader.

The hello_world.py shows where and how files should be written for the importer to recognize them. Currently, the example is creating items of types Tag and Note, which are written to ../../../importers/data/_tutorials/tags and  ../../../importers/data/_tutorials/notes respectively. The filenames are the externalIds, with the suffix .json. Once the files are in this place, calling the correct importer will get them into the database.

Command Line Interface

You can use the command line interface to select what download script to run, which is especially important when using Docker (see below). The CLI is written in python and located at downloaders/service/cli.py. To add a new script, the only thing to update is the SCRIPTS constant at the top of the file:

SCRIPTS = {
    'iCloud': {'Notes': 'node download-notes.js'},
    'EverNote': {'Notes': 'python3 download_notes.py'}
}

This is formatted as  'service': {'data_type': 'script.ext'}. For this tutorial script to be called in the CLI, we change SCRIPTS to:

SCRIPTS = {
    'iCloud': {'Notes': 'node download-notes.js'},
    'EverNote': {'Notes': 'python3 download_notes.py'},
    '_tutorials': {'Notes': 'python3 hello_world.py'}
}

Now you can run python cli.py to see if the script executes correctly.

Docker

All the components of Memri can be run as Docker containers. This allows others to simply run a command without having to install packages from Python, Node, etc. Furthermore, in the recent future we will run the Downloaders from the app, which means that your Pod will spin up the Downloaders container and remove it when finished to free up resources.

In order for the new script to work, there might be additional resources needed that are not already installed in the container, like Python packages or other libraries. The Dockerfile specifies the instructions that Docker uses to build the image. For this tutorial we will add the file /downloaders/service/_tutorial/requirements.txt that lists the packages that need to be installed. Subsequently, we add the following line to the Dockerfile:

RUN pip3 install -r _tutorials/requirements.txt

This tells Docker to run the pip3 command, which will install the dependencies needed for our hello world downloader. Note that this line should be after the WORKDIR and before the CMD command:

[...]
COPY service /usr/src/downloaders/service
WORKDIR /usr/src/downloaders/service

RUN pip3 install -r EverNote/requirements.txt
RUN pip3 install -r _tutorials/requirements.txt
RUN npm --prefix iCloud install

CMD ["python3", "cli.py"]

To test whether the script runs correctly in the container, run ./build-and-run-downloaders.sh in your terminal and use the CLI to run the hello world script.

Import the data

After the data is downloaded to disk, it still has to be inserted into your Pod's database. In the future, the importers will be invoked by the Pod when requested from the front-end. For now we simply run (make sure your Pod is still online):

curl -v -X POST -H "Content-Type: application/json" "localhost:3030/v1/run_service/importers/note"

The terminal output of your Pod shows the import of the data you “downloaded” in the previous step. See the [Pod documentation](https://gitlab.memri.io/memri/pod#development) on how to run it in debug mode to get more info if the import fails. When the import is completed, you should see the test data in the app!

Check out the memri docs to see what is currently supported and don't hesitate to create a merge requests here!