graphdb

kwg-graphdb

KnowWhereGraph's GraphDB deployment configuration

Overview

KnowWhereGraph uses a single node GraphDB Enterprise instance to store and process data requests.

There are six docker-compose files here. The two main flavors are

Preloading: These compose files are used to the first upload of data. There are three (local/stage/prod)
Running: These compose files are used when running GraphDB to serve content. There are three (local/stage/prod)

Data Persistence

Data is persisted on the host machine, not the container. This is achieved by a volume mount between the host and GraphDB's repository data directory which is set in the docker-compose file. Graph DB stores its repository, configuration, and logging data under /opt/graphdb/home. This path can be mounted to the local system, persisting the data. When a new container is launched, it will reference the persisted data and load it.

Initial Data Load

GraphDB's initial database is constructed using the importrdf tool from Ontotext. This runs with GraphDB offline and offers much faster data loading than other options. In this process, GraphDB creates a new repository and inserts data into it. To account for this, separate docker-compose files are needed to manage the offline instances.

In order to properly load data,

The repository configuration must be supplied in graphdb-data/home/data/repositories/<your-repo-name-here>/config.ttl
The data being imported must be placed in graphdb-data/import-data
If the repository name is anything other than KWG, modify the Makefile to account for this change
make start-<env>-preload should be called from the project root, where env={local/stage/prod}

Loading KnowWhereGraph's data can take days. Once this is complete,

The docker container will exit
Confirm the success by checking the logs in the logs folder here, or by getting the docker logs
The Data Serving deployment can be initiated (see below)

Debugging

In the case that something goes wrong, the docker container will most likely exit.

Get the logs of the stopped container with docker ps --all
docker logs <container_id>
Also check the mounted logs folder in graphdb-data/logs

Data Serving (normal deployment)

When data doesn't need to be loaded and GraphDB is meant to be started as a service that functions as a normal database,

Use make start-<env> where env={local/stage.prod} to start the service with the rest of the stack

If the stack is running, stop the stack and start it back up with the command above.

Logging

GraphDB has several rolling log files that are in the GraphDB home directory, making it difficult to use docker logs <container_name>. Instead, the logs are mounted to the local volume through the docker-compose file.

Updating GraphDB Version

Updating GraphDB can be achieved by bumping the version in the docker-compose file. Data should be persisted through the mounted graphdb-data folder. The service should then be redeployed by bringing the stack down, and then back up (kludgy, with downtime).

Integrating with Elasticsearch

The Knowledge Explorer webapp relies on integrating GraphDB with Elasticsearch. The Elasticsearch index is on a per-repository basis. This means that the Manhattan repository has its own Elasticsearch index. The Vienna repository has its own Elasticsearch index, etc. As of right now creating these is a manual process. Elasticsearch indexes are created through SPARQL. The SPARQL queries for the indexes are found in the scripts/ folder.

To integrate with Elasticsearch, run the sparql queries in the scripts/ folder.

Troubleshooting

GraphDB is Unreachable

If GraphDB is not reachable,

Make sure the container is running
Make sure nginx is running
Check the nginx error logs
Check the graphdb logs
Restart the service

GraphDB is Unresponsive

If GraphDB is running but is unresponsive,

Check if there's a data load process happening (can slow the service down)
Check the GraphDB logs
Restart the pod

How do I check the logs if the container was killed?

docker ps --all
Get the killed container id
docker logs <container_id>

Name		Name	Last commit message	Last commit date
parent directory ..
graphdb-data/home/data/repositories/KWG		graphdb-data/home/data/repositories/KWG
scripts		scripts
Dockerfile		Dockerfile
README.md		README.md
docker-compose.local.preload.yaml		docker-compose.local.preload.yaml
docker-compose.local.yaml		docker-compose.local.yaml
docker-compose.prod.preload.yaml		docker-compose.prod.preload.yaml
docker-compose.prod.yaml		docker-compose.prod.yaml
docker-compose.stage.preload.yaml		docker-compose.stage.preload.yaml
docker-compose.stage.yaml		docker-compose.stage.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

README.md

kwg-graphdb

Overview

Data Persistence

Initial Data Load

Debugging

Data Serving (normal deployment)

Logging

Updating GraphDB Version

Integrating with Elasticsearch

Troubleshooting

GraphDB is Unreachable

GraphDB is Unresponsive

How do I check the logs if the container was killed?

Uh oh!

FilesExpand file tree

graphdb

Directory actions

More options

Directory actions

More options

Latest commit

History

graphdb

Folders and files

parent directory

README.md

kwg-graphdb

Overview

Data Persistence

Initial Data Load

Debugging

Data Serving (normal deployment)

Logging

Updating GraphDB Version

Integrating with Elasticsearch

Troubleshooting

GraphDB is Unreachable

GraphDB is Unresponsive

How do I check the logs if the container was killed?