Ben E. C. Boyter's Blog

Exporting Documents from KnowledgeTree (468 words)

I was recently tasked with exporting a large collection of documents from KnowledgeTree (KT) for a client. The collection was too large to use the download all functionality and too wide to attempt to export each folder individually.

I had played around with the WebDav connection that KT provides but it either didn’t work or was designed deliberately to not allow exporting of the documents.

I looked at where the documents were  stored on disk but KT stores them as numbered files in numbered directories sans extension or folder information.

Long story short I spent some time poking through the database to identify the tables which would contain the correct metadata which would allow me to rebuild the tree using a proper filesystem. For record the tables required are the following,

That said here is a short Python script which can be used to rebuild the folders and documents on disk. All that is required is to ensure that Python MySQLdb is installed and to set the database details. Depending on your KT install you may need to change the document location. Where  the script is run it will replicate the folder tree containing the documents preserving the structures, names and extensions.

Keep in mind this is a fairly ugly script abusing global variables and such. It is also not incredibly efficient, but did manage to extract 20GB of files in my case in a little under 10 minutes.

import MySQLdb
import os
import shutil

# KnowledgeTree default place to store documents

ktdocument = '/var/www/ktdms/Documents/'

conn = MySQLdb.connect(user='', passwd='',db='', charset="utf8", use_unicode=True)
cursor = conn.cursor()

# global variables FTW

cursor.execute('''select id, parent_id, name from folders;''')
allfolders = cursor.fetchall()

cursor.execute('''select id, folder_id from documents;''')
alldocuments = cursor.fetchall()

cursor.execute('''select document_id, filename, storage_path from document_content_version;''')
document_locations = cursor.fetchall()

# create folder tree which matches whatever the database suggests exists

def create_folder_tree(parent_id, path):
    directories = [x for x in allfolders if x[1] == parent_id]
    for directory in directories:
        d = '.%s/%s/' % (path, directory[2])
        print d
        # get all the files that belong in this directory

        for document in [x for x in alldocuments if x[1] == directory[0]]:
                location = [x for x in document_locations if document[0] == x[0]][0]
                print 'copy %s%s %s%s' % (ktdocument, location[2], d, location[1])
                shutil.copy2('%s%s' % (ktdocument, location[2]), '%s%s' % (d, location[1]))
                 print 'ERROR exporting - Usually due to a linked document.'

        create_folder_tree(parent_id=directory[0], path='%s/%s' % (path, directory[2]))

create_folder_tree(parent_id=1, path='')