***********************************************************
    Readme file for GT-Predict
***********************************************************

GT-Predict is an activity data and physicochemical property-based classifier for predicting and visualizing the substrate scopes of glycosyltransferase family 1 (GT1) enzymes. These proteins were written in MATLAB and compiled for stand-alone use using the MATLAB Compiler Runtime (MCR).


System requirements:
---------------------------------
The current version of the software has been compiled to run on Windows XP and/or Windows 7.  To install the software run the MCRInstaller.exe program should be first run to install the necessary components. Note: a standalone copy/license of MATLAB is not needed to run GT-Predict, but all users must abide by the license agreements distributed with GT-Predict concerning the MRC/MATLAB components.

MCRInstaller.exe will install the relevant programs and configuration settings within the GTPredict folder and the subfolders for AcceptorGUI, PredictedAcceptorInteraction, and PredictEnzymeInteraction.
Note: when opening programs on some computers, we sometimes observe two “Warnings” in the the terminal that an MCR component is not found and that there is an inexact match with the local MATLAB components. These warnings can be ignored.

GT-Predict occupies 155 Mb on disk.


Components
---------------------------------
1. AcceptorGUI visualizes the interaction patterns input in the data file. The included acceptor_interaction_data contains all observed interaction patterns (positive, negative, or ambiguous) for the Arabidopsis GT1 enzyme library. 

2. PredictAcceptorInteraction uses a decision tree-based classifier that was learned on our interaction dataset using combinations of physicochemical properties calculated in Chem3D (CambridgeSoft Inc.). New datasets can also be added as described in the relevant Readme to extend the predictive abilities. 

3. PredictEnzymeInteraction performs local sequence alignment on any novel GT1 amino acid FASTA sequence against our library of studied Arabidopsis GT1 enzyme sequences, then predicts interaction patterns for the sugar donor and sugar acceptor used for training GT-Predict. The FASTA sequences reported for predicting substrate scopes outside our initial dataset [Medicago truncatula MtUGT71G1, MtUGT78G1, Avena strigosa AsUGT74H5, AsUGT88C4, Streptomyces antibioticus OleD, and Streptomyces lividans MGS] are included for reference. 

Follow the readme.txt files included in each subfolder to use each program.
