COMPUTER SCIENCE

 

JAVA

 

R

 

XML

 

LINUX

 

OTHERS

 

BIOINFORMATICS

 

BIOJAVA

 

 

BIOSQL

 

 

MICROARRAY

 

 

MOTIF FINDING

 

 

REGULATION NETWORK

 

OTHERS

 

LIFE SCIENCE

 

 

From Biojava to BioSQL

In Biojava, there is a package to manipulate BioSQL database. I copy the document of Biojava package here and let you see how it works.


Package org.biojava.bio.seq.db.biosql Description
General purpose Sequence storage in a relational database.

Introduction
BioSQL is a general-purpose relational database schema for the storage of biological sequence data and annotation. It evolved from the bioperl-db system.

Using BioSQL
To use BioSQL, you will need:

A DBMS server (currently, PostgreSQL and MySQL are supported)
A JDBC driver for connecting to that database (if in doubt, contact your database vendor)
A BioSQL schema file, suitable for your database. Currently, these can be downloaded here
You will need to create a new database and all the tables specified in the schema file. For example (for PostgreSQL users):
createdb thomasd_biosql
psql thomasd_biosql -f biosqldb-schema-pg.sql
When accessing the database from Java programs, you will need to:
Add the JDBC driver .jar file to your CLASSPATH
Set the jdbc.drivers system property to the class name of the driver (if in doubt, contact your database vendor).
For example:
export CLASSPATH=biojava.jar:xerces.jar:bytecode.jar:pgjdbc2.jar
java -Djdbc.drivers=org.postgresql.Driver demos.MyProgram
You should now be able to connect to the database by simply constructing a new BioSQLSequenceDB object, passing your database connection details to the constructor.

Each physical BioSQL database may contain multiple namespaces (sometimes called biodatabases). In BioJava, each SequenceDB only reflects a single namespace.

Working with BioSQL sequences
The BioJava-BioSQL objects are transparently persistent. This means that you don't need to do anything special to write data back to the database, and that any changes you make to BioSQL sequences will be immediately reflected in the database. If you don't want this to happen, consider using a ViewSequence.

It is possible to completely remove sequences (and all their annotation) from the database. However, an exception will be thrown if any references still exist to that sequence. The following code will fail:

SequenceDB seqDB = new BioSQLSequenceDB(...);
Sequence seq = seqDB.getSequence("AL121903");
// do things with sequence
seqDB.removeSequence("AL121903");
If, however, the variable seq is set to null before calling removeSequence, the call will succeed.

Limitations
In general, the behaviour of BioSQL sequences and features is very similar to that of the standard in-memory interfaces. However, the current version has a few limitations:

Only Feature and StrandedFeature are currently supported. Other sub-interfaces of Feature are silently converted to one of these basic types.
Objects and binary data stored in Annotation bundles of sequences and feature may be lost -- only Strings and Collections of strings are safe (this may be fixed in the future)
Currently, only the MySQL amd PostgreSQL databases are supported. Porting to other databases should, however, be quite easy.

--BACT TO TOP

 

Maintainted by Wu Xin, CBI, Peking University, China, 2003