From Biojava to BioSQL
In Biojava, there is a package to manipulate
BioSQL database. I copy the document of Biojava package here and let
you see how it works.
Package org.biojava.bio.seq.db.biosql
Description
General purpose Sequence storage in a relational database.
Introduction
BioSQL is a general-purpose relational database schema for the storage
of biological sequence data and annotation. It evolved from the bioperl-db
system.
Using BioSQL
To use BioSQL, you will need:
A DBMS server (currently, PostgreSQL
and MySQL are supported)
A JDBC driver for connecting to that database (if in doubt, contact
your database vendor)
A BioSQL schema file, suitable for your database. Currently, these
can be downloaded here
You will need to create a new database and all the tables specified
in the schema file. For example (for PostgreSQL users):
createdb thomasd_biosql
psql thomasd_biosql -f biosqldb-schema-pg.sql
When accessing the database from Java programs, you will need to:
Add the JDBC driver .jar file to your CLASSPATH
Set the jdbc.drivers system property to the class name of the driver
(if in doubt, contact your database vendor).
For example:
export CLASSPATH=biojava.jar:xerces.jar:bytecode.jar:pgjdbc2.jar
java -Djdbc.drivers=org.postgresql.Driver demos.MyProgram
You should now be able to connect to the database by simply constructing
a new BioSQLSequenceDB object, passing your database connection details
to the constructor.
Each physical BioSQL database may contain
multiple namespaces (sometimes called biodatabases). In BioJava, each
SequenceDB only reflects a single namespace.
Working with BioSQL sequences
The BioJava-BioSQL objects are transparently persistent. This means
that you don't need to do anything special to write data back to the
database, and that any changes you make to BioSQL sequences will be
immediately reflected in the database. If you don't want this to happen,
consider using a ViewSequence.
It is possible to completely remove sequences
(and all their annotation) from the database. However, an exception
will be thrown if any references still exist to that sequence. The
following code will fail:
SequenceDB seqDB = new BioSQLSequenceDB(...);
Sequence seq = seqDB.getSequence("AL121903");
// do things with sequence
seqDB.removeSequence("AL121903");
If, however, the variable seq is set to null before calling removeSequence,
the call will succeed.
Limitations
In general, the behaviour of BioSQL sequences and features is very
similar to that of the standard in-memory interfaces. However, the
current version has a few limitations:
Only Feature and StrandedFeature are
currently supported. Other sub-interfaces of Feature are silently
converted to one of these basic types.
Objects and binary data stored in Annotation bundles of sequences
and feature may be lost -- only Strings and Collections of strings
are safe (this may be fixed in the future)
Currently, only the MySQL amd PostgreSQL databases are supported.
Porting to other databases should, however, be quite easy.
--BACT
TO TOP