The chemaxon.formats.MdlCompressor
can compress or decompress
MDL Molfiles, SDfiles, RGfiles and Rxnfiles in two ways:
public static byte[] convert(byte[] mol, int flags) throws IOException; public static String convert(String mol, int flags) throws IOException;The following flags can be specified:
COMPRESS
for compressionDECOMPRESS
for decompressionimport java.io.*; import chemaxon.formats.*; public class Example { public static void main(String args[]) { int n = 0; try { FileInputStream in = new FileInputStream("2000.sdf"); MdlCompressor mc = new MdlCompressor(in, System.out, MdlCompressor.COMPRESS); while(mc.convert()) ++n; } catch(FileNotFoundException ex) { System.err.println("File not found"); } catch(MolFormatException ex) { System.err.println("Bad file format"); } catch(IOException ex) { System.err.println("Unexpected end of file"); } System.out.println("Number of molecules: "+n); } }
MolConverter mc = new MolConverter(in, System.out, "csmol");Decompression:
MolConverter mc = new MolConverter(in, System.out, "mol");
At first you should include the file molcompress.js in the HTML page, in the following way:
Because of the difference of operating systems in text file formats, you might need a function that converts a string to DOS/Windows format:<script LANGUAGE="JavaScript1.1" SRC="molcompress.js"></script>
In this example, an HTML textarea is used to display the input and output of the molfile compression or decompression.<script LANGUAGE="JavaScript1.1"> <!-- // molCompress() returns a string with \n newline characters. // The <textarea> HTML element needs \r\n end-of-line characters // in MS Windows, so we must fix the molCompress() output before // setting the value of a <textarea>. function eolfix(s) { if(navigator.userAgent.lastIndexOf("(Win") >= 0) { return s.split("\n").join("\r\n"); } else { return s; } } //--> </script>
The second argument of<form onSubmit="return false;"> <textarea NAME="mol" ROWS=5 COLS=60> MSketch 11289810322D 1 0 0 0 0 0 0 0 0 0999 V2000 -2.5313 0.7188 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 M END </textarea>
molCompress()
must be
true
for compression, false
for decompession.
You may want to try the compression demo and view its source.<input TYPE="BUTTON" VALUE="Compress" onClick="mol.value=eolfix(molCompress(mol.value, true))"> <input TYPE="BUTTON" VALUE="Inflate" onClick="mol.value=eolfix(molCompress(mol.value, false))"> </form>
This C program converts an MDL Molfile into compressed mol format:
When you run the program, the result should be#include <stdio.h> char* molCompress(const char* s, int compress); int main(int argc, char* argv[]) { char* mol = "\n\ MSketch 11289810322D\n\ \n\ 1 0 0 0 0 0 0 0 0 0999 V2000\n\ -2.5313 0.7188 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0\n\ M END"; char* csmol = molCompress(mol, 1); printf("%s", csmol); free(csmol); return 0; }
If you develop in C++ and compileMSketch 11289810322D 1 0 0 0 0 0 0 0 0 0999 V2000 VqvVKm1W60 M END
molCompress()
as a C++ function
(usually by simply renaming the "c" extension of molcompress.c to
C, cc, or cxx, etc.),
then you should free the memory allocated for the compressed mol string
by using the C++ delete
operator instead of the
C function free()
.
For decompression, the second parameter of#include <iostream.h> char* molCompress(const char* s, int compress); int main(int argc, char* argv[]) { char* mol = "\n\ MSketch 11289810322D\n\ \n\ 1 0 0 0 0 0 0 0 0 0999 V2000\n\ -2.5313 0.7188 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0\n\ M END"; char* csmol = molCompress(mol, 1); cout<<csmol; delete csmol; return 0; }
molCompress
must be 0
:
char* mol = molCompress(csmol, 0);
Note that the second parameter only affects the return value, the first argument can be a compressed or decompressed mol regardless of the type of return. In other words, all of the following calls are valid:
char* csmol = molCompress(mol, 1); char* csmol2 = molCompress(csmol, 1); char* mol = molCompress(csmol, 0); char* mol2 = molCompress(mol, 0);