Subscribe

RSS Feed (xml)

Powered By

Skin Design:
Free Blogger Skins

Powered by Blogger

Monday, May 19, 2008

Strigi now extracts chemical information from PNG files from chemistry notes

Many people blogged recently about storing molecular connectivity tables in images (Egon summarized it). Strigi-chemical now can extract and index this data.

This is how it works: PngChemicalEndAnalyzer is an endAnalyzer which takes control over the stream. It detects a chemical chunk in PNG (Molfile, CML, InChI, ...) and creates a substream to pass it to indexChild(). Then, again the whole chain of analyzers is executed and chemical data extracted by a respective stream analyzer.

It does not replace a normal PNG endAnalyzer, which is in charge for extracting all image-related information from the stream.

By the way, the InChI analyzer was upgraded and can now detect InChIs in various text sources, it can now fix spaces and in some cases even line breaks.

PNG chemical analyzer has a testcase and let's have a look at file samples and xmlindexer output:

Caffeine with embedded InChI (thanks Jean):






66
1
InChI=1/C8H10N4O2/c1-10-4-9-6-5(10)7(13)12(3)8(14)11(6)2/h4H,1-3H3
1
1
InChI=1/C8H10N4O2/c1-10-4-9-6-5(10)7(13)12(3)8(14)11(6)2/h4H,1-3H3



image/png
3323
1
Jean Brefort
171
193
32
RGB/Alpha
Deflate
None
Public domain
0


Rosiglitazone with Molfile (thanks Rich):






2411
name
C18N3O3S1
25
27
comments
0
1


image/png
7984
1
109
327
32
RGB/Alpha
Deflate
None
0

No comments: