Many people blogged recently about storing molecular connectivity tables in images (Egon summarized it). Strigi-chemical now can extract and index this data.
This is how it works: PngChemicalEndAnalyzer is an endAnalyzer which takes control over the stream. It detects a chemical chunk in PNG (Molfile, CML, InChI, ...) and creates a substream to pass it to indexChild(). Then, again the whole chain of analyzers is executed and chemical data extracted by a respective stream analyzer.
It does not replace a normal PNG endAnalyzer, which is in charge for extracting all image-related information from the stream.
By the way, the InChI analyzer was upgraded and can now detect InChIs in various text sources, it can now fix spaces and in some cases even line breaks.
PNG chemical analyzer has a testcase and let's have a look at file samples and xmlindexer output:
Caffeine with embedded InChI (thanks Jean):
66
1
InChI=1/C8H10N4O2/c1-10-4-9-6-5(10)7(13)12(3)8(14)11(6)2/h4H,1-3H3
1
1
InChI=1/C8H10N4O2/c1-10-4-9-6-5(10)7(13)12(3)8(14)11(6)2/h4H,1-3H3
image/png
3323
1
Jean Brefort
171
193
32
RGB/Alpha
Deflate
None
Public domain
0
Rosiglitazone with Molfile (thanks Rich):

2411 name C18N3O3S1 25 27 comments 0 1 image/png 7984 1 109 327 32 RGB/Alpha Deflate None 0
No comments:
Post a Comment