I am currently using Pipeline Pilot 8.0 to extract embedded chemistry from Word documents, and I am using the Document Reader for Each Data component. I have found several issues:
- if there are two copies of the same document, one in OOXML (.docx) and the other as Word 97-2003 format (.doc), then if fails to extract chemistry in the first instance but succeeds in the second
- If the Word 97-2003 document has an XML schema applied to it in order to allow it to become a Smart document, then it fails as well.
Before you ask, I've tried it with the standard Word Document Reader and the same thing happens.
So, what am I doing wrong? AM I doing anything wrong?