Occasional Notes

broken image


We're using Apache POI to manipulate the content of some Word documents. There are other ways to do it, but, on the whole, Apache POI works reasonably well for a nominally free solution. We've hit a use case that can be summarised by a simple question: does this Word document contain a (Word-generated) table of contents (TOC)? You would think that that is a reasonably uncontroversial question, perhaps even one commonly asked. Apparently it is not.

  1. Occasional Notes 意味
  2. Occasional Notes Calgary
  3. Occasional Notes
Occasional Notes

Occasional Notes Productions is a Calgary based entertainment production company that prides themselves on providing the best and most versatile entertainment for every event. Whether it be a corporate function, your special day or just a reason to celebrate, we've got something to suit your needs. Occasional Notes. A loosely organized blog of my activities in engineering, nuclear science, data science, and programming. Sort Sort by title Sort by date. The West Australian (Perth, WA: 1879 - 1954), Fri 21 Oct 1881, Page 2 - OCCASIONAL NOTES. You have corrected this article This article has been corrected by You and other Voluntroves This article has been corrected by Voluntroves.

Occasional Notes 意味

Background

The background here is that I know nothing about TOC generation in Word beyond what I've been able to deduce from examining Word's behaviour and trawling the content of word/document.xml. I gather that Word inserts a processing instruction of some kind, but also renders static content into the file—that is, there's a marker saying 'there is a TOC in this document', but the TOC content itself is also rendered. It seems that instead of dynamically generating the TOC content (say, every time the document is changed), Word instead generates it once, and then it is only updated on a manual re-generation. So the problem we're facing is:

  • A document has a TOC.
  • We make changes to the body content: say, removing an entire section.
  • The TOC is now stale, and instead of automatically refreshing it, Word inserts error messages at print time.

A basically satisfactory workaround in our case is to call enforceUpdateFields() on the document prior to save, which signals to Word to show a dialog on next load:

Again, this isn't ideal, but it is satisfactory. Sq lot 2 brooklyn ny.

Notes

Solution

Apache POI doesn't expose anything useful in its high-level API for detecting an existing TOC. Slots of vegas download free. After an exhaustive Google search, and quite a bit of digging around in the lower-level class hierarchies, it wasn't obvious that we could solve this at any level using Java alone.

Occasional Notes

Occasional Notes Productions is a Calgary based entertainment production company that prides themselves on providing the best and most versatile entertainment for every event. Whether it be a corporate function, your special day or just a reason to celebrate, we've got something to suit your needs. Occasional Notes. A loosely organized blog of my activities in engineering, nuclear science, data science, and programming. Sort Sort by title Sort by date. The West Australian (Perth, WA: 1879 - 1954), Fri 21 Oct 1881, Page 2 - OCCASIONAL NOTES. You have corrected this article This article has been corrected by You and other Voluntroves This article has been corrected by Voluntroves.

Occasional Notes 意味

Background

The background here is that I know nothing about TOC generation in Word beyond what I've been able to deduce from examining Word's behaviour and trawling the content of word/document.xml. I gather that Word inserts a processing instruction of some kind, but also renders static content into the file—that is, there's a marker saying 'there is a TOC in this document', but the TOC content itself is also rendered. It seems that instead of dynamically generating the TOC content (say, every time the document is changed), Word instead generates it once, and then it is only updated on a manual re-generation. So the problem we're facing is:

  • A document has a TOC.
  • We make changes to the body content: say, removing an entire section.
  • The TOC is now stale, and instead of automatically refreshing it, Word inserts error messages at print time.

A basically satisfactory workaround in our case is to call enforceUpdateFields() on the document prior to save, which signals to Word to show a dialog on next load:

Again, this isn't ideal, but it is satisfactory. Sq lot 2 brooklyn ny.

Solution

Apache POI doesn't expose anything useful in its high-level API for detecting an existing TOC. Slots of vegas download free. After an exhaustive Google search, and quite a bit of digging around in the lower-level class hierarchies, it wasn't obvious that we could solve this at any level using Java alone.

Inspecting word/document.xml suggested that a processing instruction that looked something like this was present in all documents containing TOCs:

Occasional Notes Calgary

TOC o '1-3' h z u

How about if we get the XML for the document and search for such an element? If we call getDocument() on the XWPFDocument, we get a CTDocument1 which implements XmlObject and provides a selectPath() method to select nodes via an XPath expression. (If you're curious, it took a couple of hours of trial and error to be able to come up with the facts in the preceding sentence!) Firstly, add XMLBeans and Saxon to your POM:

(Again, that excerpt represents an hour of fun trying to assemble mutually compatible versions of POI, XMLBeans and Saxon, as well as answering the question 'Do we also need xmlbeans-xpath?' Spoiler: we don't.) Then, with an XWPFDocument called document, find any w:instrText elements, where w is a namespace which we'll also define, and see if any of them contain a magic string:

Legendary slots wow. GODS OF NATURE SLOTS Get in touch with your spiritual side as you spin your way through ancient temples, seeking the gods and spirits of fire, water, earth, moon, sun, and metal to unlock the ancient treasure. Dinosaurs, Tigers and Dragons, oh my! Treat yourself to the best in fantasy slot machines as you spin and win to unlock a variety of themes and earn incredible rewards. Experience wild, adventurous fun as you progress through different levels and come face to face with prehistoric predators, big cats, and fire-breathing beasts. Open new worlds of fun in every spin as you rack.

Occasional Notes

So, it's brute force and depends on a magic string, but it seems to work. Better solutions gladly accepted!





broken image