Detect and delete (textual) empty Pages in Writer with UNO API
we are processing documents which are generated by a third party application. Some of these documents contain “false” empty pages which need to be removed programmatically.
My idea is to iterate over the pages of a document, get the textual content of that page. If the content is empty then remove that page from the document and continue with the next one.
Does that sound reasonable?
Could someone provide pointers or code snippets on how that can be accomplished using UNO Java API?
I tried fiddling around with the serveral cursors and services provided by the loaded document but currently have no clue on how to accomplish the described task.
Any hints would be great!
Thanks in advance!
Peter
I became aware that it is easy to just delete all empty paragraphs with an enumeration but my initial idea with collecting pagenumber information is a bit complicated because when you got the pagenumber of a paragraph and move on to get the next I don’t know how to come back. I mean you must first go through multiple paragraphs before you know if they are on one page or not. And when you know it then you already have gone too far.
Here is a little basic code that lets you delete all empty paragraphs of a document. For information purposes I left in the additional steps for
1.passing tables and
2. of moving the viewcursor to the paragraph for getting pagenumber although it isn’t used for anything here.
Note: If there is an image anchored to an empty paragraph it will also be deleted with the paragraph.
sub deleteparas
oDocSrc = ThisComponent
oVCursSrc = oDocSrc.CurrentController.getViewCursor()
'loop through all of the paragraphs in the source document oParEnum = oDocSrc.getText().createEnumeration() Do While oParEnum.hasMoreElements()
oPar = oParEnum.nextElement()
'fix for using tables: Do while oPar.supportsService("com.sun.star.text.TextTable") oPar = oParEnum.nextElement() loop
oVCursSrc.gotoRange(oPar,false) oVCursSrc.gotoendofline(false) curpagenumber = oVCursSrc.page
If len(oPar.getString()) = 0 Then
oPar.dispose
End If
Loop end sub
thanks for the hint. I tried the following (pseudo code) (based on http://api.libreoffice.org/examples/java/Text/TextDocumentStructure.java)
textDocument.getTextService().getText().getContentEnumeration(that is UnoRuntime.queryInterface(XEnumerationAccess.class)).createEnumeration()
but this call only return one paragraph with no text content for the whole document.
BTW: the document contains of 24 Pages full of text
Any ideas what could be wrong?
Thanks!