Detect and delete (textual) empty Pages in Writer with UNO API

3 replies [Last post]
peter.hermsdorf
Offline
Last seen: 4 days 8 hours ago
Title:
Joined: 24 May 2016
Posts: 2
Hi,

we are processing documents which are generated by a third party application. Some of these documents contain “false” empty pages which need to be removed programmatically.

My idea is to iterate over the pages of a document, get the textual content of that page. If the content is empty then remove that page from the document and continue with the next one.

Does that sound reasonable?

Could someone provide pointers or code snippets on how that can be accomplished using UNO Java API?

I tried fiddling around with the serveral cursors and services provided by the loaded document but currently have no clue on how to accomplish the described task.

Any hints would be great!

Thanks in advance!

Peter

musikai
Offline
Last seen: 6 hours 32 min ago
Title: ★★★
Joined: 25 Oct 2015
Posts: 35
You can enumerate all
You can enumerate all paragraphs and iterate through them. To get the pagenumber of a paragraph you must use the viewcursor. So if all paragraphs with the same pagenumber are empty you could delete them. Can’t help you with java code.
Windows7 Pro, Lubuntu 15.04, Sibelius 7.1.3, LibreOffice 4.4.7, OpenOffice 4.1.2 Free Project: LibreOffice Songbook Architect (LOSA) http://struckkai.blogspot.de/2015/04/libreofficesongbookarchitect.html SVG-Tools: http://struckkai.blogspot.com/2016/0
peter.hermsdorf
Offline
Last seen: 4 days 8 hours ago
Title:
Joined: 24 May 2016
Posts: 2
hi, thanks for the hint. I
hi,

thanks for the hint. I tried the following (pseudo code) (based on http://api.libreoffice.org/examples/java/Text/TextDocumentStructure.java)

textDocument.getTextService().getText().getContentEnumeration(that is UnoRuntime.queryInterface(XEnumerationAccess.class)).createEnumeration()

but this call only return one paragraph with no text content for the whole document.
BTW: the document contains of 24 Pages full of text

Any ideas what could be wrong?

Thanks!

musikai
Offline
Last seen: 6 hours 32 min ago
Title: ★★★
Joined: 25 Oct 2015
Posts: 35
Hi,I was about to write an
Hi, I was about to write an example in Basic (can’t in java) and had to look how to delete paragraphs when I stumbled across these threads with java code. Perhaps this can be of further help. https://forum.openoffice.org/en/forum/viewtopic.php?f=44&t=16285 http://stackoverflow.com/questions/2511216/openoffice-with-net-how-to-iterate-throught-all-paragraphs-and-read-text

I became aware that it is easy to just delete all empty paragraphs with an enumeration but my initial idea with collecting pagenumber information is a bit complicated because when you got the pagenumber of a paragraph and move on to get the next I don’t know how to come back. I mean you must first go through multiple paragraphs before you know if they are on one page or not. And when you know it then you already have gone too far.
Here is a little basic code that lets you delete all empty paragraphs of a document. For information purposes I left in the additional steps for
1.passing tables and
2. of moving the viewcursor to the paragraph for getting pagenumber although it isn’t used for anything here.

Note: If there is an image anchored to an empty paragraph it will also be deleted with the paragraph.


sub deleteparas

oDocSrc   = ThisComponent

      oVCursSrc   = oDocSrc.CurrentController.getViewCursor()     

      'loop through all of the paragraphs in the source document
      oParEnum = oDocSrc.getText().createEnumeration()
      Do While oParEnum.hasMoreElements()

      oPar = oParEnum.nextElement()

      'fix for using tables:
        Do while oPar.supportsService("com.sun.star.text.TextTable")
        oPar = oParEnum.nextElement()
        loop

        oVCursSrc.gotoRange(oPar,false)
        oVCursSrc.gotoendofline(false)
   curpagenumber = oVCursSrc.page

          If  len(oPar.getString()) = 0 Then

		oPar.dispose

          End If

      Loop
end sub
Windows7 Pro, Lubuntu 15.04, Sibelius 7.1.3, LibreOffice 4.4.7, OpenOffice 4.1.2 Free Project: LibreOffice Songbook Architect (LOSA) http://struckkai.blogspot.de/2015/04/libreofficesongbookarchitect.html SVG-Tools: http://struckkai.blogspot.com/2016/0

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.