org.apache.poi.hslf.extractor
Class PowerPointExtractor

java.lang.Object
  extended by org.apache.poi.POITextExtractor
      extended by org.apache.poi.POIOLE2TextExtractor
          extended by org.apache.poi.hslf.extractor.PowerPointExtractor

public class PowerPointExtractor
extends POIOLE2TextExtractor

This class can be used to extract text from a PowerPoint file. Can optionally also get the notes from one.

Author:
Nick Burch

Field Summary
 
Fields inherited from class org.apache.poi.POITextExtractor
document
 
Constructor Summary
PowerPointExtractor(HSLFSlideShow ss)
          Creates a PowerPointExtractor, from a HSLFSlideShow
PowerPointExtractor(java.io.InputStream iStream)
          Creates a PowerPointExtractor, from an Input Stream
PowerPointExtractor(POIFSFileSystem fs)
          Creates a PowerPointExtractor, from an open POIFSFileSystem
PowerPointExtractor(java.lang.String fileName)
          Creates a PowerPointExtractor, from a file
 
Method Summary
 void close()
          Shuts down the underlying streams
 java.lang.String getNotes()
          Fetches all the notes text from the slideshow, but not the slide text
 java.lang.String getText()
          Fetches all the slide text from the slideshow, but not the notes, unless you've called setSlidesByDefault() and setNotesByDefault() to change this
 java.lang.String getText(boolean getSlideText, boolean getNoteText)
          Fetches text from the slideshow, be it slide text or note text.
static void main(java.lang.String[] args)
          Basic extractor.
 void setNotesByDefault(boolean notesByDefault)
          Should a call to getText() return notes text? Default is no
 void setSlidesByDefault(boolean slidesByDefault)
          Should a call to getText() return slide text? Default is yes
 
Methods inherited from class org.apache.poi.POIOLE2TextExtractor
getDocSummaryInformation, getSummaryInformation
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

PowerPointExtractor

public PowerPointExtractor(java.lang.String fileName)
                    throws java.io.IOException
Creates a PowerPointExtractor, from a file

Parameters:
fileName - The name of the file to extract from
Throws:
java.io.IOException

PowerPointExtractor

public PowerPointExtractor(java.io.InputStream iStream)
                    throws java.io.IOException
Creates a PowerPointExtractor, from an Input Stream

Parameters:
iStream - The input stream containing the PowerPoint document
Throws:
java.io.IOException

PowerPointExtractor

public PowerPointExtractor(POIFSFileSystem fs)
                    throws java.io.IOException
Creates a PowerPointExtractor, from an open POIFSFileSystem

Parameters:
fs - the POIFSFileSystem containing the PowerPoint document
Throws:
java.io.IOException

PowerPointExtractor

public PowerPointExtractor(HSLFSlideShow ss)
                    throws java.io.IOException
Creates a PowerPointExtractor, from a HSLFSlideShow

Parameters:
ss - the HSLFSlideShow to extract text from
Throws:
java.io.IOException
Method Detail

main

public static void main(java.lang.String[] args)
                 throws java.io.IOException
Basic extractor. Returns all the text, and optionally all the notes

Throws:
java.io.IOException

close

public void close()
           throws java.io.IOException
Shuts down the underlying streams

Throws:
java.io.IOException

setSlidesByDefault

public void setSlidesByDefault(boolean slidesByDefault)
Should a call to getText() return slide text? Default is yes


setNotesByDefault

public void setNotesByDefault(boolean notesByDefault)
Should a call to getText() return notes text? Default is no


getText

public java.lang.String getText()
Fetches all the slide text from the slideshow, but not the notes, unless you've called setSlidesByDefault() and setNotesByDefault() to change this

Specified by:
getText in class POITextExtractor
Returns:
All the text from the document

getNotes

public java.lang.String getNotes()
Fetches all the notes text from the slideshow, but not the slide text


getText

public java.lang.String getText(boolean getSlideText,
                                boolean getNoteText)
Fetches text from the slideshow, be it slide text or note text. Because the final block of text in a TextRun normally have their last \n stripped, we add it back

Parameters:
getSlideText - fetch slide text
getNoteText - fetch note text


Copyright 2008 The Apache Software Foundation or its licensors, as applicable.