| Package | Description |
|---|---|
| org.opencms.search.extractors |
Contains a generic, low-level framework for extration of plain text content out of various popular file formats.
|
| Modifier and Type | Class and Description |
|---|---|
class |
CmsExtractorHtml
Extracts the text from an HTML document.
|
class |
CmsExtractorMsOfficeOLE2
Extracts text data from a VFS resource that is an OLE 2 MS Office document.
|
class |
CmsExtractorMsOfficeOOXML
Extracts text data from a VFS resource that is an OOXML MS Office document.
|
class |
CmsExtractorOpenOffice
Extracts the text from OpenOffice documents (.ods, .odf).
|
class |
CmsExtractorPdf
Extracts the text from a PDF document.
|
class |
CmsExtractorRtf
Extracts the text from a RTF document.
|