Wednesday, May 11, 2005

PDFTextStream

PDFTextStream is a Java class library that enables Java applications to access the text content of PDF documents quickly, easily, and accurately. PDFTextStream is the first Java library to focus on the extraction of text from PDF files. It supports all versions of the PDF document specification, including v1.6 (used by Acrobat 7). It supports decryption of 40-bit and 128-bit encrypted documents. It provides access to all document metadata contained in a PDF file. It subclasses java.io.Reader, allowing drop-in integration with existing code. Easy integration with Jakarta Lucene is included.

[Environment] Other Environment, Web Environment, Win32 (MS Windows)
[Intended Audience] Developers
[License] Other/Proprietary License with Free Trial
[Operating System] MacOS X, Microsoft :: Windows :: Cygwin, Microsoft :: Windows :: Windows NT/2000/XP, OS Independent, POSIX :: BSD, POSIX :: BSD :: BSD/OS, POSIX :: BSD :: FreeBSD, POSIX :: BSD :: NetBSD, POSIX :: BSD :: OpenBSD, POSIX :: HP-UX, POSIX :: Linux, POSIX :: SunOS/Solaris, Unix
[Programming Language] Java
[Topic] Information Management :: Document Repositories, Internet :: WWW/HTTP :: Indexing/Search, Software Development :: Libraries :: Java Libraries

No comments: