Home
Main website
Display Sidebar
Hide Ads
Recent Changes
View Source:
pdftotext(1)
Edit
PageHistory
Diff
Info
LikePages
pdftotext !!!pdftotext NAME SYNOPSIS DESCRIPTION CONFIGURATION FILE OPTIONS BUGS AUTHOR SEE ALSO ---- !!NAME pdftotext - Portable Document Format (PDF) to text converter (version 1.00) !!SYNOPSIS __pdftotext__ [[options] [[''PDF-file'' [[''text-file'']] !!DESCRIPTION __Pdftotext__ converts Portable Document Format (PDF) files to plain text. Pdftotext reads the PDF file, ''PDF-file'', and writes a text file, ''text-file''. If ''text-file'' is not specified, pdftotext converts ''file.pdf'' to ''file.txt''. If ''text-file'' is -', the text is sent to stdout. !!CONFIGURATION FILE Pdftotext reads a configuration file at startup. It first tries to find the user's private config file, ~/.xpdfrc. If that doesn't exist, it looks for a system-wide config file, typically /usr/local/etc/xpdfrc (but this location can be changed when pdftotext is built). See the xpdfrc(5) man page for details. !!OPTIONS Many of the following options can be set with configuration file commands. These are listed in square brackets with the description of the corresponding command line option. __-f__ ''number'' Specifies the first page to convert. __-l__ ''number'' Specifies the last page to convert. __-raw__ Keep the text in content stream order. This is a hack which often __-htmlmeta__ Generate a simple HTML file, including the meta information. This simply wraps the text in __-enc__ ''encoding-name'' Sets the encoding to use for text output. The ''encoding-name'' must be defined with the unicodeMap command (see xpdfrc(5)). This defaults to __textEncoding__] __-eol__ ''unix | dos | mac'' Sets the end-of-line convention to use for text output. [[config file: __textEOL__] __-opw__ ''password'' Specify the owner password for the PDF file. Providing this will bypass all security restrictions. __-upw__ ''password'' Specify the user password for the PDF file. __-q__ Don't print any messages or errors. [[config file: __errQuiet__] __-cfg__ ''config-file'' Read ''config-file'' in place of ~/.xpdfrc or the system-wide config file. __-v__ Print copyright and version information. __-h__ Print usage information. (__-help__ and __--help__ are equivalent.) !!BUGS Some PDF files contain fonts whose encodings have been mangled beyond recognition. There is no way (short of OCR) to extract text from these files. !!AUTHOR The pdftotext software and documentation are copyright 1996-2002 Derek B. Noonburg (derekn@foolabs.com). !!SEE ALSO xpdf(1), pdftops(1), pdfinfo(1), pdffonts(1), pdftopbm(1), pdfimages(1), xpdfrc(5)__ http://www.foolabs.com/xpdf/ ----
8 pages link to
pdftotext(1)
:
pdffonts(1)
pdftops(1)
xpdf(1)
xpdfrc(5)
pdfimages(1)
pdfinfo(1)
pdftopbm(1)
Man1p
This page is a man page (or other imported legacy content). We are unable to automatically determine the license status of this page.