version 2, including all changes.
.
Rev |
Author |
# |
Line |
1 |
perry |
1 |
pdftotext |
|
|
2 |
!!!pdftotext |
|
|
3 |
NAME |
|
|
4 |
SYNOPSIS |
|
|
5 |
DESCRIPTION |
|
|
6 |
CONFIGURATION FILE |
|
|
7 |
OPTIONS |
|
|
8 |
BUGS |
|
|
9 |
AUTHOR |
|
|
10 |
SEE ALSO |
|
|
11 |
---- |
|
|
12 |
!!NAME |
|
|
13 |
|
|
|
14 |
|
|
|
15 |
pdftotext - Portable Document Format (PDF) to text converter (version 1.00) |
|
|
16 |
!!SYNOPSIS |
|
|
17 |
|
|
|
18 |
|
|
|
19 |
__pdftotext__ [[options] [[''PDF-file'' |
|
|
20 |
[[''text-file'']] |
|
|
21 |
!!DESCRIPTION |
|
|
22 |
|
|
|
23 |
|
|
|
24 |
__Pdftotext__ converts Portable Document Format (PDF) |
|
|
25 |
files to plain text. |
|
|
26 |
|
|
|
27 |
|
|
|
28 |
Pdftotext reads the PDF file, ''PDF-file'', and writes a |
|
|
29 |
text file, ''text-file''. If ''text-file'' is not |
|
|
30 |
specified, pdftotext converts ''file.pdf'' to |
|
|
31 |
''file.txt''. If ''text-file'' is -', the text is sent |
|
|
32 |
to stdout. |
|
|
33 |
!!CONFIGURATION FILE |
|
|
34 |
|
|
|
35 |
|
|
|
36 |
Pdftotext reads a configuration file at startup. It first |
|
|
37 |
tries to find the user's private config file, ~/.xpdfrc. If |
|
|
38 |
that doesn't exist, it looks for a system-wide config file, |
|
|
39 |
typically /usr/local/etc/xpdfrc (but this location can be |
|
|
40 |
changed when pdftotext is built). See the xpdfrc(5) |
|
|
41 |
man page for details. |
|
|
42 |
!!OPTIONS |
|
|
43 |
|
|
|
44 |
|
|
|
45 |
Many of the following options can be set with configuration |
|
|
46 |
file commands. These are listed in square brackets with the |
|
|
47 |
description of the corresponding command line |
|
|
48 |
option. |
|
|
49 |
|
|
|
50 |
|
|
|
51 |
__-f__ ''number'' |
|
|
52 |
|
|
|
53 |
|
|
|
54 |
Specifies the first page to convert. |
|
|
55 |
|
|
|
56 |
|
|
|
57 |
__-l__ ''number'' |
|
|
58 |
|
|
|
59 |
|
|
|
60 |
Specifies the last page to convert. |
|
|
61 |
|
|
|
62 |
|
|
|
63 |
__-raw__ |
|
|
64 |
|
|
|
65 |
|
|
|
66 |
Keep the text in content stream order. This is a hack which |
|
|
67 |
often |
|
|
68 |
|
|
|
69 |
|
|
|
70 |
__-htmlmeta__ |
|
|
71 |
|
|
|
72 |
|
|
|
73 |
Generate a simple HTML file, including the meta information. |
|
|
74 |
This simply wraps the text in |
|
|
75 |
|
|
|
76 |
|
|
|
77 |
__-enc__ ''encoding-name'' |
|
|
78 |
|
|
|
79 |
|
|
|
80 |
Sets the encoding to use for text output. The |
|
|
81 |
''encoding-name'' must be defined with the unicodeMap |
|
|
82 |
command (see xpdfrc(5)). This defaults to |
|
|
83 |
__textEncoding__] |
|
|
84 |
|
|
|
85 |
|
|
|
86 |
__-eol__ ''unix | dos | mac'' |
|
|
87 |
|
|
|
88 |
|
|
|
89 |
Sets the end-of-line convention to use for text output. |
|
|
90 |
[[config file: __textEOL__] |
|
|
91 |
|
|
|
92 |
|
|
|
93 |
__-opw__ ''password'' |
|
|
94 |
|
|
|
95 |
|
|
|
96 |
Specify the owner password for the PDF file. Providing this |
|
|
97 |
will bypass all security restrictions. |
|
|
98 |
|
|
|
99 |
|
|
|
100 |
__-upw__ ''password'' |
|
|
101 |
|
|
|
102 |
|
|
|
103 |
Specify the user password for the PDF file. |
|
|
104 |
|
|
|
105 |
|
|
|
106 |
__-q__ |
|
|
107 |
|
|
|
108 |
|
|
|
109 |
Don't print any messages or errors. [[config file: |
|
|
110 |
__errQuiet__] |
|
|
111 |
|
|
|
112 |
|
|
|
113 |
__-cfg__ ''config-file'' |
|
|
114 |
|
|
|
115 |
|
|
|
116 |
Read ''config-file'' in place of ~/.xpdfrc or the |
|
|
117 |
system-wide config file. |
|
|
118 |
|
|
|
119 |
|
|
|
120 |
__-v__ |
|
|
121 |
|
|
|
122 |
|
|
|
123 |
Print copyright and version information. |
|
|
124 |
|
|
|
125 |
|
|
|
126 |
__-h__ |
|
|
127 |
|
|
|
128 |
|
|
|
129 |
Print usage information. (__-help__ and __--help__ are |
|
|
130 |
equivalent.) |
|
|
131 |
!!BUGS |
|
|
132 |
|
|
|
133 |
|
|
|
134 |
Some PDF files contain fonts whose encodings have been |
|
|
135 |
mangled beyond recognition. There is no way (short of OCR) |
|
|
136 |
to extract text from these files. |
|
|
137 |
!!AUTHOR |
|
|
138 |
|
|
|
139 |
|
|
|
140 |
The pdftotext software and documentation are copyright |
|
|
141 |
1996-2002 Derek B. Noonburg |
|
|
142 |
(derekn@foolabs.com). |
|
|
143 |
!!SEE ALSO |
|
|
144 |
|
|
|
145 |
|
|
|
146 |
xpdf(1), pdftops(1), pdfinfo(1), |
|
|
147 |
pdffonts(1), pdftopbm(1), pdfimages(1), |
|
|
148 |
xpdfrc(5)__ |
2 |
DreadKnot |
149 |
http://www.foolabs.com/xpdf/ |
1 |
perry |
150 |
---- |