Penguin
Blame: pdftotext(1)
EditPageHistoryDiffInfoLikePages
Annotated edit history of pdftotext(1) version 2, including all changes. View license author blame.
Rev Author # Line
1 perry 1 pdftotext
2 !!!pdftotext
3 NAME
4 SYNOPSIS
5 DESCRIPTION
6 CONFIGURATION FILE
7 OPTIONS
8 BUGS
9 AUTHOR
10 SEE ALSO
11 ----
12 !!NAME
13
14
15 pdftotext - Portable Document Format (PDF) to text converter (version 1.00)
16 !!SYNOPSIS
17
18
19 __pdftotext__ [[options] [[''PDF-file''
20 [[''text-file'']]
21 !!DESCRIPTION
22
23
24 __Pdftotext__ converts Portable Document Format (PDF)
25 files to plain text.
26
27
28 Pdftotext reads the PDF file, ''PDF-file'', and writes a
29 text file, ''text-file''. If ''text-file'' is not
30 specified, pdftotext converts ''file.pdf'' to
31 ''file.txt''. If ''text-file'' is -', the text is sent
32 to stdout.
33 !!CONFIGURATION FILE
34
35
36 Pdftotext reads a configuration file at startup. It first
37 tries to find the user's private config file, ~/.xpdfrc. If
38 that doesn't exist, it looks for a system-wide config file,
39 typically /usr/local/etc/xpdfrc (but this location can be
40 changed when pdftotext is built). See the xpdfrc(5)
41 man page for details.
42 !!OPTIONS
43
44
45 Many of the following options can be set with configuration
46 file commands. These are listed in square brackets with the
47 description of the corresponding command line
48 option.
49
50
51 __-f__ ''number''
52
53
54 Specifies the first page to convert.
55
56
57 __-l__ ''number''
58
59
60 Specifies the last page to convert.
61
62
63 __-raw__
64
65
66 Keep the text in content stream order. This is a hack which
67 often
68
69
70 __-htmlmeta__
71
72
73 Generate a simple HTML file, including the meta information.
74 This simply wraps the text in
75
76
77 __-enc__ ''encoding-name''
78
79
80 Sets the encoding to use for text output. The
81 ''encoding-name'' must be defined with the unicodeMap
82 command (see xpdfrc(5)). This defaults to
83 __textEncoding__]
84
85
86 __-eol__ ''unix | dos | mac''
87
88
89 Sets the end-of-line convention to use for text output.
90 [[config file: __textEOL__]
91
92
93 __-opw__ ''password''
94
95
96 Specify the owner password for the PDF file. Providing this
97 will bypass all security restrictions.
98
99
100 __-upw__ ''password''
101
102
103 Specify the user password for the PDF file.
104
105
106 __-q__
107
108
109 Don't print any messages or errors. [[config file:
110 __errQuiet__]
111
112
113 __-cfg__ ''config-file''
114
115
116 Read ''config-file'' in place of ~/.xpdfrc or the
117 system-wide config file.
118
119
120 __-v__
121
122
123 Print copyright and version information.
124
125
126 __-h__
127
128
129 Print usage information. (__-help__ and __--help__ are
130 equivalent.)
131 !!BUGS
132
133
134 Some PDF files contain fonts whose encodings have been
135 mangled beyond recognition. There is no way (short of OCR)
136 to extract text from these files.
137 !!AUTHOR
138
139
140 The pdftotext software and documentation are copyright
141 1996-2002 Derek B. Noonburg
142 (derekn@foolabs.com).
143 !!SEE ALSO
144
145
146 xpdf(1), pdftops(1), pdfinfo(1),
147 pdffonts(1), pdftopbm(1), pdfimages(1),
148 xpdfrc(5)__
2 DreadKnot 149 http://www.foolabs.com/xpdf/
1 perry 150 ----
This page is a man page (or other imported legacy content). We are unable to automatically determine the license status of this page.