Penguin
Note: You are viewing an old revision of this page. View the current version.

If you try to get a wordcount of a latex source file via the shell command 'wc', it will include any macros you have used, and any comments, in the word count.

A better way is to make use of a tool called detex. This does a (fairly) good job of stripping out any Latex macros and comments, although it's probably not usable as a .tex -> readable ascii filter (it gets the ordering on included files wrong, for starters). What it does do is automatically include referenced Latex files (at least the ones using \input ), so you don't have to add any numbers.

I added the following rules to my Makefile:

.PHONY: wordcount

wordcount: thesis.tex
       detex thesis.tex | wc

running 'make wordcount' will now run detex over my .tex file, and pipe the output through the 'wc' program:

If I do a 'normal' wordcount:

$ cat thesis.tex | wc
   1419    7738   57862

Compare this to our new make target:

$ make wordcount
/home/dlawson/bin/detex thesis.tex | wc
   1009    5528   35563

A noticeable difference!


Alternatively, convert the .tex file to ascii and run through wc:

$ pdflatex report.tex
$ ps2ascii report.pdf | wc -w
   2003