Penguin
Note: You are viewing an old revision of this page. View the current version.

If you try to get a wordcount of a latex source file via the shell command 'wc', it will include any macros you have used, and any comments, in the word count.

A better way is to make use of a tool called detex. This does a (fairly) good job of stripping out any Latex macros and comments, although it's probably not usable as a .tex -> readable ascii filter (it gets the ordering on included files wrong, for starters).

I added the following rules to my Makefile:

.PHONY: wordcount

wordcount: thesis.tex
       detex thesis.tex | wc

running 'make wordcount' will now run detex over my .tex file, and pipe the output through the 'wc' program:

If I do a 'normal' wordcount:

$ cat thesis.tex | wc
   1419    7738   57862

Compare this to our new make target:

$ make wordcount
/home/dlawson/bin/detex thesis.tex | wc
   1009    5528   35563

A noticeable difference!


Alternatively, convert the .tex file to ascii and run through wc:

$ pdflatex report.tex
$ ps2ascii report.pdf | wc -w
   2003