Penguin

This Perl script converts all the worksheets in a MicrosoftExcel SpreadSheet to individual CSV files.

The following modules are required:


#!/usr/bin/perl

=head1 NAME

excel2csv -- dump all worksheets from an Excel file to CSV files

=head1 SYNOPSIS

F<excel2csv>
S<B<-h>>

F<excel2csv>
S<B<--man>>

F<rename>
S<B<[ -s ]>>
S<B<[ -S ]>>
S<B<[ -v ]>>
S<B<file.xls [ F<file2.xls> F<file3.xls> ... ]>>

=head1 DESCRIPTION

C<excel2csv> takes any number of Excel files on the command line, reads them, and dumps each worksheet therein to a separate CSV file, named after the worksheet.

=head1 ARGUMENTS

=over 4

=item B<-h>, B<--help>

See a synopsis.

=item B<--man>

Browse the manpage.

=back

=head1 OPTIONS

=over 4

=item B<-s>, B<--suffix>

The suffix for the CSV files generated. Default is C<.csv>.

=item B<-S>, B<--separator>

The field separator used in the generated CSV files. Default is C<;>.

=item B<-v>, B<--verbose>

Print additional information about the operations (not) executed.

=back

=head1 BUGS

CSV generation is fugded -- no quoting issues are taken into consideration. CSV generation should rely on one of the modules available for the purpose instead.

=head1 AUTHORS

Drew Broadley, with contributions from Aristotle Pagaltzis

=head1 COPYRIGHT

FIXME FIXME FIXME FIXME FIXME FIXME FIXME FIXME

Drew Broadley has not clarified the copyright on this code.

FIXME FIXME FIXME FIXME FIXME FIXME FIXME FIXME

=cut

use strict;
use warnings;

use Spreadsheet::ParseExcel;
use Getopt::Long 2.24, qw(:config bundling no_ignore_case no_auto_abbrev);
use Pod::Usage;

Getopt::Long::GetOptions(
    'h|help'     => sub { pod2usage( -verbose => 1 ) },
    'man'        => sub { pod2usage( -verbose => 2 ) },
    's|suffix=s' => \(my $opt_suffix = '.csv'),
    'S|sep=s'    => \(my $opt_separator = ';'),
    'v|verbose'  => \(my $opt_verbose),
) or pod2usage();

my $excel = Spreadsheet::ParseExcel->new();
for (@ARGV) {
    my $book = $excel->Parse($_);

    my $last_sheet = $book->{SheetCount} - 1;

    for my $worksheet ( @{ $book->{Worksheet} }[ 0 .. $last_sheet ] ) {

        next
            if not defined $worksheet->{MaxRow}
            or not defined $worksheet->{MaxCol};

        print $worksheet->{Name} . "\n" if $opt_verbose;

        my $filename = $worksheet->{Name} . $opt_suffix;
        open my $fh, ">", $filename
            or die "Can open $filename to write: $!\n";

        my @row = $worksheet->{MinRow} .. $worksheet->{MaxRow};
        my @col = $worksheet->{MinCol} .. $worksheet->{MaxCol};

        for my $row ( @{ $worksheet->{Cells} }[ @row ] ) {
            my @cellvalue = map {
                $_ = $_->Value() if ref $_;
                $_ = '' if not defined $_;
                $_;
            } @$row[ @col ];
            print $fh join($opt_separator, @cellvalue), "\n";
        }
    }
}