‹ Jack's Brain

OxfordPerl

Sep 29, 2014

I was hoping to contribute to some documentation for a project that’s especially near to my heart. One of the biggest issues was inconsistent comma usage, so I wanted to remedy that. To aid me, I dipped my toes into Perl and RegEx to create OxfordPerl, and simple script that uses regular expressions to find sentences that should have an Oxford (AKA serial) comma, but don’t.

The script performs adequately; it rarely misses a spot that should have one, but often is fooled by other comma-delimited parentheticals. For the most part, though, it works well: in 448KB of text, I received two correctly identified issues, and five false positives. Not great, but it does what I need.

Check it out on Github; source is also below.

#!/usr/bin/perl
use strict;
use warnings;

# OxfordPerl.pl: Search for lines lacking an oxford comma

(my $filename) = @ARGV;   # Get filename
my $linecount = 0;

# Create a filehandle
open my $fh, '<', $filename or die "Could not open file $filename for reading: $!\n";

# Loop through file by line
while (my $line = <$fh>) { 
	$linecount++;

	# Credit to Terry Woods at http://www.experts-exchange.com/Programming/Languages/Regular_Expressions/Q_27601985.html
	if ($line =~ /((?:[\w'-]+,\s+)+(?:[\w'-]+\s){0,2}[\w'-]+)(\s+and\s+[\w'-]+)/) {
		print "$filename, line $linecount: \"$1$2\" -> \"$1,$2\"\n";
	}
}