Go to content Go to navigation Go to search

Filtering Columns · Dec 12, 10:17 PM by Dylan Doxey

I love grep, sed, and especially awk.

However, I find that I generally only use awk for one thing -- filtering by columns.

Suppose I want to see the login names and their home directories for the users on my system.

cat /etc/passwd | awk -F: '{print $1" "$6}'

Beautiful. Works perfect. Who could ask for more?

Me.

I really like quotation marks and curly brackets. I really do. They make code look awesome. They're also good exercise for my keyboarding skills. But they're a little awkward to type, over and over again.

Yes, it's a little tiring typing out all that awesome looking code. Sometimes I wish my command lining around could go just a little more efficiently.

So, how about we make code written in modules and scripts really awesome looking with sweet punctuation marks and stuff, and then the code we're slinging at the command prompt be a little more streamlined?

Here's what I have in mind as an alternative to the sweet awk program above.

cat /etc/passwd | cols -F: 1 6

Yep, I guess that completes the design phase of this project.


I think it should go a little something like this.

#!/usr/bin/perl

use strict;

my $usage = "\nUsage:\n  \$ $0 [--first|--last|n|...] [-Fx]\n\n"
    . "Where n is the 1 indexed column number and x "
    . "is some character (or Perl regex) to split on.\n"
    . "You may specify multiple columns.\n"
    . "If not specified, n is 1 and x is \\\\s.\n\n";

my $split_regex;
my @cols;

# Note, if an argument is 0, then the arg reading loop ends.
ARG:
while ( my $arg = shift @ARGV ) {
    
    if ( $arg =~ m/\A -F \s* (.+)? \z/msx ) {
        $arg = $1 ? $1 : shift @ARGV;
        $split_regex = qr/$arg/;
        next ARG;
    }
    
    $arg =~ s/--first/0/;
    $arg =~ s/--last/-1/;
    
    if ( $arg eq '--help' || $arg !~ m/\A -? \d+ \z/msx ) {
        print $usage;
        exit;
    }
    
    # reduce to zero index
    $arg--;
    
    push @cols, $arg;
}

# Default split regex
$split_regex = qr/\s/
    if !$split_regex;

# Default index
@cols = (0) 
    if !@cols;

# Do the work
for my $line (<STDIN>) {
    
    print '' . ( join ' ', ( split $split_regex, $line )[@cols] ) . "\n";
}

Use the force!

Commenting is closed for this article.