No Head Line in grep search
when I tried to find a process, I normally use ps
with grep
command.
sh> ps aux | grep fish
myoungj+ 695 0.0 0.0 88596 7000 tty2 S+ 09:17 0:00 -/usr/bin/fish -c /usr/bin/gnome-session -l
myoungj+ 2490 0.0 0.1 164660 10140 pts/1 Ss 09:21 0:00 -fish
myoungj+ 2665 0.0 0.1 172848 10076 pts/2 Ss+ 09:24 0:00 -fish
myoungj+ 2781 0.0 0.1 172724 9712 pts/0 Ss+ 09:27 0:00 -fish
myoungj+ 3024 0.0 0.1 164528 9552 pts/3 Ss+ 09:32 0:00 -fish
myoungj+ 4709 0.0 0.0 9136 2692 pts/5 S+ 10:00 0:00 grep --color=auto fish
Headline is helpful
However, I found that no head line sometimes makes me wondering what
those information actually means. i.e: I’d though it would be nicer if I could see the
head line of ps
command along with search results.
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
myoungj+ 695 0.0 0.0 88596 7000 tty2 S+ 09:17 0:00 -/usr/bin/fish -c /usr/bin/gnome-session -l
myoungj+ 2490 0.0 0.1 164660 10140 pts/1 Ss 09:21 0:00 -fish
.. snip ..
Quick Solution for single use
awk
or sed
could be useful in this category if you don’t need any other feature from grep
.
sh> ps aux | awk 'NR == 1 || /fish/ { print; }'
But I think grep
is more powerful tool.
In bash, it looks straight forward for me.
bash> ps aux | { read line; echo "$line"; grep 'fish'; }
or using sub-shell.
bash> ps aux | ( read line; echo "$line"; grep 'fish'; )
or in fish shell (little longer)
fish> ps aux | begin read -l line; echo "$line"; grep 'fish'; end
Still, I think bash is better than fish in one-liner command.
More Serious Approach
But those one-liners are not very friendly. IMHO, all the programmes, at least, provide us of simple usage. So I decided to go little deeper.
Fish shell solution
The recent file is on my github: hgrep.fish
The basic options are below:
- -h|help : help message and exit
- -C|context : which is passed to as a ‘grep option’, Which is sometimes useful when we need the context literally.
#!/usr/bin/env fish
set -l PROG = 'hgrep.fish'
# ref: https://fishshell.com/docs/current/cmds/argparse.html#cmd-argparse
set -l options 'C/context=' 'h/help'
function usage -S -d "basic usage for $PROG"
echo \
"Usage: $PROG [-C|--context context] <SEARCH> [<INPUT PATH>]"
end
# parse args here
argparse $options -- $argv
set -l argc (count $argv)
# note: processed arguments are removed from $argv
if test $argc -ne 1 -a $argc -ne 2
usage
exit 0
end
set -l search_string $argv[1] # first argument
set -l input_path /dev/stdin
if test $argc -gt 1
# <INPUT PATH> is specified
set input_path $argv[-1]
end
echo $input_path
set -l grep_options -i
if set -q _flag_context
set --append grep_options '-C' $_flag_context
end
set --append grep_options $search_string
begin
# print head first
read -l line
echo "$line"
# let 'grep' do the rest
exec grep $grep_options
end < $input_path
begin .. end < $input_path
pattern is used before when I made fish-pandoc-any-to-markdown.
So I found this version a bit easier than others.
Perl Solution
My perl solution was made very long time ago. I’m happy to see that it is still working. Basic routine is the same, except it has one more options. –nohead which is not neccessary. I think I just wanted to chceck the how the OptArgs is working at that time.
I realized today that the routine in fish shell is also applicable.
- read one line from input and print to stdout
- exec to grep with option
Nevertheless, I believed that it is worth to learn!
parsing options in perl
And thanks to OptArgs module, I could handle option handy and in a more structural approach.
(However, I think this is little heavier than python’s argparse
.)
The recent file is on my github: hgrep.pl
#!/usr/bin/env perl
# -*- Mode: cperl; cperl-indent-level:4; tab-width: 8; indent-tabs-mode: nil -*-
# -*- coding: utf-8 -*-
# vim: set tabstop=8 expandtab:
use strict; use warnings;
use feature qw(switch);
use OptArgs; # https://metacpan.org/dist/OptArgs/view/bin/optargs
my @grep_options = qw(-i);
for ( $ENV{'TERM'} ) {
if ( $_ =~ /dumb/ ) { }
default { push @grep_options, "--color=auto" }
}
# ref: https://metacpan.org/pod/OptArgs
## option parts ...
opt context =>'Num',
( isa => 'C',
alias => default => 3,
'print NUM lines of output context' );
comment =>
opt help =>'Bool',
( isa => 'h',
alias => 'print a help message and exit',
comment => 1 );
ishelp =>
# argument parts ...
arg search =>'Str',
( isa => 1,
required => 'string to search from file' );
comment =>
arg file_name =>'Str',
( isa => default => '-', # default input from stdin
'the file which we search from' );
comment =>
# parsing options via optargs function!
my $opts = optargs;
And now processing the parsed arguments and open a file (or stdin)
if ( defined $opts->{'context'} and $opts->{'context'} > 0 ) {
push @grep_options, '-C', $opts->{'context'};
}my $fh;
if ( $opts->{'file_name'} ne '-' ) {
open my $fh, "<$opts->{file_name}",
or die "Can't open `$opts->{file_name}'";
}else {
# http://perldoc.perl.org/functions/open.html
open( $fh, "<&=",*STDIN );
}
if ( not $opts->{nohead} ) {
my $head = <$fh>;
# FIXME: colourising ....
print "$head";
}
my $to_gh;
requirement for system programming
And when I try to go further, I found that I need little more system programming underneath,
which shell
normally does for me.
To communicate with grep
function, we need to open a pipe via open
function.
my $grep_pid = open( $to_gh, '|-' );
if ( not defined $grep_pid ) {
die "Can't fork: $!";
}
|-
means creating a pipe, and fork implicitly at the same time and now we have two processes,
when the parent writing into new handle $to_gh, the child will read from the stdin.
In terms of shell script, it looks like below at the moment.
sh> parent_perl <some options ...> | child_perl
i.e. parent_perl and child_perl now communicate with piple(|
) and the child_perl
process
will be replaced with grep
process via exec
.
There is a simple way to we are in the parent_perl
process or child_perl
process,
which is checking the $grep_pid
value.
if ( $grep_pid ) {
# if grep_pid is not zero, this is parent_perl (parent process)
# which handle both file handles.
while ( <$fh> ) { print $to_gh $_; }
close $_ for $to_gh, $fh;
# parent process have to wait any children processes finsished.
waitpid $grep_pid, 0;
}else {
# otherwise, this is child_perl (child process)
close $fh; # not used in child process
exec 'grep', @grep_options, $opts->{'search'};
}
exit 0;
and last exec 'grep' ...
will replace its own perl process to grep
process.
no process could not be created without a parent.
I found that it is worth trying to understand basic system programming in perl, However shell script will be much easier to handle it.
Python Solution (as a newbie)
How about python? I think the same logic could be applied in python as well. However, I didn’t get chance to write down a python script yet. so, I didn’t make any function and write it as simple as possible. BTW, I only have python version 3.
credit:
- os pipe: https://www.tutorialspoint.com/python/os_pipe.htm
- for loop: https://realpython.com/python-for-loop/
- file i/o: https://www.w3schools.com/python/python_file_open.asp
- optparse: https://docs.python.org/3/library/optparse.html
- execvp: https://docs.python.org/3/library/os.html?highlight=popen#os.execvp
- waitpid: https://docs.python.org/3/library/os.html#os.waitpid
I go through similar pattern as I did in perl you can find the recent file on my github: hgrep.py
#!/usr/bin/env python3
import os, sys
import argparse
# handle options first
= argparse.ArgumentParser()#prog="hgrep.py")
parser "-C", "--context",
parser.add_argument( = 1,
nargs type = int,
= "context",
dest = False,
required help="print NUM lines of output context" )
"search",
parser.add_argument( # upper case in the help message
= "<SEARCH>",
metavar help = "string to search from <file_path>" )
"file_path",
parser.add_argument( # upper case in the help message
= "[<FILE PATH>]",
metavar = '-',
default help = "<file_path> to search" )
# case insenstive search
= [ '-i' ]
grep_options
# highligting
if os.environ['TERM'].lower != 'dumb':
"--color=auto" ) grep_options.append(
I found argparse module cannot handle optional positional argument.
optional opsitional argument is natural in grep
. So I’d like to keep that behaviour.
# argparse cannot handle optional argument
# WORKAROUND:
= sys.argv[1::]
argv if len(argv) == 0:
print( "{prog}: No argument given".format(prog= sys.argv[0] ),
file = sys.stderr )
parser.print_help()1 )
exit(
if len(argv) == 1:
# user ommit input file path
# default : - (stdin)
'-' )
argv.append(
= parser.parse_args( argv )
args
# check more grep options
if args.context is not None and args.context > 0:
'-C', args.context ] )
grep_options.extend( [
grep_options.append( args.search )
I don’t really know about python, but I guess I took the very low-level pipe()
function
in python.
# and let's go for plumbing
# file descriptors r,w for reading and writing
= os.pipe()
r, w
if args.file_path == "-":
# from stdin
= sys.stdin
file_to_read else:
# or open file path to read
if os.path.isfile( args.file_path ):
= open( args.file_path, "r" )
file_to_read else:
print( "A file path:({fp}) is not readable"
format( fp=args.file_path )
.file = sys.stderr )
, 2 )
exit(
# read head first and print into stdout directly
print( file_to_read.readline() , file = sys.stdout, flush = True )
# fork() will create a child process
# and we can distinguish which one is parent process by checking
# return value
= os.fork()
grep_pid
if grep_pid:
# parent process
# to communicate with to a child process
# writing file descriptor will be used
os.close(r)
os.dup2( w, sys.stdout.fileno() )
for line in file_to_read:
print( line )
# It is good practice to close all the file open
os.close( w )
# safely waiting for children processes
os.waitpid( grep_pid,# if child process status not available: no wait
os.WNOHANG
)
else:
# child process
os.dup2( r, sys.stdin.fileno() )
# child process only requires 'r' as stdin
# and stdout so it is better to close r,w here.
os.closerange( r, w )'grep', grep_options )
os.execvp(
0) exit(
Where I found difficulty
os.dup2
is essential to communicate with the grep
in child process as grep
only care about
stdin
here, but there is no way to inform the child that parent is going to newly open
file descriptors (r,w). So we should kindly re-bind the new file descriptor to stdin
TBH, I spent too much time on this because lacks of my knowledge about system programming.
and os.waitpid
requires os.WNOHANG
option value, I thought it will be 0
,
which is actually not. so my programme was on hang after grep
had finished its job.
Wrapping Up
pipe and shell’s power
- Even though it was good chance to learn about basic pipe usage, Shell script is very powerful for basic process communication between two processes.
- perl’s old open function’s arguments are a little bit hacky.
parsing option is easier with modules
And also I tried to add option and test them.
- fish’s argparse is relatively new, which is useful for my cases.
- Perl’s OptArgs has more features and handle optional argument as well. However, a little bit slower than python’s.
- python’s argparse has good type system for checking data type and is performant, however it doesn’t support optional (positional) arguments. so I applied some workaround.
Suggestion after post
It would be nicer, if we have option for case sensitive option because I put case-insensitive by default.
After making fish-pandoc-any-to-markdown and hgrep, only I need a programme to pre-process and let the other application could handle rest of it. So it becomes more general programme like below:
sh> ps aux | head-with get-one-line --tail-with grep -i /fish/
# or in fish-pandoc-any-to-markdown
sh> cat some.org | head-with retrieve-metadata --tail-with pandoc -t markdown > some.md
Well… but not for today. maybe after I get more chance to use the similar patterns!
Thank you for reading! and Happy coding!