This is an archived cached-text copy of the developerWorks article. Please consider viewing the original article at: IBM developerWorks

Skip to main content
    Country/region [select]      Terms of use
     Home      Products      Services & industry solutions      Support & downloads      My IBM     
skip to main content

developerWorks  >  Open source | Linux  >

Whistle while you work to run commands on your computer

Use open source software and microphone-enabled laptops to listen for specific tonal sequences and run commands

Document options

Document options requiring JavaScript are not displayed

Sample code

Rate this page

Help us improve this content

Level: Intermediate

Nathan Harrington (, Programmer, IBM

09 Jan 2007

Use Linux® or Microsoft® Windows®, the open source sndpeek program, and a simple Perl script to read specific sequences of tonal events -- literally whistling, humming, or singing at your computer -- and run commands based on those tones. Give your computer a short low whistle to check your e-mail or unlock your your screensaver with the opening bars of Beethoven's Fifth Symphony. Whistle while you work for higher efficiency.

For many years, computer users have been able to run processor-intensive speech recognition applications executing commands based on voice processing and requiring extensive configuration. While recent advances in processing power and algorithms have brought about user-independent voice recognition with reduced error rates, excellent opportunities exist for a simple tone pattern-based recognition system.

Using the recently released sndpeek program to do the sophisticated processing, we will run a simple Perl program to allow for the easy generation of tonal codes. A Perl script will be presented to allow the user to customize the tonal input and detection environment.



You'll need a system with the capability of processing sound input, preferably from an integrated microphone, although any sound source capable of producing discrete tonal events will suffice. For example, whistling at your computer is one of the more effective ways to add a tertiary input channel (besides keyboard and mouse), but playback of an MP3 player's output through your line-in jack could produce the same results if configured properly. The code in this article was developed and tested on an IBM® ThinkPad® T42p with a 900-MHz processor and 1 GB of RAM. Much less-powerful systems should be easily capable of making use of the code presented in this article, as sndpeek is the primary resource consumer and is an efficient program.


You'll need a functional sound-processing software environment to access your microphone hardware. Although sound configuration and troubleshooting is beyond the scope of this article, it may be useful to test this code on a Vector Linux Live CD, which has most of the drivers and components necessary for a functional setup of a diverse range of sound hardware. You will need at least a semifunctional installation of the sndpeek program, as the 3-D display portion of the code is not required.

In Resources, there is a link to the sndpeek Web page. Download the code and in the file src/sndpeek/sndpeek.cpp, find the section that says:

Listing 1. sndpeek modification

    fprintf( stdout, "%.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f ", 
         mfcc(0), mfcc(1), mfcc(2), mfcc(3), mfcc(4), mfcc(5), mfcc(6),
         mfcc(7), mfcc(8), mfcc(9), mfcc(10), mfcc(11), mfcc(12) );
    fprintf( stdout, "\n" );

Ensure that the output of the program is actually written at the end of every print window by flushing the output. Change the above section to read:

Listing 2. Second sndpeek modification

    fprintf( stdout, "%.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f %.2f ", 
         mfcc(0), mfcc(1), mfcc(2), mfcc(3), mfcc(4), mfcc(5), mfcc(6),
         mfcc(7), mfcc(8), mfcc(9), mfcc(10), mfcc(11), mfcc(12) );
    fprintf( stdout, "\n" );

Now run the typical ./configure; make; make install commands to build and install sndpeek on Linux. For Windows, make sure the program builds correctly. There are many options for building programs on Windows, which are beyond the scope of this article. If you want to see cmdWhistle in action, I recommend using the Vector Linux Live CD.

Next up is the xwit application installation for command-line windowing control. Check Resources for a link to the xwit download, and install with no changes to the source code. You're now ready to create some simple tones.

Back to top

Example setup and configuration

Creation of a simple tonal sequence

Download the source code and find the script. This is the main Perl program that allows you to create tonal sequences, as well as listen for specific tonal sequences and run commands. This article will first take you through the user-space usage and configuration of the program, then describe its various functions.

Run the program with the following command: sndpeek --print --nodisplay | perl -c. This will start the sndpeek program listening to your microphone and print the output to the Perl program. Once the program is running, generate some simple whistles -- separate solid tones or rising-scale notes with one-third-second pause between. Note that you will have to be running this program in a relatively noise-free environment, so plug in your headphones and make sure your CD drive is spun down. If your laptop's battery is on fire, try unplugging the smoke detector before running this program.

Experiment with different paces and tones to get a feel for the resolution of events the cmdWhistle program can capture. Experience with the subtleties of the program's tonal detection process is important for creating complex tone sequences that are repeatable. Your first tonal sequence should be simple: two low tones with 0.5 seconds between them for a "beep beep." Run sndpeek --print --nodisplay | perl -c, and when you see "enter a tone sequence," whistle twice with 0.5-second delay between the tones. An automatic timeout will occur after 4 seconds (configurable), and your tonal sequence will be printed out similar to the following example: 25.00 25.00 _#_ 0 500000 _#_ <command here> _#_ <comment here>.

Setup of command and detection of tonal sequence

Let's dissect that line: tone values, delimiter, time values, delimiter, command area, delimiter, and comment area. Your next step is to copy this line into the default configuration file for the program: {$HOME}/.toneFile, which is probably /home/<username>/.toneFile. Once you have created the ~/.toneFile with the above tonal sequence line, you can modify the line to run a program. Change the command area text to /bin/echo "two low" and modify the comments area to something more descriptive, like 25.00 25.00 _#_ 0 500000 _#_ /usr/bin/echo "two low" _#_ two low tones.

Now that you have modified the configuration file to print out a notification, run the script in daemon mode with the command sndpeek --print --nodisplay | perl The program will silently listen in the background for any of the events from the ~/.toneFile listing. Try your double low-tone whistle with the same temporal spacing and the same notes, and you will see the text "two low" printed to the screen. If you want to see the functioning of the script in detail, run it in daemon mode with the command sndpeek --print --nodisplay | perl -v. If your system supports the graphics display, remove the --nodisplay option for an excellent 3-D visualization of the sound input.

Back to top

Example setup for window management using xwit

Creation of raise, lower, and iconify sequences

There are many options for useful tone input that mimic keyboard shortcuts or mouse movements. The example below is focused on providing window management functions to allow the user to keep his hands in the home position, happily hacking away while windows are placed as he prefers. Please see the "Additional examples" section for more detail on the possibilities offered with this expanded input method.

Run the program in "create" mode with the command sndpeek --print --nodisplay | perl -c. You need to create some simple tones you can reproduce with ease for quick window management functions. I recommend a low tone for "lower," a high tone for "raise," and two midtones for "iconify." Make sure you pick something you can consistently perform accurately. Although you'll be able to modify the parameters that control the precision with which you need to enter your tonal sequences (both in pitch and time), it can still be difficult to match the tone in question or the precise timing. Three tonal options widely spaced in pitch is a good mix between command flexibility and simplicity for those of us who need more practice at the karaoke bar. Here is a sample ~/.toneFile with the three commands:

Listing 3. Sample ~/.toneFile with three commands

20.00 _#_ 0 _#_ /usr/bin/xwit -pop -names rxvt _#_ raise rxvt windows
40.00 _#_ 0 _#_ /usr/bin/xwit -pop -names nathan _#_ raise xterms starting with nathan@localhost
25.00 25.00 _#_ 0 500000 _#_ /usr/bin/xwit -iconify -names nathan _#_ iconify nathan@localhost~

Consider replacing the xwit commands with echo alternatives and practicing before sustained usage.

xwit - X functions accessible from a shell script

Although there are many methods of command-line window control for X Window System, xwit is one of the more simple and portable methods that will work for many window managers. Download and install xwit, and execute the command xwit -iconify to ensure that the program is working. (This will minimize your current window) Although xwit does not have a "lower" function, we can provide a workaround by raising other windows. For this example, we use the the single high tone (40.00) to raise the windows with a title beginning with "nathan." Setting the title of your xterms is a simple mode of identification suitable for typical programming-type tasks. Other windows will be raised using a single low tone (20.00) if they begin with rxvt. Two medium tones (25.00) with a half-second delay will cause all of the windows that have a title beginning with "nathan" to be iconified. See Resources for examples of what this looks like.

One of the key benefits to this setup is the ability to continue typing in one window while making another window visible. This is useful for building subroutine structures while you bring your documentational reference to the foreground for API information. In effect, you can compute faster than you can type -- you can type and manage your windowing environment at the same time.

Windows shell scripts for raising and lowering windows

For controlling windows on Microsoft operating systems, once simple way to raise windows is to use the WshShell.AppActivate command. If, for example, we want to raise the "gvim" application, we would create a file called "gvimActivate.vbs" with the following code:

Set WshShell = WScript.CreateObject("WScript.Shell")
WshShell.AppActivate "gvim";

With the above file in place, all we have to do is execute it, and the "gvim" window will be given focus and brought to the foreground. If you are running on Windows, change the simple high-tone command in the ~/.toneFile to 40.00 _#_ 0 _#_ gvimActivate.vbs _#_ raise gvim window.

Back to top

Additional examples

The sndpeek program and provide an additional user-input mechanism that can be utilized in unique ways. Set up an unlock code for your screensaver and whistle as you approach your desk -- no more bothersome password typing. Check your e-mail every time you whistle, check for your cell phone's unique tones, and send yourself an e-mail when your phone rings.

Back to top

The code

History and strategy

Fast Fourier transforms, sliding percentage windows, and some Linux audio programming can give you tonal-recognition capabilities with your language of choice. The sndpeek application, written by Ge Wang, Perry R. Cook, and Ananya Misra, provides a portable, fast and UNIX® way of acquiring the information needs to detect tonal events. The simple command sndpeek --print will show real-time text analytics of the current sound source, as well as provide an excellent 3-D visualization. The fourth entry in the text analytics printed out by sndpeek is the output of the Rolloff function of sndpeeks processing using the "Marsyas" component of the sndtools distribution. Here's the description from the Rolloff.cpp source code:

Listing 4. Description from the Rolloff.cpp source code

Compute Rolloff (a percentile point) of the input fvec. 
For example if perc_ is 0.90 then Rolloff would be the 
index where the sum of values before that index are 
equal to 90% of the total energy. 

Using this value as our base "tone," the script will detect various time intervals between tones.

Parameter configuration

Let's start at the top of with the timing- and sensor-critical parameters:

Listing 5. timing- and sensor-critical parameters

$|=1; #for non buffered standard output, useful for other programs to read 
require 'sys/'; # for subsecond timing

my $option = $ARGV[0] || ""; # simple option handling

my $MAX_TIMEOUT_LENGTH = 4; # maximum length in seconds of tone pattern
my $LISTEN_TIMEOUT = 2; # timeout value in seconds between tone
my $MAX_TIME_DEV = 	100000; # maximum acceptable deviation between recorded
	# pattern values and current time values
my $MAX_TONE_DEV = 2; # maximum acceptable deviation between recorded
	# pattern values and current tone values
my $MAX_AVG_TONE = 5; # maximum number of samples to be averaged 

The above variables and their comments are relatively straightforward. More detail will be available about their usage and configuration options later. The following is the remainder of the global variables and their descriptions.

Listing 6. Remainder of global variables and descriptions

my @queTones = ();	# running queue of recent tones detected 
my $prevTone = 0; 	# the last solid tone detected, used for disambiguation
my $prevInterval = 0; 	# previous interval of time
my @baseTones = (); # the currently entered tone sequence
my @baseTimes = (); # the currently entered temporal values
my %toneHash = (); 	# tones, times and commands read from ~/.toneFile
my $toneCount = 0; 	# the current count of tones entered
my $startTime = ""; 	# start of a temporal block
my $currTime = ""; 	# current time in the time out loop
my $toneAge = 0; 		# for tone count synchronization
my $timeOut = 0;		# to reset the timer window


We begin with the getEpochSeconds and getEpochMicroSeconds subroutines used to provide detailed and precise information on the status of temporal tonal patterns.

Listing 7. getEpochSeconds and getEpochMicroSeconds subroutines

sub getEpochMicroSeconds {

 my $TIMEVAL_T = "LL";   # LL for microseconds
 my $timeVal = pack($TIMEVAL_T, ());

 syscall(&SYS_gettimeofday, $timeVal, 0) != -1 or die "micro seconds: $!";
 my @vals = unpack( $TIMEVAL_T, $timeVal );
 $timeVal = $vals[0] . $vals[1];
 $timeVal = substr( $timeVal, 6);

 my $padLen = 10 - length($timeVal);
 $timeVal = $timeVal . "0" x $padLen;


sub getEpochSeconds {
 my $TIMEVAL_T = "LL";   # LL for microseconds
 my $start = pack($TIMEVAL_T, ());
 syscall(&SYS_gettimeofday, $start, 0) != -1 or die "seconds: $!";
 return( (unpack($TIMEVAL_T, $start))[0] );

Next up is the readTones subroutine, which first acquires the Rolloff data value as output from sndpeek. As you can see from the following comments section, the code first develops five samples of tones to compare together to form a basis for computing deviation. If the tone array queue is up to capacity, compute the deviation for each tone. If the comparison to any single tone in the queue exceeds the maximum tonal deviation, specify that the current processing did not produce a discernable tone.

If the deviation of all tones in the queue is less than the acceptable threshold, perform the temporal layer of ambiguity detection. If you whistle in a slowly increasing tone, your individual notes will deviate in small-enough amounts from the acceptable threshold to produce recognized tonal events. This continuous detection can cause a problem, however. The upDev and downDev variables and comparison logic are designed to acquire these contiguous tonal changes if they deviate more than the MAX_TONE_DEV variable. If both the recent tones queue check and the contiguous tone check are passed, record the tone event and its time for later printing or comparison.

The careful modification of these variables will help tune the program to recognize your particular tonal style and temporal deviations. The overall refresh rate of frequency analysis is dictated by the output of the sndpeek program. All of the other parameters can be configured to detect more widely spaced tonal events -- multiple simultaneous tonal events across a time threshold, or different timings between the tonal events.

Listing 8. Detect more widely spaced tonal events

sub readTones
 # read the Rolloff output only, 
 my(undef, undef, undef, $currentTone ) = split " ", $_[0];

 if( @queTones == $MAX_AVG_TONE )
  my $localDiff = 0;
  # check for a solid tone by comparing against the last five tones
  # perform simple time related tonal smoothing, so if the tone
  # wavers just a bit it's still counted as one tone
  for my $chkHistTone ( @queTones )
   if( abs($currentTone - $chkHistTone) > $MAX_TONE_DEV )
    $localDiff =1;
    $prevTone = 0;
   }#if deviation less than threshold
  }#for each tone

  if( $localDiff == 0 )
   # make sure the current tone is different than the previous one, this is to 
   # ensure that long duration tones are not acquired as multiple tone events
   # this step up or down will allow you to whistle continuously and pick up the 
   # steps as discrete tone events
   my $upDev  = $currentTone + $MAX_TONE_DEV;
   my $downDev = $currentTone - $MAX_TONE_DEV;
   if( $prevTone > $upDev || $prevTone < $downDev )
    my $currVal = getEpochMicroSeconds();
    my $diffInterval = abs($prevInterval - $currVal);
    if( $option ){
     print "Tone: $currentTone ## last: [$currVal] curr: [$prevInterval] ";
     print "difference is: $diffInterval\n";
    if( $toneCount == 0 ){ $diffInterval = 0 }
    push @baseTones, $currentTone;
    push @baseTimes, $diffInterval;
    $prevInterval = $currVal;
   }#if deviation in tone

   # now set the previous tone to the current tone so a continuous tone
   # is not acquired as multiple tone events
   $prevTone = $currentTone;

  }#if a solid tone has been found

  # if enough tones to create an useful queue have been added, pop one off
  shift @queTones;

 }#if enough tones to create a useful queue

 # always push more tones on the avg
 push @queTones, $currentTone;


When a tonal pattern is created, it is placed in the ~/.toneFile file and read by the following subroutine.

Listing 9. Tonal pattern creation

# readToneFile reads tone sequences and commands from ~/.toneFile
# format is: tones _#_ times _#_ command _#_ comments
sub readToneFile 
 #give it a full path to .toneFile if on windows
 open(TONEFILE,"$ENV{HOME}/.toneFile") or die "no tone file: $!";


   if( !/^#/ ){

    my @arrLine = split "_#_";
    $toneHash{ "$arrLine[0] $arrLine[1]" }{ tones }  = $arrLine[0];
    $toneHash{ "$arrLine[0] $arrLine[1]" }{ times }  = $arrLine[1];
    $toneHash{ "$arrLine[0] $arrLine[1]" }{ cmd }   = $arrLine[2];
    $toneHash{ "$arrLine[0] $arrLine[1]" }{ comment } = $arrLine[3];

   }#if not a comment line

  }#for each line in file



When a tonal pattern is acquired by readTone, it is compared to the existing tonal patterns loaded from readToneFile. The compareToneSequences subroutine performs a simple difference check between the timings of the tones, as well as the values of the tones. Note that the differences between tone values and timings is not compounded. Missing the timing or tone on many notes by a small amount will not accumulate into a total match failure.

For each tone in the tone file, build the tonal and temporal arrays for matching. The first comparison is between the number of tones, as there is no point comparing a seven-tone sequence to a two-tone sequence. For each of the tones and times, check that the values are within the acceptable deviation parameters. Maximum tone and temporal deviation is critical to allowing the matching of your tonal sequences with accuracy, not precision. You can increase the maximum tone or time deviation to allow you to be more liberal in your rhythmic timings or tonal production. Caution and experimentation is called for, as liberal settings can lead to spurious detection results. For example, try increasing just the tone deviation threshold to 5, keeping the temporal deviation threshold at 100000. This will allow you to enter tones remotely related to the expected patterns, at the correct times, and still match a pattern -- useful if you wish to practice your timings only.

If the full pattern is a match, the command specified in the ~/.toneFile is run and the result printed out if verbose mode is enabled. The next step is to exit the subroutine if no matches are found or reset the current tone and time records if a match is made.

Listing 10. Tonal pattern creation

sub compareToneSequences 
 my $baseCount = @baseTones;
 my $countMatch = 0; # record how many tones matched

 for my $toneFromFile ( keys %toneHash )
  my @confTones = split " ", $toneHash{$toneFromFile}{tones};
  my @confTimes = split " ", $toneHash{$toneFromFile}{times};

  my $confCount = @confTones;

  next unless( $confCount == $baseCount );

  # as a learning aid, the matching and non matching portions of the
  # comparison are printed out, so at least you can see what is going 
  # wrong while trying to remember your tone codes
  my $pos =0;
  my $toneMatchFail = 0;
  for( @baseTones )
   my $tonDiff = abs($confTones[$pos] - $baseTones[$pos]);
   my $tonStr = "t $pos b $baseTones[$pos] ".
          "c $confTones[$pos] \n";

   my $timeDiff = abs($confTimes[$pos] - $baseTimes[$pos]);
   my $timStr = "t $pos b $baseTimes[$pos] ".
          "c $confTimes[$pos] d $timeDiff\n";

   if( $tonDiff > $MAX_TONE_DEV )
    $toneMatchFail = 1;
    if( $option ){ print "NOTE DISSONANCE $tonStr" }
    if( $option ){ print "NOTE MATCH $tonStr" }
   }#if tone detected outside of deviation

   # if it's an exact match, increment the matching counter
   if( $timeDiff < $MAX_TIME_DEV ){
    if( $option ){ print "TIME MATCH $timStr" }
    if( $option ){ print "TIME DISSONANCE $timStr" }
   }# deviation check


  }# for each tone to check 

  if( $countMatch == $confCount && $toneMatchFail == 0 )
   my $cmd = $toneHash{$toneFromFile}{ cmd };
   if( $option ){ print "run: $cmd\n" }
   $cmd =`$cmd`;
   if( $option ){ print "result: $cmd\n" }

  # otherwise, make the count of matches zero, in order to not reset
   $countMatch = 0;

 }#for each tone in tone file

 # if the match count is zero, exit and don't reset variables so a longer
 # tone sequence can be entered and checked
 if( $countMatch == 0 ){ return() }

 # if a match occured, reset the variables so it won't match another pattern
 $toneCount = 0;
 @baseTones = ();
 @baseTimes = ();


Main program logic

With the subroutines in place, the main program logic will allow the user to create a tone sequence, or will run in daemon mode to listen for tones and execute commands. The first section is executed when the user specifies option "-c" for create mode. A simple timeout process is used to end the knock sequence. Increase the maximum timeout length variable to permit pauses of more than 4 seconds between tones. If you leave the maximum timeout length at 4, the program will end and print your currently entered tone sequence.

Listing 11. Timeout process

if( $option eq "-c" ){

 print "enter a tone sequence:\n";

 $startTime = getEpochSeconds(); # reset time out start

 while( my $sndPeekOutput = <STDIN> )

  $currTime = getEpochSeconds();

  # check if there has not been a tone in a while 
  if( $currTime - $startTime > $MAX_TIMEOUT_LENGTH ){

   $timeOut = 1; # exit the loop


   # if a tone has been entered before timeout, reset timers so
   # more tones can be entered

   if( $toneCount != $toneAge ){
    $startTime = $currTime;  # reset timer for longer delay
    $toneAge = $toneCount; # synchronize tone counts
   }# if a new tone came in

  }# if timer not reached

  readTones( $sndPeekOutput );

  if( $timeOut == 1 ){ last }
 }#while stdin

 if( @baseTones ){
  print "place the following line in $ENV{HOME}/.toneFile\n\n";
  for( @baseTones ){ print "$_ " }
  print "_#_ ";
  for( @baseTimes ){ print "$_ " }
  print "_#_ (command here) _#_ <comments here>\n\n";
 }#if tones entered

Section two of the main logic continuously reads the output from the sndpeek --print command. Tonal groups are automatically reset after the timeout threshold is reached in order to differentiate between separate tonal patterns. Consider modifying the LISTEN_TIMEOUT variable to achieve faster tonal entry times, or lengthen the timeout variable to acquire tone patterns with more widely spaced events.

Listing 12. Modifying LISTEN_TIMEOUT

 # main code loop to listen for tones and run commands
 $startTime = getEpochSeconds();

 while( my $sndPeekOutput = <STDIN> )

  $currTime = getEpochSeconds();

  if( $currTime - $startTime > $LISTEN_TIMEOUT ){

   $toneCount = 0;
   @baseTones = ();
   @baseTimes = ();
   $startTime = $currTime;
   if( $option ){ print "listen timeout - resetting tones \n" }


   if( $toneCount != $toneAge ){
    $startTime = $currTime;  # reset timer for longer delay
    $toneAge = $toneCount;  # synchronize tone counts
   }# if a new tone came in


  }#if not reset timeout

  readTones( $sndPeekOutput );

 }#while stdin

}#if option set

Back to top

Caveats and security concerns

Share this...

digg Digg this story Post to
Slashdot Slashdot it!

The cmdWhistle program is well suited for providing an additional channel of user input for your system, while allowing your hands to continue manipulating the mouse and keyboard. Be wary of using cmdWhistle to do anything requiring authentication on your system.

In addition to the obvious issue of listeners recording and mimicking your selected tonal sequence to run commands on your system, there are many other variables associated with tonal authentication that indicate usage in any serious context is premature at best. The tonal sequences are currently stored as two-digit "notes" in the ~/.toneFile, along with four- to nine-digit representations of the delay in microseconds. It is compartively easy to read this "password" file, and simply try and match the tone pattern to gain access to the system. One-way hashes could be used by eliminating some of the precision in the microseconds values, but this is best left to the reader with a desire to evaluate the risks on their own.

Back to top


DescriptionNameSizeDownload method
Source codeos-whistle.zip3KBHTTP
Information about download methods


  • Princeton University hosts the sndpeek program.

  • hosts a mirror of the xwit program.

  • See a demonstration video on or check it out on Google Video.

  • Visit IBM developerWorks' PHP project resources to learn more about PHP.

  • Stay current with developerWorks technical events and webcasts.

  • Check out upcoming conferences, trade shows, webcasts, and other Events around the world that are of interest to IBM open source developers.

  • Visit the developerWorks Open source zone for extensive how-to information, tools, and project updates to help you develop with open source technologies and use them with IBM's products.

  • To listen to interesting interviews and discussions for software developers, be sure to check out developerWorks podcasts.

Get products and technologies
  • Innovate your next open source development project with IBM trial software, available for download or on DVD.


About the author

Nathan Harrington is a programmer at IBM currently working with Linux and resource-locating technologies.

Rate this page

Please take a moment to complete this form to help us better serve you.

YesNoDon't know



Back to top

    About IBM Privacy Contact