Create a continuous keystroke-dynamics monitor with Perl and xev

Analyze who is using the computer by continuous processing of keystroke attributes at the X Window System level

Level: Intermediate

Nathan Harrington, Programmer, IBM

07 Oct 2008

Learn how to use Perl, xev, and custom algorithms to monitor who is currently at the keyboard based on characteristic typing patterns.

Keystroke dynamics is a relatively new field that enables identification of individuals through statistical analysis of their typing patterns. Previously published articles on developerWorks have shown how to integrate the concept of keystroke dynamics into your applications, as well as a real-world example of modifying Gnome Display Manager (GDM) to require a correct password and a "correctly typed" password. This article presents tools and code allowing you to move beyond a single application of keystroke dynamics, and monitor your entire X Window System environment continuously for characteristic patterns of the typist.

After reading this article, you will be able to create a continuous keystroke-dynamics monitor that can lock your X Window System session when your characteristic typing patterns are no longer detected.

Selected approach considerations

Perhaps the most efficient way to track every key pressed is through the use of a kernel-level key logger such as THC-vlogger or ttyrpld. Unfortunately, these programs are designed for older kernel levels or are currently difficult to use on modern Linux® distributions. Keyboard device-tracking programs, such as uberkey, are an appealing alternative, but their propensity for dropping keystrokes and timing imprecision makes them unsuitable for this application.

Although not applicable to the console or remote sessions, xev provides a robust and lightweight method for detecting keyboard events for any application running in X Window System.

Each event in a xev session is printed out with a high-resolution time value. In this article, we'll use that time value to record the "dwell" time for the R, S, and T keys over a specific time window. The dwell time is the period in which the user's finger holds the key down. This relatively simple measurement will be recorded for every application on the X Window System desktop.

When developing a keystroke "signature," a large amount of data is ideal. Usage patterns consistent with the most likely usage of the computer is desired. Experiment with the data-tracking options described below to achieve a broad sample of usage data. The developed signature will then be transformed into a cryptographic hash and stored on disk to be compared later during the monitor phase.

Hardware and software requirements

Any PC manufactured after 2000 should provide sufficient processing power for the code presented here. You'll need X Window System, as well as the xev program (see Resources). You need the mkpasswd program (included in most Linux distributions) for generating the cryptographic hashes of the keystroke signatures. The Perl modules X11::GUITest, threads, and Thread::Queue are required. Note for UNIX® and Linux users: If you're new to installing Perl modules, Andreas J. Konig's CPAN module will automate the installation of other modules (see Resources).

A simple way to record every keystroke pressed in an X Window System session is to start a xev program attached to every window listed by xwininfo -root -tree. This will work in theory for a small number of windows, but eventually, the maximum number of X clients will be reached, and X Window System will need to be recompiled to increase the number of allowable X clients. A more reasonable solution is to track the current window in focus and attach a single xev program to that window. Each keyboard event is then recorded for the currently in-focus window.

Listing 1 shows the beginning of the program designed to track the current focus and create a keystroke signature.

Listing 1. variable declaration

#! perl -w 
# - monitor dwell time of r,s,t for all X Window System
use strict;
use X11::GUITest qw( :ALL );
use threads;
use Thread::Queue;
die "specify mode, minimum samples" unless @ARGV == 2 ;

my $sleepTime = 5; # seconds between key event processing runs
my %windows = ();  # hash of window keystrokes
my @samp = ();     # most recent sample averages of keystrokes
my $checkRng = 10; # fuzziness of dwell time matching
my $userMatch = 0; # user or impostor?
my %keys = ();     # average of key dwell times

my $mode = $ARGV[0];       # record baseline or monitor matches
my $minSamples = $ARGV[1]; # required base samples to match with
my ( $salt, $hash ) = "";  # read from keystroke.Signatures

if( $mode eq "monitor" ){ loadSignatureFile() }

After the module includes and the initial variable declarations, the main control loop is entered. When in monitor mode, the previously generated signature file is loaded. Listing 2 shows the beginning of the main program loop.

Listing 2. Main program loop start

# ctrl-c to exit the program and drop the threads without error
  my @activeId = GetInputFocus();
  my $foundPipe = 0;

  for my $key ( keys %windows )
    if( $key eq "@activeId" && $windows{$key}{pipeDef} == 0 )
        my $res = "xev -id $key |";
        $windows{$key}{ input } = createPipe( $res ) or die "no pipe ";
        $windows{$key}{pipeDef} = 1;
        $foundPipe = 1;
    }#if not a match

  }#for each windows key

  if( $foundPipe == 0 )
    # if pipe doesn't already exist, add a new one
    my $key = "@activeId";
    if( !exists($windows{$key}) || $windows{$key}{pipeDef} == 0 )
      my $res = "xev -id $key |";
      $windows{$key}{ input } = createPipe( $res ) or die "no pipe ";
      $windows{$key}{pipeDef} = 1;
    }#if pipe doesn't already exist
  }#foundpipe check

The first for loop searches for a pre-existing entry in the windows hash that does not currently have a pipe attached to it. If such an entry is found, a pipe is created. Keeping a running list available of which window currently has a pipe attached allows the xev output to be collected over a period of time. The xev output is nonbuffered, which leads to seldom-used windows not filling the output buffer at a fast-enough rate. To keep the output data in place after the window loses focus, then regains it, the windows hash records the output. Listing 3 shows the remainder of the main logic loop.

Listing 3. Main logic loop end

  # read any available date from a pipe
  for my $xevPipe( keys %windows )
    next unless( $windows{$xevPipe}{pipeDef} == 1 );

    while( $windows{$xevPipe}{input}->pending )
      my $line = $windows{$xevPipe}{input}->dequeue or next;
      $windows{$xevPipe}{keyString} .= $line;

    }#while data to be added to the buffer

    next unless( exists( $windows{$xevPipe}{keyString} ) );
    next unless( length( $windows{$xevPipe}{keyString} ) > 8192 );

    compareSignature( getKeyAverages( $windows{$xevPipe}{keyString} )  );

    $windows{$xevPipe}{keyString} = "";

  }#for windows keys 

  # kill all xevs except currently monitored
  for my $key ( keys %windows )
    next unless( $key ne "@activeId" && $windows{$key}{pipeDef} == 1  );

    $windows{$key}{pipeDef} = 0;
    my $cmd = qq{ps -aef | grep $key | grep xev | perl -lane '`kill \$F[1]`'};

  }#for each windows key

  sleep( $sleepTime );

}#while main loop

After the pipe has been created (or if one already exists), each pipe's output is read into the recorded event variable for that window. If enough data has been recorded, the entire buffer is passed to the getKeyAverages subroutine, then to the compareSignature subroutine. Next, if a change of focus event has occurred, the old xev program is terminated.

Listing 4 shows the first subroutines: loadSignatureFile and createPipe.

Listing 4. loadSignatureFile and createPipe subroutines

sub loadSignatureFile
  open(INFILE,"keystroke.signatures") or die "no signature file";
    my $line =<INFILE>;
    die "empty file " unless defined $line;
    ( undef, undef, $salt, $hash ) = split '\$', $line;


sub createPipe
  my $cmd = shift;
  my $queue = new Thread::Queue;
      my $pid = open my $pipe, $cmd or die $!;
      $queue->enqueue( $_ ) while <$pipe>;
      $queue->enqueue( undef );

  # detach causes the threads to be silently terminated on program exit
  return $queue;


The loadSignatureFile simply reads the salt and hash information stored from the "record" mode of the program. These values are used later for keystroke-signature comparisons. The createPipe subroutine is a simple method to create a nonblocking read from a pipe using threads. Listing 5 shows the next subroutine: getKeyAverages.

Listing 5. getKeyAverages subroutine

sub getKeyAverages
  my %temp = (); # temporary hash to record key press and release times
  my %avg =  (); # average for entire buffer read key press and release times

  open(my $fh, '<', \$_[0]) or die "Could not open string for reading";

    while(my $inLine = <$fh> )
      next unless( $inLine =~ /KeyPress event/ || $inLine =~ /KeyRelease event/ );
      my $state = (split " ", $inLine)[0];

      # get type of entry
      my $eventType = (split " ", $inLine)[0];

      # get the time entry
      my $currTime = <$fh>;

      # make sure the line exists and has the required data 
      next unless( defined($currTime) );
      next unless( length($currTime) > 43 );
        $currTime = substr( $currTime, index($currTime,"time ")+5);
        $currTime = substr( $currTime, 0, index($currTime,","));

      # get the key name 
      my $currKey = <$fh>;
      next unless( defined($currKey) );
      next unless( length($currKey) > 40 );
        $currKey = substr( $currKey, index($currKey,"keysym ")+7);
        $currKey = substr( $currKey, 0, index($currKey,"),"));
        $currKey = substr( $currKey, index($currKey, ", ")+2);

      next unless( $currKey eq "r" || $currKey eq "s" || $currKey eq "t" );

      # add the key press
      if( $state eq "KeyPress" ){ $temp{$currKey} = $currTime }

      next unless ( $state eq "KeyRelease" );

      if( exists( $temp{ $currKey } ) )
        $avg{$currKey}{val} += $currTime - $temp{$currKey};
        $avg{$currKey}{count} ++;
      }#if a press has been recorded

      # either the data has been recorded or it was a release on a key never pressed
      # in this window
      delete $temp{ $currKey };

    }#while file handle
  close( $fh );

  my( $rVal, $sVal, $tVal ); $rVal = $sVal = $tVal = 0;
  if( exists( $avg{"r"} ) ){ $rVal =  ($avg{"r"}{val} / $avg{"r"}{count}) };
  if( exists( $avg{"s"} ) ){ $sVal =  ($avg{"s"}{val} / $avg{"s"}{count}) };
  if( exists( $avg{"t"} ) ){ $tVal =  ($avg{"t"}{val} / $avg{"t"}{count}) };

  return( $rVal, $sVal, $tVal );


The xev program output lists every X Window System event in the attached window. Listing 6 is an example of what this can look like.

Listing 6. xev example program output

KeyPress event, serial 16, synthetic NO, window 0x2000002,
    root 0x76, subw 0x2000012, time 248543985, (719,86), root:(964,107),
    state 0x0, keycode 27 (keysym 0x72, r), same_screen YES,
    XLookupString gives 1 bytes: (72) "r"
    XmbLookupString gives 1 bytes: (72) "r"
    XFilterEvent returns: False

KeyRelease event, serial 16, synthetic NO, window 0x2000002,
    root 0x76, subw 0x2000012, time 248544153, (719,86), root:(964,107),
    state 0x0, keycode 27 (keysym 0x72, r), same_screen YES,
    XLookupString gives 1 bytes: (72) "r"
    XFilterEvent returns: False

KeyPress event, serial 16, synthetic NO, window 0x2000002,
    root 0x76, subw 0x2000012, time 248544206, (719,86), root:(964,107),
    state 0x0, keycode 39 (keysym 0x73, s), same_screen YES,
    XLookupString gives 1 bytes: (73) "s"
    XmbLookupString gives 1 bytes: (73) "s"
    XFilterEvent returns: False

KeyPress event, serial 16, synthetic NO, window 0x2000002,
    root 0x76, subw 0x2000012, time 248544263, (719,86), root:(964,107),
    state 0x0, keycode 28 (keysym 0x74, t), same_screen YES,
    XLookupString gives 1 bytes: (74) "t"
    XmbLookupString gives 1 bytes: (74) "t"
    XFilterEvent returns: False

KeyRelease event, serial 16, synthetic NO, window 0x2000002,
    root 0x76, subw 0x2000012, time 248544365, (719,86), root:(964,107),
    state 0x0, keycode 39 (keysym 0x73, s), same_screen YES,
    XLookupString gives 1 bytes: (73) "s"
    XFilterEvent returns: False

The key data values here are the key names, event types, and time entry. Note how, during normal typing, press-and-release events for different keys can overlap. The code in the getKeyAverages subroutine processes the input string buffer as a file handle and extracts the relevant time, event type, and key name from the input buffer. The average value for each key dwell time for the entire buffer is computed and returned.

As shown in the main program loop, the getKeyAverages subroutine output is sent to the compareSignature subroutine shown below.

Listing 7. compareSignature subroutine

sub compareSignature
  if( $_[0] ne "0" )
    $keys{ "r" }{ val } += $_[0];
    $keys{ "r" }{ count }++;
  }#if r is not 0

  if( $_[1] ne "0" )
    $keys{ "s" }{ val } += $_[1];
    $keys{ "s" }{ count }++;
  }#if s is not 0

  if( $_[2] ne "0" )
    $keys{ "t" }{ val } += $_[2];
    $keys{ "t" }{ count }++;
  }#if t is not 0

  return unless ( exists($keys{"r"}) );
  return unless ( exists($keys{"s"}) );
  return unless ( exists($keys{"t"}) );
  if( $keys{ "r" }{ count } >= $minSamples &&
      $keys{ "s" }{ count } >= $minSamples &&
      $keys{ "t" }{ count } >= $minSamples )

    $samp[0] = sprintf( "%0.0f", $keys{r}{val} / $keys{r}{count} );
    $samp[1] = sprintf( "%0.0f", $keys{s}{val} / $keys{s}{count} );
    $samp[2] = sprintf( "%0.0f", $keys{t}{val} / $keys{t}{count} );

    if( $mode eq "record" )
      #print "[@samp]\n";  # uncomment to see plain keystroke signature
      print `echo "@samp" | mkpasswd -H md5 --stdin`;
      $userMatch = 0;
      checkDynamics( "", 0 );
      if( $userMatch == 0 )
        print "\nno match\n";
        #system( "xscreensaver-command -lock" );
        print "user verified\n";
      }#if the signatures did not match

    }#if in record mode

    %keys = ();

  }#enough samples


After recording the values (if they are not zero), the average R, S, and T dwell times are computed if there are enough samples. When in "record" mode, these dwell times are expanded into a string and used to generate a cryptographic hash of the signature. In "monitor" mode, the checkDynamics subroutine is called to determine whether the current dwell times match those recorded (within a allowable range) in the keystroke.signatures file. If a match is found, no action is taken. If no match is found, the screensaver is locked, effectively locking an opportunistic attacker out of the system. Listing 8 details the checkDynamics subroutine.

Listing 8. checkDynamics subroutine

sub checkDynamics
  my $inString = $_[0];
  my $level = $_[1];

  my $start = $samp[$level] - $checkRng;
  my $stop  = $samp[$level] + $checkRng;
  my $curr  = $start;
  while( $curr <= $stop && $userMatch != 1 )
    if( $level == 2 ) # deepest level for only three letters
      my $res = `echo "$inString $curr" | mkpasswd -S $salt -H md5 --stdin`;
      if( $res eq qq/\$1\$${salt}\$${hash}/ ){ $userMatch = 1 }

      # append to the current 'signature', go to next level
      my $tempStr = "";  # temporary signature string

      if( length($inString) != 0 ){ $tempStr = "$inString $curr" }
      else                        { $tempStr = $curr }

      checkDynamics( $tempStr, $level+1 );

    }#if at maximum level


  }#while current less than stop



The checkDynamics subroutine recursively calls itself while building signatures encompassing the full range of possibilities defined by the checkRng parameter. Each string passed to mkpasswd is built level by level from a single key dwell time all the way up to a dwell time for each recorded letter in the user name. For example, if the average dwell time is "130 130 130" (for R, S, T, respectively), the checkDynamics subroutine will work through the necessary permutations to check "125 125 125," "135 135 135," and everything in between. Loose matching (with a high checkRng value) will drastically increase the amount of time required to check all possibilities.

Save the above code as and run the program in record mode to generate keystroke signatures: perl record 10 2>/dev/null. This command will monitor the keystrokes of the currently in-focus X window and print out a cryptographic hash of the keystroke dwell times after 10 averages have been recorded across all windows. For testing purposes, uncomment the sample printing line in Listing 7 to show the keystroke signature before encryption. Although 10 averages, as used above, is useful for testing, a much larger range of data is desirable for accuracy in creating a signature. Try values that allow you to type thousands of words in normal usage before printing a cryptographic hash. After you are satisfied with your data collection, take the hash that is printed out and place it in the keystroke.signatures file.

To monitor the current user's typing patterns and lock the screen when a deviation in the pattern is detected, run the program with perl monitor 10 2>>dev/null. (The stderr redirect to null is due to a scalar dropping issue in As described, this program will monitor the current typing patterns and lock the screen when the signature differs from that recorded in the keystroke.signatures file.

Note that you'll need to experiment with the checkRng parameter and the minSamples parameter to find settings that work correctly for your environment and your specific typing patterns.

Conclusion, further examples

The tools and code described in this article allow you to create your own framework for continuous user verification using keystroke dynamics. Although built around dwell times for three keys, the xev program and the code described herein allow you to monitor all aspects of keyboard (and mouse) X Window System interaction. Consider measuring which characters are commonly used before a backspace, or monitor which vi or emacs key combinations are used most frequently. Count the commonly misspelled words and measure other typing patterns, such as application shortcut keys and common keys, for a given application.

About the author

Nathan Harrington

Nathan Harrington is a programmer at IBM currently working with Linux and resource-locating technologies.

