Driver drowsiness represents an important risk on the roads, given that it is one of the main factors leading to accidents or near-missed accidents. This fact has been proven by many studies that have established links between driver drowsiness and road accidents. Aldrich, for example, reported that more than 13% of the road accidents are due to driver drowsiness . He also quoted a study by Lafont, who affirms that 34% of the accidents are due to drowsiness. A survey done in 1973 by Tilley et al. shows that, from fifteen hundred drivers, 10% had experienced drowsiness at the wheel, and 10% had been involved in accidents or near-missed accidents due to drowsiness.
The above-mentioned results show the importance of a research with the goal of reducing the risks of accidents due to drowsiness. So far, several studies have tried to model the behavior of a drowsy driver, by establishing links between drowsiness and certain parameters related to the vehicle and to the driver (e.g., steering wheel position, speed of the vehicle, etc.) [2, 5, 10, 11, 14]. These studies would eventually be the base of future developments, as they are limited to determining what parameters are actually related to the drowsiness condition. An important drawback of these studies is the fact that they have been performed either on a driving-simulator, or by using non-transparent sensors, which might affect the driving task, and therefore, the behavior of the drivers in a drowsy condition.
The goal of this work is to develop a real-time monitor that will detect driver drowsiness and progressively warn the driver of this condition, so he/she can either correct the behavior or stop driving. This device must be transparent to the driver, as well as inexpensive. In this research, the first stages of the project were completed. That is, the implementation of the first prototype (hardware and software), and the execution of driving tests to evaluate, in a preliminary way, the performance of the system. Other approaches have been proposed by the automotive industry, but without success so far (e.g., cameras attached to image-processing devices to detect eyelid closure, etc.).
This paper is organized as follows: We first present the setup (hardware and
software) used to achieve the objectives. We include a brief description
of the physical setup of the device, a description of the data acquisition and
processing (which involves digital signal processing techniques and pattern
recognition techniques) and a description of the driving tests performed to
evaluate the performance of the system. We then show and analyze the results
obtained, and finally we present the conclusions and future researches.
To achieve the objectives of this research, a vehicle was provided with an on-board, compact computer, as well as a series of sensors to allow the computer to read information about the parameters of interest. Potentiometric sensors were installed in the steering wheel and in the accelerator and break pedals. A frequency-meter sensor was also installed to allow the on-board computer to get the speed of the vehicle. All these sensors, as well as the computer, were completely transparent to the driver, which represents the main constraint in our work.
The choice of these parameters was done according to the results of previous works. Among the parameters shown by the literature as being the most important indications of drowsy driving, we chose those that could be implemented with transparent sensors and in real-time. Thus, the measure of the respiratory rhythm was discarded, since it can not be measured with transparent sensors. Also, the deviation with respect to the lane position has been reported as an important parameter [2, 11, 14]. However, this parameter can not be obtained in real time, since the techniques proposed so far are all based on image processing.
The resolution of these sensors, including the data acquisition parameters, was determined based on the results presented in previous works [2, 3]. According to Boisvert , a sampling rate of 4 Hz (four samples per second) is sufficient for drowsiness detection purposes. This agrees with the results of Brekke and Sherman , who affirm that all the important information concerning the steering wheel is in the frequency range below 1 Hz, which, according to the Sampling Theorem, implies a sampling rate of at least 2 Hz . Even though we only used the spectral information below 1 Hz, we decided to oversample the signal, which is a commonly used technique that has as important advantages, an increase in the resolution and a reduction in the aliasing distortion . Data were (over)sampled at 5 Hz, except for the speed, which was sampled at 1 Hz. The samples were digitized at a resolution of eight bits (256 levels of quantization).
Data were sampled and processed in real-time. That is, every 200 milliseconds, samples of the various sensors were read and temporarily stocked. Those samples, along with the samples of the previous 20 seconds, were used to perform some processing and feed a pattern recognition sub-system, which determines whether the current combination of parameters corresponds to the pattern associated to drowsy driving. Besides this processing, data were stocked in non-volatile memory to be processed later, allowing evaluation of the performance of the system. We emphasize the fact that the proposed method actually performs real-time drowsiness detection, and that the extra processing performed later is only to evaluate the performance of the system.
In the next two sections we describe the two main sub-systems of the signal processing phase; that is, the Spectral Analysis of the Data, based on the use of digital filters, and the pattern recognition sub-system, based on the Classical Decision Theory.
The key point in the data processing is that the information being extracted from the raw data is the appropriate one. This corresponds to the feature extraction phase, which represents the most critical aspect of the design of a pattern recognition system . In our context, this phase consists of finding what parameters actually present a typical pattern when the driver experiences drowsiness. This pattern should be distinctive from the patterns that these parameters present during normal driving.
Figure 1. Frequency Response of the Digital Filters
(a) Low-Pass Filter (b) Band-Pass Filter (c) High-Pass Filter
In previous works, researchers mainly established links between driver drowsiness and several parameters related to the vehicle based on statistical correlation studies. These conclusions were used as the base for our choice of parameters or features to extract from the read data. A new approach was implemented, allowing real-time processing, as well as drowsiness detection. This new approach is based on the use of digital filters. The data obtained from the steering wheel is split, in real-time, into three frequency bands, by using the Finite-length Impulse Response (FIR) filtering technique (see figure 1). These bands provide information on the type of movements executed by the driver. According to the literature, fast and short movements are related to a normal driving pattern, whereas slow movements are an indication of drowsiness [2, 5, 10, 11]. Gabrielsen and Sherman  proposed the spectral analysis of the steering wheel signal as an important indication of driver drowsiness. Their study, though, was limited to a global analysis, performed off-line (after the whole driving period).
The use of digital filters allows the extraction of spectral information of the signal in real-time, as the implementation of a digital filter is a weighted sum of the current and previous samples of the signal . The coefficients determining the weight of each element (sample) determine the characteristics of the filter. The design of these filters (obtaining the coefficients) is a standard, yet complex problem. We omit the design of the filters in this paper. For further details, the reader can consult  or .
The signal of the steering wheel is thus split into three bands that provide information related to the state of alertness of the driver. The energy present on each band is measured by using a moving average filter (for the last ten seconds). These measures of the energy give information about the frequency contents of the steering wheel signal, and therefore, they give information about the type of movements that the driver is doing. Fast and short movements are related to a high value in the high-frequency band (the energy measured for the high-pass filter), whereas the lack of these movements may be related to a state of drowsiness or reduced alertness. The choice of the frequency ranges for the filters (the cutoff frequencies) was done based on the conclusions of the works of Boisvert  and Gabrielsen and Sherman . For more details on how the spectral information relates to specific types of movements and how these movements relate to the state of the driver, the reader can consult ,  or .
As many studies have pointed out, the steering wheel provides the most important and reliable information to detect driver drowsiness (except for the physiological parameters). This has been confirmed by our study as well. However, other parameters have been reported to be related to the state of alertness of the driver. These parameters, related to the movements of the accelerator pedal (specifically, a release of the accelerator pedal) and to the speed of the vehicle, are also computed in real-time by our system. This set of parameters or features (also called the feature vector) is the information that we feed into the pattern recognition sub-system. For more information about the computed parameters, the reader can consult .
The next section describes the processing done by the pattern recognition sub-system to decide whether the combination of the values correspond to the pattern present when the driver experiences drowsiness, or whether it corresponds to a normal driving pattern.
The technique used to detect the drowsiness condition with the parameters obtained from the sensors is based on the Classical Decision Theory. This theory, based on statistical principles, consists of minimizing the probability of misclassification of a condition (or object) given a set of parameters (features) measured from the condition or object. Many techniques have been proposed to implement this general theory. Among the most popular techniques are the use of Linear Discriminant Functions and the k-Nearest Neighbors (k-NN) rule. The former is reported to be the most efficient from the point of view of computation time, whereas the latter is reported as optimal from the point of view of classification error . Given the constraint of the real-time processing and detection, our system is based on the use of a linear discriminant function, and the distance to the theoretical position of the drowsiness cluster will be used to evaluate the performance of the system, in a preliminary way. In the next stages of the research, classification techniques with better performance will be evaluated, such as the (Bayes-optimal) k-NN rule.
The linear discriminant functions technique consists of evaluating the euclidean distance to points that correspond to the typical pattern associated to each possible classification. This distance is computed in an N-dimensional space, where N is the number of features extracted from the object or condition to classify. In our context, we extract six parameters or features (the three bands of the steering wheel signal, two parameters associated to the speed of the vehicle, and one related to the movements of the accelerator pedal), and we compute the euclidean distance in a six-dimensional space to the theoretical point corresponding to the drowsiness condition.
To find the points corresponding to the typical pattern (or combination of values) of each condition to be detected, it is required either an exact mathematical model of the object or condition to detect, or a database of objects or sample conditions for which the actual classification is known . Either two provide a statistical description of the object or phenomenum to classify, which allows to obtain the points associated to each possible classification. In our case, the condition to be detected is an extremely complex phenomenum, affected by numerous factors (including mainly psychological aspects) that are impossible to model. The use of a database of samples of normal driving and samples of drowsy driving would allow to obtain the point corresponding to the pattern associated to the condition of drowsiness. However, to use the database approach, we require samples of real drowsy driving. The inconvenients associated to this are obvious, and indeed, this has been the main inconvenient found in previous works.
The approach that we assumed was to compute a theoretical point corresponding to drowsy driving. Based on previous studies, which established links between parameters of the driving and the state of alertness of the driver, we compute the six-dimensional point by setting each of the six coordinates in an independent manner. Each coordinate of this feature vector space corresponds to a specific parameter that, according to the results of the previous works, presents certain values (or ranges of values) associated to drowsy driving, and certain values or ranges associated to normal driving. We set the value of each coordinate of the theoretical point of drowsiness to the value reported for that parameter as being related to the drowsy driving.
The difficulty in obtaining samples of real drowsy driving not only affects the design of the system, but also the validation and evaluation of its performance, as we discuss in the next section. To make a quantitative evaluation of the performance of the system, we need to have periods of driving in which the drivers became drowsy, and knowing the exact instants at which this condition occurred. With this information, we can compute the rate of missed detections (when the driver became drowsy and the system did not detect the condition) and the rate of false alarms (when the system detects the drowsiness condition when the driver was not drowsy). In the next section, we discuss these limitations and present the strategy that we used to make validation tests that allowed to evaluate, in a preliminary, qualitative way, the performance of the system.
As we discussed in the previous section, the main problem that we had to deal with, was the evaluation of the performance of the system. This, as already said, requires samples of real drowsy driving. That is, it requires to drive during periods in which the driver becomes drowsy, and that we know the exact instants at which such condition occurred. The use of sensors that could give absolutely reliable indications of drowsiness is immediately discarded, since the only reliable indications are physiological parameters that can only be measured with sophisticated, non-transparent sensors (which might affect the driving task). On the other hand, driving in a condition of very intense drowsiness (which almost guarantees that most of the time during the driving period the driver would be drowsy) must be also discarded, given the risks associated to such scenario.
We compromised in performing preliminary tests to evaluate the performance of our system. The data being processed in real-time were also stocked for further analysis. A statistical comparison of two periods of highway driving was done. One period was during the morning, and one in the evening (approximately at 9 pm). Five drivers were used. The same route (approximately one hour driving) was followed by all the drivers during both driving periods.
The parameter used to evaluate the performance is the euclidean distance to the theoretical point corresponding to drowsiness. We used the stored data to recompute (off-line) the distance of the feature vector to the theoretical point of drowsiness at every instant during the whole driving period. We then compared the histograms of this distance, which showed important differences, as we present and discuss in the next section.
Figure 2 shows the percentiles of the distance to the theoretical point of drowsiness for each driving period. This graph is interpreted as follows: In the x-axis is the distance corresponding to a given percentile. If, for instance, we consider the point of the percentile 70% of the day driving (d ? 1.4), we know that during 70% of the driving period, the distance was higher than this value. Given that low distances mean that the feature vector is close to the theoretical point of drowsiness, we can conclude that during 30% of the driving period, the driver experienced a level of drowsiness more intense than that value (assuming that the distance is an indicator of the level of drowsiness).
When we compare the curves for both periods, we notice that the first half of them are very close to each other (up to percentile 50%). For the lower values of the distance, however, we notice that the curves clearly separate, being the smaller distances (i.e., the more intense indication of drowsiness) during the evening period. Even though this global, statistical information does not include details about the exact instants at which the distances lowered, the differences between the curves only in the percentiles that are related to the drowsy driving make us conclude that the performance of the system seems to be appropriate.
This can be interpreted, from a more intuitive point of view, as follows: Even though the average behavior of the drivers was roughly the same for both periods, the indications of drowsiness were more intense for the evening period, which agrees with the expected results.
We must insist, though, that this is a preliminary evaluation, and should be only taken as a good, qualitative indication about the performance of our system. More elaborated driving tests must be done, not only to quantitatively evaluate the performance, but also to adjust it for optimal performance.
The influence of each parameter (each sensor) was studied as well. We built graphics of the percentiles by re-computing the euclidean distances without considering individual parameters or groups of parameters. Figure 3 shows the percentiles of the distances for both periods, computing the distance in a three-dimensional space, without considering the parameters corresponding to the steering wheel. This leads us to the conclusion that the steering wheel is the most important among the parameters used for this first prototype, since the curves are almost identical when we remove the influence of the steering wheel parameters. This does not mean, however, that the other parameters should be discarded. For more details, the reader can consult .
The first phases of the development of a real-time driver drowsiness detection device were successfully completed. Preliminary tests were done as well. The results of these tests allow us to conclude that the performance of the system appears to be appropriate, even though the limited mechanism of data collection does not allow us to evaluate the actual performance of the system (i.e., percentage of missed detections and percentage of false alarms). Important differences between both driving periods for all the drivers were observed by a comparison of statistical characteristics of the euclidean distance. More elaborated driving tests must be performed in the next stages of the research, which not only allow us to actually evaluate the performance of the system, but also to adjust the system for optimal performance.
The authors are grateful to the Fonds pour la formation de chercheurs et l'aide à la recherche / Société de l'assurance automobile du Québec / Ministère des transports du Québec (FCAR / SAAQ / MTQ) action concertée) and the Université de Montréal for sponsorship of this research.
The authors would like to thank all those who kindly helped in the first stages of this project, mainly Domingo De Negri and Éric Girard, who helped in the physical setup of the vehicle. Thanks are also extended to Luc Lafranchise, Juan Parra and Olivier Bellavigna-Ladoux, for their cooperation during the driving tests phase.