Visual Psychophysics
Relationship of psychophysics to psychology
Psychophysics is perhaps the oldest subfield of psychology. It is an outgrowth of 19th century attempts to measure and quantify sensory experience.
Gustav Fechner (1801-1887), a German physicist and psychologist, coined the term "psychophysics" by which he meant the science of measuring the mind. His hope was to derive the correspondence between quantitative variables and qualitative (subjective) experience, that is, to develop a physics of "mental energy" akin to the successful physics of physical energy which was being developed in the 19th century.
As with all experimental psychology, the tools of the psychophysicist are independent variables (e.g., light, sound, or mechanical pressure), and dependent variables -- behavioral responses of various kinds, such as vocalizations (e.g., I saw it) or button presses.
The Concept of the Threshold
Much (if not most) work in psychophysics has centered on determining how sensitive a sensory system is. This is determined by measuring how much of a particular stimulus is required to reliably detect that stimulus. In olden days one spoke of brides being carried across the threshold, or entrance, of a home. A sensory threshold is also a kind of entrance: it represents the entrance of a stimulus into sensory existence. The threshold for a particular light stimulus is that intensity which allows it to be "just seen".
Psychophysical Methods
There are a variety of ways to measure threshold. A straightforward was is called the Method of Limits. Here, a stimulus is either gradually increased (Ascending Series) or decreased (Descending Series) in intensity, and the subject indicates on each trial (on each presentation) whether the stimulus was "seen" or "not seen" (or felt, or heard, or smelled, etc.). One can place the results of such an experiment in a table, like the one shown below.
Hypothetical Responses to Stimuli Presented in Four Trials Using Method of Limits (Ascending Series) |
||||||
| Stimulus # | Trial #1 | Trial #2 | Trial #3 | Trial #4 | % "Yes" Responses | |
| S1 | N | N | N | N | 0/4 (0%) | |
| S2 | N | N | N | N | 0/4 (0%) | |
| S3 | N | N | N | N | 0/4 (0%) | |
| S4 | N | N | N | N | 0/4 (0%) | |
| S5 | N | N | N | N | 0/4 (0%) | |
| S6 | Y | N | N | N | 1/4 (25%) | |
| S7 | Y | N | Y | N | 2/4 (50%) | |
| S8 | Y | N | Y | Y | 3/4 (75%) | |
| S9 | Y | Y | Y | Y | 4/4 (100%) | |
| S10 | Y | Y | Y | Y | 4/4 (100%) | |
| S11 | Y | Y | Y | Y | 4/4 (100%) | |
| S12 | Y | Y | Y | Y | 4/4 (100%) | |
The letters in the columns refer to subjects responses of whether the stimulus was reported as seen (Y) or not seen (N) on a particular presentation, or trial. I have color-coded the responses for ease of interpretation: red=NO; green=Yes. The Yes responses shown in italics are not typically obtained, since stimuli usually cease to be presented following the first Yes response. They are shown here for purposes of illustration. These data describe results from an ascending series, since the stimulus strength increases with each presentation. Series are often run in alternate ascending and descending directions. Note that in any individual column there is a discrete intensity at which the stimulus appears to cross a threshold from unseen ("No" response) to seen ("Yes" response).
The value of this threshold varies from column to column, however, and this fact argues against the idea that threshold is a fixed, unchanging point along the sensory continuum. The idea of a fixed threshold is therefore replaced with the idea that thresholds are stochastic (that is, probabilistic, or variable), either because threshold actually changes over time (say because the general level of neural excitability changes), and/or because a variable amount of added "equivalent stimulus" (called noise) is added to the detecting mechanisms on different series of trials. In any event, the modern concept of the threshold is that it is a probability that a (Y) response will be given.
Graphs which plot a behavioral response (% detection) as a function of a physical variable (i.e., stimulus intensity) are referred to as psychometric functions. In tasks requiring a simple Yes or No response such as the one we are exploring, threshold is usually computed as the stimulus intensity yielding a 50% "Yes, I saw it" response rate, although the decision is arbitrary, and other criteria may occasionally be used (e.g., 40% or 65%). Follow the long horizontal dashed line in the figure above which traces the 50% "Yes" response rate across until it intersects the solid curved line. This smooth curve through the data points is the best fit to the response data of the table above (solid symbols) of a theoretical function called a cumulative normal distribution, which is the integral of a normal distribution. A general name for the family of such S-shaped functions is an ogive. From where the 50% Yes response rate intersects the ogive, follow the short dashed vertical line down to the abscissa (x-axis). The point at which the vertical dashed line intersects the abscissa shows that the 50% "Yes" response rate is associated with a stimulus intensity of 5 units. Five units would be defined as the threshold for this stimulus.
Note that in Method of Limits the subject simply responds "Yes" or "No" to stimuli whose strength has been set by the experimenter. Another technique in which the subject takes a more active role is called the Method of Adjustment. Here, it is the subject who controls of the intensity of the stimulus. The subject adjusts the intensity until the stimulus is judged to be (in the case of a visual stimulus) "just visible". Method of adjustment is a straightforward and convenient technique, particularly for tracking thresholds which are undergoing rapid change over time.
One problem with either the Method of Limits or the Method of Adjustment is that the subject knows which intensity of the stimulus to expect from trial to trial (e.g., either a slightly stronger stimulus, in ascending series, or a slightly weaker one, in descending series) a problem which can influence the measured threshold. A method designed to overcome this expectation problem is the Method of Constant Stimuli. Here the order of presentation of the stimulus is randomized, so the subject cannot anticipate the intensity of the stimulus on any given trial. All the various stimulus intensities are presented, and a table similar to that for the Method of Limits (above) can be constructed. From such a table, the percent (Y) responses can be plotted as a function of stimulus strength, and a psychometric function can be described.
While the Method of Constant Stimuli eliminates some of the problems of the previous methods, it still possesses a troubling flaw: the point at which "No" responses become "Yes" responses is determined not only by the stimulus threshold, but also by the whim of the subject. That is, there is no control over the response criterion applied to the decision of whether a stimulus was seen or not. This is a problem because some subjects may be cautious types, who refrain from saying "Yes" until the stimulus is clearly visible to them. Other subjects are risk-takers, often saying "Yes" even if they're not entirely sure. Subjects may even change response criteria within the course of an experiment.
A method developed to circumvent the problem of response criteria is the Method of Forced-Choice. Here subjects are presented with two or more alternatives, and must select one on each trial even if the stimulus was not clearly seen. The choice can thus be coded as a criterion-free "correct" or "incorrect". Alternatives can be presented sequentially (temporal forced-choice), or can be presented simultaneously (spatial forced-choice). There must be at least two alternatives, but there can be up to four or five. (More than four or five usually becomes too confusing for subjects). Below are several examples from a three-alternative spatial forced-choice task (color discrimination). When there are more than 2 choices, forced-choice becomes an "oddity" task, that is, the subject's taks is to choose the "odd" (i.e., different looking) window on each trial.
Forced-choice methods usually reveal lower thresholds than other techniques. When subjects are forced to choose, they usually make better than chance guesses even when they FEEL like they're just guessing. The forced choice psychometric function also differs from the Yes/No function in that it does not fall to 0% for weak stimuli. Instead, the worst performance is guessing, which in a 2AFC task is 50%. It is conventional to take the performance level half-way between guessing and perfect as the criterion for threshold in a 2AFC task (this is 75% correct in a 2AFC experiment). Here are some forced-choice psychometric functions for 2-alternative (green), 3-alternative (red) and 5-alternative (blue) tasks. Note that chance performance (the percent correct you obtain from merely guessing) varies with the number of alternatives.
Forced-choice paradigms can be used to measure the sensory capacities of non-verbal subjects (human or animal), since subjects can be operantly conditioned to select the "odd" stimulus from an array of three or more. The difference between the "odd" stimulus and the others can be adjusted to smaller and smaller amounts while performance is tracked.
Absolute Thresholds and Difference Thresholds
So far we have discussed the Absolute Threshold, e.g., the amount of a stimulus which is required to simply detect it against on background (e.g., detecting light in an absolutely dark room). A more general type of threshold is called the Difference Threshold, which is defined as the size of the difference between two stimuli required in order to just tell them apart. Another word for the Difference Threshold is the Just-Noticeable Difference (JND).
Let's define the intensity of a "standard" stimulus as IO, and the intensity of a "test" stimulus as IT. Differential sensitivity refers to the smallest increase in stimulus strength (IT - IO) which can be detected on the background of the "standard" stimulus IO. The difference (IT - IO) is known as DI (pronounced "delta" I). The threshold value of DI is the JND. Differential sensitivity can be measured by any of the psychophysical methods we just talked about.
Weber's Law
A commonly performed experiment involves measuring the threshold DI (the JND) while changing the intensity of the background "standard" stimulus (IO). The background intensity is usually increased from a low (i.e., dark) to a high (i.e, bright) intensity level. We do this to discover how the "standard" background level affects visual sensitivity, that is, affects the size of the JND.
Ernst Weber (in the 1830's) performed such experiments and made a discovery concerning the relationship of IO and DI. Namely, he found over a wide range of stimulus intensities that DI=kIO. This expression can be rewritten as DI/IO=k. What Weber discovered was that the JND is not an absolute amount of stimulus, but is a constant proportion of the background "standard" stimulus, IO. The larger increments required for them to be "seen" on "standard" backgrounds of increasing intensity is illustrated in the three figures below.
Thus, the more intense (or larger) the background stimulus, the larger the increment needed to be in order for it to be detected on top of the background. The k in the equation DI=kIO is called the constant of proportionality -- it determines how fast the threshold increment rises with increasing background level. The smaller the value of k, the smaller the threshold remains (i.e, the more sensitive the system) as background level increases. Another name for k is the Weber Fraction. The value of the Weber fraction varies across sensory systems and across discrimination tasks within given systems, as shown in the table below.
Discrimination Task |
Weber Fraction (k) |
| Brightness | 0.079 |
| Loudness | 0.048 |
| Finger Span | 0.022 |
| Heaviness | 0.020 |
| Line Length | 0.029 |
| Taste (salt) | 0.083 |
| Electric Shock | 0.013 |
| Vibration (fingertip) | 0.036 |
For weight discrimination using one's finger, k = 1/30 (.0333). This means that a 1 lb weight can be discriminated from another weighing 1+1[1/30], or 1.0333 lbs. How large an increment would be required to detect a change in a 10 lb weight? 10+10[1/30] = 10.333 lbs. A much larger absolute increase, but an equal proportion of IO.
Sensory Scaling
Stimuli which exceed threshold are referred to as suprathreshold. Measuring and expressing the relationship of suprathreshold stimuli to each other is referred to as sensory scaling.
The first attempt to define the magnitude of suprathreshold sensation was indirect. Fechner used the mathematical relationship between the threshold increment (DI) and the "standard" intensity (IO) -- Weber's Law-- to make an inference about the relationship between stimulus magnitude and suprathreshold sensory experience.
Since Fechner doubted that sensation magnitude could be measured directly, he used the concept of the JND to develop a scale. He ASSUMED that all JND's measured on all different "standard" backgrounds (which were ALL the same in the sense that they represented barely perceptible changes in stimulus intensity) were ALSO equal increments in sensory experience. NOTE: this is an assumption, not a fact. A consequence of this assumption, however, is that the JND can now be used as a metric of sensory magnitude (just like inches, or centimeters, form a metric of distance).
For those of you who are mathematically inclined, Fechner's Law is derived from Weber's Law through integration (replace DI with Y, the sensory magnitude):
Fechner's Law states that equal increments in sensory experience (Y) are proportional (k) to the logarithm (log) of stimulus magnitude (I): Y=klogI. What this means is that constant differences in sensation are the result of constant ratios of physical stimuli. That is, doubling the intensity of a stimulus may increase its brightness by 30%, and a second doubling increases it by another 30%, and so on.
The logarithmic compression implied by Fechner's Law is thought to occur in the photoreceptors themselves, that is, the response of a receptor (R) is generally a logarithmic function of stimulus intensity: R = c log(I).
S.S. Stevens took issue with Fechner's Log law. The log function is only one of many compressive functions, and Stevens argued that the correct mathematical relationship between stimulus intensity and sensory magnitude was described by a Power Law:
Y=kIn
Again, Y sensation magnitude, k is a constant of proportionality, and n is an exponent to which the stimulus intensity is raised. In order for a power function to be compressive the exponent, n, must be a number smaller than one. If n=1, it is a linear relationship, and if n>1, it is an expansive, or positively accelerating function.
Stevens presented empirical evidence that equal ratios of physical intensity produced equal ratios (not differences as Fechner had asserted) of sensation. This is what would occur if the Power Law were correct. Stevens used a technique to measure sensation magnitude which is ridiculously simple: he asked people to assign numerical values to the strength of stimuli, a procedure known as Magnitude Estimation. Magnitude Estimation is an example of Direct Scaling; Calvin is assigning numerical estimates to the perceived magnitude of his "effort".
Like Weber fractions, the exponents of the power functions derived from magnitude estimation studies across a number of sensory channels (like brightness, contrast, loudness, pain, cold, weight, electric shock, etc.) are quite variable, as illustrated in the table below. Values of n which are less than 1 mean that sensation magnitude rises more slowly than physical intensity; values greater than 1 mean that sensation magnitude rises faster than the increases in physical intensity. For example, the sensation associated with electrical stimulation of teeth rises much faster than the physical intensity of the shock -- a whole lot of bang for the buck!
| Power Law Exponent | EScaling Task |
| 0.33 | Brightness of 5 deg target in dark |
| 0.50 | Brightness of brief flash |
| 0.60 | Smell of heptane |
| 0.60 | Intensity of 250 Hz finger vibration |
| 0.67 | Loudness of 3000 Hz tone |
| 0.70 | Visual area |
| 0.80 | Tactual hardness |
| 0.95 | Intensity of 60 Hz finger vibration |
| 1.00 | Ambient temperature |
| 1.00 | Visual length |
| 1.10 | Duration of white noise |
| 1.10 | Pressure on palm |
| 1.10 | Vocal sound pressure |
| 1.30 | Thickness of blocks (haptic) |
| 1.30 | Taste of sucrose |
| 1.40 | Taste of salt |
| 1.45 | Heaviness of lifted weights |
| 1.50 | Temperature of arm |
| 1.50 | Tactual roughness |
| 1.70 | Handgrip strength |
| 2.00 | Electrical stimulus (auditory) |
| 3.50 | Electric shock to skin |
| 7.00 | Electric shock to teeth (ouch!!) |
Magnitude Production is a variation on the theme of Magnitude Estimation. Here a subject adjusts the intensity of a stimulus to equal a prescribed numerical value.
Cross-Modal Matching is another closely related procedure in which a subject might be asked to adjust the brightness of a visual stimulus until it is judged to be as bright as an auditory stimulus is loud.
Signal Detection Theory
In modern psychophysics the concept of a "hard" threshold has been replaced by the concept of probabilistic detecability. Signal Detection Theory (SDT) was originally developed to analyze the performance of telecommunication systems (which transmit and receive information, as do nervous systems). Like forced-choice procedures, SDT likewise overcomes the limitations mentioned earlier with regard to the yes-no psychophysical methods, in which observers with dissimilar response biases might be judged to possess different sensory thresholds, even though their actual sensory capacities might be dentical.
SDT assumes that there is always some amount of "noise" in a sensory system. I find it convenient to think of this noise as a consequence of using (fallible biological) neurons to process information. Neurons sometimes respond even in the absence of a stimulus, sometimes they fail to respond even to a strong stimulus, and their response to the same stimulus is not necessarily consistent over time.
In the absence of a stimulus (i.e., a "signal") the amount of noise (i.e., the level of activity) in the detecting system at any point in time is assumed to be a random variable which is normally distributed. This distribution of potential activity levels in the "undisturbed" system is called the "noise distribution". Put into the language of inferential statistics, noise constitutes "random sampling error". Below is shown such a noise distribution.
The noise distribution has mean XN (=10 in this example), and standard deviation sN (=1.3 in this example). This is a picture of the theoretical probability of various levels of sensory system activation IN THE ABSENCE OF A STIMULUS. On trials when a stimulus is actually present, the stimulus contributes some additional level of activation to the sensory system, and the mean of the Signal+Noise distribution is displaced rightward, to a value of XN+S (=12.5 in this example) with the same standard deviation.
The observer's task is to indicate on each trial whether the stimulus was seen or not -- that is, to decide, based upon the level of sensory activation, if the trial was more likely a Noise trial (response = N) or a Stimulus + Noise trial (response = Y). If the distance separating the means of these distributions is small in comparison to their standard deviations, then any particular level of activation could equally well have come from either distribution. Note that the distance between the means of the two distributions, expressed in standard deviation units, d', is essentially a t-score (X1 - X2 / s1+2).
Consider a situation in which d' is fairly large [(12.5-10)/1.3=1.9 in this example]. A particular level of sensory activation occurs on a trial: how does a subject arrive at a decision about whether it was a S+N or a N trial? Observers will adopt some response criterion (b), that is, they will choose some level of activation which, if their system exceeds on a particular trial will result in a "yes" response. If the level of activation an a particular trial falls below the criterion, the subject will respond "no". As the table below shows, there are two possible outcomes for each response: "Yes" responses are either Hits or False Alarms; "No" responses are either Misses or Correct Rejections.
| Stimulus Alternatives | Response Alternatives | ||
| "Yes" | "No" | ||
| Signal | Hit | Miss | |
| No Signal | False Alarm | Correct Rejection | |
Given the theoretical Noise and Signal+Noise distributions, and the particular desision criterion chosen, we can compute the probability of each kind of response. The probability in each panel is the hatched area under the curve.
| "Yes" | "No" | |
| Signal | ||
| No Signal |
The number of Hits and False Alarms both depend upon the criterion value adopted by the subject. A low (more liberal) criterion leads to higher rates of both Hits and False Alarms (left panel), whereas a higher (more conservative) criterion yeilds lower rates of Hits and False Alarms. Note that in neither case does a change in the decision criterion affect the actual detectability of the stimulus (d').
| Low Criterion | Medium Crterion | High Criterion |
![]() |
![]() |
If it is important not to miss trials in which a signal has occurred, then subjects adopt a low criterion (left panel). If it is crucial to avoid false-alarms, subjects adopt a high criterion (rightmost panel). If response biases don't influence the detectability (d') of the stimulus, what do they influence? This can be determined by plotting the probability of "Hits" against that for "False Alarms" for a given d'. These data are plotted from the example above in the figure below, where the leftmost symbol (in red) represents the proportion of "Hits" and "False Alarms" under the High Criterion condition, the middle symbol the Medium Criterion condition, and the rightmost symbol the Low Criterion condition. The solid blue curved line which passes through all three data points is called a Receiver Operating Characteristic, or ROC curve. It describes the locus of all possible proportions of "Hit" and "False Alarm" values when (in our case) d' is 1.92. Knowing any point on this curve allows you to calculate d'. The solid green line illustrates the locus of all possible "Hit" and "False Alarm" values when d' is 0. Different d' values result in a family of ROC curves. The solid orange symbol at coordinate (0,100) in the upper left corner represents a d' value of infinity.


The area under the ROC curve is equal to the % correct score in a 2-alternative forced-choice experiment (right hand panel).
Copyright © 1997 [Mark E.
McCourt]. All rights reserved.
Revised: September 22, 2001.