|This article will outline the development of a very nice
speech-processing system for amateur use. It is based on a combination
of techniques developed over the past few decades. Processing
techniques have changed during this period of time because of
advancements in two areas: technology and psychoacoustics.
Technology advances include the use of semiconductors instead of tubes/valves. And the use of electret microphones has become quite popular, instead of dynamic or crystal types. For automatic gain control, semiconductors devices are now used instead of variable-transconductance tubes. Monitoring the activity of such a system has been done in the past using analog meters; currently the use of LED bargraphs is common.
Studies in that field strangely named "psychoacoustics" have shown that loudness is best controlled by the use of automatic gain control techniques based on average signal levels rather than peak. A system with a very fast reaction time was at one time thought to be best. In fact system designers went to great lengths to make it fast; 100 microseconds was not uncommon. Such a system may look very nice on an oscilloscope but it does not match the ear when loudness is of concern. And recovery (gain-increase) time was thought to be best if it was set to a few seconds. Current thinking is that recovery times of the order of 100 to 200 milliseconds are best. Even shorter recovery times are better from a loudness viewpoint but tend to increase harmonic and especially intermodulation distortions. These distortions can be minimized, as will be seen.
This writeup will focus on a set of circuits intended for communications use, amateur radio in particular, and may not be at all suitable for broadcast or other purposes. We will end up with a signal that is quite potent but is very listenable and quite nicely controlled both in amplitude and spectral distribution. The proposed circuitry can have the "efficiency" (loudness) of a clipper but with seriously reduced distortion.
Readers with library facilities may note, perhaps with a touch of humor and/or nostalgia, the original article on this subject by ye scribe [see footnote]. I had no idea at the time how, in the future, the performance of such a device could be so dramatically improved by making just a few changes in the basic concepts. [And also by using devices that hadn't been invented at the time :-) ]
We'll begin the examination of this unit at the input to the chain. The assumption is made here that an electret microphone element is to be used. This kind of microphone can be thought of as a small depletion-mode FET, with an electrostatically-charged sound-receiving diaphragm next to the surface. When sound moves the diaphragm, the FET changes its conductivity. The FET itself has a high output impedance; it is in effect a current source. To operate it requires both a pullup resistor and a biasing voltage. The resistor should be in the vicinity of 10K ohms, connected to a bias of about 6 volts. Lower values of load resistor will lower the output level from the element while offering an improved degree of immunity to treble rolloff caused by the connecting cable capacitance. The biasing voltage should not exceed about 10 volts or the FET may become noisy by virtue of a zener-like breakdown process.
Figure 1 shows the microphone preamplifier schematic. If a dynamic microphone element is used, then the input 10K pullup resistor is to be removed. Such a microphone should be of the high impedance type. The output level from this circuit should be in the vicinity of a volt peak with normal speech levels in amateur service.
Both the schematics and the various plots shown in this writeup are the result of using LTspice, from Linear Technology Corporation. I have no connection with that company; I simply use LTspice for circuit analysis because it yields correct results, has a wonderful support group and is priced right (zero).
Figure 1 - Microphone preamplifier circuitry
The frequency response as seen at the output of the first opamp, U1, is as shown in Figure 2.
Figure 2 - The bass frequencies are rolled off by the action of C1 and R2 along with R5 and C5. The treble frequencies are rolled off by the action of R3 and C2 along with R4 and C3. R3 and C2 perform the additional function of discouraging RF from entering the input circuitry.
The output of that first stage is then applied to a highpass filter. The resulting response is shown in Figure 3.
Figure 3 - The input amplifier response is shown here along with the output of the highpass filter. The highpass has a slight peak in its response right at 200 Hz to correct the low-end rolloff of the input amplifier.
Figure 4 - Here we see the output of the first stage, the highpass and finally the lowpass filter. The lowpass has a peak in its response at about 4 kHz.
By using this bandpass filter action, those signal components that are of little benefit, but which may cause "mischief" in later circuits, are rejected.
The AGC block
After the speech signal has been spectrally shaped it is applied to an AGC system. Key to this block of circuitry will be some form of gain-controller. Here we have a wide variety of mechanisms that could be made to work. But first a flashback in time.
That speechamp writeup in 1956 used for the gain-controller a pair of variable-transconductance pentode tubes (valves) - 6BA6s - in a balanced- modulator configuration. After about 30 years those became hard to find. I wanted to try to avoid such a procurement problem this time around. For this project I wanted to use a variable-gain scheme the components for which would probably be available for a long time. Hence my long search for a variable-gain element resulted in choosing a system that used silicon diodes operating in the "knee" of their transfer curve, along with opamps. The basic idea is as shown in Figure 5. Please note - and this is very important - that the signal levels are down in the vicinity of millivolts. This system is called a "variolosser." The audio input to the variolosser is in the range of 10 to 300 millivolts peak. The output (at the right-hand side of the series ballast resistor) is in the range of 5 to 10 millivolts peak. The control voltage (the AGC bus) is in the vicinity of 300 to 600 millivolts DC. Its magnitude and polarity are such that the diodes are slightly forward-biased into conduction.
Figure 5 - The variolosser (gain-controller) in elementary form. In practice, resistor R3 is made trimmable over a several percent range to allow the circuit to be balanced; adjusting that resistor will allow nulling out the gain control voltage so it does not produce a "thump" in the output when gain reduction is in effect. This then is a balanced modulator.
Adding circuitry as shown in Figure 6 results in a workable AGC system.
Figure 6 - A simple AGC system
To the variolosser we have added a gain block (U3, R4 and R5). Following that amplifier is a full-wave rectifier system. Diode D4 feeds R8 directly while D3 is fed from a unity-gain inverter. The signal at the top of R8 then is absolute-value (full-wave rectified) audio. The ripple is largely removed by R9 and C1. The voltage across C1 is then the AGC control voltage.
With the parts values shown, the AGC attack time is about 5 milliseconds and the recovery time is about 200 milliseconds. This is a nice-sounding system. But due to the finite attack time this circuit must be followed by a clipper to catch the resulting transients that escape. (To those with broadcast experience, these numbers and comments are similar to the device circa 1965 called the "Volumax.")
Another caution has to be made at this point. It might seem that the faster the attack time the better. Indeed, from the invention of automatic volume limiters (probably about 1930) until the early 1960s this was the usual philosophy of design engineers. At that point it became evident that a longer attack time - of perhaps 3 to 6 milliseconds - would produce a louder signal. And a louder signal was actually the goal for those systems that used these volume-limiting amplifiers. Unfortunately such a "long" attack time also resulted in overshoots or bursts of audio which had to be removed by a following clipper circuit. Merely mentioning the word "clipper" usually caused fidelity afficionados to squirm. Fortunately it turned out that such clipping (of transients only) actually didn't sound particularly bad. The distortion might be heard but it was not especially objectionable. The amplitude-modulation receiver's demodulator usually caused distortion far in excess of that caused by a transient clipper.
Bottom line here is that we want an AGC loop which has a finite attack time, probably in the vicinity of 3 to 6 milliseconds, that loop to be followed by a clipper, for a loud signal. The clipper is adjusted to be on the verge of clipping when a 1000 Hz sinusoid is passed through the system. When this is done, audio signals with a high peak to average content (such as speech) will be lightly clipped.
And another item of concern at this point in our design process is the gain-recovery time. If the recovery time is long (long here meaning a few seconds), then loudness is definitely penalized. If the recovery time is short (perhaps 50 milliseconds) then loudness is maximized but distortion of the lower audio frequencies is increased. Worse yet, intermodulation distortion increases at an alarming rate. For maximum loudness we would like to see a recovery time in the vicinity of 50 milliseconds. Rest assured that if we simply shortened the recovery time without some additional cleverness, distortion would be intolerable. This can be minimized by a certain amount of cleverness.
To minimize the generation of harmonic and intermodulation distortion in a fast-recovery (50 milliseconds) AGC loop, we need to look into the mechanism involved in its generation. Looking at the AGC loop control waveform we will normally see ripple on that signal. This ripple is in fact modulating the audio signal. Lengthening the recovery time minimizes the ripple, minimizes the modulation of the signal and so minimizes distortions, both harmonic and intermodulation. That lengthening will also reduce the loudness of the controlled signal. A fast-recovery AGC loop with a recovery delay can sound loud and also have low distortion. This delay can be accomplished by adding hysteresis to the control loop. Such a modified AGC voltage generator is shown in Figure 7. The resulting tone-burst response is shown in Figure 8. Note the slight (several milliseconds) delay prior to the gain-increase.
Figure 7 - Schematic of an AGC system with a slight delay in the recovery
That schematic also shows the simple temperature compensation system used to make the AGC loop rather independent of temperature variations.
Figure 8 - Note the slight delay prior to recovery. The test signal is a 1000 Hz sinusoid, being reduced in level from 30 dB over the threshold of compression down to 20 dB over threshold. The gain remains constant for about 5 milliseconds and then increases over a 15 millisecond time frame.
Figure 9 - As above but now the test signal frequency is 100 Hz. Observe that the gain is increased to normal in the 20 millisecond time frame but that the signal suffers no visible distortion. This system will be about as efficient as a clipper in terms of loudness increase, but with a dramatic reduction in distortion. The clipper can be called into play for additional "punch."
Is this circuit peak-sensitive? No. It is "area under the curve" sensitive. This is the best way to go for a communications system in which loudness is the factor to be maximized. (The first-presented circuit is peak-sensitive.) Does this circuit distort? No. Distortion is very low in spite of the fast recovery, thanks to the delay in the recovery.
Now let us examine how that recovery delay is accomplished; the system is illustrated in Figure 10.
Figure 10 - Essentials of the delayed-recovery scheme
The audio is made into what is called absolute-value (full-wave rectified) by diodes D1 and D2. This signal is divided down by half using resistors R1 and R2. The reduced-value signal is applied to diode D4 to charge capacitor C1. The voltage across C1 is the gain-controlling voltage which is eventually routed to the variolosser. But note that capacitor C1 can only discharge via diode D5. And that diode is back-biased by a voltage greater than the AGC voltage by the charge on capacitor C2. C2 has been charged to the full absolute-value voltage via R4 and D3. Upon removal (or reduction in amplitude) of the audio voltage, the AGC voltage remains constant until C2 discharges (via R4 and R3). If the timing is such that the delay is about 5 milliseconds then the AGC voltage will have no ripple on it even while the system is passing a 100 Hz sinusoid.
The component values shown offer an attack time of about 3 milliseconds, a recovery delay of about 5 milliseconds and a recovery time after that delay of about 10 milliseconds. This results in a very high average value of audio with very low distortion.
Because of the several millisecond attack time, the above-described AGC block will not control modulation on a peak basis. It will, however, have a strikingly high average output level. It also has a very "smooth" sound. A clipper must be added to catch those transients that escape the AGC loop. First let us look at an elementary clipper. For this design we have chosen a simple shunt-diode arrangement. Each diode is back-biased so that it does not conduct until a certain voltage has been reached. These voltages are derived from a voltage divider feeding an opamp, and from the output of that first opamp applied to a polarity-inverting second opamp.
Figure 11 - Schematic of a simple clipper. The trimmer is used to adjust the clipping threshold so that when the AGC block is delivering its usual output the clipper is on the verge of clipping. That trimmer is not on the front panel of the equipment.
If a sinusoid several dB above the clipping threshold is applied to this circuit the output will appear as in Figure 12.
Figure 12 - Output from the simple clipper
The multiple waveforms shown are a result of changing the temperature over a wide range. The silicon diodes have a distinct change in their forward voltage drop as the temperature varies, resulting in an output level change with temperature. Fortunately this can be corrected very easily as shown in Figure 13.
Figure 13 - Schematic of a temperature-compensated clipper
The output from this temperature-compensated clipper will appear as in Figure 14.
Figure 14 - Output from the temperature-compensated clipper
The output level remains quite stable over a very wide temperature range. By the simple addition of one diode and one resistor to the simple circuit, the change in diode characteristics with temperature vanishes.
This clipper design uses only a volt or two for the diode back-bias voltage. As a result the transfer curve is slightly rounded instead of being "textbook abrupt." This causes a noticeable reduction in the very high order harmonic generation as well as a trivial drop in "efficiency" as compared to the textbook clipper.
Between the AGC system and the clipper we will install a gain block. If this block is operating at unity gain, it is essentially transparent and we are operating the clipper at threshold. When the gain block has gain greater than unity we can add some clipping for additional signal loudness. The best way to add gain is to have a shaped response in this clipper-driver block so that when additional clipping is desired, mostly the upper audio frequencies are increased. A circuit which does this is shown in Figure 15.
Figure 15 - The clipper driver
When the resistor R2 in the schematic is set to zero ohms, this circuit is simply a unity gain block. As R2 is increased the upper audio frequencies are boosted. Figure 16 shows the resulting family of curves. When R2 is zero ohms the bottom plot results. As it is increased the response at the upper frequencies increases. This technique minimizes intermodulation distortion (of upper audio frequencies by lower audio frequencies) and allows the weaker upper audio frequencies to be emphasized.
Figure 16 - The clipper driver frequency response
With an AGC loop that might be considered slow-acting or sluggish, the clipper will operate a considerable portion of the time on speech signals. The clipper will operate at its threshold (on the verge of clipping) with a sinusoid applied to the AGC loop input. On speech waveforms it will be seen that a few dB of clipping will occur fairly consistently. This is not very evident to the ear. But the clipped wave has harmonic content, which must be removed. This "post-clipping lowpass" filter needs to remove components caused by clipping, but only those over 3000 Hz or so. The filter should be of a lowpass type, it should have a flat response out to its cutoff frequency, and then it should drop as fast as possible. A filter to accomplish this is shown in Figure 17. (For the technically-inclined, this is a fifth-order Chebyshev with 0.2 dB of passband ripple.)
Figure 17 - The post-clipping lowpass filter schematic
The response of that filter is shown in Figure 18.
Figure 18 - The post-clipping lowpass filter frequency response
This filter will satisfactorily remove the splatter which would otherwise result from clipping. A problem with such a filter is that, when driven by a squared waveform (as would result from clipping), it will have an overshoot at its output. This is a result of the sharp cutoff. Mathematically it is a result of a truncation of the higher terms. Those overshoots can cause overmodulation. This overmodulation can be prevented by lowering the modulation level by perhaps 35% (the amplitude of the overshoots) but that is quite a penalty. Figure 19 - The post-clipping lowpass filter transient response
Numerous schemes to counter these overshoots have been developed during the years. The more effective ones are quite complicated. A simple method, quite practical, is included in the lowpass filter of Figure 20. Figure 20 - Schematic of the post-clipping lowpass filter with overshoot correction
Figure 21 - The post-clipping lowpass filter frequency response with correction
This filter has a step in the magnitude response wherein the upper modulating components are attenuated about 2 dB compared to the lower components. This design has a "step" in its magnitude response.
Figure 22 - The post-clipping lowpass filter transient response after correction
The overshoots have been changed to what might be called "undershoots", which are quite harmless.
The gain-reduction (compression) metering scheme is shown in Figure 23, at the lower-right corner. The available signal is about 2 volts DC, which seems to be a value quite useable by a bargraph display (or oscilloscope). Perhaps an agile analog meter movement could be used.
Figure 23 - The schematic of the AGC bias-generation and temperature-compensation circuit, with the metering system shown in the lower-right corner.
The schematic of the complete system is shown in Figure 24.
Figure 24 - The schematic of the AGC loop, metering, clipper-driver, clipper and post-clip lowpass filter is shown here. The microphone preamp and associated filters are not shown.
"Compression and Clipping", James L. Tonne, W5SUC, QST, September, 1956