Compression and Clipping - Revisited



James L. Tonne WB6BLD


Copyright © 2001-2003 James L. Tonne


This article will outline the development of an uncommonly efficient whilst still good-sounding speech-processing system for amateur use. It is based on a combination of techniques developed over the past few decades. Processing techniques have changed during this period of time because of advancements in two areas: technology and psychoacoustics.

Regarding technology, in 1950 the standard microphone was a crystal or dynamic type, which drove a vacuum tube amplifier. Today these microphones are still quite viable and they are, in fact, in common use. But they are not as cost-effective as the electret, which, although quite inexpensive, has very good quality. Amplification is more commonly done now using IC opamps. In 1950, electronic control of signal levels was most commonly done by means of variable transconductance vacuum tubes. Today, electronic gain control is done more effectively by IC multipliers, which were not available then. If the bandwidth was to be confined then lumped-element filtering (involving relatively large and expensive inductors) was de rigeur. Today we most likely use IC opamps operating in conjunction with relatively inexpensive passive components. Again, the concept of opamps was still in the future . Monitoring gain levels or clipping was generally done with analog meter movements; LED bargraphs were yet to be developed.

Regarding psychoacoustics, studies have confirmed that AGC loops with fast attack times are actually counterproductive from a loudness viewpoint. They can react to signals which are too short to be meaningful to the human ear. Paraphrased, the AGC system might operate (cause gain reduction) on speech components that the ear considers insignificant. For a while there was in fact a form of "one upsmanship" wherein one engineer would develop an AGC system which would operate even faster than the competition, who would go back to the drawing board and counter the "advance" made by his cohort. Processor designs that made it to the marketplace varied from systems that had zero attack time to systems that had several millisecond attack times. Gain-restoration times varied from several seconds down to about 150 milliseconds. Those systems with longer attack times or shorter release times tended to sound better; they matched the ear. In any event, following this cascade of circuits is the modulation process itself; we will discuss that aspect in those areas where the speech processing techniques might have an impact.


A review

The prime purpose of a speech processor is to control the speech signal to prevent overloading following circuitry while at the same time maintaining a consistent level of excitation into that circuitry. If the dynamic range of the speech signal can be controlled so as to be more nearly uniform, then that signal's average level can be raised to approach the overload point of the following circuits. The result will be a loudness improvement, obviously a desirable thing. A commonly-accepted method of doing this is to use an AGC system, or compressor. The effectiveness of this method of level control is dependent on the reaction times to level increases or decreases. Broadcasters tend to use reaction times to level increases ("attack times") of the order of one millisecond or less. They also favor reaction times to level decreases ("recovery times") of the order of one to ten seconds. Such devices are not terribly effective from a loudness-improvement viewpoint but typically have low distortion. Amateurs who build their own equipment frequently use a simple waveform-clipping approach to control levels. Clippers have zero reaction times. As a result, they are quite effective in controlling the magnitude of the speech signal amplitude. Although simple and inexpensive, they sound distorted largely as a result of intermodulation caused by the clipping process. In this design we will approach the efficiency of a clipper but with a dramatic reduction in distortion, particularly intermodulation distortion. This will be due to the use of an AGC design best described as agile, backed up by a clipper.

Another purpose of the processor is to confine the transmitted bandwidth. Regardless of the processing techniques used, transmitted bandwidth restriction is usually done by post-processing lowpass filtering. We will see that additional filtering (in the form of highpass filtering) and response-correction (treble boost) can be helpful.

Regardless of how all this processing is done, there should be some mechanism to monitor the degree of processing. When a slow-acting AGC is the prime operator, then a simple analog metering system can do the monitoring. This article will propose processing by an AGC loop which is faster than syllabic and as a result analog metering is quite inadequate. Even with control of meter damping, added acceleration in the meter-driver and the like, analog meters will be of limited use in the proposed system.

Readers with library facilities may note, perhaps with a touch of humor and/or nostalgia, the original article by ye scribe (1). I had no idea at the time how, in the future, the performance of such a device could be so dramatically improved by making just a few changes in the basic concepts.


  Proposed - the preamp

We'll begin the examination of this unit at the input to the chain. The assumption is made here that an electret microphone element is to be used. These devices can be thought of as a small depletion-mode FET, with an electrostatically-charged sound-receiving diaphragm next to the surface. When sound moves the diaphragm, the FET changes its conductivity. The FET itself has a high output impedance; it is in effect a current source. To operate it requires both a pullup resistor and a biasing voltage. The resistor should be in the vicinity of 10K ohms, connected to a bias of about 6 volts. Lower values of load resistor will lower the output level from the element while offering an improved degree of immunity to treble rolloff caused by the connecting cable capacitance. The biasing voltage should not exceed about 10 volts or the FET may start to become noisy by virtue of a zener-like breakdown process.

Figure 1 shows the microphone preamplifier schematic. If a dynamic microphone element is used, then the 10K pullup resistor should be removed. Such a microphone should be of the high impedance type. The output level from this circuit should be in the vicinity of a volt peak with normal speech levels in amateur service. The feedback resistor may be adjusted to yield this level.


  The input bandpass

After the speech signal has been increased in amplitude to a useable value by the input preamplifier, it is highly advisable to discard those components that can only cause mischief if processed by the subsequent circuitry. On the low-frequency end of the audio spectrum, it would be advisable to discard frequencies which can be a source of intermodulation distortion but which will not contribute significantly to our loudness effort. If some degree of clipping is used the low frequencies are especially mischievous. A highpass filter cutting off at perhaps 200 Hz for the AMers or 300 Hz for the single-sideband operators is advised. Sibilant frequencies (the letter "s") cannot possibly be passed by any communications-grade receiver but can operate the processor's AGC system. Should this happen the average volume of the speech signal will be reduced. Neither of these filters need not be very complex. The output of this filtering block is a speech signal containing only useful components.

At this point that we can add a simple circuit which boosts the treble components to aid the intelligibility of the signal. This should be done with care, i.e., not to excess. If this is done, and to do such is highly recommended, then the sound will have a degree of brightness increase. This will increase the intelligibility of the signal under poor path conditions; in AM systems it can be thought of as a correction for the receivers' I.F. selectivity.

Figure 2 shows the schematic of the complete highpass/lowpass filter pair (forming a bandpass) and the treble boost.


  The AGC loop

After the speech signal has been spectrally shaped it is applied to an AGC system. Key to this block of circuitry will be some form of magnitude-controller. Here we have a wide variety of mechanisms that could be made to work. Not available during the original design were the combination of an LED and a photoresistor; the LED had not been invented. An LED/photoconductor pair can be used but are difficult to make operate in an agile fashion. Simple shunt attenuators using transistors have excessive distortion. The same holds true for diodes even when used in their "linear" region. The best circuit to use is a multiplier, sometimes called a balanced modulator or Gilbert cell. Those intended for RF/IF conversion don't perform well in this application. Those intended for instrumentation purposes work very nicely and were chosen for this design. We don't need the bandwidth here but we do need the linearity.

For maximum control range, we will choose a reverse-acting system, wherein the output of the loop is sampled to derive an AGC voltage. The usual method of developing the control signal is to compare the system output against a reference or bias voltage, using a comparator or back-biased diode. When signal peak levels exceed the reference, a capacitor is charged. The capacitor can charge quite fast and it discharges relatively slowly using a recovery-timing resistor. The voltage across the capacitor is the AGC control voltage.

One method of making the AGC system match the ear, by lengthening its reaction time, is to slow the charging rate of the capacitor. This is indeed the most common method of controlling the attack time in a system as outlined. But we have here a fundamental flaw: we are comparing the instantaneous amplitude of the speech signal, not the average value, against a reference. In this new proposed system we will rectify the audio signal, filter it to obtain a time average and then apply that signal to the comparator. By doing this we have an even better match to what the ear hears, in other words we will maximize loudness. A byproduct of this is that a post-AGC clipper is now required because the system is no longer peak-sensitive, it is average-sensitive. Waveforms, which are peaked or jagged (in any event having a high peak to average ratio), do not generate as much AGC voltage. The AGC loop tends to let them pass on to the following clipper, where they are more heavily clipped. The signal as read on an ordinary "vu" meter has a remarkably consistent level.

To maximize loudness we must operate the AGC loop with a recovery (gain-increase) time as fast as possible. If the attack and recovery times are decreased to zero we approach the characteristics of a clipper as a limiting condition. But with very fast recovery times we will find that distortion, both harmonic and intermodulation, will increase. Harmonic distortion, particularly of components in the region from 1500 Hz upward, can be removed by lowpass filtering. But intermodulation distortion must be reduced in another manner. Since it is our object in this redesign to approach the effectiveness of a clipper, an implication is that we must set the recovery time as short as possible. In this design we are trying for a recovery time of about 30 milliseconds.

One way to minimize the generation of intermodulation distortion in an AGC loop is to look into the mechanism involved in its generation. Looking at the AGC loop control waveform we will see ripple on that signal. This ripple is in fact modulating the audio signal. Lengthening the recovery time minimizes the ripple and so minimizes the modulation of the signal. So will adding a recovery delay. This can be done by adding hysteresis to the control loop. Such a modified AGC voltage generator is shown in Figure 3. The resulting tone-burst response is shown in Figure 4.

We have added a delay, which is just barely adequate to keep AGC control voltage ripple at bay when a test tone of 100 Hz is applied to the system. But this delay is needlessly long when the test tone is 1000 Hz, and especially so when the tone is at 2500 Hz. We need a frequency-sensitive delay. Low frequency (bass) audio signals will need a greater recovery delay. The recovery time itself is not being adjusted; we are delaying only the time between the instantaneous audio waveform peak and the start of the recovery. The method of doing this is shown in Figure 5. This scheme for blocking the recovery has been adjusted so that it is just barely adequate to keep ripple off the AGC control voltage regardless of the test tone frequency. This frequency-dependent recovery delay is an advancement in processor design.


 The clipper and its driver

The above-described AGC block will not control modulation on a peak basis. It will, however, have a strikingly high average output level. A clipper must be added to catch those transients that escape the AGC loop. This can be of a fairly simple design, such as shown in Figure 6. Here we see the control to preset the system so that with sinusoids the clipper is on the verge of clipping. Also note the control labeled "Clipping" which adjusts the drive into the diode clipper proper. This control raises the level of the middle and upper audio frequencies only as it is advanced. This minimizes intermodulation distortion when clipping is called into play. The frequency response of this driver is shown in Figure 7. Observe that the diode biasing voltages are quite low. As a byproduct of this low bias voltage is a noticeably rounded shape of the output waveform, as opposed to a waveform with a very sharp break caused by an ideal diode clipper. This gentle clipping knee sounds quite good and yields a dramatic reduction in very high frequency components, making it easier to filter. When the output versus input is plotted it will be seen to have a rounded segment at the clipping breakpoint. This does cause a slight loss in efficiency.

A wideband (not frequency-limited) flat-topped wave cannot be transmitted accurately in the SSB mode. This is another reason for more severely discouraging low audio frequency components at the input to this processor.

With an AGC loop that might be considered sluggish, the clipper will operate a considerable portion of the time. With a preset control, the clipper will operate at its threshold (on the verge of clipping) with a sinusoid applied to the AGC loop input. On speech waveforms it will be seen that a few dB of clipping will occur fairly consistently. The ear will be unable to discern this. But the clipped wave has harmonic content, which must be removed. If the output of this system is applied to a filter-type SSB generator, then perhaps no further bandwidth restriction lowpass filtering will be needed; a subsequent bandpass filter will limit the transmitted spectrum. A phasing-type SSB system or an AM system absolutely must have bandwidth-limiting lowpass filtering. This filter only need remove components caused by clipping, and only those over 3000 Hz or so. The filter should be of a lowpass type, it should have a flat response out to its cutoff frequency, and then it should drop as fast as possible. A filter to accomplish this is shown in Figure 8.

A problem with such a filter is that, when driven by a squared waveform, it will have an overshoot at its output. Numerous schemes to counter these overshoots have been developed during the years. The more effective ones are quite complicated. A simple method, quite practical, is included in the lowpass filter of Figure 7. It consists of a step in the magnitude response wherein the upper modulating components are attenuated about 2 dB compared to the lower components. The frequency response of the proposed lowpass filter is shown in Figure 9. The transient response of the filter to a squared waveform without overshoot correction is shown in Figure 10A; with the overshoot correction the response is shown in Figure 10B.


  The metering

Because the control voltage in the AGC loop can go from zero to maximum in perhaps 10 milliseconds, and back to zero in perhaps 30 milliseconds, there is no way that an ordinary analog meter can indicate the degree of compression. LED bargraph displays are the best way to monitor compression. The AGC loop metering system is shown in Figure 11.

The clipping should be monitored if the operator intends to on occasion use clipping as a loudness-enhancing tool. It was found experimentally that monitoring the instantaneous clipping level was essentially pointless, because the bargraph segments were illuminated in microsecond bursts. They could be seen only in a darkened room. In addition, since the ear cannot hear the clipping of such wavebursts, it was decided to average the clipping over a 10 millisecond time frame. This allowed the bargraph clipping display to match the ear; it will respond to clipping which is substantial enough for the ear to perceive. It also results in a bargraph which is indicates in a meaningful way. The clipper metering system is shown in Figure 12.

The schematic of the complete system is shown in Figure 13.


  DSP

The design stated as a set of algorithms is as follows:


  References

Compression and Clipping, James L. Tonne, W5SUC, QST, September, 1956