[Comp.Sci.Dept, Utrecht] Note from archiver<at>cs.uu.nl: This page is part of a big collection of Usenet postings, archived here for your convenience. For matters concerning the content of this page, please contact its author(s); use the source, if all else fails. For matters concerning the archive as a whole, please refer to the archive description or contact the archiver.

Subject: comp.dsp FAQ [2 of 4]

This article was archived around: NNTP-Posting-Wed, 11 Apr 2007 13:16:58 -0500

All FAQs in Directory: dsp-faq
All FAQs posted in: comp.dsp
Source: Usenet Version

Archive-name: dsp-faq/part2 Last-modified: Wed Apr 11 2007 URL: http://www.bdti.com/faq/
Previous section (1) Next section (3) Q2: Algorithms and standards Q2.1: Where can I get public domain algorithms for general-purpose DSP? Updated 12/31/96 The following archives contain things such as matrix operations, FFT's and generally useful things like that, as opposed to complete applications. Netlib Netlib serves some of this software via email. Try mail to netlib@ORNL.GOV with "send help" in the subject field. To Obtain: For Europe: Internet: netlib@nac.no EARN/BITNET: netlib%nac.no@norunix.bitnet X.400: s=netlib; o=nac; c=no; EUNET/uucp: nac!netlib For more information: See Jack J. Dongarra and Eric Grosse, "Distribution of Mathematical Software Via Electronic Mail," Comm. ACM (1987) 30,403--407. A similar collection of statistical software is available from statlib@temper.stat.cmu.edu. The symbolic algebra system REDUCE is supported by reduce-netlib@rand.org. NSWC Library The Naval Surface Warfare Center has a library of mathematical Fortran subroutines that may be of use. The NSWC library is a library of general-purpose Fortran subroutines that provide a basic computational capability in a variety of mathematical activities. Emphasis has been placed on the transportability of the codes. Subroutines are available in the following areas: Elementary Operations, Geometry, Special Functions, Polynomials, Vectors, Matrices, Large Dense Systems of Linear Equations, Banded Matrices, Sparse Matrices, Eigenvalues and Eigenvectors, l1 Solution of Linear Equations, Least-Squares Solution of Linear Equations, Optimization, Transforms, Approximation of Functions, Curve Fitting, Surface Fitting, Manifold Fitting, Numerical Integration, Integral Equations, Ordinary Differential Equations, Partial Differential Equations For more information: NSWC Library of Mathematical Subroutines Report No.: NSWC TR 90-21, January 1990 by Alfred H. Morris, Jr. Naval Surface Warfare Center (E43) Dahlgren, VA 22448-5000 U.S.A. [Witold Waldman] IEEE Press book "Programs For Digital Signal Processing" You can get the Fortran source code from the IEEE Press book "Programs For Digital Signal Processing." See question 1.3.6. ---------------------------------------------------------------------- Q2.2: What are CELP and LPC? Where can I get the source for CELP and LPC? Updated 09/10/01 CELP stands for "code excited linear prediction". LPC stands for "linear predictive coding". They are compression algorithms used for low bit rate (2400 and 4800 bps) speech coding. The U.S. DoD's Federal-Standard-1016 based 4800 bps code excited linear prediction voice coder version 3.2 (CELP 3.2) Fortran and C simulation source codes are available for worldwide distribution (on DOS diskettes, but configured to compile on Sun SPARC stations) from NTIS and DTIC. Example input and processed speech files are included. A Technical Information Bulletin (TIB), "Details to Assist in Implementation of Federal Standard 1016 CELP," and the official standard, "Federal Standard 1016, Telecommunications: Analog to Digital Conversion of Radio Voice by 4,800 bit/second Code Excited Linear Prediction (CELP)," are also available. To obtain CELP: Available through the National Technical Information Service: NTIS U.S. Department of Commerce 5285 Port Royal Road Springfield, VA 22161 USA (800) 553-6847 FS-1016 CELP 3.2 may also be obtained from ftp://svr-ftp.eng.cam.ac.uk/pub/comp.speech/coding/celp_3.2a.tar.Z. LPC-10 (2.4 Kbps) is available from ftp://svr-ftp.eng.cam.ac.uk/pub/comp.speech/coding/lpc10-1.0.tar.gz. LPC (4.8 Kbps) can be downloaded in SpeakFreely http://www.speakfreely.org/, or in HawkVoice http://www.hawksoft.com/hawkvoice/. HawkVoice includes versions of OpenLPC, LPC-10, LPC, GSM, and Intel/DVI ADPCM. These versions have been rewritten to support multiple encoding and decoding streams, and the interfaces have been standardized. [Phil Frisbie, Jr., phil@hawksoft.com] OpenLPC (1.4 and 1.8 Kbps) can be downloaded from ftp://ftp.futuredynamics.com/OpenLPC/. MATLAB software for LPC-10 is available from http://www.eas.asu.edu/~spanias/srtcrs.html. Also, postscript copies of tutorials of speech coding can be found at http://www.eas.asu.edu/~spanias/papers.html. [Andreas Spanias, spanias@asu.edu] For more information: * The following articles describe the Federal-Standard-1016 4.8-kbps CELP coder (it's unnecessary to read more than one): Campbell, Joseph P. Jr., Thomas E. Tremain and Vanoy C. Welch, The Federal Standard 1016 4800 bps CELP Voice Coder, Digital Signal Processing, Academic Press, 1991, Vol. 1, No. 3, p. 145-155. Campbell, Joseph P. Jr., Thomas E. Tremain and Vanoy C. Welch, The DoD 4.8 kbps Standard (Proposed Federal Standard 1016), in Advances in Speech Coding, ed. Atal, Cuperman and Gersho, Kluwer Academic Publishers, 1991, Chapter 12, p. 121-133. Campbell, Joseph P. Jr., Thomas E. Tremain and Vanoy C. Welch, The Proposed Federal Standard 1016 4800 bps Voice Coder: CELP, Speech Technology Magazine, April/May 1990, p. 58-64. Additional information on CELP can also be found in the comp.speech FAQ. * The voicing classifier used in the enhanced LPC-10 (LPC-10e) is described in: Campbell, Joseph P., Jr. and T. E. Tremain, Voiced/Unvoiced Classification of Speech with Applications to the U.S. Government LPC-10E Algorithm, Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, 1986, p. 473-6. The U. S. Federal Standard 1015 (NATO STANAG 4198) is described in: Thomas E. Tremain, The Government Standard Linear Predictive Coding Algorithm: LPC-10, Speech Technology Magazine, April 1982, pp. 40-49. [Most of the above from Joe Campbell, jpcampb@afterlife.ncsc.mil, with additions from Dan Frankowski, drankow@winternet.com, and Ed Hall, edhall@rand.org] ---------------------------------------------------------------------- Q2.3: What is ADPCM? Where can I get source for it? Updated: 04/03/01 ADPCM stands for Adaptive Differential Pulse Code Modulation. It is a family of speech compression and decompression algorithms. A common implementation takes 16-bit linear PCM samples and converts them to 4-bit samples, yielding a compression rate of 4:1. To obtain: There is public domain C code available via anonymous ftp at ftp://ftp.cwi.nl/pub/audio/adpcm.shar written by Jack Jansen (email Jack.Jansen@cwi.nl). It is very programmer-friendly. The ADPCM code used is the Intel/DVI ADPCM code which is being recommended by the IMA Digital Audio Technical Working Group. It allows the following calls: adpcm_coder(short inbuf[], char outbuf[], int nsample, struct adpcm_state *state); adpcm_decoder(char inbuf[], short outbuf[], int nsample, struct adpcm_state *state); Note that this is NOT a G.722 coder. The ADPCM standard is much more complicated, probably resulting in better quality sound but also in much more computational overhead. Platforms: The routines have been tested on numerous platforms, and will easily compress and decompress millions of samples per second on current hardware. For more information: The G.721/722/723 packages are available from ITU at http://www.itu.ch/. [From Dan Frankowski, dfrankow@winternet.com; Jack Jansen, Jack.Jansen@cwi.nl] ---------------------------------------------------------------------- Q2.4: What is GSM? Where can I get source for it? Updated 4/27/00 GSM (Global System for Mobile Communication) is a standard for digital cellular telephony used in Europe. GSM also refers to the speech coder used in GSM telephones, which is what this section of the FAQ is concerned with. The Communications and Operating Systems Research Group (KBS) at the Technische Universitaet Berlin is currently working on a set of UNIX-based tools for computer-mediated telecooperation that will be made freely available. As part of this effort they are publishing an implementation of the European GSM 06.10 provisional standard for full-rate speech transcoding, prI-ETS 300 036, which uses RPE/LTP (residual pulse excitation/long term prediction) coding at 13 kbit/s. GSM 06.10 compresses frames of 160 13-bit samples (8 kHz sampling rate, i.e. a frame rate of 50 Hz) into 260 bits; for compatibility with typical UNIX applications, our implementation turns frames of 160 16-bit linear samples into 33-byte frames (1650 Bytes/s). The quality of the algorithm is good enough for reliable speaker recognition; even music often survives transcoding in recognizable form (given the bandwidth limitations of 8 kHz sampling rate). The interfaces offered are a front end modeled after compress(1), and a library API. Compression and decompression run faster than real time on most SPARCstations. The implementation has been verified against the ETSI standard test patterns. Jutta Degener jutta@cs.tu-berlin.de, Carsten Bormann cabo@cs.tu-berlin.de) Communications and Operating Systems Research Group, TU Berlin Fax: +49.30.31425156, Phone: +49.30.31424315 To obtain: ftp://svr-ftp.eng.cam.ac.uk/pub/comp.speech/coding/gsm-1.0.6.tar.gz. An alternative site is ftp://ftp.cwi.nl/pub/audio/gsm-1.0.7.tar.gz. Try also: http://kbs.cs.tu-berlin.de/~jutta/toast.html. [From Dan Frankowski, dfrankow@winternet.com; Jutta Degener, jutta@cs.tu-berlin.de] ---------------------------------------------------------------------- Q2.5: How does pitch perception work, and how do I implement it on my DSP chip? Updated 04/02/01 Pitch is officially defined as "That attribute of auditory sensation in terms of which sounds may be ordered on a musical scale." Several good examples illustrating the subtleties of pitch perception are included in the "Auditory Demonstrations CD" which is available from the Acoustical Society of America, Woodbury, NY 10797 for $20. Books/papers: A good general reference about the psychology of pitch perception is the book: B.C.J. Moore, An Introduction to the Psychology of Hearing, Academic Press, London, 1997. This book is available in paperback and makes a good desk reference. An algorithm implementation that matches a large body of psycho-acoustical work, but which is computationally very intensive, is presented in the paper: Malcolm Slaney and Richard Lyon, "A Perceptual Pitch Detector," Proceedings of the International Conference of Acoustics, Speech, and Signal Processing, 1990, Albuquerque, New Mexico. Available for ftp at ftp://worldserver.com/pub/malcolm/ICASSP90.psc.Z The definitive papers describing the use of such a perceptual pitch detector as applied to the classical pitch literature is in: Ray Meddis and M. J. Hewitt. "Virtual pitch and phase sensitivity of a computer model of the auditory periphery. " Journal of the Acoustical Society of America 89 (6 1991): 2866-2682. and 2883-2894. The current work that argues for a pure spectral method starts with the work of Goldstein: J. Goldstein, "An optimum processor theory for the central formation of the pitch of complex tones," Journal of the Acoustical Society of America 54, 1496-1516, 1973. Two approaches are worth considering if something approximating pitch is appropriate. The people at IRCAM have proposed a harmonic analysis approach that can be implemented on a DSP: Boris Doval and Xavier Rodet, "Estimation of Fundamental Frequency of Musical Sound Signals," Proceedings of the 1991 International Conference on Acoustics, Speech, and Signal Processing, Toronto, Volume 5, pp. 3657-3660. The classic paper for time domain (peak picking) pitch algorithms is: B. Gold and L. Rabiner, "Parallel processing techniques for estimating pitch periods of speech in the time domain," Journal of the Acoustical Society of America, 46, pp 441-448, 1969. Finally, a word of caution: Pitch is not single-valued. We can hear a sound and match it to several different pitches. Imagine the number of instruments in an orchestra, each with its own pitch. Even a single sound can have more than one pitch. See for example Demonstration 27 from the ASA Auditory Demonstrations CD. [The above from Malcolm Slaney, Interval Research, and John Lazzaro, U.C. Berkeley.] Information about independently changing the pitch and speed of a digital recording can be found at http://www.dspdimension.com/html/timepitch.html. [Stephan M. Bernsee, spam@dspdimension.com]Updated! ---------------------------------------------------------------------- Q2.6: What standards exist for digital audio? What is AES/EBU? What is S/PDIF? Updates 1/8/97 Q2.6.1: Where can I get copies of ITU (formerly CCITT) standards? Try the ITU (International Telecommunication Union) homepage at http://www.itu.ch/. ---------------------------------------------------------------------- Q2.6.2: What standards are there for digital audio? AES/EBU The "AES/EBU" (Audio Engineering Society / European Broadcast Union) digital audio standard is probably the most popular digital audio standard today. Most consumer and professional digital audio devices (CD players, DAT decks, etc.) that feature digital audio I/O support AES/EBU. AES/EBU is a bit-serial communications protocol for transmitting digital audio data through a single transmission line. It provides two channels of audio data (up to 24 bits per sample), a method for communication control and status information ("channel status bits"), and some error detection capabilities. Clocking information (i.e., sample rate) is derived from the AES/EBU bit stream, and is thus controlled by the transmitter. The standard mandates use of 32 kHz, 44.1 kHz, or 48 kHz sample rates, but some interfaces can be made to work at other sample rates. AES/EBU provides both "professional" and "consumer" modes. The big difference is in the format of the channel status bits mentioned above. The professional mode bits include alphanumeric channel origin and destination data, time of day codes, sample number codes, word length, and other goodies. The consumer mode bits have much less information, but do include information on copy protection (naturally). Additionally, the standard provides for "user data", which is a bit stream containing user-defined (i.e., manufacturer-defined) data. According to Tim Channon, "CD user data is almost raw CD subcode; DAT is StartID and SkipID. In professional mode, there is an SDLC protocol or, if DAT, it may be the same as consumer mode." The physical connection media are commonly used with AES/EBU: balanced (differential), using two wires and shield in three-wire microphone cable with XLR connectors; unbalanced (single-ended), using audio coax cable with RCA jacks; and optical (via fiber optics). S/P-DIF "S/P-DIF" (Sony/Philips Digital Interface Format) typically refers to AES/EBU operated in consumer mode over unbalanced RCA cable. Note that S/P-DIF and AES/EBU mean different things depending on how much of a purist you are in the digital audio world; see the Finger article below. References: Finger, Robert, AES3-199X: The Revised Two Channel Digital Audio Interface (DRAFT), presented at the 91st Convention of the Audio Engineering Society, October 4-8, 1991. Reprints: AES, 60 East 42nd St., New York, NY, 10165. [The above from Phil Lapsley and Tim Channon, tchannon@black.demon.co.uk] Painter, E. M., and Spanias, A. S. (1997 and revised 1999). A Review of Algorithms for Perceptual Coding of Digital Audio Signals. (PostScript, 3MB) http://www.eas.asu.edu/~spanias/papers.html [Andreas Spanias, spanias@asu.edu] ---------------------------------------------------------------------- Q2.7: What is mu-law encoding? Where can I get source for it? Updated 9/13/99 Mu-law (also "u-law") encoding is a form of logarithmic quantization or companding. It's based on the observation that many signals are statistically more likely to be near a low signal level than a high signal level. Therefore, it makes more sense to have more quantization points near a low level than a high level. In a typical mu-law system, linear samples of 14 to 16 bits are companded to 8 bits. Most telephone quality codecs (including the Sparcstation's audio codec) use mu-law encoded samples. Desktop Sparc machines come with routines to convert between linear and mu-law samples. On a desktop Sparc, see the man page for audio_ulaw2linear in /usr/demo/SOUND/man. To obtain: Craig Reese posted the source of similar routines to comp.dsp in August '92. These are archived on ftp://mirriwinni.cse.rmit.edu.au/pub/dsp/misc/ulaw_reese. References: ITU-T (formerly CCITT) Recommendation G.711 (very difficult to follow). Michael Villeret, et. al, A New Digital Technique for Implementation of Any Continuous PCM Companding Law, IEEE Int. Conf. on Communications, 1973, vol. 1, pp. 11.12-11.17. MIL-STD-188-113, Interoperability and Performance Standards for Analog-to-Digital Conversion Techniques, 17 February 1987. TI Digital Signal Processing Applications with the TMS320 Family (TI literature number SPRA012A), pp. 169-198. [From Joe Campbell; Craig Reese, cfreese@super.org; Sepehr Mehrabanzad, sepehr@falstaff.dev.cdx.mot.com; Keith Kendall, KLK3%mimi@magic.itg.ti.com] ---------------------------------------------------------------------- Q2.8: How can I do CD <=> DAT sample rate conversion? Updated 9/13/99 CD players use a 44.1 kHz sample rate, whereas DAT uses a 48 kHz sample rate. This means that you must do sample rate conversion before you can get data from a CD player directly into a DAT deck. [From Ed Hall, edhall@rand.org:] For a start, look at Multirate Digital Signal Processing by Crochiere and Rabiner (see FAQ section 1.1). Almost any technique for producing good digital low-pass filters will be adaptable to sample-rate conversion. 44.1:48 and vice-versa is pretty hairy, though, because the lowest whole-number ratio is 147:160. To do all that in one go would require a FIR with thousands of coefficients, of which only 1/147th or 1/160th are used for each sample--the real problem is memory, not CPU for most DSP chips. You could chain several interpolators and decimators, as suggested by factoring the ratio into 3*7*7:2*2*2*2*2*5. This adds complexity, but reduces the number of coefficients required by a considerable amount. [From Lou Scheffer:] Theory of operation: 44.1 and 48 are in the ratio 147/160. To convert from 44.1 to 48, for example, we (conceptually): 1. interpolate 159 zeros between every input sample. This raises that data rate to 7.056 MHz. Since it is equivalent to reconstructing with delta functions, it also creates images of frequency f at 44.1-f, 44.1+f, 88.2-f, 88.2+f, ... 2. We remove these with an FIR digital filter, leaving a signal containing only 0-20 KHz information, but still sampled at a rate of 7.056 MHz. 3. We discard 146 of every 147 output samples. It does not hurt to do so since we have no content above 24 KHz. In practice, of course, we never compute the values of the samples we will throw out. So we need to design an FIR filter that is flat to 20 KHz, and down at least X db at 24 KHz. How big does X need to be? You might think about 100 db, since the max signal size is roughly +-32767, and the input quantization +- 1/2, so we know the input had a signal to broadband noise ratio of 98 db at most. However, the noise in the stopband (20KHz-3.5MHz) is all folded into the passband by the decimation in step 3, so we need another 22 db (that's 160 in db) to account for the noise folding. Thus 120 db rejection yields a broadband noise equal to the original quantizing noise. If you are a fanatic, you can shoot for 130 db to make the original quantizing errors dominate, and a 22.05 KHz cutoff to eliminate even ultrasonic aliasing. You will pay for your fanaticism with a penance of more taps, however. To obtain: There's a free implementation of Julius O. Smith III and someone else's "bandwidth-limited interpolation" rate conversion algorithm. A paper available as ftp://ccrma-ftp.stanford.edu/pub/DSP/Tutorials/BandlimitedInterpolation.eps.Z explains the algorithm. Free source code, as well as an HTML discussion of the algorithm, is available at http://ccrma-www.stanford.edu/~jos/resample/. It all works quite well. [From Kevin Bradley, kb+@andrew.cmu.edu:] There is an implementation of polyphase resampling for various rates as a part of the Sox audio toolkit at http://home.sprynet.com/~cbagwell/sox.html. See file polyphas.c for details. Sox also contains an implementation of bandlimited interpolation and linear interpolation, and serves as a ready vehicle for module experimentation. [From Fritz M. Rothacher, f.rothacher@ieee.org:] You can add my Ph.D. thesis on sample-rate conversion to the FAQ: Fritz M. Rothacher, Sample-Rate Conversion: Algorithms and VLSI Implementation, Ph.D. thesis, Integrated Systems Lab, Swiss Federal Institute of Technology, ETH Zuerich, 1995, ISBN 3-89191-873-9 It can also be downloaded from my homepage at http://www.guest.iis.ee.ethz.ch/~rota. ---------------------------------------------------------------------- Q2.9: Wavelets Updated 6/3/98 Q2.9.1 What are wavelets? Where can I get more information? In short, wavelets are a way to analyze a signal using base functions which are localized both in time (as diracs, but unlike sine waves), and in frequency (as sine waves, but unlike diracs). They can be used for efficient numerical algorithms and many DSP or compression applications. Sources of information on wavelets include: * a newsletter, "Wavelet Digest". Subscriptions for Wavelet Digest: E-mail to wavelet@math.scarolina.edu with "subscribe" as subject. The Wavelet Digest can also be found at http://www.wavelet.org/. * http://www.amara.com/current/wavelet.html ---------------------------------------------------------------------- Q2.9.2 What are some good books and papers on wavelets The best introduction to wavelet transforms is in: Wavelets and Signal Processing- Oliver Rioul and Martin Vetterli, IEEE Signal Processing magazine, Oct. 91, pp 14-38 A good introductory book on wavelets: Randy K. Young, Wavelet Theory and Its Applications, Kluwer Academic Publishers, ISBN 0-7923-9271-X, 1993. A more thorough book: Ali N. Akansu and Richard A. Haddad, Multiresolution Signal Decomposition Transforms, Subbands, Wavelets Academic Press, Inc., ISBN 0-12-047140-X A couple more interesting papers: Wavelets and Filter banks: Theory and Design, IEEE Transactions on Signal Processing, Vol. 40, No.9, Sept. 1992, pp 2207-2232 Mac Cody's articles in Dr. Dobb's Journal, April 1992 and April 1993 Paper by Ingrid Daubechies in IEEE Trans. on Info. theory , vol 36. No.5 , Sept 1990 and a book titled " Ten lectures on Wavelets" deal with the mathematical aspects of the WT. ---------------------------------------------------------------------- Q2.9.3: Where can I get some software for wavelets? ftp://pascal.math.yale.edu/pub/wavelets/software/xwpl Binaries are available for the following platforms: Sun Sparcstations running SunOS 4.1 or Solaris 2.3, NeXT machines running NeXTstep 3.0 or higher, with an X server, Silicon Graphics machines (IRIS), DEC Alpha AXP running OSF/1 1.2 or higher, i386/i486 PC compatible with Linux 0.99. There is also a sample data directory containing interesting signals. More information: http://www.math.yale.edu/users/majid/ [From Fazal Majid majid@math.yale.edu]: Rice Wavelet Tools Description: The Rice Wavelet Toolbox (RWT) is a collection of Matlab M-files and C MEX-files for 1D and 2D wavelet and filter bank design, analysis, and processing. The toolbox provides tools for denoising and interfaces directly with our Matlab code for wavelet domain hidden Markov models and wavelet regularized deconvolution. Also included is a simple converter to the data format used by the official Matlab wavelet toolbox. The current distribution, Version 2.3 (Dec 1, 2000), has been streamlined and packaged for different systems, including Solaris, Linux, and Microsoft Windows. Functions omitted in Version 2.3 can be found in the Version 2.01 distribution. To obtain: See http://www-dsp.rice.edu/software/RWT/. Send mail to wlet-tools@rice.edu (or ramesh@dsp.rice.edu) ---------------------------------------------------------------------- Q2.10: How do I calculate the coefficients for a Hilbert transformer? Updated 6/3/98 For all the gory details, I suggest the paper: Andrew Reilly and Gordon Frazer and Boualem Boashash: Analytic signal generation---tips and traps, IEEE Transactions on Signal Processing, no. 11, vol. 42, Nov. 1994, pp. 3241-3245. For comp.dsp, the gist is: 1. Design a half-bandwidth real low-pass FIR filter using whatever optimal method you choose, with the principle design criterion being minimization of the maximum attenuation in the band f_s/4 to f_s/2. 2. Modulate this by exp(2 pi f_s/4 t), so that now your stop-band is the negative frequencies, the pass-band is the positive frequencies, and the roll-off at each end does not extend into the negative frequency band. 3. either use it as a complex FIR filter, or a pair of I/Q real filters in whatever FIR implementation you have available. If your original filter design produced an impulse response with an even number of taps, then the filtering in 3 will introduce a spurious half-sample delay (resampling the real signal component), but that does not matter for many applications, and such filters have other features to recommend them. Andrew Reilly [Reilly@zeta.org.au] ---------------------------------------------------------------------- Q2.11: Algorithm implementation: floating-point versus fixed-point According to the WWWebster Dictionary, an algorithm is "a procedure for solving a mathematical problem (as of finding the greatest common divisor) in a finite number of steps that frequently involves repetition of an operation; broadly: a step-by-step procedure for solving a problem or accomplishing some end especially by a computer." Typical (although by no means the only) operations are those of addition and multiplication. When expressing the algorithm with pencil and paper, these operations are commonly taken to be within an algebraically complete number system such as the integers or the reals. However, when the time comes to implement the algorithm on a computer, these "ideal" number systems must be exchanged for something realizable. The number systems available today on common processors and digital hardware are broadly categorized as floating-point and fixed-point. In a floating-point representation, the total number of bits available are partitioned into an exponent and mantissa. Generally speaking, the mantissa stores the "significant digits" of the value while the exponent scales the significant digits to the desired magnitude. The action of the exponent is to move, or "float," the decimal point depending on the magnitude being represented; thus the term "floating-point." Because floating-point representations are typically at least 32 bits long (IEEE-754 is a popular standard for 32-bit and 64-bit floating-point numbers), there exists simultaneously high precision and high dynamic range. These traits of floating-point numbers allow most algorithms to be ported directly to floating-point implementations with little or no change, and this is the key reason floating-point representations are highly desirable. The disadvantage of floating-point implementations is that they require a significant amount of extra hardware over fixed-point implementations, which translates to higher parts costs, higher power consumption, slower execution, larger chip area, or a combination of these. As the term "fixed-point" implies, fixed-point representations have the binary point at a fixed location. There are two subsets of fixed-point implementations: fractional and integer. In a fractional fixed-point implementation, such as that provided on the Motorola 56K series of DSPs, the binary point is always assumed to be to the left of the most-significant digit. In an integer fixed-point implementation, such as that provided by the Texas Instruments TMS320C54xx series of DSPs, the binary point is to the right of the least-significant digit. In either case, the arithmetic operations implemented in the hardware are essentially integer, which results in a much simpler arithmetic logic unit in hardware that allows lower cost, lower power consumption, faster execution, smaller chip area, or a combination of these, over that of floating-point implementations. Fixed-Point Arithmetic: The Basics In essence, a fixed-point representation is a simple integer scaled (divided) by a power of two. If we denote an unscaled integer variable by upper case "X" and the scaled, fixed-point variable by lower case "x," then x = X/2^b, where b is the number of digits the binary point is shifted left. For example, if X is a 16-bit, two's complement integer, and b=4, then "X" has values ranging from -2^(15) to +2^(15)-1 and with minimum step size of 1, while the scaled value "x" ranges from -2^(11) to +2^(11) - 1/(2^4) with a minimum step size of 1/(2^4). Note that the value of "b" is not part of the representation. You won't see it in a register or as part of the data anywhere; it is a parameter that the algorithm implementer must determine and maintain. Fixed-point representations place some very different rules on operations than their floating-point counterparts. For example, two variables must be scaled the same in order to be added (or subtracted). Thus it may be necessary to shift one or the other operand prior to adding. Another example is that when multiplying two N-bit values with scale factors b0 and b1, the result is scaled (b0+b1) and requires 2*N bits in general in order to avoid overflow and maintain precision. There are several other rules and considerations for fixed-point arithmetic that are commonly encountered when implementing algorithms. For more information, see http://www.digitalsignallabs.com/papers.htm. Randy Yates [yates@ieee.org] Previous section (1) Next section (3) Previous section (2) Next section (4) Q3: Programmable DSP chips and their software Q3.1: What are the available DSP chips and chip architectures? Updated 05/07/02 The "big four" programmable DSP chip manufacturers are Texas Instruments, with the TMS320C2000, TMS320C5000, and TMS320C6000 series of chips; Freescale, with the DSP56300, DSP56800, and MSC8100 (StarCore) series; Agere Systems (formerly Lucent Technologies), with the DSP16000 series; and Analog Devices, with the ADSP-2100 and ADSP-21000 ("SHARC") series. A good overview of programmable DSP chips is published periodically in EDN and Computer Design magazines. You may also want to check out Berkeley Design Technology's home page, which has a number of articles on choosing DSP processors, as well as a "Pocket Guide to Processors for DSP" in HTML format. Brief overviews of various DSP processors, cores, and general-purpose processors can be found at http://www.bdti.com/procsum/index.htm. Here's a less ambitious chip breakdown by manufacturer: Agere Systems (formerly Lucent Technologies): DSP16xxx: 100 to 170 MHz 16-bit fixed-point DSP. The DSP16000 core features two multipliers with SIMD-like capabilities, a 20-bit address bus, a 32-bit address bus, and eight 40-bit accumulators. The chips feature two serial ports and two timers. The first-generation processor, the DSP16210, contains a single DSP16000 core and 120 KB of internal RAM. The second-generation DSP16410 incorporates two DSP16000 cores and 386 KB of internal RAM. Analog Devices: ADSP-21xx: 10 to 80 MHz 16-bit fixed point DSPs; 40-bit accumulator; 24-bit instructions. Large number of family members with different configurations of on-chip memory and serial ports, timers, and host ports. ADSP-21mspxx members include an on-chip codec. ADSP-219x: 160 MHz 16-bit fixed point DSPs; 40-bit accumulator; 24-bit instructions. Based on the ADSP-21xx family, and is is mostly, but not completely, assembly source-code upward compatible with the ADSP-21xx Adds new addressing modes and an instruction cache, expands address space, and lengthens pipeline (six stages compared to three on the ADSP21xx). Family includes members containing multiple ADSP-219x cores. ADSP-21xxx ("SHARC"): 33 to 100 MHz floating-point DSP; Supports 32-bit fixed-point, IEEE format 32-bit floating-point, and 40-bit floating-point; 40-bit registers plus an 80-bit accumulator that can be divided into two 32-bit registers and a 16-bit register. The first-generation SHARC, the ADSP-2106x, features a single data path, a 32-bit address bus, and 40-bit data bus. Versions are available with up to 512 KB of on-chip memory, up to six communication ports, and up to 10 DMA channels. The second-generation ADSP-2116x has two parallel data paths, a 32-bit address bus, and a 64-bit data bus. Versions are available with up to 512 KB of on-chip memory; up to six communication ports, and up to 14 DMA channels. Analog Devices also sells the AD14000 series, which contain four ADSP-2106x SHARC processors in a single-chip package. ADSP-2153x: 200 to 300 MHz 16-bit fixed point DSPs that can execute two MAC instructions per cycle; based on the ADI/Intel MSA core. Uses a mix of 16-, 32-, and 64-bit instructions. Features include ability to operate over a wide range of frequencies and voltages. Freescale: DSP563xx: 66 to 160 MHz 24-bit fixed-point DSP; most family members have 24-bit address and data busses. The DSP563xx also features 56-bit accumulators (2), timers, serial interface, host interface port. The DSP56307 and DSP56311 contain a filter co-processor. Up to 1 MB of internal RAM. DSP568xx: 40 MHz 16-bit fixed point DSP; 36-bit accumulators (2), three internal address buses (two 16-bit, one 19-bit) and one 16-bit external address bus; three 16-bit internal data buses, one 16-bit external data bus; serial ports, timers. 4-12 KB of internal RAM. Most family members include an on-chip A/D. DSP5685x: 160 MHz 16-bit fixed point DSP based on the DSP568xx. Adds an exponent detector and two accumulators, extends shifter and the logic unit to 32 bits, and widens internal address and data buses. The DSP5685x uses a 1X master clock rate rather than the 2X master clock rate used by the DSP568xx. MSC81xx: The 300 MHz MSC8101 is the first processor based on the StarCore SC140 core. It contains four parallel ALU units that can execute up to four MAC operations in a single clock cycle. The MSC8101 uses variable-length instructions. Features include: 512 KB on-chip RAM; 16 DMA channels; an on-chip filter co-processor; and interfaces for ATM, Ethernet, E1/T1 and E3/T3, and the PowerPC bus. Texas Instruments: TMS320C2xxx: 20-40 MHz 16-bit fixed-point DSPs oriented toward low-cost control applications; 16 bit data, 32 bit registers. The family members have a variety of peripherals, such as A/D converters, 41 I/O pins, and 16 PWM outputs. A variety of RAM and ROM configurations are available TI also sells the TMS320C2x family, an older version of the chip with fewer features. TMS320C3x: 33-75 MHz floating point DSPs; 32-bit floating-point, 24-bit fixed-point data, 40-bit registers; DMA controller; serial ports; some support for multi-processor arrays. Various ROM and RAM configurations. TMS320C54xx: 40 to 160 MHz 16-bit fixed-point DSPs with a large number of specialized instructions. Many family members; the processors differ in configuration of on-chip ROM/RAM, serial ports, autobuffered serial ports, host ports, and time-division multiplexed ports. On-chip RAM ranges from 10 KB to over 1 MB. TMS320C55xx: 144 to 200 MHz dual-ALU variant of the TMS320C54xx that can execute two MAC instructions per cycle. Variable instruction word width. Features include up to 320 KB internal RAM; 6 DMA channels; 2 serial ports; and 2 timers. TMS320C62xx: 150-300 MHz 16-bit fixed-point DSP with VLIW (very large instruction word), load/store architecture; 32 32-bit registers; very deep pipeline; two multipliers, ALUs, and shifters; cache. TMS320C64xx: 400-600 MHz 16-bit fixed-point DSP based on the TMS320C62xx. Adds SIMD support to most execution units, including extensive 8-bit SIMD support. Also doubles data bandwidth and increases size of on-chip memory. TMS320C67xx: 100-167 MHz 32-bit and 64-bit IEEE-754 floating-point DSP with VLIW (very large instruction word), load/store architecture; 32 32-bit registers; very deep pipeline; two multipliers, ALUs, and shifters; cache. ---------------------------------------------------------------------- Q3.2: What is the difference between a DSP and a microprocessor? Updated 04/02/01 The essential difference between a DSP and a microprocessor is that a DSP processor has features designed to support high-performance, repetitive, numerically intensive tasks. In contrast, general-purpose processors or microcontrollers (GPPs/MCUs for short) are either not specialized for a specific kind of applications (in the case of general-purpose processors), or they are designed for control-oriented applications (in the case of microcontrollers). Features that accelerate performance in DSP applications include: * Single-cycle multiply-accumulate capability; high-performance DSPs often have two multipliers that enable two multiply-accumulate operations per instruction cycle; some DSP have four or more multipliers * Specialized addressing modes, for example, pre- and post-modification of address pointers, circular addressing, and bit-reversed addressing * Most DSPs provide various configurations of on-chip memory and peripherals tailored for DSP applications. DSPs generally feature multiple-access memory architectures that enable DSPs to complete several accesses to memory in a single instruction cycle * Specialized execution control. Usually, DSP processors provide a loop instruction that allows tight loops to be repeated without spending any instruction cycles for updating and testing the loop counter or for jumping back to the top of the loop * DSP processors are known for their irregular instruction sets, which generally allow several operations to be encoded in a single instruction. For example, a processor that uses 32-bit instructions may encode two additions, two multiplications, and four 16-bit data moves into a single instruction. In general, DSP processor instruction sets allow a data move to be performed in parallel with an arithmetic operation. GPPs/MCUs, in contrast, usually specify a single operation per instruction While the above differences traditionally distinguish DSPs from GPPs/MCUs, in practice it is not important what kind of processor you choose. What is really important is to choose the processor that is best suited for your application; if a GPP/MCU is better suited for your DSP application than a DSP processor, the processor of choice is the GPP/MCU. It is also worth noting that the difference between DSPs and GPPs/MCUs is fading: many GPPs/MCUs now include DSP features, and DSPs are increasingly adding microcontroller features. ---------------------------------------------------------------------- Q3.3: Software for Analog Devices DSPs Updated 12/01/2006 Q3.3.1: Where can I get a C compiler for the ADSP-21xx and ADSP-21xxx? The G21 package collects the free source code for the Analog Devices GCC-based C compilers for their 21xxx (SHARC) and 21xx series DSPs. These compilers are all based on GCC version 2.3.3. Full source code for the compiler, assembler, linker, etc. is available at http://www.kvaleberg.com/g21.html. The C compilers are available for the 210x series as well as for the SHARC. The assemblers and linkers are only available for the SHARC. The source code is based on what is released under GPL by ADI, but is adapted for use with Linux and other Unix variants. [Egil Kvaleberg, egil@kvaleberg.no] ---------------------------------------------------------------------- Q3.3.2: Where can I get tools for the ADSP-21xxx? SHARC development tools are avaiable for Acorn/BSD, Linux, and other platforms. The tools include frontend/preprocessor , assembler, linker, archiver, a utility to generate ROM images for eprom burners, and other utilities The supplied assembler is not part of the gnu archive, but is based on a assembler originaly written by P. Lantto. Source code and binaries are available at: http://www.markettos.org.uk/electronics/sharc/. ---------------------------------------------------------------------- Q3.3.3: Where can I get algorithms or libraries for Analog Devices DSPs? The number for the Analog Devices DSP BBS is (617) 461-4258 (300, 1200, 2400, 9600, 14400 bps), 8N1. You can also find files on Analog Devices' web site at http://www.analog.com/processors/index.html, or at their FTP site at ftp://ftp.analog.com. [Analog Devices DSP Applications, dsp_applications@analog.com] ---------------------------------------------------------------------- Q3.4: Software for Agere Systems (Formerly Lucent Technologies) DSPs Agere Systems provides application libraries for their DSPs at http://www.agere.com/networking/dsps.html. ---------------------------------------------------------------------- Q3.5: Software for Freescale DSPs Updated 12/01/2006 Freescale provides free software development tools that may be downloaded from the Freescale Web site at http://www.freescale.com/webapp/sps/site/prod_summary.jsp?code=MSW3SDK000AA&nodeId=01M983916044937. Q3.5.1: Where can I get a free assembler for the Freescale DSP56000? A free assembler for the Freescale DSP56000 exists, thanks to Quinn Jensen, jensenq@zdomain.com. The current version is 1.2. It is also available at http://www.zdomain.com/a56.html. ---------------------------------------------------------------------- Q3.5.2: Where can I get a free C compiler for the Freescale DSP56000? There are two separate compiler sources for the Freescale DSP56000. One is the port of gcc 1.40 done by Andrew Sterian (asterian@umich.edu) and the other is a port of gcc 1.37.1 done by Freescale and returned to the FSF. Andrew's port has bowed to Freescale's version. Both may be portable to gcc2.x.x with some effort required. Neither of these comes with an assembler, but you can get a free DSP56000 assembler elsewhere (see question 3.5.1, above). The Freescale gcc source is available for FTP from: ftp://nic.funet.fi/pub/ham/dsp/dsp56k-tools/dsp56k-gcc.tar.Z. From Andrew Sterian, asterian@umich.edu: "My DSP56K compiler, while not supported nor as well tested as Freescale's, implements fixed-point arithmetic rather than floating-point arithmetic. This may be suitable for some applications. The 5615 compiler also implements fixed-point arithmetic. To the best of my knowledge, Freescale does not have a C compiler for the 5615 family, although alternatives may exist. As of this writing (January 1997) I have not worked with Freescale DSPs or compiler software for nearly 5 years so questions regarding my compilers may well be met with "Ummm... I have no idea." Both compilers were posted to alt.sources so any Usenet site that archives this newsgroup will have a copy. I have also found the 5616 compiler at ftp://ftp.funet.fi/pub/ham/dsp/dsp56k-tools/gcc5616.tar.Z. (http://www.newmicros.com) IsoPod(TM) - based on the DSP56F805. The assembler generates output suitable for Freescale's free JTAG flash loader. Pete Gray has announced the availability of a Small C cross-compiler (with source) and assembler for the Freescale DSP56800, available for download from http://petegray.newmicros.com/ . Targetting a simple DOS-box host, developed and tested using djgpp (http://www.delorie.com/djgpp/) and Metrowerks CodeWarrior, in conjunction with NMI's (http://www.newmicros.com) IsoPod(TM) - based on the DSP56F805. The assembler generates output suitable for Freescale's free JTAG flash loader. Small C language reference available online at http://www.ddjembedded.com/languages/smallc/ ---------------------------------------------------------------------- Q3.5.3 Where can I get a disassembler for the Freescale DSP56000? Miloslaw Smyk has released an open source (BSD style) 5600x disassembly library. It is available for download at https://sourceforge.net/projects/lib5600x [Miloslaw Smyk, thorgal@wmfh.org.pl] ---------------------------------------------------------------------- Q3.5.4: Where can I get algorithms and libraries for Freescale DSPs? Freescale provides a software archive that is available via World-Wide Web from the software page at http://www.mot.com/SPS/DSP/software/. The archive includes macros for filters (FIR, IIR, adaptive) and floating-point functions. [Tim Baggett] ---------------------------------------------------------------------- Q3.5.5: Where can I get NeXT-compatible Freescale DSP56001 code? Try FTP at ccrma-ftp.stanford.edu. The /pub/ directory contains free code for the Freescale DSP56001 and the NeXT platform. [bil@ccrma.Stanford.EDU] ---------------------------------------------------------------------- Q3.5.6: Where can I get emulators for the 68HC11 (6811) processor? While the 68HC11 is not a DSP processor, emulators are available for those who might be interested in doing DSP on these processors: * New Mexico State University (NMSU) simulator engine, ftp://crl.nmsu.edu/pub/non-lexical/6811/ (Unix). Simulator engine with a command-line interface. * Sim6811, http://www.cs.nmsu.edu/~pfeiffer/classes/273/notes/sim.html (Mac). Screen-oriented user interface based on the NMSU simulator engine (plus bug fixes). * THRSim11, http://programfiles.com/index.asp?ID=8366 allows you to edit, assemble, simulate and debug programs for the 68HC11 on Windows 95/98. THRSim11 simulates the CPU, ROM, RAM, all memory mapped I/O ports, and the on board peripherals. ---------------------------------------------------------------------- Q3.6: Software for Texas Instruments DSPs Updated 12/01/2006 Q3.6.1: Where can I get free algorithms or libraries for TI DSPs? ftp://ftp.funet.fi/pub/ham/dsp/ has some old, apparently public domain, assembler and related tools from TI for the TMS320 family. TI has a number of free algorithms available on their website at http://dspvillage.ti.com/docs/sdstools/sdscommon/showsdsinfo.jhtml?templateId=57&path=templatedata/cm/ccstudio/data/free_tools. TI's world-wide web site is http://www.ti.com. The TI DSP bulletin board is mirrored on ftp.ti.com. The TI site is the official one, but has no user contributed software. [Brad Hards, bradh@gil.com.au] { If anyone knows of any other sources for TI DSP software, please let us know at comp-dsp-faq@bdti.com. Thanks! } ---------------------------------------------------------------------- Q3.6.2: Where can I get free development tools for TI DSPs? TI development tools are available for free 30 day evaluation on the TI website. Go to http://focus.ti.com/dsp/docs/dspsupportaut.tsp?sectionId=3&tabId=416&familyId=44&toolTypeId=30. ---------------------------------------------------------------------- Q3.6.3: Where can I get a free C compiler for the TI TMS320C3x/4x? The GNU binutils 2.11 and later have been ported to the TI C54xx/IBM C54DSP. Most of the binutils tools are supported, including the assembler, linker and objdump. The assembler is source-compatible with the TI assembler. The GNU binutils are available from http://sources.redhat.com/binutils/ GDB ports for c25/c5x/c54x are also available. [Timothy Wall] Dr. Michael P. Hayes has written a GNU C-based compiler for the TMS320C30 and TMS320C40 families, available at http://www.elec.canterbury.ac.nz/c4x. The current version patches against gcc-2.8.1; support is moving to egcs-1.2. The compiler is freely redistributable under the terms of the GNU Public License. Front-ends are also available for C++, Java, Fortran 77, Pascal, Ada 95, among others. [Dr. Michael P. Hayes, m.hayes@elec.canterbury.ac.nz] ---------------------------------------------------------------------- Q3.6.4: Where can I get a free assembler for the TI TMS320C3x/4x? Ted Rossin has written an assembler and linker for the TMS320C30. In his words, "It is somewhat limited by the fact that it can't handle expressions but it has worked fine for me over the past few years. There is no manual because it is a clone of the TI assembler and linker. However the linker command files use a different (easier to use) syntax. It runs on HP-UX workstations, Macs, IBM clones and believe it or not the Atari-ST (because I developed the code on it)." [Ted Rossin, rossin@fc.hp.com] Dr. Michael P. Hayes has written a GNU-based assembler for the TMS320C30 and TMS320C40 families, available at http://www.elec.canterbury.ac.nz/c4x. The current version patches against binutils-2.7. According to Michael Hayes, the assembler syntax is compatible with the Texas Instruments TMS320C30 assembler, although not all the Texas Instruments directives are supported. The binutils include a linker (ld), archiver (ar), disassembler (objdump), and other miscellaneous utilities. The object format of the assembler is compatible with the COFF format used by the Texas Instruments assembler. The assembler and other binary utilities are freely redistributable under the terms of the GNU Public License. [Dr. Michael P. Hayes, m.hayes@elec.canterbury.ac.nz] ---------------------------------------------------------------------- Q3.6.5: Where can I get a free simulator for the TI TMS320C3x/4x? A freely distributable instruction set architecture simulator is available for the TMS320C30 DSP as part of the Web-Enabled Simulation framework from UT Austin at http://signal.ece.utexas.edu/~arifler/wetics/. We have released all of the source code, as well as prebuilt C30 simulators for Windows '95/NT and Solaris 2.5 architectures. The C30 simulator is bit-, cycle-, and instruction-accurate. The behavior of the C30 simulator has been validated against a C30 DSK board. The C30 simulator correctly reports interlocking and pipeline flushes, so it provides a convenient way to check C30 programs for these hidden delays. The C30 simulator is based on the C30 DSK tools by Keith Larson at Texas Instruments. [Brian Evans, bevans@ece.utexas.edu] Herman Ten Brugge (haj.ten.brugge@net.hcc.nl) has also written a GNU debugger (GDB) based simulator for the TMS320C30 and TMS320C40, available via anonymous FTP at http://www.elec.canterbury.ac.nz/c4x. This is freely redistributable under the terms of the GNU Public License. This simulator allows you to debug your programs without having to a connect to a real C[34]x target system. It will also profile your code showing you where the pipeline conflicts are occurring. You can connect I/O ports to files (or TCP/IP sockets), trigger interrupts, examine the cache etc. It will detect different threads of control running and generate a profile summary for each thread, annotating both the C code and assembler code with the number of executed cycles. [Dr. Michael P. Hayes, m.hayes@elec.canterbury.ac.nz] ---------------------------------------------------------------------- Q3.6.6: What is Tick? Where can I get it? Tick is a TMS320C40 parallel network detection and loader utility. It is available from: http://wotug.kent.ac.uk/parallel/vendors/ti/tms320c40/tick/ Supports: Transtech, Hunt, and Traquair boards hosted by DOS, SunOS, Linux Previous section (2) Next section (4)