Digital signal processing (DSP) module is a part of the device specialized for fast execution of the basic mathematical operations (addition, subtraction and multiplication) and for performing automatically accumulation, logical shifting, rounding off and saturation. This module makes the dsPIC30F devices very powerful and considerably extends the scope of their applications.
Processing of digital signals is very demanding. One of the biggest problems is the multiplication required for processing of digital signals. The family of dsPIC30F devices has a hardware implemented multiplier which accelerates considerably the processing. The major part of digital signal processing reduces to calculating the sums of products of two arrays. This module has been designed to allow a fast calculation of the sum of products:
Block diagram of the DSP module is shown in Fig.11-1.
Fig. 11-1 DSP module block diagram
Fig. 11-1 illustrates the realization of the DSP module. In order to calculate the sum of products as fast as possible, the following additions have been made:

The process of calculation of the sum of products comprises several steps:

As can be noticed, the process starts by reading two array elements, the 1st of the first array and the 1st of the second array. Several details need to be considered. Firstly, the array elements being multiplied are from different arrays. This means that the values of these elements are not saved in the adjacent locations. Secondly, the elements always have identical indices, i.e. the address changes are alway identical. E.g. if the arrays consist of 16-bit elements, after the calculation of each partial sum, it is required that the address for the next two elements that is increased or decreased by 2, depending whether the array is saved with the increasing or decreasing index. The change for such arrays will always be ±2. This is important because it allows this change to be caried out by hardware!
Reading the array elements to be multiplied can be accelerated by accessing simultaneously both memory locations. For this purpose it is required that the microcontroller has two data buses, two address generators and that the structure of the memory allows simultaneous access to different locations. The devices of the dsPIC30F family meet these requirements. The data buses are called X and Y bus, as shown in Fig. 11-1. There are two address generators (it is required to calculate simultaneously both addresses to be read). A multiple access to the memory is provided.
The X and Y data buses are considered here. To facilitate the realization, some constraints have to be introduced. Data memory where the elements of the arrays are saved has been split in two sections, X and Y data spaces, as shown in Fig. 11-2.
Fig. 11-2 Organizations of data memories of dsPIC30F4013 and dsPIC30F6014A devices
Fig. 11-2 shows the example of data memory organization of a dsPIC30F4013 device. The memory capacity, i.e. the size of the X and Y data spaces are device specific. The space for the special function registers (SFR) remains the same and so does the starting address of the X space.
Splitting data memory to the X and Y data spaces introduces the constraint that each data bus can have access only to one of the spaces. An attempt to access the other space (e.g. an attempt to access X data space via Y data bus) will result in generation of a trap or, if the function for processing a trap has not been specified, in device reset.
The existence of the constraints, when using the DSP instructions has aready been mentioned. The principal contraints when reading data are:
The practice is as follows:
Of course, all this carried out by the hardware. The code should only provide data concerning address increment (increase/decrease) when calculating the partial sums, the initial and final addresses of the arrays, the registers used for loading the addresses and reading the array elements.
The most frequently used DSP instruction is MAC. The following example shows one of the forms of its use.
Example:
MAC W4*W6, A, [W8]+=2, W4, [W10]+=2, W6
The instruction from this example:
The hardware specialized for DSP instructions allows that all this is executed in one instruction cycle! As can be seen, the DSP module makes the devices of the dsPIC30F family very powerful. If the device clock is 80MHz, the instruction clock is 20MHz, i.e. in one second 20 milion MAC instructions can be executed each including all six actions listed above!
It is customary that the X space contains stationary arrays, e.g. digital filter coefficients, FFT blocks, etc, whereas the Y space contains dynamic arrays, e.g. samples of the signal being processed or similar. For this reason it was made possible that one section of he program memory is mapped as X space. This means that the coefficients of a digital filter may be saved in the program memory as constants, that this section of the program memory is declared an extention of the X space and that it is accessible as if it was in the data memory. Of course, the addresses of these elements will not be the same as in the program memory, but they will start from the address 0x8000, as can be seen in Fig. 11-2. In this way an additional capacity for saving coefficients has been obtained which is essential particularly for high order filters, FFT algorithms and many other applications. This procedure is known as Program Space Visibility (PSV) Management.
Address generation in PSV management is shown in Fig. 11-3.
Fig. 11-3 Address generation in PSV management
The element of the array to be read is in the program memory having 24-bit addresses, whereas DSP instructions can operate only with 16-bit addresses. The remaining 8 bits are obtaind from the PSVPAG register. The whole procedure reduces to the PSVPAG register writing the most significant byte od the 24-bit program memory address, PSV management is activated and the array is accessed as if it was in the data memory. The hardware of the DSP module will add 8 bits to each address, as shown in Fig.11-3. In this way a 24-bit address is generated and the array element has been read correctly.
Example:
PSV management is enabled. The array in the program memory starts from the address 0x108000
W1=0x8000
PSVPAG=0x21
In the binary system this is

The underlined bit zero is the most significant bit that can be set to zero by hardware (see Fig. 11-3). EA denotes the Effective Address in the program memory.
As Fig. 11-3 shows, of the program memory address only the 15 least significant bits are used and the highest bit is set to logic one to denote that the PSV management is enabled. 8 bits from the register PSVPAG are added to the above 15 bits and the highest bit in the program memory address is logic zero. In this way the addresses of the array elements in the program memory are obtained. All this is done automatically, i.e. when writing a program no attention should be payed to this. All that should be done is to set the corresponding value in the register PSVPAG and activate the PSV management in the register CORCON. The structure of the CORCON register is given at the end of this chapter.
One of the essential improvements which made possible the execution of an instruction in one instruction cycle is hardware multiplier. The input data are 16-bit quantities and the 32-bit output data are extended to 40 bits in order to facilitate adding to the current value in the accumulator. If this multiplier did not exist, multiplying 16-bit quantities would require 16 instruction cycles which would for many digital signal processing applications be unacceptably long.
The values to be multiplied are via the X and Y data buses fed simultaneously to the input of the multiplier. The output of the multiplier is fed to the to the 40 bits extension block retaining the sign.
By multiplying two 16-bit values one obtains a 32-bit value. However, the aim is not only to multiply but also to calculate the sum of the partial products for the whole array. Therefore, multiplying is only one part and the result is a partial sum. The number of the partial sums will correspond to the length of the array. It follows that the 32 bits will not be sufficient for saving the result because the probability of overflow is high. For this reason it is required to extend the value which will be accumulated. It has been adopted that this extention is 8 bits resulting in the total of 40 bits.
In the worst case that all partial sums are the maximum 32-bit value, one can sum 256 partial sums before overflow. This means that the maximum length of an array consisting of 32-bit elements all of the maximum value is 256. For most applications this is sufficient, but such arrays are very rare and the permissible array lengths are several times longer.
Depending on the values of indiviual bits in the CORCON register, the multiplication may be carried out with signed or unsigned integers or signed or unsigned fractionals formated 1.15 (1 bit for sign and 15 bits for value). The most often used format is fractional (radix).
Barrel shifter serves for automatic shifting the values from the multiplier or accumulator for an arbitrary number of bits to the right or to the left. This operation is often required when scaling a partial or the total sum. The barrel shifter is added in order to simplify the code. Shifting the values is carried out in parallel with the execution of instruction.
The input to the barrel shifter is 40-bit and the output may be 40-bit or 16-bit. If the value from the accumulator is scaled and the result should be fed back to the accumulator, then the output is 40-bit. It is possible to scale the result from the accumulator and save the obtained value in the memory as the final result of the calculation.
A part of the DSP module is a 40-bit adder which may save the result in one of the two 40-bit accumulators. The activation of the saturation logic is optional. The adder is required for accumulation of the partial sums. Adding or subtracting of the partial sums is performed automatically as a part of DSP instructions, no additional code is required which allows extermely short time for signal processing.
An example which illustrates the significance of a hardware, independent adder is the previous example of the MAC instruction.
Example:
MAC W4*W6, A, [W8]+=2, W4, [W10]+=2, W6
The instruction of this example:
It is important that this DSP instruction is executed in one instruction cycle! This means that the whole algorithm for calculating the sum of products consists of loading arrays to the memory, adjusting the parameters of the DSP module (format, positions of the arrays, etc) and then the above instruction is called the corresponding number of times.
Example:
CLR A
REPEAT #20
MAC W4*W6, A, [W8]+=2, W4, [W10]+=2, W6
The result of the execution of the above program is calling the MAC instruction 21 times (REPEAT means that the next instruction will be called 20+1 times). If the first array elements have been loaded to the registers W4 and W6 and the initial addresses in the data memory or extended data memory (PSV management) loaded to the registers W8 and W10 before the execution of the program, then, upon completion of the REPEAT loop, the accumulator will contain the sum of products of the 20 elements of the two arrays.
This section of the code occupies 3 locations in the program memory and includes 22 instruction cycles (1 for MOV, 1 for REPEAT and 20 for MAC). If the device clock is 80MHz (20MHz instruction clock), then the program will be executed in 22*50 = 1100ns!
It should be noted that without an independent adder which may carry out the operations simultaneously with the multiplier and other parts of the DSP module, this would not be possible. Then the parallelism in the execution of instructions would not be possible and the execution of one DSP instruction would last at least one clock longer.
Two 40-bit accumulators for saving the partial sums are avialable. These are accumulator A and accumulator B (ACCA and ACCB). The accumulators are mapped in the data memory and occupy 3 memory locations each. The addresses and distribution of the accumulator bits are given at the end of this chapter.
Data accumulation is carried out with 40-bit data. The architecture of the of the dsPIC30F devices, however, is 16-bit, meaning that the conversion to 16-bit values has to be done. A higher number of bits for calculating the sums by DSP instructions is intended for increasing the accuracy and the operating range of values and so reduce the error owing to the finite word length. The purpose of individual bits within the accumulator is presented in Fig. 11-4.
Fig. 11-4 Purpose of individual bits within accumulator
As can be seen from Fig. 11-4, the upper 8 bits serve for extending the range and the lower 16 bits for increasing the operational accuracy. Increasing the range is sometimes useful as an intermediate step, but the end result should not overrun the basic range. If this occurs, the result will not be correct. The consequences can be mitigated by enabling the saturation logic, but not completely neutralized.
After the calculation is completed, it is required to save the result as a 16-bit quantity. In order to do that, the accuracy of the result has to be reduced. For doing this process automatically, the DSP module is added a block which automatcally rounds off the result during an accumulator write back. From Fig. 11-3 it can be seen that the round logic is placed between the accumulator and X data bus. If the round logic is on (in the CORCON register), by using a separate instruction the result from the accumulator is rounded off and saved in the desired location in the data memory.
The round logic can perform a conventional (biased) or convergent (unbiased) round function.
The conventional round function implies the following. If the MS bit of the accuracy increase bits (bit 15) is logical one, the result will be one step incremented. One step is the least significant positive value that can be saved in the selected format (1 for integer, 1.15 for fractional point), specifically 0.000030517578125. A consequence of this algorithm is that over a succession of random rounding operations, the value will tend to be biased slightly positive.
The convergent round function assuming that bit 16 is effectively random in nature, will remove any rounding bias that may accumulate. If the convergent round function is enabled, the LS bit of the result will be incremented by 1 if the value saved in the accuracy increase 16 bits is greater than 0x08000, or if this value is equal to 0x08000 and bit 16 is one. This algorithm can be readily be explained by using integers. If the middle 16 bits contain an integer, the rounding algorithm will tend towards even integers. This is demonstarted by the examples in Table 11-1.
| Value to be rounded | The result | Binary form of the value to be rounded | Binary form of the result |
|---|---|---|---|
| 12.75 | 13 | 0000 0000 0000 0000 0000 1100 1100 0000 0000 0000 |
0000 0000 0000 1100 |
| 12.5 | 12 | 0000 0000 0000 0000 0000 1100 1000 0000 0000 0000 |
0000 0000 0000 1100 |
| 13.5 | 14 | 0000 0000 0000 0000 0000 1101 1000 0000 0000 0000 |
0000 0000 0000 1110 |
Table 11-1 Covergent round function.
The convergent round function usually gives better results and its use is recommended.
When calculating a sum of products of two arrays comprising many elements (more than 256), there is a risk of exceeding the range. In this case, the obtained value is not only inaccurate but also of the opposite sign. These sudden changes of values of a signal (known as glitches) are easily recognized because they violate the characteristics of a signal.
The consequences can be mitigated if the saturation logic is enabled. If, while executing current instruction, an overrun occurs, the hardware saturation logic will load the maximum positive or maximum negative value to the operating accumulator, depending on the previous value loaded to the accumulator. In this way the consequences of a range overrun are mitigated. Fig. 11-6 shows the case of an output sinusoidal signal when an overrun occured, without and with enabled saturation logic.
Fig. 11-6 Consequences of range overrun without (left) and with (right) enabled saturation logic
The figure shows that if an overrun occurs and the saturation logic is not enabled, the consequences are greater by far compared to those when the saturation logic is enabled. In the first case a glitch which appears violates considerably the characteristics of the signal. In the second case, owing to the enabled satutration logic, the consequence will only be the unwanted clipping of the crest of the sinusoidal signal, which is much better compared to the first case. With the saturation logic enabled, a lesser overrun corresponds to a lesser consequence, whereas with the saturation logic disabled this does not apply.
There are three modes of operation of the saturation logic: accumulator 39-bit saturation, accumulator 31-bit saturation and write-back saturation logic. In the first case, overrun is allowed until the MS bit (corresponding to the sign in signed operations) is overrun. This is an optimistic version, because it is assumed that by the end of the calculation the signal will decrease to the permitted range. The reason is that the MS 8 bits are the range extention and they are very seldom used so the middle 16 bits contain the final result. This mode is enabled by writing logic one to the ACCSAT bit (CORCON register, bit 4).
A pesimistic version is to enable the saturation logic for the 31 bits when the accumulated value must not overrun the range at any time during the calculation of the sum of products. This mode is enabled by writing logic zero to the ACCSAT bit (CORCON register, bit 4). In case that the saturation logic detects that the current instruction could cause an overrun, the maximum positive value (0x007FFFFFFF) is written to the operating accumulator (A or B) if the accumulator contains a positive value, or the minimum negative value (0xFF80000000) if the accumulator contains a negative value.
If the satuation logic is enabled, at each overrun the bit SA (register SR, bit 13) is set when the saturation logic is enabled for the accumulator A, or SB (register SR, bit 12) when the saturation logic is enabled for the accumulator B. Saturation logic enable for the accumulator A is done by setting the SATA bit (CORCON register, bit 7) to logic one. Similarly, saturation logic enable for the accumulator B is done by setting the SATB bit (CORCON register, bit 6) to logic one.
The third mode of the saturation logic is that the overrun is tested while writing the result from the operating accumulator to a general purpose register (W0...W15). The advantage of this approach is that during calculations it allows using the full range offered by the accumulator (all 40 bits). This logic is enabled only when executing the SAC and SAC.R instructions if the SATDW bit (register CORCON, bit 5) is set to logic one. For the values greater than 0x007FFFFFFFF, in the memory (general purpose registers are a part of the data memory) will be written the value 0x7FFFF. Similarly, for the values smaller than 0xFF80000000, in the memory will be written the value 0x8000, representing the smallest negative number that can be expressed by 16 bits.
For using the DSP module in an optimum way, it is necessary to konw all DSP instructions. The list of DSP instructions, including the parameter description and application of the instruction is presented in table 11-2.
| Instruction | Instruction and parameters | Parameter description | Operation description |
|---|---|---|---|
| MAC | MAC Wm*Wn, Acc | Wm – W4 or W5 Wn – W6 or W7 Acc – A or B accumulator |
Values of the Wm and Wn registers are multiplied and added to the current value in the operating accumulator (A or B) |
| MAC | MAC Wm*Wn, Acc, [Wx], Wxd, [Wy], Wyd | Wm – W4 or W5 Wn – W6 or W7 Acc – A or B accumulator Wx – W8 or W9 Wxd – W4 or W5 Wy – W10 or W11 Wyd – W6 or W7 |
Values of the Wm and Wn registers are multiplied and added to the current value in the operating accumulator (A or B), from the address pointed by the Wx register the value is read and written to the Wxd register, from the address pointed by the Wy register the value is read and written to the Wyd register. |
| MAC | MAC Wm*Wn, Acc, [Wx]+=kx, Wxd, [Wy]+=ky, Wyd | Wm – W4 or W5 Wn – W6 or W7 Acc – A or B accumulator Wx – W8 or W9 Wxd – W4 or W5 kx – (-6,-4,-2, 2, 4, 6) Wy – W10 or W11 Wyd – W6 or W7 ky – (-6,-4,-2, 2, 4, 6) |
Values of the Wm and Wn registers are multiplied and added to the current value in the operating accumulator (A or B), from the address pointed by the Wx register the value is read and written to the Wxd register, from the address pointed by the Wy register the value is read and written to the Wyd register, the Wx register value is decreased by kx, the Wy register value is decreased by ky. |
| MOVSAC | MOVSAC Acc[Wx], Wxd, [Wy], Wyd, AWB | Acc – A or B accumulator Wx – W8 or W9 Wxd – W4 or W5 Wy – W10 or W11 Wyd – W6 or W7 AWB – W13 (Acc write-back) |
The value from the operating accumulator is saved in the register W13 (AWB - accumulator write back), from the address pointed by the register Wx the value is read and written to the register Wxd, from the address pointed by the register Wy the value is read and written to the register Wyd. |
| MPY | MPY Wm*Wn, Acc | Wm – W4 or W5 Wn – W6 or W7 Acc – A or B accumulator |
The values in the Wm and Wn registers are multiplied and written to the operating accumulator. |
| MPY | MPY Wm*Wn, Acc [Wx], Wxd, [Wy], Wyd | Wm – W4 or W5 Wn – W6 or W7 Acc – A or B accumulator Wx – W8 or W9 Wxd – W4 or W5 Wy – W10 or W11 Wyd – W6 or W7 |
The values of the Wm and Wn registers are muliplied and written to the accumulator (A or B), from the address pointed by the register Wx the value is read and written to the register Wxd, from the address pointed by the register Wy the value is read and written to the register Wyd. |
| MPY | MPY Wm*Wn, Acc [Wx]+=kx, Wxd, [Wy]+=ky, Wyd | Wm – W4 or W5 Wn – W6 or W7 Acc – A or B accumulator Wx – W8 or W9 Wxd – W4 or W5 kx – (-6,-4,-2, 2, 4, 6) Wy – W10 or W11 Wyd – W6 or W7 ky – (-6,-4,-2, 2, 4, 6) |
The values of the Wm and Wn registers are multiplied and written to the operating accumulator (A or B), from the address pointed by the Wx register the value is read and written to the Wxd register, from the address pointed by the Wy register the value is read and written to the Wyd register, the Wx register value is increased by kx, the Wy register value is increased by ky. |
| MPY | MPY Wm*Wn, Acc[Wx]-=kx, Wxd, [Wy]-=ky, Wyd | Wm – W4 or W5 Wn – W6 or W7 Acc – A or B accumulator Wx – W8 or W9 Wxd – W4 or W5 kx – (-6,-4,-2, 2, 4, 6) Wy – W10 or W11 Wyd – W6 or W7 ky – (-6,-4,-2, 2, 4, 6) |
The values of the Wm and Wn registers are multiplied and written to the operating accumulator (A or B), from the address pointed by the Wx register the value is read and written to the Wxd register, from the address pointed by the Wy register the value is read and written to the Wyd register, the Wx register value is decreased by kx, the Wy register value is decreased by ky. |
| MSC | MSC Wm*Wn, Acc[Wx], Wxd, [Wy], Wyd | Wm – W4 or W5 Wn – W6 or W7 Acc – A or B accumulator Wx – W8 or W9 Wxd – W4 or W5 Wy – W10 or W11 Wyd – W6 or W7 |
The values of the Wm and Wn registers are multiplied and subtracted from the curent value in the operating accumulator (A or B), from the address pointed by the Wx register the value is read and written to the Wxd register, from the address pointed by the Wy register the value is read and written to the Wyd register. |
| MSC | MSC Wm*Wn, Acc[Wx]+=kx, Wxd, [Wy]+=ky, Wyd | Wm – W4 or W5 Wn – W6 or W7 Acc – A or B accumulator Wx – W8 or W9 Wxd – W4 or W5 kx – (-6,-4,-2, 2, 4, 6) Wy – W10 or W11 Wyd – W6 or W7 ky – (-6,-4,-2, 2, 4, 6) |
The values of the Wm and Wn registers are multiplied and subtracted from the current value in the operating accumulator (A or B), from the address pointed by the Wx register the value is read and written to the Wxd register, from the address pointed by the Wy register the value is read and written to the Wyd register, the Wx register value is increased by kx, the Wy register value is increased by ky. |
| MSC | MSC Wm*Wn, Acc[Wx]-=kx, Wxd, [Wy]-=ky, Wyd | Wm – W4 or W5 Wn – W6 or W7 Acc – A or B accumulator Wx – W8 or W9 Wxd – W4 or W5 kx – (-6,-4,-2, 2, 4, 6) Wy – W10 or W11 Wyd – W6 or W7 ky – (-6,-4,-2, 2, 4, 6) |
The values of the Wm and Wn registers are multiplied and subtracted from the current value in the operating accumulator (A or B), from the address pointed by the Wx register the value is read and written to the Wxd register, from the address pointed by the Wy register the value is read and written to the Wyd register, the Wx register value is decreased by kx, the Wy register value is decreased by ky. |
| NEG | NEG Acc | Acc – A or B (operating accumulator) | Acc ← -Acc, the sign of the current value in the accumulator is changed, analogous to the multiplying of the value in the operating accumulator by –1. |
| REPEAT | REPEAT #lit14 | #lit14 – 14-bit unsigned value (0...16383) | The instruction following REPEAT will be executed #lit14+1 times. Even though this is not a DSP instruction, it is very often used when using DSP instructions. |
| REPEAT | REPEAT Wn | Wn – W0...W15 | The instruction following REPEAT will be executed Wn+1 times. Even though this is not a DSP instruction, it is very often used when using DSP instructions. |
| SAC | SAC Acc, {#Slit4,} Wd | Acc – A or B accumulator {#Slit4,} – optional 4-bit constant Wd – W0...W15 |
If the optional 4-bit constant is specified, the accumulator value is shifted to the right for the positive value of the constant or to the left if the constant is negative. Then, the obtained value is loaded to Wd. |
| SAC | SAC Acc, {Slit4,} [Wd] | Acc - A or B accumulator {#Slit4,} – optional 4-bit constant Wd – W0...W15 |
If the optional 4-bit constant is specified, the accumulator value is shifted to the right for the positive value of the constant or to the left if the constant is negative. Then, the obtained value is loaded to the address in the data memory pointed by the Wd register. |
| SAC | SAC Acc, {Slit4,} [Wd++] | Acc - A or B accumulator {#Slit4,} – optional 4-bit constant Wd – W0...W15 |
If the optional 4-bit constant is specified, the accumulator value is shifted to the right for the positive value of the constant or to the left if the constant is negative. Then, the obtained value is loaded to the address in the data memory pointed by the Wd register. After memory write, the value of the register Wd is incremented by 2. |
| SAC | SAC Acc, {Slit4,} [Wd -] | Acc - A or B accumulator {#Slit4,} – optional 4-bit constant Wd – W0...W15 |
If the optional 4-bit constant is specified, the accumulator value is shifted to the right for the positive value of the constant or to the left if the constant is negative. Then, the obtained value is loaded to the address in the data memory pointed by the Wd register. After memory write, the value of the register Wd is decremented by 2. |
| SAC | SAC Acc, {Slit4,} [++Wd] | Acc - A or B accumulator {#Slit4,} – optional 4-bit constant Wd – W0...W15 |
If the optional 4-bit constant is specified, the accumulator value is shifted to the right for the positive value of the constant or to the left if the constant is negative. Then, the value of the register Wd is incremented by 2 and the value obtained by shifting is saved in the address pointed by the Wd register. |
| SAC | SAC Acc, {Slit4}, [--Wd] | Acc - A or B accumulator {#Slit4,} – optional 4-bit constant Wd – W0...W15 |
If the optional 4-bit constant is specified, the accumulator value is shifted to the right for the positive value of the constant or to the left if the constant is negative. Then, the value of the register Wd is decremented by 2 and the value obtained by shifting is saved in the address pointed by the Wd register. |
| SAC.R | The same as for SAC | The same as for SAC | The same as for the SAC instruction except that the value from the accumulator is rounded by the conventional or convergent mode. |
| SFTAC | SFTAC Acc, #Slit6 | #Slit6 – 6-bit constant | Shift the value in the accumulator by #Slit6 bits. If the constant is positive, shifting is to the right, otherwise to the left. |
| SFTAC | SFTAC Acc, Wd | Wd – W0...W15 | Shift the value in the accumulator by Wd bits. If the register Wd is positive, shifting is to the right, otherwise to the left. |
| CLR | CLR Acc | Acc - A or B accumulator | The value in the operating accumulator is set to zero. |
| CLR | CLR Acc, [Wx], Wxd, [Wy], Wyd | Acc - A or B accumulator Wx – W8 or W9 Wxd – W4 or W5 Wy – W10 or W11 Wyd – W6 or W7 |
The value in the operating accumulator is set to zero. From the address in the data memory pointed by Wx the value is read and written to the register Wxd. From the address in the data memory pointed by Wy the value is read and written to the register Wyd. |
| CLR | CLR Acc, [Wx]+=kx, Wxd, [Wy]+=ky, Wyd | Acc – A or B accumulator Wx – W8 or W9 Wxd – W4 or W5 kx – (-6,-4,-2, 2, 4, 6) Wy – W10 or W11 Wyd – W6 or W7 ky – (-6,-4,-2, 2, 4, 6) |
The value in the operating accumulator is set to zero. From the address in the data memory pointed by Wx the value is read and written to the register Wxd. From the address in the data memory pointed by Wy the value is read and written to the register Wyd. The Wx register value is increased by kx, the Wy register value is increased by ky. |
Table 11-2 List of DSP instructions with description of operations and parameters.
Table 11-2 shows that some instructions (such as MAC) could have more than one form. All versions of the instructions have not been descibed, but the emphasis was put on the most frequently used versions, in order to illustrate the way of thinking when using DSP instructions.
The structures of individual registers of the DSP module are given in Tables 11-3 to 11-6.
NOTE: Reading of bits which have not been alocated any functions gives '0'.
| name | ADR | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 |
|---|---|---|---|---|---|---|---|---|---|
| CORCON | 0x0044 | - | - | - | US | EDT | DL<2:0> | ||
| 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | Reset State |
|---|---|---|---|---|---|---|---|---|
| SATA | SATB | SATDW | ACCSAT | IPL3 | PSV | RND | IF | 0x0020 |
Table 11-3 Description of the CORCON register
US – DSP multiply unsigned/signed control bit
(1 – unsigned multiplication, 0 – signed multiplication)
EDT – Early DO loop termination control bit. This bit will always read as’0’.
1 – Terminate executing DO lop at the end of current loop iteration
0 – No effect
DL<2:0> - DO loop nesting level status bit
111 – 7 nested DO loops active
110 – 6 nested DO loops active
...
001 – 1 nested DO loop active
000 – 0 DO loops active
SATA – AccA saturation enable bit
1 – Accumulator A saturation enabled
0 – Accumulator A saturation disabled
SATB – AccB saturation enable bit
1 – Accumulator B saturation enabled
0 – Accumulator B saturation disabled
SATDW – Data space write from DSP engine saturation enable bit
1 – Data space write saturation enabled
0 – Data space write saturation disabled
ACCSAT – Accumlator saturation mode select bit
1 – 9.31 saturation (super saturation)
0 – 1.31 saturation (normal saturation)
IPL3 – CPU interrupt priority level status bit
1 – CPU interrupt priority level is greater than 7
0 – CPU interrupt priority level is 7 or less
PSV – Program space visibility in data space enable bit
(1 – PSV visible in data space, 0 – PSV not visible in data space)
RND – Rounding mode select bit
(1- conventional rounding enabled, 0 – convergent rounding enabled)
IF – Integer or fractional multiplier mode select bit
(1 – integer mode enabled, 0 – fractional mode enabled (1.15 radix))
| name | ADR | 47 | 46 | 45 | 44 | 43 | 42 | 41 | 40 | 39 | 38 | 37 | 36 | 35 | 34 | 33 | 32 | Reset State |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ACCAU | 0x0026 | SE | ACCAU | 0x0000 | ||||||||||||||
Table 11-4a Description of the ACCA register
| name | ADR | 31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | Reset State |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ACCAH | 0x0024 | ACCAH | 0x0000 | |||||||||||||||
Table 11-4b Description of the ACCA register
| name | ADR | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | Reset State |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ACCAL | 0x0022 | ACCAL | 0x0000 | |||||||||||||||
SE – Sign extention for AccA accumulator
Table 11-4c Description of the ACCA register
| name | ADR | 47 | 46 | 45 | 44 | 43 | 42 | 41 | 40 | 39 | 38 | 37 | 36 | 35 | 34 | 33 | 32 | Reset State |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ACCBU | 0x002C | SE | ACCBU | 0x0000 | ||||||||||||||
Table 11-5a Description of the ACCB register
| name | ADR | 31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | Reset State |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ACCBH | 0x002A | ACCBH | 0x0000 | |||||||||||||||
Table 11-5b Description of the ACCB register
| name | ADR | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | Reset State |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ACCBL | 0x0028 | ACCBL | 0x0000 | |||||||||||||||
SE – Sign extention for AccB accumulator
Table 11-5c Description of the ACCB register
| name | ADR | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | Reset State |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| SR | 0x0042 | OA | OB | SA | SB | OAB | SAB | DA | DC | IPL<2:0> | RA | N | OV | Z | C | 0x0000 | ||
SE – Sign extention for AccB accumulator
Table 11-6 Description of the SR register
OA – Accumulator A overflow status bit
(1 – accumulator A overflowed, 0 – accumulator A has not overflowed)
OB - Accumulator B overflow status bit
(1 – accumulator B overflowed, 0 – accumulator B has not overflowed)
SA – Accumulator A saturation ‘sticky’ status bit.
This bit can be cleared or read but not set to ‘1’.
1 – accumulator A is saturated or has been saturated at some time
0 – accumulator A is not saturated
SB – Accumulator B saturation ‘sticky’ status bit.
This bit can be cleared or read but not set to ‘1’.
1 – accumulator B is saturated or has been saturated at some time
0 – accumulator B is not saturated
OAB - OA¦¦OB combined accumulator overflow status bit
1 – accumulators A or B have overflowed
0 – neither accumulator A or B have overflowed
SAB - SA¦¦SB combined accumulator ‘sticky’ status bit
1 – accumulators A or B saturated or have been saturated at some time
0 – neither accumulator A or B are saturated
DA – DO loop active bit
(1 – DO loop in progress, 0 – DO loop not in progress)
DC – MCU ALU half carry/borrow bit (1 – a carry-out from the 4th order bit
(8-bit operations) or 8th order bit (16-bit operations) of the result occured,
0 – no carry-out from the 4th order bit (8-bit operations) or
8th order bit (16-bit operations) of the result occured )
IPL<2:0> - CPU internal priority level status bit.
These bits are concatenated with the IPL<3> bit (CORCON<3>) to form
the CPU interrupt priority level.
111 – CPU interrupt priority level is 7(15). User interrupts disabled.
110 – CPU interrupt priority level is 6(14). User interrupts disabled.
...
001 – CPU interrupt priority level is 1(9). User interrupts disabled.
000 – CPU interrupt priority level is 0(8). User interrupts disabled.
RA – REPEAT loop active bit
(1 – REPEAT loop in progress, 0 – REPEAT loop not in progress)
N – MCU ALU negative bit
(1- result was negative, 0 – result was non-negative (zero or positive)
OV – MCU ALU overflow bit (1 – overflow occured for signed arithmetic,
0 – no ovwerflow occured). This bit is used for signed arithmetic (2’s complement).
It indicates an overflow of the magnitude which causes the sign bit to
change state.
Z – MCU ALU Zero bit (1 – a zero result, 0 – a non-zero result)
C – MCU ALU carry/borrow bit (1 – a carry-out from the MS bit of the result occured,
0 – no carry-out from the MS bit of the result occured)

The example shows how the DSP module can be used for fast calculation of the sums of products of two arrays. The elements of the arrays are kept in the data memory. The X and Y spaces have been detailed in Chapter 8.
/* dsPIC30F6014A */
char i;
int arr1[20] absolute 0x0900; // array of 20 signed-integer elements in X space
int arr2[20] absolute 0x1900; // array of 20 signed-integer elements in Y space
// replace 0x1900 with 0x0D00 for dsPIC30F4013 MCU
void main() {
TRISD = 0; // configure PORTD as output
for (i=0; i<20; i++) { // init arr1 and arr2
arr1[i] = i+3;
arr2[i] = 15-i;
}
CORCON = 0x00F1; // signed computing, saturation for both Acc, integer computing
asm {
mov #@_arr1, W8 // W8 = @arr1, point to a first element of array
mov #@_arr2, W10 // W10 = @arr2, point to a first element of array
mov #0, W4 // clear W4
mov #0, W6 // clear W6
clr A
repeat #20
mac W4*W6, A, [W8]+=2, W4, [W10]+=2, W6 // AccA = sum(arr1*arr2)
sftac A, #-16 // shift the result in high word of AccA
sac A, #0, W1 // W1 = sum(arr1*arr2)
}
LATD = W1; // LATD = sum(arr1*arr2)
}
Example 1 presents the progam for calculating the sum of products of the array elements. The elements of arr1 are stored in the X space and the elements of arr2 in the Y space of the data memory.
At the start of the program the values of the array elements are initiated.
for (i = 0;i < 20;i++){ //init arr1 and arr2
arr1[i] = i+3;
arr2[i] = 15-i;
}
Before the calculation starts, it is necessary to set the DSP module for signed-integer computing. This is done by writing 0x00F1 to the register CORCON (CORCON =0x00F1;). At the same time the saturation logic for both accumulators (A nad B) is enabled, even though the accumulator B is will not be used. Table 11-3 gives the meanings of individual bits of the CORCON register.
The next step is writing the initial addresses (addresses of the first array elements) of the arr1 and arr2 arrays to the W8 and W10 registers, respectively. It has been decided to use the registers W8 and W10 for specifying the addresses of the next array elements and the registers W4 and W6 for the array elements being multiplied in the current iteration (partial sum).
mov #@_arr1, W8 mov #@_arr2, W10
Since the addresses of the first array elements are saved in the W8 and W10 registers, the process of multiplying and accumulating the partial sums can be started. Of course, the initial value of the accumulator A is set to zero by clr A.
The instruction MAC, for calculation of the partial sums and their adding to the current accumulator value should be executed 21 times. The first partial sum will be zero since the values of the registers W4 and W6 are zero. The purpose of the first execution of the MAC instruction is to read the values of the first elements from the data memory snd write them to the registers W4 and W6. After that, the instruction MAC is executed 20 times, calculating the partial sums which are accumulated in the accumulator A. The instruction MAC and the corresponding parameters are described in Table 11-2.
repeat #20 mac W4*W6, A, [W8]+=2, W4, [W10]+=2, W6 //AccA:=sum(arr1*arr2)
After the instruction MAC has been executed, the result is in the lower 16 bits of the accumulator A. The result could be read directly from AccAL, i.e. from address 0x0022 (see Table 11-4c), but it is regular practice to shift the result to AccAH, i.e. perfom the shift left 16 times and then read the result by using instruction SAC. In this way the consequences of an overflow, if it occurs, will be mitigated. In this case no overflow will occur, nevertheless the result is read in a regular way.
The shift left 16 times is performed by the instruction:
SFTAC A, #-16
The instruction SFTAC with its parameters is described in Table 11-2. After the result has been shifted 16 places to the left, it is read and saved in the W1 register. This is done by the instruction:
SAC A, #0, W1
The instruction SAC with its parameters is described in Table 11-2.
NOTE: The instruction SAC reads the results from AccAH (see Fig. 11-4)
The example shows the use of the modulo addressing and PSV management. The result is the sum of array elements of alternated signs:

The elements are saved in the data memory and the sign (-1, +1) in the program memory.
/* dsPIC30F6014A */
const int Sgn[2] = {1,-1}; // Signes for sum
int arr[14]; // array of 14 signed-integers in Y space (Y space is default)
int i;
unsigned int adr2;
void main() {
TRISD = 0; // Configure PORTD as output
for (i=0; i<14; i++)
arr[i] = i+1; // init arr
adr2 = &Sgn; // dummy line, just for linking Sgn before usage inside asm block
MODCON = 0x8008; // X modulo addressing, on W8 register
XMODSRT = adr2; // XMODSRT points to the start of Sgn array
XMODEND = adr2+3; // XMODEND points to the end of Sgn array
CORCON = 0x00F5; // Signed computing, saturation for both Acc,
// integer computing, PSV managment
asm {
mov #@_sgn, W8 // W8 = @Sgn in X space (mirror),
// points to a 1st of 2 elements in Sgn array
mov #@_arr, W10 // W10 = @arr, points to a first element of arr
mov #0, W4 // clear W4
mov #0, W6 // clear W6
clr A // clear accumulator for computing
repeat #14 // 15 iterations
mac W4*W6, A, [W8]+=2, W4, [W10]+=2, W6 // AccA = sum(Sgn*arr)
sftac A, #-16 // shift the result in high word of AccA
sac A, #0, W1 // W1 = sum(Sgn*arr)
}
LATD = W1; // LATD = sum(Sgn*arr)
}
Example 2 shows the method of using modulo addressing, described in Chapter 8, Section 8.3 and PSV management, described in Chapter 11, Section 11.2.
Constants +1 and -1 for multiplying the elements of the array arr are saved in the program memory. The advantage of this approach is a reduction in using the data memory. It is particularly suitable when several arrays having constant elements should be saved. Then, the use of the program memory is recommended. The data memory should be used when the array elements are not constants (unknown at the moment of compiling) or if the program memory is full (a rare event).
Addresses in the program memory are 24-bit, whereas in the dtata memory are 16-bit. For this reason it is necessary to perform mirroring, by using PSV management, in the upper block of the data memory (addresses above 0x7FFF). The mirroring is performed in two steps:
Obtaininmg the value to be written to the PSVPAG register is shown in Fig. 11-7.
Fig. 11-7 Obtaining the value of the PSVPAG register
Writing to the PSVPAG register and obtaining the effective address (the address of the array mirrored to the data memory) are carried out by the following set of instructions:
ptr =&Sgn; //ptr points to Sgn (24-bit address) adr = Hi(ptr); //Get upper Word (16 bits) adr2 = adr & 0x00FF; //Only lower 8 bits are relevant PSVPAG = adr2; //Load PSVPAG adr2 = ptr & 0xFFFF; //get lower Word (16 bits). Mirrored address in X space adr2 = adr2 | 0x0080; //Upper Data-MEM
The variables ptr and adr are 32-bit long (LongInt). When this part of the code is executed, the PSVPAG register will contain the corresponding value and in adr2 will be the address of the first element of the constant array.
The array arr comprises 14 elements and array Sgn only 2. In order to calculate the required sum, modulo addressing should be used for the Sgn array. This is enabled by writing 0x8008 in the MODCON register.
MODCON = 0x8008
This instruction enables modulo addressing in the X space via the W8 register. The structure of the MODCON register is shown in Chapter 8, Table 8-3.
After modulo addressing has been enabled, it is necessary to define the initial and final addresses of this addressing by writing the corresponding values to the XMODSRT and XMODEND registers. The initial address of the constant array contained by adr2 is written to the XMODSRT register. The address of the last byte of the constant array, i.e. adr2+4-1 (+4 because two elements occupy 2 locations (bytes) each and –1 to obtain the address of the last byte) is written to the XMODEND register.
XMODSRT = adr2; XMODEND = adr2+3;
The next step is setting the required bits in the CORCON register. The structure of the CORCON register is shown in Table 11-3. For the signed computing (positive and negative numbers), enabled saturation logic for both accumulators, enabled saturation logic during writing to the data memory, integer computing and enabled PSV management the value 0x00F5 should be written to the CORCON register. Enabling the saturation logic for accumulator B is superfluous, but it is inserted in the example to show that the saturation logic can be enabled for both accumulators simultaneously.
After the corresponding value has been written to the CORCON register, the initialization of the W4, W6, W8 and W10 registers is performed. The registers W4 and W6 are set to zero, whereas in the registers W8 and W10 are written the addresses of the first elements of the arrays Sgn and arr, respectively.
CORCON = 0x00F5; asm mov #@_adr2, W8 mov [W8], W8 mov #@_arr, W10 mov #0,W4 mov #0, W6
The initial value of the accumulator is set to zero by the instruction clr A. After that, the computing may start.
clr A
The repeat loop is used and it is executed 15 times. By performing mac instruction in the W4 and W6 registers the first elements of the arrays are written and then the 14 partial sums are calculated. The consequence of enabling modulo addressing is that the elements of the array Sgn 1,2,1,2,... will be read alternately.
repeat #14 mac W4*W6, A, [W8]+=2, W4, [W10]+=2, W6
After the loop is completed, the result is in the lowest 16 bits of the accumulator A. This result can be read directly from the address 0x0028. Another approach is used in the example in order to illustrate the use of instructions sftac and sac. These instructions are described in Table 11-2. At first, by using instruction sftac, the result is shifted to the middle 16 bits of the accumulator A. Then, by instruction sac, the result is written to the W1 register and from there forwarded to the port D.
sftac A, # - 16 sac A, #0, W1 end; LATD = W1;
NOTE: Instruction SAC reads the result from AccAH (see Fig. 11-4).
In the example it is shown how, by using instruction add, one can select one of the accumulators as a destination and how to use the instruction div for dividing two signed integers.
The instruction divide exists in the compiler and its use is very simple. However, the most efficient use of the DSP module is by using the assembler, so the purpose of this example and of other examples in this chapter is familiarization with the assembler instructions.
For the calculation of mathematical expectation of an array in this example, a function is used. The expression for calculating mathematical expectation is:

where N is the number of elements in the array and R mathematical expectation.
/* dsPIC30F6014A */
int arr[15]; // Array of 15 signed-integer elements
unsigned int i, MeanRes;
void MeanVar(unsigned int *ptrArr, unsigned int Len, unsigned int *ptrMean) {
CORCON = 0x00F1; // Signed computing, saturation for both Acc, integer computing
asm {
mov [W14-8], W10 // W10 = ptrArr
mov [W14-10], W7 // W7 = Len
sub W7, #1, W2 // W2 = Len-1
clr A
repeat W2
add [W10++], #0, A // A = sum(arr)
add W7, #1, A // A = A + (Len/2) for div's lack of rounding ...
sac.r A, #0, W3 // W3 = round(AccA)
repeat #17 // 18 iterations of signed-divide. Result in W0
div.s W3, W7 // W0 = sum(arr)/Len
mov [W14-12], W4 // W4 = ptrMean
mov W0, [W4] // Mean = Mean(arr)
}
}
void main() {
TRISD = 0; // Configure PORTD as output
MeanRes = 0;
for (i=0; i<15; i++)
arr[i] = i; // Init arr
MeanVar(&arr, 15, &MeanRes); // call subroutine
LATD = MeanRes; // Send result to LATD
}
The main program is very simple. After setting port D as output and initializing the input array, the function for calculating mathematical expectation is called. The result is then sent to port D.
TRISD = 0; // Configure PORTD as output MeanRes = 0; for (i=0; i<15; i++) arr[i] = i; // Init arr MeanVar(&arr, 15, &MeanRes); // call subroutine LATD = MeanRes; // Send result to LATD
The function for calculating mathematical expectation has only 3 parameters. The first parameter is the address of the first array element (ptrArr = &arr). The second parameter is the number of array elements (Len = 15). The third parameter is the address of the variable where the result, i.e. mathematical expectation, should be written (ptrMean = &MeanRes).
The function consists of three parts:
In order to use the accumulator correctly, the operating conditions of the accumulator should be defined first. This is done by setting the corresponding bits in the CORCON register. Structure of the CORCON reguister is shown in Table 11-3.
CORCON = 0x00F1;
For the signed computing (positive and negative numbers), enabled saturation logic for both accumulators, enabled saturation logic while writing to the data memory and integer computing the value 0x00F1 should be written to the CORCON register. Enabling the saturation logic for the accumulator B is superfluous, but it is inserted in the example to show that the saturation logic can be enabled for both accumulators simultaneously.
After the CORCON register is set, the address of ther first array element is written to the W10 register. The number of array elements is written to the W7 register.
mov [W14-8], W10 mov [W14-10], W7
Instruction add should be called as many times as there are elements in the array, i.e. the value in the W7 register should be decremented by 1. Since the number of elements will be required later for performing division, the decrementd value is saved in the W2 register. This is done by the instruction.
sub W7, #1, W2
Instruction add will be executed the required number of times. The result of each call is the partial sum which is added to the content of the accumulator A. After completion of the loop, the sum of all array elements is in the accumulator.
clr A repeat W2 add [W10++], #0, A
To obtain mathematical expectation, the sum of all array elements should be divided by the number of array elements. During division it is not possible to round off the result to the nearest integer. For this reason to the sum of all array elements the value Len/2 is added first. This is the same as adding the value 0.5 to the result, but this is not possible in this case because of integer computing. Adding the value Len/2 is done by the instruction add W7, #1, A. This instruction adds the value of the register W7 shifted one position to the right, which is analogous to divide by two, to the current value in the accumulator A. After that, the value in the accumulator A is read and divided by the number of array elements. This is done by the instruction sac.r A, #0, W3. Instruction sac.r is described in Table 11-2.
add W7, #1, A sac.r A, #0, W3
In the family of dsPIC30F devices there is no hardware division. Division is performed by 18 iterations each calling instruction div in the loop. The result will be saved in the W0 register and the remainder in the W1 register. The sum of all array elements is saved in the W3 register and the number of array elements in the W7 register. Therefore, the instruction div.s W3, W7 is called in the loop. After the loop is completed, the result is saved in the W0 register.
repeat #17 div.s W3, W7
Since the value of mathematical expectation is in the W0 register, it is necssary to write this value to the destination address (third parameter of the function). In this way the obtained value of mathematical expectation is forwared to the main program for further processing.
mov [W14-12], W4 mov W0, [W4]