MENU
Example 1 – (calculation of the sum of products of two arrays):
The example shows how the DSP module can be used for fast calculation of the sums of products of two arrays. The elements of the arrays are kept in the data memory. The X and Y spaces have been detailed in Chapter 8./* dsPIC30F6014A */ char i; int arr1[20] absolute 0x0900; // array of 20 signed-integer elements in X space int arr2[20] absolute 0x1900; // array of 20 signed-integer elements in Y space // replace 0x1900 with 0x0D00 for dsPIC30F4013 MCU void main() { TRISD = 0; // configure PORTD as output for (i=0; i<20; i++) { // init arr1 and arr2 arr1[i] = i+3; arr2[i] = 15-i; } CORCON = 0x00F1; // signed computing, saturation for both Acc, integer computing asm { mov #_arr1, W8 // W8 = @arr1, point to a first element of array mov #_arr2, W10 // W10 = @arr2, point to a first element of array mov #0, W4 // clear W4 mov #0, W6 // clear W6 clr A repeat #20 mac W4*W6, A, [W8]+=2, W4, [W10]+=2, W6 // AccA = sum(arr1*arr2) sftac A, #-16 // shift the result in high word of AccA sac A, #0, W1 // W1 = sum(arr1*arr2) } LATD = W1; // LATD = sum(arr1*arr2) }Example 1 presents the progam for calculating the sum of products of the array elements. The elements of arr1 are stored in the X space and the elements of arr2 in the Y space of the data memory. At the start of the program the values of the array elements are initiated.
for (i = 0;i < 20;i++){ //init arr1 and arr2 arr1[i] = i+3; arr2[i] = 15-i; }Before the calculation starts, it is necessary to set the DSP module for signed-integer computing. This is done by writing 0x00F1 to the register CORCON (CORCON =0x00F1;). At the same time the saturation logic for both accumulators (A nad B) is enabled, even though the accumulator B is will not be used. Table 11-3 gives the meanings of individual bits of the CORCON register. The next step is writing the initial addresses (addresses of the first array elements) of the arr1 and arr2 arrays to the W8 and W10 registers, respectively. It has been decided to use the registers W8 and W10 for specifying the addresses of the next array elements and the registers W4 and W6 for the array elements being multiplied in the current iteration (partial sum).
mov #_arr1, W8 mov #_arr2, W10Since the addresses of the first array elements are saved in the W8 and W10 registers, the process of multiplying and accumulating the partial sums can be started. Of course, the initial value of the accumulator A is set to zero by clr A. The instruction MAC, for calculation of the partial sums and their adding to the current accumulator value should be executed 21 times. The first partial sum will be zero since the values of the registers W4 and W6 are zero. The purpose of the first execution of the MAC instruction is to read the values of the first elements from the data memory snd write them to the registers W4 and W6. After that, the instruction MAC is executed 20 times, calculating the partial sums which are accumulated in the accumulator A. The instruction MAC and the corresponding parameters are described in Table 11-2.
repeat #20 mac W4*W6, A, [W8]+=2, W4, [W10]+=2, W6 //AccA:=sum(arr1*arr2)After the instruction MAC has been executed, the result is in the lower 16 bits of the accumulator A. The result could be read directly from AccAL, i.e. from address 0x0022 (see Table 11-4c), but it is regular practice to shift the result to AccAH, i.e. perfom the shift left 16 times and then read the result by using instruction SAC. In this way the consequences of an overflow, if it occurs, will be mitigated. In this case no overflow will occur, nevertheless the result is read in a regular way. The shift left 16 times is performed by the instruction:
SFTAC A, #-16The instruction SFTAC with its parameters is described in Table 11-2. After the result has been shifted 16 places to the left, it is read and saved in the W1 register. This is done by the instruction:
SAC A, #0, W1The instruction SAC with its parameters is described in Table 11-2.
/* dsPIC30F6014A */ const int Sgn[2] = {1,-1}; // Signes for sum int arr[14]; // array of 14 signed-integers in Y space (Y space is default) int i; unsigned int adr2; void main() { TRISD = 0; // Configure PORTD as output for (i=0; i<14; i++) arr[i] = i+1; // init arr adr2 = &Sgn; // dummy line, just for linking Sgn before usage inside asm block MODCON = 0x8008; // X modulo addressing, on W8 register XMODSRT = adr2; // XMODSRT points to the start of Sgn array XMODEND = adr2+3; // XMODEND points to the end of Sgn array CORCON = 0x00F5; // Signed computing, saturation for both Acc, // integer computing, PSV managment asm { mov #@_sgn, W8 // W8 = @Sgn in X space (mirror), // points to a 1st of 2 elements in Sgn array mov #_arr, W10 // W10 = @arr, points to a first element of arr mov #0, W4 // clear W4 mov #0, W6 // clear W6 clr A // clear accumulator for computing repeat #14 // 15 iterations mac W4*W6, A, [W8]+=2, W4, [W10]+=2, W6 // AccA = sum(Sgn*arr) sftac A, #-16 // shift the result in high word of AccA sac A, #0, W1 // W1 = sum(Sgn*arr) } LATD = W1; // LATD = sum(Sgn*arr) }Example 2 shows the method of using modulo addressing, described in Chapter 8, Section 8.3 and PSV management, described in Chapter 11, Section 11.2. Constants +1 and -1 for multiplying the elements of the array arr are saved in the program memory. The advantage of this approach is a reduction in using the data memory. It is particularly suitable when several arrays having constant elements should be saved. Then, the use of the program memory is recommended. The data memory should be used when the array elements are not constants (unknown at the moment of compiling) or if the program memory is full (a rare event). Addresses in the program memory are 24-bit, whereas in the dtata memory are 16-bit. For this reason it is necessary to perform mirroring, by using PSV management, in the upper block of the data memory (addresses above 0x7FFF). The mirroring is performed in two steps:
ptr =&Sgn; //ptr points to Sgn (24-bit address) adr = Hi(ptr); //Get upper Word (16 bits) adr2 = adr & 0x00FF; //Only lower 8 bits are relevant PSVPAG = adr2; //Load PSVPAG adr2 = ptr & 0xFFFF; //get lower Word (16 bits). Mirrored address in X space adr2 = adr2 | 0x0080; //Upper Data-MEMThe variables ptr and adr are 32-bit long (LongInt). When this part of the code is executed, the PSVPAG register will contain the corresponding value and in adr2 will be the address of the first element of the constant array. The array arr comprises 14 elements and array Sgn only 2. In order to calculate the required sum, modulo addressing should be used for the Sgn array. This is enabled by writing 0x8008 in the MODCON register.
MODCON = 0x8008This instruction enables modulo addressing in the X space via the W8 register. The structure of the MODCON register is shown in Chapter 8, Table 8-3. After modulo addressing has been enabled, it is necessary to define the initial and final addresses of this addressing by writing the corresponding values to the XMODSRT and XMODEND registers. The initial address of the constant array contained by adr2 is written to the XMODSRT register. The address of the last byte of the constant array, i.e. adr2+4-1 (+4 because two elements occupy 2 locations (bytes) each and –1 to obtain the address of the last byte) is written to the XMODEND register.
XMODSRT = adr2; XMODEND = adr2+3;The next step is setting the required bits in the CORCON register. The structure of the CORCON register is shown in Table 11-3. For the signed computing (positive and negative numbers), enabled saturation logic for both accumulators, enabled saturation logic during writing to the data memory, integer computing and enabled PSV management the value 0x00F5 should be written to the CORCON register. Enabling the saturation logic for accumulator B is superfluous, but it is inserted in the example to show that the saturation logic can be enabled for both accumulators simultaneously. After the corresponding value has been written to the CORCON register, the initialization of the W4, W6, W8 and W10 registers is performed. The registers W4 and W6 are set to zero, whereas in the registers W8 and W10 are written the addresses of the first elements of the arrays Sgn and arr, respectively.
CORCON = 0x00F5; asm mov #@_adr2, W8 mov [W8], W8 mov #_arr, W10 mov #0,W4 mov #0, W6The initial value of the accumulator is set to zero by the instruction clr A. After that, the computing may start.
clr AThe repeat loop is used and it is executed 15 times. By performing mac instruction in the W4 and W6 registers the first elements of the arrays are written and then the 14 partial sums are calculated. The consequence of enabling modulo addressing is that the elements of the array Sgn 1,2,1,2,... will be read alternately.
repeat #14 mac W4*W6, A, [W8]+=2, W4, [W10]+=2, W6After the loop is completed, the result is in the lowest 16 bits of the accumulator A. This result can be read directly from the address 0x0028. Another approach is used in the example in order to illustrate the use of instructions sftac and sac. These instructions are described in Table 11-2. At first, by using instruction sftac, the result is shifted to the middle 16 bits of the accumulator A. Then, by instruction sac, the result is written to the W1 register and from there forwarded to the port D.
sftac A, # - 16 sac A, #0, W1 end; LATD = W1;
/* dsPIC30F6014A */ int arr[15]; // Array of 15 signed-integer elements unsigned int i, MeanRes; void MeanVar(unsigned int *ptrArr, unsigned int Len, unsigned int *ptrMean) { CORCON = 0x00F1; // Signed computing, saturation for both Acc, integer computing asm { mov [W14-8], W10 // W10 = ptrArr mov [W14-10], W7 // W7 = Len sub W7, #1, W2 // W2 = Len-1 clr A repeat W2 add [W10++], #0, A // A = sum(arr) add W7, #1, A // A = A + (Len/2) for div's lack of rounding ... sac.r A, #0, W3 // W3 = round(AccA) repeat #17 // 18 iterations of signed-divide. Result in W0 div.s W3, W7 // W0 = sum(arr)/Len mov [W14-12], W4 // W4 = ptrMean mov W0, [W4] // Mean = Mean(arr) } } void main() { TRISD = 0; // Configure PORTD as output MeanRes = 0; for (i=0; i<15; i++) arr[i] = i; // Init arr MeanVar(&arr, 15, &MeanRes); // call subroutine LATD = MeanRes; // Send result to LATD }The main program is very simple. After setting port D as output and initializing the input array, the function for calculating mathematical expectation is called. The result is then sent to port D.
TRISD = 0; // Configure PORTD as output MeanRes = 0; for (i=0; i<15; i++) arr[i] = i; // Init arr MeanVar(&arr, 15, &MeanRes); // call subroutine LATD = MeanRes; // Send result to LATDThe function for calculating mathematical expectation has only 3 parameters. The first parameter is the address of the first array element (ptrArr = &arr). The second parameter is the number of array elements (Len = 15). The third parameter is the address of the variable where the result, i.e. mathematical expectation, should be written (ptrMean = &MeanRes). The function consists of three parts:
CORCON = 0x00F1;For the signed computing (positive and negative numbers), enabled saturation logic for both accumulators, enabled saturation logic while writing to the data memory and integer computing the value 0x00F1 should be written to the CORCON register. Enabling the saturation logic for the accumulator B is superfluous, but it is inserted in the example to show that the saturation logic can be enabled for both accumulators simultaneously. After the CORCON register is set, the address of ther first array element is written to the W10 register. The number of array elements is written to the W7 register.
mov [W14-8], W10 mov [W14-10], W7Instruction add should be called as many times as there are elements in the array, i.e. the value in the W7 register should be decremented by 1. Since the number of elements will be required later for performing division, the decrementd value is saved in the W2 register. This is done by the instruction.
sub W7, #1, W2Instruction add will be executed the required number of times. The result of each call is the partial sum which is added to the content of the accumulator A. After completion of the loop, the sum of all array elements is in the accumulator.
clr A repeat W2 add [W10++], #0, ATo obtain mathematical expectation, the sum of all array elements should be divided by the number of array elements. During division it is not possible to round off the result to the nearest integer. For this reason to the sum of all array elements the value Len/2 is added first. This is the same as adding the value 0.5 to the result, but this is not possible in this case because of integer computing. Adding the value Len/2 is done by the instruction add W7, #1, A. This instruction adds the value of the register W7 shifted one position to the right, which is analogous to divide by two, to the current value in the accumulator A. After that, the value in the accumulator A is read and divided by the number of array elements. This is done by the instruction sac.r A, #0, W3. Instruction sac.r is described in Table 11-2.
add W7, #1, A sac.r A, #0, W3In the family of dsPIC30F devices there is no hardware division. Division is performed by 18 iterations each calling instruction div in the loop. The result will be saved in the W0 register and the remainder in the W1 register. The sum of all array elements is saved in the W3 register and the number of array elements in the W7 register. Therefore, the instruction div.s W3, W7 is called in the loop. After the loop is completed, the result is saved in the W0 register.
repeat #17 div.s W3, W7Since the value of mathematical expectation is in the W0 register, it is necssary to write this value to the destination address (third parameter of the function). In this way the obtained value of mathematical expectation is forwared to the main program for further processing.
mov [W14-12], W4 mov W0, [W4]