BRIEF

 

 

There are basically two ways to process large volume of SAR data.

1.                   Post-mission Processing

2.                   Real-time Processing.

First option was adopted for several years because of limitations of technology in the data storage and fast execution. In post-mission processing, the entire volume of data collected during the flight path is stored on magnetic tape and processed at ground base. Of course with the current technology for post processing, the final result (image) is of excellent quality. There are some defense and military applications or crucial civil applications, where decision-making at right moment with sufficient accuracy is required with SAR Remote Sensing.

Because of such application requirements and with the development in technology and algorithms for SAR imaging, the word "Real-time Processing" has focused the attention. The efforts in hardware and algorithms are being made to reach as closer as possible to the Real-time Processing. In true sense, Real-time Signal Processing means sample by sample processing with an output at the same instance of input.

As large volume of raw data is associated with SAR imaging, Real-time Processing with some time factor scaling is only possible. Even with today's state-of-the art technology only Near-Real-time Processing is achievable. The processing load increases drastically with a small improvement in Azimuth Resolution. For high-resolution imaging, large bandwidth and in turn huge mass of data is to be processed. Furthermore, multi-look processing for speckle reduction makes the processing load as many times the number of looks. Depending on the narrow swath or wide swath mode for specified area coverage in SAR image requires a large variation in number of range gates per look of image detection.

Two important issues with Real-time processing are

1.                  Memory Requirement

2.                  Execution Time

Redundant processing steps in an algorithm for high-resolution increase Execution Time, at the same time high-resolution demands large data to be stored for block-processing algorithms and hence huge Memory Requirements.

Real-time processing of SAR data either at on-board or at ground base requires special kind of processors with fast computational capabilities. Such processors are known as DSP processors. Most of the time, multi-processor environment with complex data routing from master to slave or processor-to-processor is deployed.

DSP processors are categorized mainly into 1) Generic Signal Processors, 2) Special Purpose Signal Processors and 3) Programmable Signal Processors. For the requirements of flexibility, easy to use development tools, better hardware support and interfaces, and multi-processing ability, the Programmable Signal Processors like Analog Devices ADSP 21XXX is generally given preference for radar signal processing or for SAR imaging.

            This Section is divided in to three chapters.

1.                   DSP Microprocessor ADSP21020, which gives the over all architectural overview of the processor, features and its suitability for the work carried out.

2.                   DSP Card (SP-20), which presents the details about a signal processing solution card with dual ADSP 21020 processors, memory and other hardware interfaces. This card was utilized for studying the Real-time implementation aspects of FASP (Frequency-Domain Azimuth Signal Processing) algorithm.

3.                   Results, which documents about two critical issues 1) Memory Requirement, 2) Execution Time, for the Real-time implementation of FASP algorithm.


CHAPTER 1:      DSP MICROPROCESSOR - ADSP 21020

 

 

S3.C1.1          GENERAL DESCRIPTION

The ADSP-21020 is the first member of Analog Devices’ family of single-chip, programmable, IEEE floating-point processors optimized for digital signal processing applications. Its architecture is similar to that of Analog Devices’ ADSP-2100 family of fixed-point DSP processors.

Fabricated in a high-speed, 1.0 micron, low-power CMOS process, the ADSP-21020 has a 50 ns instruction cycle time. A40 ns, 25 MIPS version is planned for availability in 1992. With a high- performance on-chip instruction cache, the ADSP-21020 can execute every instruction in a single cycle.

 

S3.C1.2          ADSP - 21020 Features

2.1              Independent Parallel Computation Units : The arithmetic/logic unit (ALU), multiplier and shifter perform single-cycle instructions. The units are architecturally arranged in parallel, maximizing computational throughput. A single multifunction instruction executes parallel ALU and multiplier operations. These computation units support IEEE 32-bit single-precision floating-point, and 32-bit fixed-point data formats.

2.2              Data Register File : A general-purpose data register file is used for transferring data between the computation units and the data buses, and for storing intermediate results. This 10-port (16-register) register file, combined with the ADSP-21020’s Harvard architecture, allows unconstrained data flow between computation units and off-chip memory.

2.3              Single-Cycle Fetch of Instruction and Two Operands : The ADSP-21020 uses a modified Harvard architecture in which data memory stores data and program memory stores both instructions and data. Because of its separate program and data memory buses and on-chip instruction cache, the processor can simultaneously fetch an operand from data memory, an operand from program memory, and an instruction from the cache, all in a single cycle.

2.4              Memory Interface : Addressing of external memory devices by the ADSP-21020 is facilitated by on-chip decoding of high-order address lines to generate memory bank select signals. Separate control lines are also generated for simplified addressing of page-mode DRAM. The ADSP-21020 provides programmable memory acknowledge controls allow interfacing to peripheral devices with variable access times.

2.5              Instruction Cache : The ADSP-21020 includes a high performance instruction cache that enables three-bus operation for fetching an instruction and two data values. The cache is selective-only the instructions whose fetches conflict with program memory data accesses are cached. This allows full-speed execution of core, looped operations such as digital filter multiply-accumulates and FFT butterfly processing.

2.6              Hardware Circular Buffers : The ADSP-21020 provides hardware to implement circular buffers in memory, which are common in digital filters and Fourier transform implementations. It handles address pointer wraparound, reducing overhead thereby increasing performance and simplifying implementation. Circular buffers can start end at any location.

2.7              Flexible Instruction Set : The ADSP-21020’s 48-bit instruction word accommodates a variety of parallel operations, for concise programming. For example, the ADSP-21020 can conditionally execute a multiply, an add, a subtract and a branch in a single instruction.

 

S3.C1.3          ARCHITECTURE OVERVIEW

Figure S3.C1.B1 shows a block diagram of the ADSP-21020. The processor features:

·                    Three Computation Units (ALU, Multiplier, and Shifter) with a Shared Data Register File

·                    Two Data Address Generators (DAG 1, DAG 2)

·                    Program Sequencer with Instruction Cache

·                    32-bit Timer

·                    Memory Buses and Interface

·                    JTAG Test Access Port and On-Chip emulation Support

 

Text Box: S3.C1.B1

 

 

 

3.1              Computation Units : The ADSP-21020 contains three independent computation units: an ALU, a multiplier with fixed-point accumulator, and a shifter. In order to meet a wide variety of processing needs, the computation units process data in three formats: 32-bit fixed-point, 32-bit floating-point and 40-bit floating-point. The floating-point operations are single-precision IEEE-compatible (IEEE Standard 754/854). The 32-bit floating-point format is the standard IEEE format, whereas the 40-bit IEEE single-extended-precision format has eight additional LSBs of mantissa for greater accuracy. The multiplier performs floating-point and fixed-point multiplication as well as fixed-point multiply/add and multiply/subtract operations. Integer products are 64 bits wide, and the accumulator is 80 bits wide. The ALU performs 45 standard arithmetic and logic operations, supporting both fixed-point and floating-point formats. The shifter performs 19 different operations on 32-bit operands. These operations include logical and arithmetic shifts, bit manipulation, field deposit and extract and derive exponent operations.

The computation units perform single-perform single-cycle operations; there is no computation pipeline. The three units are connected in parallel rather than serially, via multiple-bus connections with the 10-port data register file. The output of any computation unit may be used as the input of any unit on the next cycle. In a multifunction computation, the ALU and multiplier perform independent, simultaneous operations.

3.2              Data Register File : The ADSP-21020’s general-purpose data register file is used for transferring data between the computation units and the data buses, and for storing intermediate results. The register file has two sets (primary and alternate) of sixteen 40-bit registers each, for fast context switching.

With a large number of buses connecting the registers to the computation units, data flow between computation units and from/to off-chip memory is unconstrained and free from bottlenecks. The 10-port register file and Harvard architecture of the ADSP-21020 allow the following nine data transfers to be performed every cycle:

·                   Off-chip read/write of two operands to or from the register file

·                   Two operands supplied to the ALU

·                   Two operands supplied to the multiplier

·                    Two results received from the ALU and multiplier (three, if the ALU operation is a combined addition / subtraction)

The processor’s 48-bit orthogonal instruction word supports fully parallel data transfer and arithmetic operations in the same instruction.

3.3              Address Generators and Program Sequencer : Two dedicated address generators and a program sequencer supply addresses for memory accesses. Because of this, the computation units need never be used to calculate addresses. Because of its instruction cache, the ADSP-21020 can simultaneously fetch an instruction and data values from both off-chip program memory and off-chip data memory in a single cycle.

The data address generators (DAGs) provide memory addresses when external memory data is transferred over the parallel memory ports to or from internal registers. Dual data address generators enable the processor to output two simultaneous addresses for dual operand reads and writes. DAG 1 supplies 32-bit addresses to data memory. DAG 2 supplies 24-bit addresses to program memory for program memory data accesses.

Each DAG keeps track of up to eight address pointers, eight modifiers, eight buffer length values and eight base values. A pointer used for indirect addressing can be modified by a value in a specified register, either before (pre-modify) or after (post-modify) the success. To implement automatic modulo addressing for circular buffers, the ADSP-21020 provides buffer length registers that can be associated with each pointer. Base values for pointers allow circular buffers to be placed at arbitrary locations. Each DAG register has an alternate register that can be activated for fast context switching.

The program sequencer supplies instruction addresses to program memory. It controls loop iterations and evaluates conditional instructions. To execute looped code with zero overhead, the ADSP-21020 maintains an internal loop counter and loop stack. No explicit jump or decrement instructions are required to maintain the loop.

The ADSP-21020 derives its high clock rate from pipelined fetch, decode and execute cycles. Approximately 70% of the machine cycle is available for memory accesses; consequently, ADSP-21020 systems can be built using slower and therefore less expensive memory chips.

3.4              Instruction Cache : The program sequencer includes a high performance, selective instruction cache that enables three-bus operation for fetching an instruction and two data values. This two-way, set associative cache holds 32 instructions. The cache is selective-only the instructions whose fetches conflict with program memory data accesses are cached, so the ADSP-21020 can perform a program memory data access and can execute the corresponding instruction in the same cycle. The program sequencer fetches the instruction from the cache instead of from program memory, enabling the ADSP-21020 to simultaneously access data in both program memory and data memory.

3.5              Context Switching : Many of the ADSP-21020’s registers have alternate register sets that can be activated during interrupt servicing to facilitate a fast context switch. The data registers in the register file, DAG registers and the multiplier result register all have alternate sets. Registers active at reset are called primary registers; the others are called alternate registers. Bits in the MODE1 control register determine which registers are active at any particular time.

The primary/alternate select bits for each half of the register file (top eight or bottom eight registers) are independent. Likewise, the top four and bottom four register sets in each DAG have independent primary / alternate select bits. This scheme allows passing of data between contexts.

3.6              Interrupts : The ADSP-21020 has four external hardware interrupts, nine internally generated interrupts, and eight software interrupts. For the external interrupts and the internal timer interrupts, the ADSP-21020 automatically stacks the arithmetic status and mode (MODE1) registers when servicing the interrupts, allowing five nesting levels of fast service for these interrupts.

An interrupt can occur at any time while the ADSP-21020 is executing a program. Internal events that generate interrupts include arithmetic exceptions, which allow for fast trap handling and recovery.

3.7              Timer : The programmable interval timer provides periodic interrupt generation. When enabled, the timer decrements a 32-bit count register every cycle. When this count register reaches zero, the ADSP-21020 generates an interrupt and asserts its TIMEXP output. The count register is automatically reloaded from a 32-bit period register and the count resumes immediately.

 

S3.C1.4          JTAG TEST AND EMULATION SUPPORT

The ADSP-21020 implements the boundary scan testing provisions specified by IEEE Standard 1149.1 of the Joint Testing Action Group (JTAG). The ADSP-21020’s test access port and on-chip JTAG circuitry is fully compliant with the IEEE 1149.1 specification. The test access port enables boundary scan testing of circuitry connected to the ADSP-21020’s I/O pins.

The ADSP-21020 also implements on-chip emulation through the JTAG test access port. The processor’s eight sets of break-point range registers enable program execution at full speed until reaching a desired breakpoint address range. The processor can then halt and allow reading/writing of all the processor’s internal registers and external memories through the JTAG port.

 

S3.C1.5          DEVELOPMENT SYSTEM

The ADSP-21020 is supported with a complete set of software and hardware development tools. The ADSP-21020 Development System includes development software, an evaluation board and an in-circuit emulator.

5.1              Assembler : Creates relocatable, COFF(Common Object File Format) object files from ADSP-21XXX assembly source code. It accepts standard C preprocessor directives for conditional assembly and macro processing. The algebraic syntax of the ADSP-21XXX assembly language facilitates coding and debugging of DSP algorithms.

5.2              Linker/Librarian : The Linker processes separately assembled object files and library files to create a single executable program. It assigns memory locations to code and to data in accordance with a user-defined architecture file that describes the memory and I/O configuration of the target system. The Librarian allows you to group frequently used object files into a single library file that can be linked with your main program.

5.3              Simulator : The simulator performs interactive, instruction-level simulation of ADSP-21XXX code within the hardware configuration described by a system architecture file. It flags illegal operations and supports full symbolic disassembly. It provides an easy-to-use, window oriented, GUI that is identical to the one used by the ADSP-21020 EZ-ICE Emulator. Commands are accessed from pull-down menus with a mouse.

5.4              PROM Splitter : Formats an executable file into files that can be used with an industry-standard PROM programmer.

5.5              C Compiler and Runtime Library : The C Compiler compiles with ANSI specifications. It takes advantage of the ADSP-21020’s high-level language architectural features and incorporates optimizing algorithms to speed up the execution of code. It includes an extensive runtime library with over 100 standard and DSP-specific functions.

5.6              C source Level Debugger : A full-featured C source level debugger that works with the EZ-ICE emulator to allow debugging of assembler source, C source, or mixed assembler and C.

5.7              DSP/C Compiler : Supports ANSI Standard Numerical C as defined by the Numeric C Extension Group. The DSP/C Compiler accepts C source input containing Numerical C extensions for array selection, vector math operations, complex data types, circular operations, and variably dimensioned arrays, and outputs ADSP-21020 assembly language source code.

5.8              EZ-ICE Emulator : This in-circuit emulator provides the system designer with a PC-based development environment that allows nonintrusive access to the ADSP-21020’s internal registers through the processor’s 5-pin JTAG Test Access Port. This use of on-chip emulation circuitry enables reliable, full-speed performance in any target. The emulator uses the same GUI as the ADSP-21020 Simulator, allowing an easy transition from software to hardware debug.

 

S3.C1.6          FEATURES

·                    Superscalar IEEE Floating-Point Processor

·                    Off-Chip Harvard Architecture Maximizes Signal Processing Performance

·                    50 ns, 20 MIPS Instruction Rate, Single-Cycle Execution

·                    60 MFLOPS Peak, 40 MFLOPS Sustained Performance

·                    1024-Point Complex FFT Benchmark: 0.96 ms

·                    Divide (y/x): 300 ns

·                    Inverse Square Root (1/x): 450 ns

·                    32-Bit Single-Precision and 40-Bit Extended-Precision IEEE Floating-Point Data Formats

·                    32-Bit Fixed-Point Formats, Integer and Fractional, with 80-Bit Accumulators

·                    IEEE Exception Handling with Interrupt on Exception

·                    Three Independent Computation Units: Multiplier, ALU, and Barrel Shifter

·                    Dual Data Address Generators with Indirect, Immediate, Modulo, and Bit Reverse Addressing Modes

·                    Two Off-Chip Memory Transfers in Parallel with Instruction Fetch and Single-Cycle Multiply & ALU Operations

·                    Multiply with Add & Subtract for FFT Butterfly Computation

·                    Efficient Program Sequencing with Zero-Overhead Looping: Single-Cycle Loop Setup & Exit

·                    Single-Cycle Register File Context Switch

·                    35 ns External RAM Access Time For Zero-Wait-State, 50 ns Instruction Execution

·                    IEEE JTAG Standard 1149.1 Test Access Port and On-Chip Emulation Circuitry

·                    223-Pin PGA Package (Plastic and Ceramic)


CHAPTER 2:      DSP CARD - SP 20

 

 

The SP-20 is an extremely powerful circuit board assembly as shown in Figure S3.C2.D1 for personal computers utilizing the ISA bus. The SP-20 can perform complex signal processing routine as well as arbitrate activity on Signatec’s Auxiliary Bus (SAB). This power can be unleashed to mechanize high speed waveform capture, signal processing, and data storage systems that run virtually independent of PC operations. Orchestrating the power of the SP-20 can be accomplished with Maestro, an intuitive Windows based graphical programming tool.

 

S3.C2.D1

 

The SP-20 is a high-speed signal processing board that was designed with flexibility and ease of use in mind. The SP-20incorporates Analog Devices ADSP-21020 100 MFLOP Floating Point Digital Signal Processor that controls all operations on the board using an interrupt driven approach. The architecture supplies the user with three interfaces for transferring data to and from the SP-20. The three interfaces give graduated levels of performance that will meet a wide variety of demands.

The highest performance interface is the SAB that can transfer data at 200 Mbytes/s. The primary data source for this bus is high speed waveform capture (data acquisition) boards and large memory boards while typical data receivers include wave form creation board (DAC), data storage devices, and graphics boards.

Text Box: S3.C2.D1

 

The next level of performance is obtained by using the Digital I/O Interface (DIO). This interface can transfer data between the SP-20 and external devices at 25 Mbytes/s. 56 MW/Sa.

The last method is over the ISA bus of the personal computer, which gives a data transfer rate of about 2 Mbytes/s.

 

S3.C2.1          HARDWARE DESCRIPTION

1.1             Theory of Operation : The SP-20 uses the ADSP-21020 DSP to control all operations on the board with an interrupt driven approach. The three interfaces shown in Figure S3.C2.D2 can interrupt the DSP with a specific hardware interrupt. After receiving the hardware interrupt the DSP reads an interrupt latch to determine what Interrupt Service Routine (ISR) should be executed for the interface.

 

S3.C2.D2

 

1.2             SAB Interface : The Signatec Auxiliary Bus (SAB) Interface is a high performance interface used to transfer data between boards independent of the PC. Data can be transferred between products such as high-speed waveform creation boards (DAC), and graphics boards. The SAB can be used by multiple SP-20 boards to communicate with each other during parallel processing applications. The SP-20 can use the SAB to control the operation of other boards on the SAB via interrupt and communication lines.

The SAB is a 64 bit bi-directional bus that implements burst transfers and handshaking transfers. The SAB Interface is designed to connect up to five boards via two 100-conductor high-density ribbon cables. The SP-20 can transfer burst data in two widths, 32 and 64 bits. The 64-bit burst mode can transfer the entire contents of memory at 200 Mbytes/s. The SP-20 drives the control lines when transmitting and responds to the control lines when receiving data.

1.3             DIO Interface : The Digital I/O (DIO) Interface is a handshaking interface that enables a device outside the PC to communicate with the SP-20. Information written to the SP-20 can be interpreted as data or Information written to the SP-20 can be interpreted as data or a command word. The command word will contain an ISR number the SP-20 DIO can be used to exchange data between the PC and other computer boards that implement the handshaking protocol in dissimilar platforms such as VME or VXI.

The DIO has 32 bi-directional data bits and 3 bi-directional control signals and is designed to connect to a single external device. The SP-20 acts as either a transmitter or a receiver when using this interface. When designated as the Transmitter, the SP-20 drives the Data Transmitted Line. When designated as the Receiver the SP-20 drives the Data Received line. A device attached to the DIO can use the Data/Command line to indicate whether a value is data or a command word.

1.4             ISA Interface : The ISA Interface is used to download code to the SP-20 Program RAM and can be used to control all other interfaces on the SP-20 from the PC. The DSP receives the interrupt and reads the interrupt latch. The ISR# may instruct the DSP to initiate an SAB transfer, DIO transfer, or perform an operation on data stored in RAM. User defined ISR numbers allow the user to create code to mechanize an algorithm that can be called from the ISA Interface.

Data transfer between the SP-20 and the ISA bus consists of the PC reading or writing data to the SP-20 interface latches and the DSP reading or writing data from the latches. Before sending or requesting data the PC must write an ISR number to the base address register on the SP-20. The ISR number instructs the DSP which interrupt service routine to run. Since the ISA bus operation is much slower than the SP-20, the ISA Bus controls the data exchange in both directions.

The interface requires 8 bytes in the PC I/O address space and an on board dipswitch with 7 active switches is used to set the base address on 8 byte boundaries. All data operations over the ISA bus are 16 bit I/O transfers.

 

S3.C2.2          SOFTWARE DESCRIPTION

The SP-20 uses an operating system that consists of Interrupt Service Routines (ISR). Signatec has created a set of ISRs that allow the DSP to perform specific tasks based on a hardware interrupt and Interrupt Service Routine number. Each of the interfaces on the SP-20 will interrupt the DSP on a specific hardware interrupt.

When the PC is used to control the SP-20 an interrupt number is written to the interrupt latch on the SP-20 and the DSP is interrupted to read the latch. The DSP will perform the ISR as requested by the PC. When a user creates SP-20 DSP code the ISRs that are part of the SP-20 operating system can be used like function calls.

Signatec offers a wide variety of development tools that allow the user to program the DSP on the SP-20. The development tools range from low-end PC functions, midrange Analog Devices ADSP-21020 Software Development Tools, and a high-end software development package called Maestro.

The PC functions allow the user basic control of the SP-20 at an economical price. The Analog Devices ADSP-21020 Software Development Tools allow the user total control of the DSP on the SP-20 but requires the user total control of the DSP on the SP-20 but requires the user to become familiar with the architecture of the ADSP-21020 DSP. Maestro is an intuitive Window based programming tool that allows users to create custom application code based on a block diagram approach.

2.1             PC Functions : The SP-20 is shipped with a collection of functions written in C that greatly facilitate the development of application programs. The PC Functions are a collection of statements that are used to set up the operation of the SP-20 via the ISA bus.

When the function is executed an ISR# is written from the PC to an interrupt latch on SP-20. A hardware interrupt will cause the DSP to read the interrupt latch. The ISR# represents a function call for the DSP to execute.

The PC functions are grouped into 5 categories; Data Manipulation, File Transfer, Hardware Settings, Monitor and Signal Processing Routines. To control the operation of all interfaces on the SP-20 the user calls the appropriate PC Function in their application code.

2.2             ADSP-21020 Software Development System : The SP-20 Digital Signal Processor incorporates Analog Devices ADSP-21020 DSP. For algorithm development the ADDS-21020-SW-PC Software Development System is available. This method of code generation gives the user total control in application development but requires them to become familiar with the architecture of the ADSP-21020 DSP.

The Software Development System is a suite of tools that include an Assembler, Linker/Librarian, PROM Splitter, C compiler and Runtime Library, DSP/C Compiler and Simulator. For more information refer to the Signatec data sheet entitled “ADDS-21020-SW-PC Software Development System for the ADSP-21020 Digital Signal Processor”.

 

 

 

 

S3.C2.3          FEATURES

·                     16 bit ISA Bus Circuit Card

·                     200 MBytes/s Auxiliary Bus

·                     Analog Devices ADSP-21020, 32 bit floating point digital signal processor.

·                     32 Bit Digital I/O with handshaking

·                     128K by 48 Program Memory (standard)

·                     Up to 1M by 32 Data Memory

·                     3 Year Warranty

 

S3.C2.4          PERFORMANCE

·                     100 MFLOPS peak, 66 MFLOPS sustained using 33 MHz Clock

·                     1K Real FFT in 340 microseconds

·                     1K Complex FFT in 578 microseconds

·                     Matrix Multiply, (10x10)*(10x10) in 38 microseconds

 

S3.C2.5          APPLICATIONS

·                     Data Sums/Averaging

·                     Spectral Analysis

·                     Digital Video

·                     Parallel Processing

·                     Motion Control


CHAPTER 3:      RESULTS

 

 

Memory Requirement calculation and Execution Time analysis are the most important and crucial issues for the Real-time implementation of any algorithm on the available hardware setup.

The TASP (Time-Domain Azimuth Signal Processing) algorithm is sample-by-sample processing algorithm and hence it is less associated with memory problem but involves some tricky storage and retrieval mechanism on a small set of data with some compromise in depth of focus for a given PRF interval. This TASP algorithm has been successfully implemented on SP-20 with hardware based simulated fixed data pattern.

The Implemented FASP (Frequency-Domain Azimuth Signal Processing) algorithm on SP-20 is a block-processing algorithm. It requires processing on all 8K PRF data in azimuth direction either for NSM (Narrow Swath Mode of 512 range gates) or for WSM (Wide Swath Mode of 4096 range gates). With the implementation of FASP algorithm, memory is the crucial problem rather than the available Execution Time and multi-processor hardware structure.

Both Memory Requirements and Execution Time are calculated for a single Range Gate (RG) on DSP card SP-20 for FASP algorithm. This calculations help in determination of

1.                   Maximum number of Range Gates that can be processed on a single SP-20 card with FASP.

2.                   Configuration of multiple SP-20 cards either for NSM (Narrow Swath Mode) or for WSM (Wide Swath Mode).

3.                   The other alternative signal processing solution instead of SP-20 with large memory and more number of processors on board.

 

Registered output with the sharp peaks for a single Range Gate using 3-look processing and with different Azimuth Resolution like 6m, 3m and 1m are shown in Figures S3.C3.D1, S3.C3.D2, S3.C3.D3 and S3.C3.D4.

 

 

 

 

 

 

 

Text Box: S3.C3.D1

 

 

 

 

 

 

 

 

Text Box: S3.C3.D2

 

 

 

 

 

 

 

 

Text Box: S3.C3.D3

 

 

 

 

 

 

 

S3.C3.D4

 

 

The received return signals are decomposed and stored as a complex data set by quadrature sampling technique for half the bandwidth reduction and for the ease of mathematical manipulation and processing. Each complex data value is stored as a Real (I) and an Imaginary (Q) value in two separate memory locations.

SP-20 has two types of on-board memory banks 1) PM (Program Memory),  2) DM (Data Memory). The same program flow is followed as per the flowchart of Chapter 5 of Section II for FASP. But due to I and Q data patterns and complex swapping methods used by FFT computation routines for optimum implementation, received data and reference function are stored in PM and DM as shown in Table S3.C3.T1. Both I and Q channels of received data are stored in DM. Even for reliability and safety, some portion of memory can also be used as toggle memory banks.

 

S3.C3.1          Memory Requirement Calculations

SP-20 offers 128K of PM and 128K of DM. DM can be extended up to 1M at maximum. Both PM and DM are 32 bits wide.

 

Memory Requirement for a single Range Gate

 

Sr. No.

Type of Data

32 bit Size in DM

32 bit Size in PM

1

redata (received real)

8K

 

2

Imdata (received imaginary)

8K

 

3

Refft (FFT real)

8K

 

4

Imfft (FFT imaginary)

-

8K

5

ref_real (reference real)

8K

 

6

ref_img (reference imaginary)

 

8K

7

look_r (look real)

512

 

8

look_I (look imaginary)

512

 

9

Reg (registered looks)

512

 

 

 

33.5 K

16K

 

S3.C3.T1

Keeping some tolerance for program itself, stack and system library, RCMC and other storage, estimated memory requirement for processing single Range Gate in DM is 40K and in PM is 20K.

·                     Out of 128 K of DM, we assume 40K for runtime processing and 50K-60k for stack and library then only 28K-38K is available for received data storage.  Roughly we can estimate it to be 32K. From Table S3.C3.T1 both real and imaginary part of received data will occupy 16K in DM for a single Range Gate. Hence only  (28K-38K)/16K » 2 RGs can be processed with a single SP-20 if FASP is used.

·                     If DM is extended to its maximum of 1M then, 1024K - 100K = 924K/16K » 57 RGs can be processed with a single SP-20 card.

·                     Based on these calculations, memory and SP-20 configurations for processing number of Range Gates are listed as below

             256 RG                      ®        256·16K = 4M » 4 SP-20 cards

             512 RG (NSM)          ®        512·16K = 8M » 8 SP-20 cards

             4096 RG (WSM)       ®        4K·16K = 64M » 64 SP-20 cards

·                     Infact NSM or WSM using FASP is never possible with SP-20, because for more than two SP-20 cards, there is no SAB type of parallel bus structure available to provide multiple SP-20 configurations.

 

S3.C3.2          Execution Time Analysis

The PRF rate of 500Hz (PRI = 2ms) will give 16 seconds as the maximum processing time for a given number of Range Gates, each with 8K (8192) data points. The Execution Time analysis is the determination of the time taken by a DSP processor AD21020 of SP-20 for a single Range Gate with the specified Azimuth Resolution i.e. 6 m and maximum number of Range Gates that can be processed within 8K PRF time. Based on this derived number, configuration of number of SP-20 cards for NSM or WSM is suggested.

Execution Time analysis for a single Range Gate with different Azimuth Resolutions such as 6m, 3m 1m is shown in Tables S3.C3.T2, S3.C3.T3, and S3.C3.T4. (Sorry these tables are corrupted forever. I have only printed versions of them with me in my thesis.)

 

Adding all tolerances and keeping sufficient margin, over all Execution Time for a singe Range Gate with ADSP 21020 @ 33MHz is taken as 50 ms for selected values of Azimuth Resolution.

·                     Maximum number of Range Gates processed by a single processor will be 16 s/50 ms » 320 RGs

·                     With suitable safety factors if it is taken as 256 RGs per processor.

·                     For processing NSM and WSM, keeping memory constraint of SP-20 aside

                        NSM requires a single SP-20 card and

                        WSM requires 4 SP-20 cards.

·                     One or more additional SP-20 can be utilized as needed for gathering and routing the received data to the SP-20 card configuration and displaying the processed data either for NSM or WSM.

 

S3.C3.3          CONCLUSION & FUTURE PATH

            It is observed from the results of S3.C3.1 and S3.C3.2 that the optimum utilization of SP-20 is compromised by the Memory Requirement and Execution Time trade-off.

·        It can be concluded that, not the Execution Time but the Memory Requirement issue for the Real-time implementation of FASP algorithm on SP-20 requires a lot of involvement in configuration design.

·        Maximum number of RGs processing constraint by the Memory is 57 whereas by Execution Time is 320.

·        Computation power of dual ADSP-21020 of the single SP-20 card can be utilized in more effective way by further increasing the maximum extendable DM size of 1M.

·        With DM size of 4M, 256 RGs can be processed by a single SP-20 card, NSM is possible with 2 SP-20 cards and WSM is with 16 SP-20 cards.

·        A large increase in DM size is not practical because of the huge prize tag associated with it. 

·        Not only, increase in the cost of the DSP card with increase in the DM size, but the availability of a parallel interface bus-structure (like SAB) for multiple SP-20 configuration is also a big limitation for the Real-time implementation of FASP.

 

After lot of investigations, calculations and trial runs on SP-20, feasible and practical way of Real-time implementation of FASP demands, the search of more powerful, flexible and with large on-board memory DSP processors like ADSP 21022, ADSP 2106X (SHARC), study of their architecture, design and configuration exploration, and calculations with trial runs.

Future path for the Real-time implementation of FASP may be the use of Morocco board with Analog Devices, octal SHARC (ADSP 2106X) DSPs. Review sheet of the Morocco board with its important features is attached as Appendix B for further research.