BRIEF
There
are basically two ways to process large volume of SAR data.
1.
Post-mission Processing
2.
Real-time Processing.
First
option was adopted for several years because of limitations of technology in
the data storage and fast execution. In post-mission processing, the entire
volume of data collected during the flight path is stored on magnetic tape and
processed at ground base. Of course with the current technology for post
processing, the final result (image) is of excellent quality. There are some defense and military applications or crucial civil
applications, where decision-making at right moment with sufficient accuracy is
required with SAR Remote Sensing.
Because
of such application requirements and with the development in technology and
algorithms for SAR imaging, the word "Real-time Processing" has
focused the attention. The efforts in hardware and algorithms are being made to
reach as closer as possible to the Real-time Processing. In true sense,
Real-time Signal Processing means sample by sample processing with an output at
the same instance of input.
As
large volume of raw data is associated with SAR imaging, Real-time Processing
with some time factor scaling is only possible. Even with today's state-of-the
art technology only Near-Real-time Processing is achievable. The processing
load increases drastically with a small improvement in Azimuth Resolution. For
high-resolution imaging, large bandwidth and in turn huge mass of data is to be
processed. Furthermore, multi-look processing for
speckle reduction makes the processing load as many times the number of looks.
Depending on the narrow swath or wide swath mode for specified area coverage in
SAR image requires a large variation in number of range gates per look of image
detection.
Two
important issues with Real-time processing are
1.
Memory Requirement
2.
Execution Time
Redundant processing steps in an algorithm for
high-resolution increase Execution Time, at the same time high-resolution
demands large data to be stored for block-processing algorithms and hence huge
Memory Requirements.
Real-time
processing of SAR data either at on-board or at ground base requires special
kind of processors with fast computational capabilities. Such processors are
known as DSP processors. Most of the time, multi-processor environment with
complex data routing from master to slave or processor-to-processor is
deployed.
DSP
processors are categorized mainly into 1) Generic Signal Processors, 2) Special
Purpose Signal Processors and 3) Programmable Signal Processors. For the
requirements of flexibility, easy to use development tools, better hardware
support and interfaces, and multi-processing ability, the Programmable Signal
Processors like Analog Devices ADSP 21XXX is generally given preference for
radar signal processing or for SAR imaging.
This Section is divided
in to three chapters.
1.
DSP Microprocessor ADSP21020,
which gives the over all architectural overview of the processor, features and
its suitability for the work carried out.
2.
DSP Card (SP-20),
which presents the details about a signal processing solution card with dual
ADSP 21020 processors, memory and other hardware interfaces. This card was
utilized for studying the Real-time implementation aspects of FASP
(Frequency-Domain Azimuth Signal Processing) algorithm.
3.
Results, which
documents about two critical issues 1) Memory Requirement, 2) Execution Time,
for the Real-time implementation of FASP algorithm.
The
ADSP-21020 is the first member of Analog Devices’ family of single-chip,
programmable, IEEE floating-point processors optimized for digital signal
processing applications. Its architecture is similar to that of Analog Devices’
ADSP-2100 family of fixed-point DSP processors.
Fabricated
in a high-speed, 1.0 micron, low-power CMOS process, the ADSP-21020 has a 50 ns
instruction cycle time. A40 ns, 25 MIPS version is planned for availability in
1992. With a high- performance on-chip instruction cache, the ADSP-21020 can
execute every instruction in a single cycle.
S3.C1.2 ADSP - 21020 Features
2.1
Independent Parallel Computation Units :
The arithmetic/logic unit (ALU), multiplier and shifter perform single-cycle
instructions. The units are architecturally arranged in parallel, maximizing
computational throughput. A single multifunction instruction executes parallel
ALU and multiplier operations. These computation units support IEEE 32-bit
single-precision floating-point, and 32-bit
fixed-point data formats.
2.2
Data Register File : A
general-purpose data register file is used for transferring data between the
computation units and the data buses, and for storing intermediate results.
This 10-port (16-register) register file, combined with the ADSP-21020’s
Harvard architecture, allows unconstrained data flow between computation units
and off-chip memory.
2.3
Single-Cycle Fetch of Instruction and
Two Operands :
The ADSP-21020 uses a modified Harvard architecture in which data memory stores
data and program memory stores both instructions and data. Because of its
separate program and data memory buses and on-chip instruction cache, the
processor can simultaneously fetch an operand from data memory, an operand from
program memory, and an instruction from the cache, all in a single cycle.
2.4
Memory Interface :
Addressing of external memory devices by the ADSP-21020 is facilitated by
on-chip decoding of high-order address lines to generate memory bank select
signals. Separate control lines are also generated for simplified addressing of
page-mode DRAM. The ADSP-21020 provides programmable memory acknowledge
controls allow interfacing to peripheral devices with variable access times.
2.5
Instruction Cache :
The ADSP-21020 includes a high performance instruction cache that enables
three-bus operation for fetching an instruction and two data values. The cache
is selective-only the instructions whose fetches conflict with program memory
data accesses are cached. This allows full-speed execution of core, looped
operations such as digital filter multiply-accumulates and FFT butterfly
processing.
2.6
Hardware Circular Buffers :
The ADSP-21020 provides hardware to implement circular buffers in memory, which
are common in digital filters and Fourier transform implementations. It handles
address pointer wraparound, reducing overhead thereby increasing performance
and simplifying implementation. Circular buffers can start end at any location.
2.7
Flexible Instruction Set :
The ADSP-21020’s 48-bit instruction word accommodates a variety of parallel
operations, for concise programming. For example, the ADSP-21020 can
conditionally execute a multiply, an add, a subtract
and a branch in a single instruction.
S3.C1.3 ARCHITECTURE
OVERVIEW
Figure
S3.C1.B1 shows a block diagram of
the ADSP-21020. The processor features:
·
Three Computation Units (ALU, Multiplier,
and Shifter) with a Shared Data Register File
·
Two Data Address Generators (DAG 1, DAG
2)
·
Program Sequencer with Instruction
Cache
·
32-bit Timer
·
Memory Buses and Interface
·
With
a large number of buses connecting the registers to the computation units, data
flow between computation units and from/to off-chip memory is unconstrained and
free from bottlenecks. The 10-port register file and Harvard architecture of
the ADSP-21020 allow the following nine data transfers to be performed every
cycle:
·
Off-chip read/write of two operands to
or from the register file
·
Two operands supplied to the ALU
·
Two operands supplied to the multiplier
·
Two results received from the ALU and
multiplier (three, if the ALU operation is a combined addition / subtraction)
The processor’s 48-bit orthogonal instruction word supports fully parallel data transfer and arithmetic operations in the same instruction.
3.3
Address Generators and Program Sequencer :
Two dedicated address generators and a program sequencer supply addresses for
memory accesses. Because of this, the computation units need never be used to
calculate addresses. Because of its instruction cache, the ADSP-21020 can
simultaneously fetch an instruction and data values from both off-chip program
memory and off-chip data memory in a single cycle.
The
data address generators (DAGs) provide memory
addresses when external memory data is transferred over the parallel memory
ports to or from internal registers. Dual data address generators enable the
processor to output two simultaneous addresses for dual operand reads and
writes. DAG 1 supplies 32-bit addresses to data memory. DAG 2 supplies 24-bit
addresses to program memory for program memory data accesses.
Each
DAG keeps track of up to eight address pointers, eight modifiers, eight buffer
length values and eight base values. A pointer used for indirect addressing can
be modified by a value in a specified register, either before (pre-modify) or
after (post-modify) the success. To implement automatic modulo addressing for
circular buffers, the ADSP-21020 provides buffer length registers that can be
associated with each pointer. Base values for pointers allow circular buffers
to be placed at arbitrary locations. Each DAG register has an alternate
register that can be activated for fast context switching.
The
program sequencer supplies instruction addresses to program memory. It controls
loop iterations and evaluates conditional instructions. To execute looped code
with zero overhead, the ADSP-21020 maintains an internal loop counter and loop
stack. No explicit jump or decrement instructions are required to maintain the
loop.
The
ADSP-21020 derives its high clock rate from pipelined fetch, decode and execute
cycles. Approximately 70% of the machine cycle is available for memory
accesses; consequently, ADSP-21020 systems can be built using slower and
therefore less expensive memory chips.
3.4
Instruction Cache :
The program sequencer includes a high performance, selective instruction cache
that enables three-bus operation for fetching an instruction and two data
values. This two-way, set associative cache holds 32 instructions. The cache is
selective-only the instructions whose fetches conflict with program memory data
accesses are cached, so the ADSP-21020 can perform a program memory data access
and can execute the corresponding instruction in the same cycle. The program
sequencer fetches the instruction from the cache instead of from program
memory, enabling the ADSP-21020 to simultaneously access data in both program
memory and data memory.
3.5
Context Switching :
Many of the ADSP-21020’s registers have alternate register sets that can be
activated during interrupt servicing to facilitate a fast context switch. The
data registers in the register file, DAG registers and the multiplier result
register all have alternate sets. Registers active at reset are called primary
registers; the others are called alternate registers. Bits in the MODE1 control
register determine which registers are active at any particular time.
The
primary/alternate select bits for each half of the register file (top eight or
bottom eight registers) are independent. Likewise, the top four and bottom four
register sets in each DAG have independent primary / alternate select bits.
This scheme allows passing of data between contexts.
3.6
Interrupts :
The ADSP-21020 has four external hardware interrupts, nine internally generated
interrupts, and eight software interrupts. For the external interrupts and the
internal timer interrupts, the ADSP-21020 automatically stacks the arithmetic
status and mode (MODE1) registers when servicing the interrupts, allowing five
nesting levels of fast service for these interrupts.
An
interrupt can occur at any time while the ADSP-21020 is executing a program.
Internal events that generate interrupts include arithmetic exceptions, which
allow for fast trap handling and recovery.
3.7
Timer :
The programmable interval timer provides periodic interrupt generation. When
enabled, the timer decrements a 32-bit count register
every cycle. When this count register reaches zero, the ADSP-21020 generates an
interrupt and asserts its TIMEXP output. The count register is automatically
reloaded from a 32-bit period register and the count resumes immediately.
The
ADSP-21020 implements the boundary scan testing provisions specified by IEEE
Standard 1149.1 of the Joint Testing Action Group (JTAG). The ADSP-21020’s test
access port and on-chip JTAG circuitry is fully compliant with the IEEE 1149.1
specification. The test access port enables boundary scan testing of circuitry
connected to the ADSP-21020’s I/O pins.
The
ADSP-21020 also implements on-chip emulation through the JTAG test access port.
The processor’s eight sets of break-point range registers enable program
execution at full speed until reaching a desired breakpoint address range. The processor can then halt and allow reading/writing of all the
processor’s internal registers and external memories through the JTAG port.
S3.C1.5 DEVELOPMENT
SYSTEM
The ADSP-21020 is supported with a complete set of software and hardware development tools. The ADSP-21020 Development System includes development software, an evaluation board and an in-circuit emulator.
5.1
Assembler :
Creates relocatable, COFF(Common Object File Format) object files from
ADSP-21XXX assembly source code. It accepts standard C preprocessor directives
for conditional assembly and macro processing. The algebraic syntax of the
ADSP-21XXX assembly language facilitates coding and debugging of DSP
algorithms.
5.2
Linker/Librarian :
The Linker processes separately assembled object files and library files to
create a single executable program. It assigns memory locations to code and to
data in accordance with a user-defined architecture file that describes the
memory and I/O configuration of the target system. The Librarian allows you to
group frequently used object files into a single library file that can be
linked with your main program.
5.3
Simulator :
The simulator performs interactive, instruction-level simulation of ADSP-21XXX
code within the hardware configuration described by a system architecture file.
It flags illegal operations and supports full symbolic disassembly. It provides
an easy-to-use, window oriented, GUI that is identical to the one used by the
ADSP-21020 EZ-ICE Emulator. Commands are accessed from pull-down menus with a
mouse.
5.4
PROM Splitter :
Formats an executable file into files that can be used with an
industry-standard PROM programmer.
5.5
C Compiler and Runtime Library :
The C Compiler compiles with ANSI specifications. It takes advantage of the
ADSP-21020’s high-level language architectural features and incorporates
optimizing algorithms to speed up the execution of code. It includes an
extensive runtime library with over 100 standard and DSP-specific functions.
5.6
C source Level Debugger : A
full-featured C source level debugger that works with the EZ-ICE emulator to
allow debugging of assembler source, C source, or mixed assembler and C.
5.7
DSP/C Compiler :
Supports ANSI Standard Numerical C as defined by the Numeric C Extension Group.
The DSP/C Compiler accepts C source input containing Numerical C extensions for
array selection, vector math operations, complex data types, circular
operations, and variably dimensioned arrays, and outputs ADSP-21020 assembly
language source code.
5.8
EZ-ICE Emulator :
This in-circuit emulator provides the system designer with a PC-based development
environment that allows nonintrusive access to the ADSP-21020’s internal
registers through the processor’s 5-pin JTAG Test Access Port. This use of
on-chip emulation circuitry enables reliable, full-speed performance in any
target. The emulator uses the same GUI as the ADSP-21020 Simulator, allowing an
easy transition from software to hardware debug.
S3.C1.6 FEATURES
·
Superscalar IEEE Floating-Point
Processor
·
Off-Chip Harvard Architecture Maximizes
Signal Processing Performance
·
50 ns, 20 MIPS Instruction Rate,
Single-Cycle Execution
·
60 MFLOPS Peak, 40 MFLOPS Sustained
Performance
·
1024-Point Complex FFT Benchmark: 0.96
ms
·
Divide (y/x): 300 ns
·
Inverse Square Root (1/x): 450 ns
·
32-Bit Single-Precision and 40-Bit
Extended-Precision IEEE Floating-Point Data Formats
·
32-Bit Fixed-Point Formats, Integer and
Fractional, with 80-Bit Accumulators
·
IEEE Exception Handling with Interrupt
on Exception
·
Three Independent Computation Units:
Multiplier, ALU, and Barrel Shifter
·
Dual Data Address Generators with
Indirect, Immediate, Modulo, and Bit Reverse Addressing Modes
·
Two Off-Chip Memory Transfers in
Parallel with Instruction Fetch and Single-Cycle Multiply & ALU Operations
·
Multiply with Add & Subtract for
FFT Butterfly Computation
·
Efficient Program Sequencing with Zero-Overhead
Looping: Single-Cycle
·
Single-Cycle Register File Context
Switch
·
35 ns External RAM Access Time For
Zero-Wait-State, 50 ns Instruction Execution
·
IEEE JTAG Standard 1149.1
·
223-Pin PGA Package (Plastic and
Ceramic)
The SP-20 is an extremely powerful circuit board assembly as shown in Figure S3.C2.D1 for personal computers utilizing the ISA bus. The SP-20 can perform complex signal processing routine as well as arbitrate activity on Signatec’s Auxiliary Bus (SAB). This power can be unleashed to mechanize high speed waveform capture, signal processing, and data storage systems that run virtually independent of PC operations. Orchestrating the power of the SP-20 can be accomplished with Maestro, an intuitive Windows based graphical programming tool.
S3.C2.D1
The
SP-20 is a high-speed signal processing board that was designed with
flexibility and ease of use in mind. The SP-20incorporates Analog Devices
ADSP-21020 100 MFLOP Floating Point Digital Signal Processor that controls all
operations on the board using an interrupt driven approach. The architecture
supplies the user with three interfaces for transferring data to and from the
SP-20. The three interfaces give graduated levels of performance that will meet
a wide variety of demands.
The
highest performance interface is the SAB that can transfer data at 200
Mbytes/s. The primary data source for this bus is high speed waveform capture
(data acquisition) boards and large memory boards while typical data receivers
include wave form creation board (DAC), data storage devices, and graphics
boards.
The
next level of performance is obtained by using the Digital I/O Interface (DIO).
This interface can transfer data between the SP-20 and external devices at 25
Mbytes/s. 56 MW/Sa.
The
last method is over the ISA bus of the personal computer, which gives a data
transfer rate of about 2 Mbytes/s.
1.1
Theory of Operation :
The SP-20 uses the ADSP-21020 DSP to control all operations on the board with
an interrupt driven approach. The three interfaces shown in Figure S3.C2.D2 can
interrupt the DSP with a specific hardware interrupt. After receiving the
hardware interrupt the DSP reads an interrupt latch to determine what Interrupt
Service Routine (ISR) should be executed for the interface.
S3.C2.D2
1.2
SAB Interface : The
Signatec Auxiliary Bus (SAB) Interface is a high
performance interface used to transfer data between boards independent of the
PC. Data can be transferred between products such as high-speed waveform
creation boards (DAC), and graphics boards. The SAB can be used by multiple
SP-20 boards to communicate with each other during parallel processing
applications. The SP-20 can use the SAB to control the operation of other
boards on the SAB via interrupt and communication lines.
The SAB is a 64 bit bi-directional bus that implements burst transfers and handshaking transfers. The SAB Interface is designed to connect up to five boards via two 100-conductor high-density ribbon cables. The SP-20 can transfer burst data in two widths, 32 and 64 bits. The 64-bit burst mode can transfer the entire contents of memory at 200 Mbytes/s. The SP-20 drives the control lines when transmitting and responds to the control lines when receiving data.
1.3
DIO Interface :
The Digital I/O (DIO) Interface is a handshaking interface that enables a
device outside the PC to communicate with the SP-20. Information written to the
SP-20 can be interpreted as data or Information written to the SP-20 can be
interpreted as data or a command word. The command word will contain an ISR
number the SP-20 DIO can be used to exchange data between the PC and other
computer boards that implement the handshaking protocol in dissimilar platforms
such as VME or VXI.
The DIO has 32 bi-directional data bits and 3 bi-directional control signals and is designed to connect to a single external device. The SP-20 acts as either a transmitter or a receiver when using this interface. When designated as the Transmitter, the SP-20 drives the Data Transmitted Line. When designated as the Receiver the SP-20 drives the Data Received line. A device attached to the DIO can use the Data/Command line to indicate whether a value is data or a command word.
1.4
ISA Interface :
The ISA Interface is used to download code to the SP-20 Program RAM and can be
used to control all other interfaces on the SP-20 from the PC. The DSP receives
the interrupt and reads the interrupt latch. The ISR# may instruct the DSP to
initiate an SAB transfer, DIO transfer, or perform an operation on data stored
in RAM. User defined ISR numbers allow the user to create code to mechanize an
algorithm that can be called from the ISA Interface.
Data
transfer between the SP-20 and the ISA bus consists of the PC reading or
writing data to the SP-20 interface latches and the DSP reading or writing data
from the latches. Before sending or requesting data the PC must write an ISR
number to the base address register on the SP-20. The ISR number instructs the
DSP which interrupt service routine to run. Since the ISA bus operation is much
slower than the SP-20, the ISA Bus controls the data exchange in both
directions.
The
interface requires 8 bytes in the PC I/O address space and an on board
dipswitch with 7 active switches is used to set the base address on 8 byte
boundaries. All data operations over the ISA bus are 16 bit I/O transfers.
S3.C2.2 SOFTWARE
DESCRIPTION
The
SP-20 uses an operating system that consists of Interrupt Service Routines
(ISR). Signatec has created a set of ISRs that allow the DSP to perform specific tasks based on
a hardware interrupt and Interrupt Service Routine number. Each of the
interfaces on the SP-20 will interrupt the DSP on a specific hardware
interrupt.
When
the PC is used to control the SP-20 an interrupt number is written to the
interrupt latch on the SP-20 and the DSP is interrupted to read the latch. The
DSP will perform the ISR as requested by the PC. When a user creates SP-20 DSP
code the ISRs that are part of the SP-20 operating
system can be used like function calls.
Signatec offers a wide variety of
development tools that allow the user to program the DSP on the SP-20. The
development tools range from low-end PC functions, midrange Analog Devices
ADSP-21020 Software Development Tools, and a high-end software development
package called Maestro.
The
PC functions allow the user basic control of the SP-20 at an economical price.
The Analog Devices ADSP-21020 Software Development Tools allow the user total
control of the DSP on the SP-20 but requires the user total control of the DSP
on the SP-20 but requires the user to become familiar with the architecture of
the ADSP-21020 DSP. Maestro is an intuitive Window based programming tool that
allows users to create custom application code based on a block diagram
approach.
2.1
PC Functions :
The SP-20 is shipped with a collection of functions written in C that greatly
facilitate the development of application programs. The PC Functions are a
collection of statements that are used to set up the operation of the SP-20 via
the ISA bus.
When the function is executed an ISR# is written from the PC to an interrupt latch on SP-20. A hardware interrupt will cause the DSP to read the interrupt latch. The ISR# represents a function call for the DSP to execute.
The
PC functions are grouped into 5 categories; Data Manipulation, File Transfer,
Hardware Settings, Monitor and Signal Processing Routines. To control the
operation of all interfaces on the SP-20 the user calls the appropriate PC
Function in their application code.
2.2
ADSP-21020 Software Development System :
The SP-20 Digital Signal Processor incorporates Analog Devices ADSP-21020 DSP.
For algorithm development the ADDS-21020-SW-PC Software Development System is
available. This method of code generation gives the user total control in
application development but requires them to become familiar with the architecture
of the ADSP-21020 DSP.
The Software Development System is a suite of tools that include an Assembler, Linker/Librarian, PROM Splitter, C compiler and Runtime Library, DSP/C Compiler and Simulator. For more information refer to the Signatec data sheet entitled “ADDS-21020-SW-PC Software Development System for the ADSP-21020 Digital Signal Processor”.
S3.C2.3 FEATURES
·
16 bit ISA Bus Circuit Card
·
200 MBytes/s
Auxiliary Bus
·
Analog Devices ADSP-21020, 32 bit
floating point digital signal processor.
·
32 Bit Digital I/O with handshaking
·
128K by 48 Program Memory (standard)
·
Up to 1M by 32 Data Memory
·
3 Year Warranty
S3.C2.4 PERFORMANCE
·
100 MFLOPS peak, 66 MFLOPS sustained
using 33 MHz Clock
·
1K Real FFT in 340 microseconds
·
1K Complex FFT in 578 microseconds
·
Matrix Multiply, (10x10)*(10x10) in 38
microseconds
S3.C2.5 APPLICATIONS
·
Data Sums/Averaging
·
Spectral Analysis
·
Digital Video
·
Parallel Processing
·
Motion Control
CHAPTER 3: RESULTS
Memory
Requirement calculation and Execution Time analysis are the most important and
crucial issues for the Real-time implementation of any algorithm on the
available hardware setup.
The
TASP (Time-Domain Azimuth Signal Processing) algorithm is sample-by-sample
processing algorithm and hence it is less associated with memory problem but
involves some tricky storage and retrieval mechanism on a small set of data
with some compromise in depth of focus for a given PRF interval. This TASP
algorithm has been successfully implemented on SP-20 with hardware based
simulated fixed data pattern.
The
Implemented FASP (Frequency-Domain Azimuth Signal Processing) algorithm on
SP-20 is a block-processing algorithm. It requires processing on all 8K PRF
data in azimuth direction either for NSM (Narrow Swath Mode of 512 range gates)
or for WSM (Wide Swath Mode of 4096 range gates). With
the implementation of FASP algorithm, memory is the crucial problem rather than
the available Execution Time and multi-processor hardware structure.
Both
Memory Requirements and Execution Time are calculated for a single Range Gate
(RG) on DSP card SP-20 for FASP algorithm. This calculations help in
determination of
1.
Maximum number of Range Gates that can
be processed on a single SP-20 card with FASP.
2.
Configuration of multiple SP-20 cards
either for NSM (Narrow Swath Mode) or for WSM (Wide Swath Mode).
3.
The other alternative signal processing
solution instead of SP-20 with large memory and more number of processors on
board.
Registered
output with the sharp peaks for a single Range Gate using 3-look processing and
with different Azimuth Resolution like 6m, 3m and 1m are shown in Figures
S3.C3.D1, S3.C3.D2, S3.C3.D3 and S3.C3.D4.
S3.C3.D4
The
received return signals are decomposed and stored as a complex data set by
quadrature sampling technique for half the bandwidth reduction and for the ease
of mathematical manipulation and processing. Each complex data value is stored
as a Real (I) and an Imaginary (Q) value in two
separate memory locations.
SP-20
has two types of on-board memory banks 1) PM (Program Memory), 2) DM (Data Memory). The same program
flow is followed as per the flowchart of Chapter 5 of Section II for FASP. But
due to I and Q data patterns and complex swapping methods used by FFT
computation routines for optimum implementation, received data and reference
function are stored in PM and DM as shown in Table S3.C3.T1. Both I and Q
channels of received data are stored in DM. Even for reliability and safety,
some portion of memory can also be used as toggle memory banks.
S3.C3.1 Memory Requirement Calculations
SP-20
offers 128K of PM and 128K of DM. DM can be extended up to 1M at maximum. Both
PM and DM are 32 bits wide.
Memory
Requirement for a single Range Gate
Sr.
No. |
Type of Data
|
32
bit Size in DM |
32
bit Size in PM |
1 |
redata
(received real) |
8K |
|
2 |
Imdata
(received imaginary) |
8K |
|
3 |
Refft (FFT
real) |
8K |
|
4 |
Imfft (FFT
imaginary) |
- |
8K |
5 |
ref_real (reference
real) |
8K |
|
6 |
ref_img
(reference imaginary) |
|
8K |
7 |
look_r (look
real) |
512 |
|
8 |
look_I (look
imaginary) |
512 |
|
9 |
Reg
(registered looks) |
512 |
|
|
|
33.5 K |
16K |
S3.C3.T1
Keeping
some tolerance for program itself, stack and system library, RCMC and other storage,
estimated memory requirement for processing single Range Gate in DM is 40K and in PM is 20K.
·
Out of 128 K of DM, we assume 40K for runtime processing and 50K-60k for
stack and library then only 28K-38K is available for received data
storage. Roughly we can estimate it to
be 32K. From Table S3.C3.T1 both real and imaginary part of received data will
occupy 16K in DM for a single Range Gate. Hence only (28K-38K)/16K
» 2 RGs
can be processed with a single SP-20 if FASP is used.
·
If DM is extended to its maximum of 1M then, 1024K - 100K = 924K/16K » 57 RGs can
be processed with a single SP-20 card.
·
Based on these calculations, memory and
SP-20 configurations for processing number of Range Gates are listed as below
256 RG ® 256·16K
= 4M » 4
SP-20 cards
512 RG (NSM) ® 512·16K
= 8M » 8
SP-20 cards
4096 RG (WSM) ® 4K·16K
= 64M » 64
SP-20 cards
·
Infact
NSM or WSM using FASP is never possible with SP-20, because for more than two
SP-20 cards, there is no SAB type of parallel bus structure available to provide
multiple SP-20 configurations.
S3.C3.2 Execution Time Analysis
The
PRF rate of 500Hz (PRI = 2ms) will give 16 seconds as the maximum processing
time for a given number of Range Gates, each with 8K (8192) data points. The
Execution Time analysis is the determination of the time taken by a DSP
processor AD21020 of SP-20 for a single Range Gate with the specified Azimuth
Resolution i.e. 6 m and maximum number of Range Gates that can be processed
within 8K PRF time. Based on this derived number, configuration of number of
SP-20 cards for NSM or WSM is suggested.
Execution
Time analysis for a single Range Gate with different Azimuth Resolutions such
as 6m, 3m 1m is shown in Tables S3.C3.T2,
S3.C3.T3, and S3.C3.T4. (Sorry these tables are
corrupted forever. I have only printed versions of them with me in my thesis.)
Adding
all tolerances and keeping sufficient margin, over all Execution Time for a
singe Range Gate with ADSP 21020 @ 33MHz is taken as 50 ms for selected values of Azimuth Resolution.
·
Maximum number of Range Gates processed
by a single processor will be 16 s/50 ms »
320 RGs
·
With suitable safety factors if it is
taken as 256 RGs per processor.
·
For processing NSM and WSM, keeping
memory constraint of SP-20 aside
NSM
requires a single SP-20 card and
WSM requires 4 SP-20
cards.
·
One or more additional SP-20 can be
utilized as needed for gathering and routing the received data to the SP-20 card
configuration and displaying the processed data either for NSM or WSM.
S3.C3.3 CONCLUSION
& FUTURE PATH
It
is observed from the results of S3.C3.1 and S3.C3.2 that the optimum
utilization of SP-20 is compromised by the Memory Requirement and Execution
Time trade-off.
·
It can be concluded that, not the
Execution Time but the Memory Requirement issue for the Real-time
implementation of FASP algorithm on SP-20 requires a lot of involvement in
configuration design.
·
Maximum number of RGs
processing constraint by the Memory is 57 whereas by Execution Time is 320.
·
Computation power of dual ADSP-21020 of
the single SP-20 card can be utilized in more effective way by further
increasing the maximum extendable DM size of 1M.
·
With DM size of 4M, 256 RGs can be processed by a single SP-20 card, NSM is
possible with 2 SP-20 cards and WSM is with 16 SP-20 cards.
·
A large increase in DM size is not
practical because of the huge prize tag associated with it.
·
Not only, increase in the cost of the
DSP card with increase in the DM size, but the availability of a parallel
interface bus-structure (like SAB) for multiple SP-20 configuration
is also a big limitation for the Real-time implementation of FASP.
After
lot of investigations, calculations and trial runs on SP-20, feasible and
practical way of Real-time implementation of FASP demands, the search of more
powerful, flexible and with large on-board memory DSP processors like ADSP
21022, ADSP 2106X (SHARC), study of their architecture, design and
configuration exploration, and calculations with trial runs.
Future
path for the Real-time implementation of FASP may be the use of