Proceedings Volume 5837

VLSI Circuits and Systems II

cover
Proceedings Volume 5837

VLSI Circuits and Systems II

View the digital version of this volume at SPIE Digital Libarary.

Volume Details

Date Published: 30 June 2005
Contents: 31 Sessions, 115 Papers, 0 Presentations
Conference: Microtechnologies for the New Millennium 2005 2005
Volume Number: 5837

Table of Contents

icon_mobile_dropdown

Table of Contents

All links to SPIE Proceedings will open in the SPIE Digital Library. external link icon
View Session icon_mobile_dropdown
  • Keynote Session I
  • Mixed-Signal Design Methods
  • High-Performance Interconnect
  • Sigma-Delta Data Converters
  • High-Performance Circuits and Architectures
  • Digital Design Methodologies and Tools I
  • High-Performance Circuits and Architectures
  • Analog Circuits
  • Multimedia I
  • Analog, Mixed-Signal, and Power Circuit Design Methodologies and Tools
  • Analog and Mixed-Signal Design Methodologies and Tools
  • Analog, Mixed-Signal, and Power Circuit Design Methodologies and Tools
  • Circuit Design for RF Applications
  • Digital Circuits
  • Keynote Session II
  • Technology Reliability
  • Baseband Design for Wireless Transceivers
  • Keynote Session III
  • Wireless Transceivers
  • Digital Design Methodologies and Tools I
  • Analog Test
  • Modeling and Design of Passive RF Components
  • Keynote Session IV
  • Reconfigurable Radio Systems
  • Memory Circuits
  • Keynote Session V
  • Multimedia II
  • Analog, Mixed-Signal, and Power Circuit Design Methodologies and Tools
  • Analog and Mixed-Signal Design Methodologies and Tools
  • Voltage-Controlled Oscillators
  • Digital Test and Verification
  • Multimedia III
  • Data Converters
  • FPGAs
  • Digital Design Methodologies and Tools II
  • Poster Session
  • Circuit Design for RF Applications
Keynote Session I
icon_mobile_dropdown
Substrate noise coupling: a pain for mixed-signal systems
Piet Wambacq, Geert Van der Plas, Stephane Donnay, et al.
Crosstalk from digital to analog in mixed-signal ICs is recognized as one of the major roadblocks for systems-on-chip (SoC) in future CMOS technologies. This crosstalk mainly happens via the semiconducting silicon substrate, which is usually treated as a ground node by analog and RF designers. The substrate noise coupling problem leads more and more to malfunctioning or extra design iterations. One of the reasons is that the phenomenon of substrate noise coupling is difficult to model and hence difficult to understand. It can be caused by the switching of thousands or millions of gates and depends on layout details. From the generation side (the digital domain), coping with the large amount of noise generators can be solved by macromodeling. On the other hand, the impact of substrate noise on the analog circuits requires careful modeling at the level of transistors and parasitics of layout, power supply, package, PCB, Comparison to measurements of macromodeling at the digital side and careful modeling at the analog side, shows that both the generation and the impact of substrate noise can be predicted with an accuracy of a few dB. In addition, this combination of macromodeling at the digital side and careful modeling at the analog side leads to an understanding of the problem, which can be used for digital low-noise design techniques to minimize the generation of noise, and substrate noise immune design of analog/RF circuits.
Mixed-Signal Design Methods
icon_mobile_dropdown
New CAD issues and considerations for the design of mixed-signal SOCs
William Kao, Susan Zhang
By 2006, close to 75% of system on a chip (SOC) designs will be mixed-signal in nature with digital, analog and possibly RF circuitry all integrated on a single chip. This paper will cover recommended design flows and the design methodology challenges facing current mixed signal SOC designers. It also addresses the CAD tool interoperability issues for these types of mixed-signal designs and identifies the work and solution that need to be done to enable their fast design turnaround.
A reuse-based framework for the design of analog and mixed-signal ICs
Despite the spectacular breakthroughs of the semiconductor industry, the ability to design integrated circuits (ICs) under stringent time-to-market (TTM) requirements is lagging behind integration capacity, so far keeping pace with still valid Moore’s Law. The resulting gap is threatening with slowing down such a phenomenal growth. The design community believes that it is only by means of powerful CAD tools and design methodologies - and, possibly, a design paradigm shift - that this design gap can be bridged. In this sense, reuse-based design is seen as a promising solution, and concepts such as IP Block, Virtual Component, and Design Reuse have become commonplace thanks to the significant advances in the digital arena. Unfortunately, the very nature of analog and mixed-signal (AMS) design has hindered a similar level of consensus and development. This paper presents a framework for the reuse-based design of AMS circuits. The framework is founded on three key elements: (1) a CAD-supported hierarchical design flow that facilitates the incorporation of AMS reusable blocks, reduces the overall design time, and expedites the management of increasing AMS design complexity; (2) a complete, clear definition of the AMS reusable block, structured into three separate facets or views: the behavioral, structural, and layout facets, the two first for top-down electrical synthesis and bottom-up verification, the latter used during bottom-up physical synthesis; (3) the design for reusability set of tools, methods, and guidelines that, relying on intensive parameterization as well as on design knowledge capture and encapsulation, allows to produce fully reusable AMS blocks. A case study and a functional silicon prototype demonstrate the validity of the paper’s proposals.
High-Performance Interconnect
icon_mobile_dropdown
Area-, power-, and pin-efficient bus structures using multi-bit-differential signaling
Donald M. Chiarulli, Jason D. Bakos, Joel R. Martin, et al.
This paper describes a new low-power, area-, and pin-efficient alternative to differential encoding for high-performance chip-to-chip and backplane signaling. The technique, called multi-bit-differential-signaling (MBDS), consists of a new design for the driver and link termination network coupled with a novel coding system based on “N choose M (nCm)” codes. In an nCm-coded MBDS channel, there are n physical interconnections over which all code symbols carry exactly m 1-bits. This property gives MBDS links signal-to-noise and transmission characteristics comparable to pair-wise differential links such as low-voltage differential signaling (LVDS). Moreover, MBDS is compatible with commercial LVDS receivers in point-to-point and multi-point bus topologies. However, because MBDS channels have a higher information density, they use up to 45% less power and up to 45% fewer I/O pads than equivalent differentially encoded buses.
Sigma-Delta Data Converters
icon_mobile_dropdown
A continuous time low-pass sigma delta modulator implemented with transmission lines
L. Hernandez, P. Rombouts, E. Prefasi, et al.
This work presents a prototype low pass continuous time sigma delta modulator which uses transmission lines in its loop filter rather than capacitive integrators. As has been shown in prior theoretical work, such a structure allows to desensitize the modulator against clock jitter and excess loop delay. The parameters of the analog components of this design are independent of the sampling clock, as long as the clock frequency has to fit only with the length of the external transmission lines. The prototype single-bit modulator was designed for an oversampling ratio of 128. When the modulator is clocked at 53.7MHz achieves a peak SNR of 67 dB. In an experiment with an excessive clock jitter of 1% of the clock period and a test tone of -10dBfs is applied, the SNDR is degraded by only 5dB compared to the case without jitter.
A dual-mode complex delta-sigma ADC in CMOS for wireless-LAN receivers
J. Arias, P. Kiss, V. Prodanov, et al.
In this work a dual-mode complex multibit continuous-time ΔΣ modulator for a standard 0.25μm CMOS technology is presented. This modulator is intended for the analog-to-digital conversion in multi-mode wireless-LAN receivers (802.11a/b/g) which require wide bandwidth and moderate resolution. Then, a low oversampling ratio of 16 along with a clock frequency of 320 MHz provides a signal bandwidth of 20 MHz for a 9-bit resolution with a second-order modulator. The modulator can be configured for two different modes of operation depending on the type of radio receiver chosen: "zero-IF" (ZIF) and "low-IF" (LIF). The former mode is better suited for 802.11b, while LIF mode is more adequate for 802.11a/g applications. The loop filter is based on transconductors and MOS-capacitors allowing for low power consumption and small chip area. The modulator also includes two 3-bit quantizers, both with their corresponding DWA scrambler. The supply voltage is 2.5V and the measured power consumption is 32 mW. Experimental results using both sine-wave and OFDM signals are presented. The obtained SNR and SNDR are 55dB and 53.5dB, respectively. A high image rejection of 47dB is achieved owing to proper layout techniques. When using OFDM signals, a minimum error vector magnitude of 1.3% is obtained. Finally, the active chip area is 0.44mm2 .
Continuous-time cascaded sigma-delta modulators for VDSL: a comparative study
This paper describes new cascaded continuous-time ΣΔ modulators intended to cope with very high-rate digital subscriber line specifications, i.e. 12-bit resolution within a 20-MHz signal bandwidth. These modulators have been synthesized using a new methodology that is based on the direct synthesis of the whole cascaded architecture in the continuous-time domain instead of using a discrete-to-continuous time transformation as has been done in previous approaches. This method allows to place the zeroes/poles of the loop-filter transfer function in an optimal way and to reduce the number of analog components, namely: transconductors and/or amplifiers, resistors, capacitors and digital-to-analog converters. This leads to more efficient topologies in terms of circuitry complexity, power consumption and robustness with respect to circuit non-idealities. A comparison study of the synthesized architectures is done considering their sensitivity to most critical circuit error mechanisms. Time-domain behavioral simulations are shown to validate the presented approach.
A 0.35-µm CMOS 17-bit@40-kS/s cascade 2-1 sigma-delta modulator with programmable gain and programmable chopper stabilization
This paper describes a 0.35μm CMOS chopper-stabilized Switched-Capacitor 2-1 cascade ΣΔ modulator for automotive sensor interfaces. For a better fitting to the characteristics of different sensor outputs, the modulator includes a programmable set of gains (x0.5, x1, x2, and x4) and a programmable set of chopper frequencies (fs/16, fs/8, fs/4 and fs/2). It has also been designed to operate within the restrictive environmental conditions of automotive electronics (-40°C, 175°C). The modulator architecture has been selected after an exhaustive comparison among multiple ΣΔΜ topologies in terms of resolution, speed and power dissipation. The design of the modulator building blocks is based upon a top-down CAD methodology which combines simulation and statistical optimization at different levels of the modulator hierarchy. The circuit is clocked at 5.12MHz and consumes, all together, 14.7mW from a single 3.3-V supply. Experimental measurements result in 99.77dB of Dynamic Range (DR), which combined with the gain programmability leads to an overall DR of 112dB. This puts the presented design beyond the state-of-the-art according with the existing bibliography.
Jitter effect comparison on continuous-time sigma-delta modulators with different feedback signal shapes
J. San Pablo, D. Bisbal, L. Quintanilla, et al.
A comparison is presented for three different feedback signal shapes on a current mode continuous-time second order sigma-delta modulator, although, it can be extended to systems of any order. The three shapes are: rectangular, exponential, and a new mixed waveform whose pulse starts being rectangular and after a fraction of the clock period changes to decaying ramp. Simulation results at system level, using a software model, are presented. Results show that using early return to zero feedback signal shapes (exponential, mixed) the modulator performance degradation due to pulse width variation is reduced with respect to rectangular signal shapes. In addition to that, the new mixed shaped do not present the high signal peak that the exponential does. This is important from the point of view of integrator input stage because it allows power saving as well as critical input noise reduction.
Optimization algorithm for linearity enhancement in the design of continuous-time sigma-delta modulators
S. Paton, L. Hernandez, R. Frutos, et al.
This paper proposes an optimization algorithm to reduce the distortion produced in the loop-filter of Continuous-Time Sigma-Delta Modulators. The aim of the algorithm is to find the loop-filter implementation that minimizes distortion at the output of the modulator, by modifying the output swing of every integrator. The algorithm is implemented in Matlab as an evolutive searching. During each step of the searching, the algorithm evaluates the harmonical distortion of a tone when it is applied to the modulator with a certain loop-filter implementation. The output of the algorithm is an optimum linear state-space representation of the loop-filter. This particular state-space representation leads to minimum distortion at the output of the modulator when the loop-filter is implemented with some specific circuitry previously defined. As long as the search is of evolutive type, the solution represents a local minimum only. The algorithm computes a random guess solution as the starting point for the optimization procedure, so that different local minimums may be found by running the algorithm itself several times. The algorithm has been applied to a 4th order 4-bit Continuous-Time Sigma-Delta Modulator as a simulation example.
High-Performance Circuits and Architectures
icon_mobile_dropdown
Bounded budgeted parallel architecture versus control dominated architecture for hazard data-signal processor synthesis
Bertrand Le Gal, Emmanuel Casseau, Eric Martin
Multimedia applications such as video and image processing are often characterized by a large number of data accesses (i.e. RAM accesses). In many digital signal-processing applications, the array access patterns are regular and periodic. In these cases, optimized Pipelined Memory Access Controllers can be generated. This technique is used to improve the pipeline access mode to RAM by creating specialized hardware components for generating addresses and packing and unpacking data items. In this paper we focus on the design, implementation and validation of memory interfacing modules that can be automatically generated from a behavioural synthesis tool and which can efficiently handle predictable address patterns as well as unpredictable ones (dynamic address computations) in a pipeline way. We also analyze the benefits of balancing dynamic address computations from datapath to specialized computation units placed in the memory controller, optimizing bitwise of operators and data locality i.e. reducing the power consumption.
Digital Design Methodologies and Tools I
icon_mobile_dropdown
Data-driven array architectures: a rebirth?
The von Neumann-style architectures have been tremendously well succeeded by taking advantage of the Moore’s law. It is now understood that, it will be very difficult to meet the supercomputing demands of the future computing systems with this style of microprocessor architectures. Most nowadays applications require high-performance for processing data streams. Being dataflow computing a natural paradigm to process data streams, architectures based on dataflow principles are emerging as a way to meet the supercomputing demands. Data-driven arrays, introduced in the 80’s, are examples of such architectures. They devised a scalable and effective fashion to directly support the dataflow model of computation and have been revived by a number of reconfigurable architectures (e.g., KressArray, WaveScalar, and XPP). Those coarse-grained reconfigurable architectures with dataflow semantics depict interesting achievements with respect to performance and programming methodologies, when compared to other computing platforms. This paper presents the most interesting data-driven array architectures. Trends and open issues related to a number of properties at architectural level and to compilation techniques are enumerated and discussed. A number of features are illustrated, especially the support for hardware virtualization, speculative configuration, and software pipelining. Examples using the PACT XPP reconfigurable array are shown. Those examples include the ADPCM decoder, from the MediaBench repository, and LeeDCT, an optimized DCT algorithm.
High-Performance Circuits and Architectures
icon_mobile_dropdown
A novel gigabit multidrop serial link for high-speed digital systems
F. Tobajas, R. Esper-Chain, R. Arteaga, et al.
A multidrop backplane based on point-to-multipoint serial links enables interconnection between line cards without requiring a central switch fabric. However, maximum data rate in today's available multidrop serial links is limited to 400Mbps due to signal integrity concerns. In this paper, a novel gigabit multidrop serial link configuration for high-speed digital systems based on broadband power splitters with matching trace impedance, is proposed. Experimental results obtained from implemented prototypes demonstrate a satisfactory operation of the proposed multidrop serial backplane for a data rate of 3.5Gbps.
Linearisation for analogue optical links using integrated CMOS predistortion circuits
Fu-Chuan Lin, David M. Holburn
The demand for high-speed communications is growing exponentially. Recent trends in the integration of entire systems-on-chip have spurred the development of Fibre-To-The-Home (FTTH) network for high-speed data and video services. This paper presents a predistortion circuit that integrates all of the functions necessary to implement a high linearity distribution point system for broadband optical fibre communications. The single CMOS chip includes a variable gain amplifier and a predistortion circuit. The circuits have been implemented with 0.35μm CMOS technology and simulation shows that the power consumption is 86mW with a 3.3V supply. The systems and circuits are detailed and their application to analogue optical links discussed.
A 40-Gb/s driver circuit using a novel inductive bandwidth extension technique
A positive feedback technique is proposed to augment the bandwidth extension achievable using peaking inductors. The technique is based on inductor sharing between consecutive amplifier stages, and it can be effective when used with smaller inductance values compared to traditional inductive peaking. A 40Gb/s two-stage amplifier comprising a differential pair and emitter followers is presented as a practical design example. An expression for the transfer function of the proposed circuit is derived, and its bandwidth and group delay are compared to equivalent amplifiers with inductive peaking and without bandwidth extension. Circuit sensitivity to the inductance value is also considered. The proposed amplifier was implemented in a SiGe BiCMOS process with ƒτ=120GHz and is used as a predriver for a 50Ω buffer. Combined with the buffer, it provides 10dB of gain and consumes 90mW from a 2.5V power supply and 180mW from a 3.3V power supply. Simulations show about 40% bandwidth improvement compared to traditional inductive peaking. Time domain measurements demonstrate 40Gb/s operation with a maximum differential swing of 1.0V p-p and 20-80% transition times of 7-9ps.
Analog Circuits
icon_mobile_dropdown
A GmC filter design methodology for high-speed continuous-time sigma-delta A/D converters in a deep sub-micron technology
Raf Schoofs, Michiel Steyaert, Willy Sansen
This paper presents a design methodology for a GmC filter in a Continuous-Time (CT) Sigma-Delta A/D converter. It focuses on the challenges the designer faces when a deep sub-micron technology is used. According to the proposed methodology, a 1-bit, 3rd order CT modulator is designed. The modulator achieves an accuracy of 10 bits within a signal band of 8 MHz. The design is made in a 90 nm standard CMOS process. The small transistor dimensions enable a clock rate of 1 GHz. An analytical comparison between RC filters and GmC filters is presented based on their power consumption. It is shown that a RC filter requires an integrator loop gain-bandwidth equal to the sampling rate. This puts a severe limitation on the minimal power consumption for this type of filter. Therefore, a GmC filter implementation is chosen because it consumes the lowest power in order to meet the design specifications. Mathematical expressions for harmonic distortion and thermal noise are derived. They are interpreted in terms of a low power design approach. Since the input signal swing scales down with the supply voltage, harmonic distortion becomes less important in a deep sub-micron technology. Therefore, the thermal noise requirements determine mainly the overall power consumption of the CT modulator. Small transistor lengths enable high sampling rates, but lower the integrator output impedance. This results in a reduced DC gain of the filter. Consequently, the proposed GmC filter architecture is adjusted to provide sufficient suppression of in-band quantization noise leakage. All proposed design choices are verified by numerical simulations.
A 0.18-µm CMOS low-noise highly linear continuous-time seventh-order elliptic low-pass filter
Juan F. Fernandez-Bootello, Manuel Delgado-Restituto, Angel Rodriguez-Vazquez
This paper presents a fast procedure for the system-level evaluation of noise and distortion in continuous-time integrated filters. The presented approach is based on Volterra’s series theory and matrix algebra manipulation. This procedure has been integrated in a constrained optimization routine to improve the dynamic range of the filter while keeping the area and power consumption at a minimum. The proposed approach is demonstrated with the design, from system- to physical-level, of a seventh-order low-pass continuous-time elliptic filter for a high-performance broadband power-line communication receiver. The filter shows a nominal cut-off frequency of ƒc=34MHz, less than 1dB ripple in the pass-band, and a maximum stop-band rejection of 65dB. Additionally, the filter features 12dB programmable boost in the pass-band to counteract high frequency components attenuation. Taking into account its wideband transfer characteristic, the filter has been implemented using Gm-C techniques. The basic building block of its structure, the transconductor, uses a source degeneration topology with local feedback for linearity improving and shows a worst-case intermodulation distortion of -70 dB for two tones close to the passband edge, separated by 1MHz, with 70mV of amplitude. The filter combines very low noise (peak root spectral noise density below 56nV/√Hz) and high linearity (more than 64dB of MTPR for a DMT signal of 0.5Vpp amplitude) properties. The filter has been designed in a 0.18μm CMOS technology and it is compliant with industrial operation conditions (-40 to 85°C temperature variation and ±5% power supply deviation). The filter occupies 13mm2 and exhibits a typical power consumption of 450 mW from a 1.8V voltage supply.
CMOS current amplifiers exhibiting independent AC and DC current amplification
A CMOS circuit topology is demonstrated for the amplification of high-frequency AC currents without requiring similar DC current amplification. This technique is useful for current-domain amplification and processing of signals when low DC power consumption is necessary. Large amounts of AC gain can be achieved using this technique without requiring equivalent DC current gain, which would increase power consumption. Two amplifiers designed using this concept are discussed, one based on a standard current mirror architecture and the second using a cascode-type configuration. Measured results and analysis show the efficacy of this technique for the amplification of multi-Gb/s current-domain signals when implemented in a 0.12μm CMOS technology. Single-stage AC current gains of 12dB are achieved with unity DC current gain, while operating from supply voltages less than 1.0V. Temperature stable gain is also achieved.
An efficient 2-stage fractional charge pump based on frequency regulation
A. Saiz-Vela, P. Miribel-Catala, M. Puig-Vidal, et al.
An efficient 2-stage charge pump based on two-phase voltage doublers is proposed in this paper. Pulse skipping frequency regulators have been used to obtain a high efficiency over a wide range of loads. Since this charge pump has been designed for battery-powered portable devices, a power-up control system that combines a linear and a switched charging sequence has been included in each stage in order to avoid great current spikes at the beginning of the start-up process that could damage or shorten the battery life. The result is a power efficient 2-stage charge pump capable to generate a maximum regulated output voltage up to 10V from a 2.7V-3.3V battery source and deliver a maximum power of 100mW. If it is desired, the regulated output voltage can be downscaled to a required lower regulated voltage through a simple programming method using external resistors plus internal digital circuitry. This circuit has been designed using a 0.7μI2T technology from AMI semiconductor.
Multimedia I
icon_mobile_dropdown
A quarter pixel precision motion estimation architecture for H.264/AVC video coding
S. Lopez, F. Tobajas, A. Villar, et al.
H.264/AVC is the most recent and promising international video coding standard developed by the ITU-T Video Coding Experts Group in conjunction with the ISO/IEC Moving Picture Experts Group. This standard has been designed in order to provide improved coding efficiency and network adaptation. In this sense, H.264/AVC provides superior features when compared with its ancestors such as MPEG-2, MPEG-4 and H.263 but at the expenses of a prohibitive computational cost for real time applications. In particular, the motion estimation results to be the most intensive task in the whole encoding process, and for this reason, efficient architectures as the one presented in this paper to compute the 41 motion vectors per macroblock required by the H.264/AVC video coding standard, are needed in order to meet real conditions. This paper deals with a low cost VLSI architecture capable to obtain half and quarter pixel precision motion vectors, applying the correspondent techniques in order to obtain these motion vectors as demanded by the H.264/AVC standard. Techniques such as the reuse of the results obtained for smaller blocks and the possibility of avoiding the use of certain motion estimation modes have been introduced in order to obtain a flexible low-power hardware solution. As a result, the proposed architecture has been synthesized and generated to a commercial FPGA device, producing a fully functional embedded prototype capable of processing up to QCIF images at 30 fps with low area occupation.
Statistically optimized VLSI architecture for buffer for EBCOT in JPEG2000 encoder
In this paper we present the VLSI architecture for the buffer for tier-I of EBCOT encoder of JPEG2000. The buffer allows the integration of bit plane coder and arithmetic coder module employing concurrent symbol processing technique. The buffer architecture is optimized by exploiting the natural image statistics to optimally choose the buffer length parameter. The overall architecture is implemented using Altera FPGA and experimental results show a savings of 59% in the hardware cost with minimal reduction in the overall throughput.
Hardware implementation of the wavelet transform for JPEG2000
J. Hormigo, J. M. Prades, J. Villalba, et al.
In this paper we shall propose and examine an VLSI architecture for the integer-to-integer wavelet transform which is used by JPEG2000 standard for lossless compression. In order to achieve a fully utilization of hardware resources independently of the bit-depth of the input data, on-line arithmetic (digit-serial computation) is proposed to carry out this architecture. Besides, a high throughput is achieved thanks to the high degree of parallelism that on-line arithmetic allows. The design has been simulated and implemented using Xilinx FPGA device, and its main results are provided.
Haar wavelet processor for adaptive on-line image compression
F. Javier Diaz, Angel M. Buron, Jose M. Solana
An image coding processing scheme based on a variant of the Haar Wavelet Transform that uses only addition and subtraction is presented. After computing the transform, the selection and coding of the coefficients is performed using a methodology optimized to attain the lowest hardware implementation complexity. Coefficients are sorted in groups according to the number of pixels used in their computing. The idea behind it is to use a different threshold for each group of coefficients; these thresholds are obtained recurrently from an initial one. Parameter values used to achieve the desired compression level are established "on-line", adapting their values to each image, which leads to an improvement in the quality obtained for a preset compression level. Despite its adaptive characteristic, the coding scheme presented leads to a hardware implementation of markedly low circuit complexity. The compression reached for images of 512x512 pixels (256 grey levels) is over 22:1 (≈0.4 bits/pixel) with a rmse of 8-10%. An image processor (excluding memory) prototype designed to compute the proposed transform has been implemented using FPGA chips. The processor for images of 256x256 pixels has been implemented using only one general-purpose low-cost FPGA chip, thus proving the design reliability and its relative simplicity.
Analog, Mixed-Signal, and Power Circuit Design Methodologies and Tools
icon_mobile_dropdown
Impact of package parasitics on crosstalk in mixed-signal ICs
This paper presents an approach for the analysis and the experimental evaluation of crosstalk effects due to current pulses drawn from voltage supplies in mixed analog-digital CMOS integrated circuits. A realistic model of bonding and package parasitics has been derived to study digital switching noise injected through bonding interconnections. Simulations results indicate that disturbances due to switching currents in digital blocks propagate through the substrate and affect analog voltages, thus degrading circuit performance. Test structures have been integrated into a test chip mounted with different technologies, in order to compare the measurements on test chips. Measurements confirm simulation results. Chip-on-board mounting technology has better performance with respect to chip-in-package, due to the reduction of parasitic elements.
Net order optimization in analog net bundles
Thomas Jambor, Lars Schreiner, Markus Olbrich, et al.
This paper presents a new approach to optimize net order in analog busses. It is used for the PARasitic SYmmetric router (PARSY), which routes net bundles, e.g. busses or differential pairs, maintaining parasitic symmetry and limiting differential coupling. The router is mainly devoted to analog signal interconnect but can also be used for critical digital busses. Net bundles have a fixed order, because wire crossing is not allowed in net bundle segments to enforce symmetry. Wires inside net bundle segments are generated by module generators. Connecting cell terminals to the first or the last net bundle segment is complex, because the cell terminals can vary in geometry and placement. Therefore, an assignment between nets and wires (net order) in a segment is required. This assignment does not affect the order in which nets or net bundles are routed sequentially. The optimization objective for the connections from net bundle segments to terminals is to minimize the number of crossings and the length difference, while maintaining symmetry if possible. Therefore, a net order has to be calculated, which globally optimizes these criteria for all terminal connections. Different net orders can be computed from the placement of terminals, which have to be connected to a net bundle segment. An additional order is calculated from these net orders, which contains the most characteristic features of all net orders. For all net orders costs are evaluated, and the one with the lowest cost is chosen.
Analog and Mixed-Signal Design Methodologies and Tools
icon_mobile_dropdown
On the suitability and development of layout templates for analog layout reuse and layout-aware synthesis
Accelerating the synthesis of increasingly complex analog integrated circuits is key to bridge the widening gap between what we can integrate and what we can design while meeting ever-tightening time-to-market constraints. It is a well-known fact in the semiconductor industry that such goal can only be attained by means of adequate CAD methodologies, techniques, and accompanying tools. This is particularly important in analog physical synthesis (a.k.a. layout generation), where large sensitivities of the circuit performances to the many subtle details of layout implementation (device matching, loading and coupling effects, reliability, and area features are of utmost importance to analog designers), render complete automation a truly challenging task. To approach the problem, two directions have been traditionally considered, knowledge-based and optimization-based, both with their own pros and cons. Besides, recently reported solutions oriented to speed up the overall design flow by means of reuse-based practices or by cutting off time-consuming, error-prone spins between electrical and layout synthesis (a technique known as layout-aware synthesis), rely on a outstandingly rapid yet efficient layout generation method. This paper analyses the suitability of procedural layout generation based on templates (a knowledge-based approach) by examining the requirements that both layout reuse and layout-aware solutions impose, and how layout templates face them. The ability to capture the know-how of experienced layout designers and the turnaround times for layout instancing are considered main comparative aspects in relation to other layout generation approaches. A discussion on the benefit-cost trade-off of using layout templates is also included. In addition to this analysis, the paper delves deeper into systematic techniques to develop fully reusable layout templates for analog circuits, either for a change of the circuit sizing (i.e., layout retargeting) or a change of the fabrication process (i.e., layout migration). Several examples implemented with the Cadence's Virtuoso tool suite are provided as demonstration of the paper's contributions.
Analog, Mixed-Signal, and Power Circuit Design Methodologies and Tools
icon_mobile_dropdown
A mismatch characterization and simulation environment for weak-to-strong inversion CMOS transistors
J. Velarde-Ramirez, G. Vicente-Sanchez, T. Serrano-Gotarredona, et al.
Mismatch analysis and simulation is crucial for modern analog design with submicron technologies, where transistors tend to be biased in weak and moderate inversion regions because of the down shrinking of power supply voltage. For optimum analog design where speed, power consumption, area, noise, and accuracy need to be carefully traded off, it is crucial to have available a precise estimation of transistor mismatch in order to avoid overdesign and consequently sacrify unnecessarily speed, power consumption, and area. In this paper we will provide experimental mismatch measurements of different 0.35um CMOS technologies. Each technology has been characterized for a large number of transistor sizes (25-30), by sweeping different width and length values. A large number of transistor curves are measured ranging over different possible biasing conditions. A recent mismatch model will be used to fit the data, and extract electrical parameters. Some of those parameters will be used to adjust the measured mismatch. As a result, a set of standard deviations and correlation coefficients result for the statistical characterization of the mismatch responsible parameters. The resulting electrical parameters, and statistical mismatch parameters are then used in the Spectre simulator of Cadence design environment, to implement the mismatch models using the AHDL behavioral level Spectre description language. The paper shows good agreement between measured data, predicted data, and simulated data.
Modeling of power control schemes in induction cooking devices
Alessio Beato, Massimo Conti, Claudio Turchetti, et al.
In recent years, with remarkable advancements of power semiconductor devices and electronic control systems, it becomes possible to apply the induction heating technique for domestic use. In order to achieve the supply power required by these devices, high-frequency resonant inverters are used: the force commutated, half-bridge series resonant converter is well suited for induction cooking since it offers an appropriate balance between complexity and performances. Power control is a key issue to attain efficient and reliable products. This paper describes and compares four power control schemes applied to the half-bridge series resonant inverter. The pulse frequency modulation is the most common control scheme: according to this strategy, the output power is regulated by varying the switching frequency of the inverter circuit. Other considered methods, originally developed for induction heating industrial applications, are: pulse amplitude modulation, asymmetrical duty cycle and pulse density modulation which are respectively based on variation of the amplitude of the input supply voltage, on variation of the duty cycle of the switching signals and on variation of the number of switching pulses. Each description is provided with a detailed mathematical analysis; an analytical model, built to simulate the circuit topology, is implemented in the Matlab environment in order to obtain the steady-state values and waveforms of currents and voltages. For purposes of this study, switches and all reactive components are modelled as ideal and the "heating-coil/pan" system is represented by an equivalent circuit made up of a series connected resistance and inductance.
Circuit Design for RF Applications
icon_mobile_dropdown
Behavioral study and design of a digital interpolator filter for wireless reconfigurable transmitters
V. Ferragina, A. Frassone, N. Ghittori, et al.
The behavioral analysis and the design in a 0.13 μm CMOS technology of a digital interpolator filter for wireless applications are presented. The proposed block is designed to be embedded in the baseband part of a reconfigurable transmitter (WLAN 802.11a, UMTS) to operate as a sampling frequency boost between the digital signal processor (DSP) and the digital-to-analog converter (DAC). In recent trends the DAC of such transmitters usually operates at high conversion frequencies (to allow a relaxed implementation of the following analog reconstruction filter), while the DSP output flows at low frequencies (typically Nyquist rate). Thus a block able to increase the digital data rate, like the one proposed, is needed before the DAC. For example, in the WLAN case, an interpolation factor of 4 has been used, allowing the digital data frequency to raise from 20 MHz to 80 MHz. Using a time-domain model of the TX chain, a behavioral analysis has been performed to determine the impact of the filter performance on the quality of the signal at the antenna. This study has led to the evaluation of the z-domain filter transfer function, together with the specifications concerning a finite precision implementation. A VHDL description has allowed an automatic synthesis of the circuit in a 0.13 μm CMOS technology (with a supply voltage of 1.2 V). Post-synthesis simulations have confirmed the effectiveness of the proposed study.
Modeling and design of high-order phase locked loops
Brian Daniels, Gerard Baldwin, Ronan Farrell
In this paper a new stable high order Digital Phase Lock Loop (DPLL) design technique is proposed. PLLs of order greater than two display better noise bandwidth, Bl, than classical second order PLLs. However these are not unconditionally stable as in the second order case. This technique uses linear theory to design the DPLL. The stability of the DPLL is guaranteed by placing a restriction on the system gain. This stability boundary is found by transforming the system transfer function to the Z-domain and plotting the root locus of the LPLL for values of gain where all the system poles lie inside the unit circle. The minimum value of gain where all the poles lie inside the unit circle forms the stability boundary. It is shown that the stability boundary of the LPLL is comparable to the stability boundary of the DPLL. Finally where the above filter design system produces slow lock, gear shifting of the DPLL components is considered. This allows the DPLL to start off with a wide loop bandwidth and switch to the narrow bandwidth once the system has locked.
Voltage-buffer-based low-power area-efficient SC FIR filter for wireless communication
Rafal Dlugosz, Ryszard Wojtyna
In this paper, a new idea of finite impulse response (FIR) switched capacitor (SC) filter realization suitable for a wireless communication is proposed. A design on the circuit level for a CMOS 0.35 μm process is presented. Main advantages of the proposed filter are low power consumption and small chip area occupation. In classic approaches to SC FIR filter realizations, such parameters like chip area, power consumption and signal quality are conflicting ones. There are various SC FIR architectures. Some are power economic, but need very complicated circuitry. Others have simple structures, but use a big number of high power active elements like operational amplifiers. We propose an approach, which is a compromise solution. Instead of using high power op amps, specialized low-power simple voltage followers have been introduced to reduce the chip area occupation and simultaneously not enlarge the power consumption. The proposed idea is to decrease the number of large capacitors by providing to some big capacitors a voltage from several small capacitors by means of the specialized voltage followers. Apart from power-economic operation, our followers are simple, including 8 transistors, and operate with high precision (of order 10-3). The resulting SC FIR filters dissipate relatively low power. Wireless channel filters based on SC FIR techniques are typically of the order 30 to 35. The proposed filter is designed just for such an order, and will dissipate less than 6 mW, being supplied by 3V, and occupies a chip area less than 1.5 mm2. The maximum signal frequency is close to 1.25 MHz. For a proper operation, the circuit needs only a 2.5 MHz clock generator, which is a low value. The clock generator is realized as an internal block, similarly as in our previous chips implemented in CMOS 0.8 μm and 0.35 μm processes.
Band-pass transimpedance read-out circuit for UHF MEMS resonator applications
Humberto Campanella, Arantxa Uranga, Zachary Davis, et al.
A detailed description of a read-out amplifier for high frequency MEMS resonators is done. Both read-out requirements and circuit architecture are presented. The architecture of the system is mainly based on three blocks: a trans-impedance amplifier, followed by a three-stage voltage-to-voltage amplifier, and finally by an output buffer amplifier. Physical design is based on AMS 0,35 μm technology. Also, simulation and fabrication results are presented and analyzed. Simulation results show an AC transimpedance gain of 70 dBΩ and a cut-off frequency of 400 MHz, for a band-pass bandwidth over 350 MHz. The fabricated amplifier has an input noise current spectral density of 11 pA/(Hz)1/2, a power dissipation of 200 mW, and occupies an active area of 600 μm * 450μm. Integration of read-out circuit with MEMS resonator has been designed and implemented, by properly connection of MEMS signals to the amplifier, in order to enable characterization of a set of MEMS resonators. Integration analysis will allow future extraction of electrical parameters of the resonator.
Digital Circuits
icon_mobile_dropdown
An integrated controller for a flexible and wireless atomic force microscopy
Nowadays Atomic Force Microscopy is one of the most extended techniques performed in biological measurements. Due to the higher flexibility in respect to conventional equipments, a novel approach in this field is the use of a microrobot equipped with an AFM tool. In this paper it is presented an integrated controller for an AFM tool assembled in a 1 cm3 wireless microrobot. The AFM tool is mounted on the tip of a rotational piezoelectric actuator arm. It consists on a XYZ positioning scanner, based in 4 piezoelectric stacked actuators, and an AFM piezoresistance probe. Two types of AFM working modes are implemented in the controller, i.e., nanoidentation and AFM scanning. Correction of the mismatch of the piezoactuators composing the arm is possible. A programmable PID control is included in the controller in order to get more flexibility in terms of scanning speed and resolution. An IrDA protocol is used to program the parameters of the AFM tool controller and the positioning of the robot in the working area. Then the values of the nanoindentation or of the scanning can be read through the IrDA interface without any other external action. Due to the strong power and area restrictions, the controller has been implemented in specific logic in a 0.35um technology. The design has been done using functional specifications with high level tools and RTL synthesis. The AFM scanner can be positioned with a resolution of 10 nm and scan areas up to 1 μm2 with an expected vertical resolution of 1nm.
A CMOS latched driver using bootstrap technique for low-voltage applications
In this paper, we propose a high performance direct bootstrapped CMOS latched driver circuit (J-driver). It is a 28% faster and occupies a 58% less active area as compared to a counterpart circuit (L-driver) using indirect bootstrap technique. In addition, our driver J-driver reduces the power consumption by a 2% in driving capacitive loads from 1pF to 6pF. The challenge in designing this latched driver is to appropriately trade-off performance against the active area.
Performance analysis of full adders in CMOS technologies
Javier Castro, Pilar Parra, Antonio J. Acosta
Full adders are one of the most important building blocks in VLSI digital arithmetic. The area, electrical, timing, power consumed and noise generated characteristics of this cell are strongly dependent on the design technique. An exhaustive work taking into account the above parameters is done, and that complete analysis will be of utility for the community of digital designers. Emphasis will be done in power/noise figures, of most important concern in current CMOS mixed-signal design. The full adders considered are those using complementary CMOS, pass-transistor logic, double pass-transistor logic, and two versions based on CMOS transmission gate. Main parameters as area, delay, power consumption and noise generation have been measured by electrical simulation in a 0.35 μm CMOS technology. The main results obtained are, on one hand, the selection of a logic family for a specific application and, on the other hand, the selection of a specific full adder structure for an optimized parameter option -power, noise or speed.
Temperature effects on circuit synchronism
Sebastian A. Bota, Josep L. Rossello, Marcos Rosales, et al.
The performance increase of VLSI circuits is leading to increase power dissipation and operation temperature, consequently management of thermally related issues is rapidly becoming one of the most challenging efforts in high performance IC design. Within die temperature gradients on silicon can occur due to different activity maps and in high performance ICs differences as high as 50 °C can be achieved during normal operation. Clock network constitutes one of the most critical elements in synchronous circuits and has a significant impact on speed, area and power dissipation. Due to the well-known impact of temperature on delay, the effect of non-uniform thermal maps on the clock skew can acquire a significant relevance. In this work we analyze the impact of within die thermal gradients on the clock skew considering the dependence on temperature on both active devices and interconnects.
Keynote Session II
icon_mobile_dropdown
Review of energy harvesting techniques and applications for microelectronics
Loreto Mateu, Francesc Moll
The trends in technology allow the decrease in both size and power consumption of complex digital systems. This decrease in size and power gives rise to new paradigms of computing and use of electronics, with many small devices working collaboratively or at least with strong communication capabilities. Examples of these new paradigms are wearable devices and wireless sensor networks. Currently, these devices are powered by batteries. However, batteries present several disadvantages: the need to either replace or recharge them periodically and their big size and weight compared to high technology electronics. One possibility to overcome these power limitations is to extract (harvest) energy from the environment to either recharge a battery, or even to directly power the electronic device. This paper presents several methods to design an energy harvesting device depending on the type of energy avaliable.
Technology Reliability
icon_mobile_dropdown
Ring oscillator behavior after oxide breakdown
R. Fernandez, R. Rodriguez, M. Nafria, et al.
The influence of the oxide hard breakdown (HBD) path location along the channel in nMOSFETS on the performance and power consumption of a five stages ring oscillator has been evaluated. A simple MOSFET transistor model which takes into account the oxide BD has been used to do the analysis. The results show that in some cases, after oxide BD, the ring oscillator still operates but the circuit could fail due to higher consumption.
Transient electro-thermal investigations of interconnect structures exposed to mechanical stress
Stefan Holzer, Christian Hollauer, Hajdin Ceric, et al.
Investigations of state-of-the-art integrated circuit designs clearly show that the temperature in interconnect structures is becoming the dominant and straitening factor for system performance. In this work we combine three-dimensional transient electro-thermal simulations with a finite element formulation of the thermo-mechanical stress problem in order to study the evolution and development of mechanical stress in complex layered interconnect structures at different operating conditions.
Baseband Design for Wireless Transceivers
icon_mobile_dropdown
A 2.5-V 4-mA GSM base-band transmit port with 2.8-mm2 area in CMOS 0.18 µm
Emmanuel Marais, Roberto Rivoir
A GSM base-band transmitter with strong requirements for linearity, matching and power dissipation has been designed in Atmel CMOS 0.18um technology. This paper recalls the digital Gaussian Minimum Shift Keying (GMSK) modulation imposed by the ETSI standard and briefly considers the design and the physical implementation of the transmitter itself (IQDAC) into a noisy environment. A test-chip, including the entire GSM base-band (IQDAC and IQADC) and a digital GMSK modulator, is presented. Measurement results are summarized at the end for the transmit part.
A programmable baseband chain for a WCDMA/WLAN (802.11b) multi-standard zero-IF receiver
As we move towards convergent 4G Wireless encompassing both 3G cellular (WCDMA) for wide area networks and Wireless LAN for ”hot-spots”, the development of low power, low cost multi-band multi-standard wireless chipset solutions is a must. To this end this paper presents a programmable architecture for an analog baseband chain intended for use in a zero-IF multi-standard WCDMA/WLAN(802.11b) radio receiver. It also addresses the DC offset cancellation in the baseband chain. This is one of the major impairments in zero-IF receivers whose simplicity makes them suitable for single-chip multi-standard designs but where DC offset can reduce the receiver performance if a proper DC offset cancellation scheme is not devised. System level design of the baseband chain is given leading to design specifications of the different blocks in the chain. Extensive simulations carried out in MATLAB/SimuLink at the system level and in Cadence design tools at the circuit level show the performance of the system. The circuits will be fabricated in a 0.18μm CMOS process for a 1.8 V power supply.
Keynote Session III
icon_mobile_dropdown
Low-power short-range transceivers for sensor network applications
Emerging technologies like ZigBee or Ultra Wide Band (UWB) Radio, based on the new standards of the IEEE 802.15 family, will, in a near future, compete with and/or complement Bluetooth technology in the development of Wireless Personal Area Networks (WPAN’s), capable to satisfy the increasing demand of high bit rate data transfer links as well as low power and small size constrains. Nowadays coexistence and interconnectivity of Wireless Local Area Networks (WLAN’s), WPAN’s and mobile phones is just the first step towards the implementation of the so called Ambient Intelligence. The main characteristics of this new paradigm are: ubiquity, transparency, and intelligence. In this context, sensor networks are the first front of the communication chain. Thus, most of the wireless data transfers will take place at very short distances and most of the information flow will be performed at very low rates. To implement the RF transceiver devices constituting sensor networks in an Ambient Intelligence environment, several challenges still need to be solved, among them: packaging (SoP vs. SoC approaches), powering (low power, batteryless systems, energy scavenging) and system architecture (new simplified direct conversion approaches). All these matters will be considered in this work.
Wireless Transceivers
icon_mobile_dropdown
CMOS implementation of ultra-wideband systems
Wim Vereecken, Michiel Steyaert
Ultra-Wideband systems is the collective term for wireless devices with a large spectral footprint and a low transmission power. The extreme low power spectral density of the UWB system forms a vast difference with classic communication systems that employ a large power within a small frequency band. Implementation approaches of Ultra-Wideband enclose classical carrier-based OFDM systems and pulse-based systems, each with their play trumps and disadvantages. Depending on the final application, cost, power or bandwidth can be the key target. Deep-submicron technologies allow to extend the limiting boundaries of analog building blocks but also introduce new challenges. Furthermore, new problems with respect to analog design deserve our attention: The high bandwidth of the signals involved in wideband systems obliges to migrate to a broadband receiver chain: LNA's (Low Noise Amplifier), mixers and ADC's with a wideband in- and output have to be designed, while commonly used techniques such as inductive peaking in the power amplifier cannot be used any more. Both advantages and disadvantages of OFDM and pulse-based transceiver architectures will be compared, together with simulation data, in order to give an overview of important design aspects of an Ultra-Wideband application.
A 2-V 0.35-µm CMOS DECT RF front end with on-chip frequency synthesizer
D. Guermandi, E. Franchi, A. Gnudi, et al.
An integrated CMOS RF front-end receiver complying with the DECT standard is presented. It is implemented in a standard 0.35μm CMOS technology operating with 2 V power supply and includes the Low Noise amplifier (LNA), the quadrature mixer and the frequency synthesizer. The frequency synthesizer is based on an integer-N Phase Locked Loop (PLL) and uses two coupled Voltage Controlled Oscillators (VCOs) for direct I/Q generation. The packaged circuit exhibits 9.2dB NF, -19.5 dBm IIP3, 27.5 dB gain and consumes 30mA. This work demonstrates the feasibility of an integrated RF front-end for a wide band standard using a direct conversion architecture that minimizes the number of external components and a cheap standard CMOS technology.
High bit rate BPSK receiver
This work presents a simple differentially BPSK receiver front-end using a novel schema without the need of an explicit carrier recovery system. The main principle of operation is the conversion of the incoming BPSK signal into an ASK signal having the same modulation pattern. Two versions of the system have been designed. One is intended to work at the 433.92 MHz ISM band and the other at 2 GHz frequency band. Accordingly, two prototypes of the system core, the BPSK to ASK converter circuit, have been implemented and tested. First a hybrid version for the low frequency operation and, second a multi chip module (MCM) for the 2 GHz frequency band. The system performance has been evaluated using Agilent Technologies Advanced Design System (ADS) platform. The ability to jointly perform system, circuit and EM simulations and co-simulations is the main advantage of this design tool. Obtained results indicate that modulation rates up to 20 Mbits/s for the hybrid version and up to 80 Mbits/s for the MCM version can be reached.
Digital Design Methodologies and Tools I
icon_mobile_dropdown
Power analysis methodology and library in SystemC
L. Pieralisi, M. Caldari, G. B. Vece, et al.
Power dissipation has become one of the main constraints during the design of complex integrated circuits in the recent years, due to the steady increasing of integration level and operating clock frequency. Power consumption is a major design issue and thus it requires the availability of effective tools for power estimation and optimization. Moreover, it is known that power analysis and optimization during the early design phases, starting from the system level, can lead to large power savings. In this paper we present Power-Kernel, an efficient object-oriented library for SystemC 2.0, which allows the easy introduction of a power model in the executable specification of a complex design.
Energy estimation and optimization in architectural descriptions of complex embedded systems
Ana Abril, Habib Mehrez, Frederic Petrot, et al.
This paper proposes a method for energy consumption estimation and optimisation on hardware-software embedded systems-on-chip. The aim of our work is to provide a simulation framework enabling power estimations of high level descriptions (behavioural C models) of systems that include all the hardware components also the new ones. Such analysis are needed to select the best hardware architecture and software organization for a particular application in terms of power consumption and to apply low power techniques at system level. The starting point is the architectural description of the system used for simulation. It employs very abstract C-based models of the hardware components. We focus on the cycle-accurate level to improve the estimation accuracy. Behavioural models are extended with energy models that take into account the operations executed per transition into the state machines of the components. The method has been tested in a MPEG4 decoder implementation. The error of the energy estimations was estimated lower than 6% from physical measurements. Low power techniques were applied and analyzed like another memory hierarchy, clock gating, voltage/frequency scaling, and some others. It has permitted to reduce the consumption cost of the system on 93%.
Algorithms to get the maximum operation frequency for skew-tolerant clocking schemes
D. Guerrero, M. Bellido, J. Juan, et al.
Nowadays it is not possible to neglect the delay of interconnection lines. The die size is rising very fast, and the delay of the interconnection lines grows quadrically with it. Also, the fact that the gate delay keeps getting smaller increases the importance of the delay of the interconnection lines. The delay of the clock lines is specially important: If the clock skew is underestimated and the clocking scheme is not properly designed, then the system may not work under any clock frequency. In this paper we evaluate the timing performance of three skew-tolerant clocking schemes. These schemes are the well known Master-Slave clocking scheme (MS) and two schemes developed by the authors: Parallel Alternating Latches Clocking Scheme (PALACS) and four-phase Parallel Alternating Latches Clocking Scheme (four-phase PALACS). To carry out these analysis, the authors introduce new algorithms to obtain the clock waveforms required by a synchronous sequential circuit. Separated algorithms were developed for every clocking scheme. The algorithms take a set of timing parameters as input and generate a chronogram of the circuit trying to minimise the clock period but ensuring the timing restrictions of the circuit are met for a given clock skew. Using these algorithms is it possible to draw a representation of the computation frequency as a function of the clock skew for every clock scheme. Once we have estimated the timing parameters and the skew, these representations can help us to choose the best clocking scheme for our design.
Analog Test
icon_mobile_dropdown
Embedded design-for-testability strategies to test high-resolution SD modulators
Sara Escalera, Alvaro Espin, Oscar Guerra, et al.
This paper describes the design-for-testability strategies integrated in a 0.35μm CMOS 17-bit@40-kS/s chopper-stabilized Switched-Capacitor 2-1 cascade ΣΔ modulator for automotive sensor interfaces. After a brief review on the most important effects degrading the circuit performance, a test technique, based on the division of the circuit into several blocks that are tested separately, is presented. Experimental results shows the utility of the implemented test technique to detect errors in the circuit and to characterize the most important blocks with a minimum increase of extra area for the additional test circuitry.
Digital test of a sigma-delta modulator in a mixed-signal BIST architecture
Luis Rolindez D.D.S., Salvador Mir, Guillaume Prenat
Oversampling Sigma-Delta modulators are commonly used in the design of high-resolution Analogue-to-Digital Converters (ADCs). The test of these ΣΔ modulators is a difficult and expensive task due to the need for the generation of a high-precision analogue test signal and the necessity of complex digital resources for the test response analysis. These problems can be overcome with the integration of the test in the chip by means of Built-In Self-Test (BIST) approaches. In this paper we present a BIST technique for high-precision ΣΔ modulators, by incorporating on-chip test signal generation and on-chip test response analysis capabilities. The approach, mostly digital, is based on the application of a binary stream as test stimulus and the re-use of the digital decimation filter present in a ΣΔ ADC for the test response analysis. The binary stimulus, which encodes a sinusoidal signal, is chosen to have a very high quality in the bandwidth of interest of the modulator. For the analysis of the test response, a high-precision sinusoidal signal is necessary as reference. This reference signal can be obtained from the same binary stimulus, by passing it directly to the digital decimation filter existing in the converter. Test response and sinusoidal reference signal are both compared by means of a sine-wave curve-fitting algorithm in order to obtain a measure of the SNDR (Signal-to-Noise-plus-Distortion Ratio). Simulation results show that this technique is capable of detecting the SNDR degradations caused by non-idealities in the modulator used in a 16-bit audio ΣΔ ADC architecture.
Experimental comparison of different oscillation-based test techniques in an analog block
Kay Suenaga, Rodrigo Picos, Sebastia Bota, et al.
This paper experimentally analyses the capabilities of an Oscillation-Based Test technique for diagnosis purposes. To evaluate the feasibility of this test strategy, the technique is applied to an Operational Transconductance Amplifier with fault injection capabilities. The application of this methodology has low impact on circuit performances. Voltage and current magnitude have been considered as test observables. The effects of catastrophic and parametric defects (bridges, opens and shorts) are analyzed in this work. Results show that by a right choice of the test observable, this technique provides high fault coverage levels even in the case of process variations.
Voltage to frequency converter for DAC test
John Hogan, Ronan Farrell
In this paper a modified relaxation oscillator is proposed as the core of an analog to digital modulator for on chip signal extraction for test. The architecture uses digital current source generation and digital switching in place of active circuitry. The resulting design allows for high input sensitivity, robustness to component variation while occupying little silicon area. This paper provides solutions on the main challenges in implementing this modulator and how it may be integrated with a digital based tester.
Modeling and Design of Passive RF Components
icon_mobile_dropdown
Design and modeling of an on-silicon spiral inductor library using improved EM simulations
This paper deals with the design and modeling of integrated spiral inductors for RF applications by means of a general purpose Electromagnetic (EM) simulator. These tools allow optimizing flexibly the inductor layout structure. The inductor performance can be obtained by using a three-dimensional design tool or a two-dimensional one. Planar 2-D or so called 2.5-Ds simulators are faster and accept complex coil geometries. We have used one of these simulators, the Advanced Design System planar EM simulator, Momentum, from Agilent. The inductor quality factor (Q) is limited, among other phenomena, by the series resistance of the metal traces and the substrate losses. Therefore the simulator requires an accurate set up of the process and simulator parameters and a correct algorithm to model metal thickness to rely on simulation results. In this paper we analyze and compare these different approaches. A high-quality factor inductor library on a 0.35 μm SiGe technology at 5 GHz is also designed in this work using the proper simulator set up. Nine of the inductors have been fabricated and measured to test the simulator reliability. Measurements taken over a frequency range from 500 MHz to 10GHz show a good agreement with 2.5-EM simulations.
Modeling of passive components in VLSI technologies
On-chip passives, such as inductors, transformers, are key components in the design of RF building blocks when using VLSI technologies. Most of the time of the design cycle is used in the simulation of passives, trying to obtain the maximum performance. Recently, work on modelling of passives has been directed to pursuit fast computation algorithms due to the need to handle several passives in a complete RF circuit. However, small attention has been paid in the optimization of passives. The work presented tries to fill this gap. The algorithm is based on three steps: the split of the magnetic and electric modelling problem based in a PEEC description; a model order reduction, based on plausible arguments; and the use of analytical formulae to keep scalability. Fast computation is achieved thanks to both the separation of the magnetic and electric problem, and the model order reduction. By keeping an analytical formulation, the optimization of the layout for minimum losses is driven by a physical algorithm, instead of a mathematical one. The tool has been checked with experimental data from inductors fabricated in different VLSI technologies, showing its possibilities in the design of RF building blocks.
Design considerations for high pass frequency passive filters
N. Sainz, I. Cendoya, U. Alvarado, et al.
Two high pass filters at 2.4 GHz and 5.2 GHz frequencies in SiGe 0.8μm process are presented. Design process and measurement results are discussed. This paper evaluates electromagnetic coupling, and presents a theoretical filter model which takes into account these effects, it also enumerates some design considerations to improve passive components design. A previous model of passive components is ilustrated, and the main conclusions are exposed to justify inductor and varactor election. Some inductors and varactors were manufactured previously to study how to improve the quality factor and to ensure accurate inductance and capacitance values. Different geometries for these passive components were designed, fabricated and measured, the best inductor and varactor election for filter design is based on these measurement results. After components election is carried out, the filter architecture is explained. The election of the optimal filter configuration is based, among other considerations, on minimizing passive component number, especially inductors. By achieving the lowest quality factor for inductors, filter characteristics improve by diminishing inductor number, therefore, the selected filter order is three, and just one inductor is used. Once the filter designs are manufactured and measured some non-modelled effects are appreciated and studied from the measurement results. These effects produce a pass band attenuation degradation, cut-off frequency deviation and resonant behavior at frequencies below cut-off frequency. To check what these effects are due to, electromagnetic coupling effect simulations are made using CADENCE simulator. This electromagnetic analysis helps evaluate the interaction effects between passive components. Electromagnetic simulations agree with the filter degradation measurement results.
Design considerations and tradeoffs for passive RFID tags
Faisal A. Hussien, Didem Z. Turker, Rangakrishnan Srinivasan, et al.
Radio Frequency Identification (RFID) systems are widely used in a variety of tracking, security and tagging applications. Their operation in non line-of-sight environments makes them superior over similar devices such as barcode and infrared tags. RFID systems span a wide range of applications: medical history storage, dental prosthesis tracking, oil drilling pipe and concrete stress monitoring, toll ways services, animal tracking applications, etc. Passive RFID tags generate their power from the incoming signal; therefore, they do not require a power source. Accordingly, minimizing the power consumption and the implementation area are usually the main design considerations. This paper presents a complete analysis on designing a passive RFID tag. A system design methodology is introduced including the main issues and tradeoffs between different design parameters. The uplink modulation techniques used (ASK, PSK, FSK, and PWM) are illustrated showing how to choose the appropriate signaling scheme for a specific data rate, a certain distance of operation and a limited power consumption budget. An antenna system (transmitter and receiver) is proposed providing the maximum distance of operation with the transmitted power stated by FCC regulations. The backscatter modulation scheme used in the downlink is shown whether to be ASK-BM or PSK-BM and the differences between them are discussed. The key building blocks such as the charge pump, voltage reference, and the regulator used to generate the DC supply voltage from the incoming RF signal are discussed along with their design tradeoffs. A complete architecture for a passive RFID tag is provided as an example to illustrate the proposed RFID tag design methodology.
Keynote Session IV
icon_mobile_dropdown
Integrated wireless systems: The future has arrived
It is believed that we are just at the beginning with wireless, and that a new age is dawning for this breakthrough technology. Thanks to several years of industrial manufacturing in mass-market applications such as cellular phones, wireless technology has nowadays reached a level of maturity that, combined with other achievements arising from different fields, such as information technology, artificial intelligence, pervasive computing, science of new materials, and micro-electro-mechanical systems (MEMS), will enable the realization of a networked stream-flow of real-time information, that will accompany us in our daily life, in a total seamless, transparent fashion. As almost any application scenario will require the deployment of complex, miniaturized, almost "invisible" systems, operating with different wireless standards, hard technological challenges will have to be faced for designing and fabricating ultra-low-cost, reconfigurable, and multi-mode heterogeneous smart micro-devices. But ongoing, unending progresses on wireless technology keeps the promise of helping to solve important societal problems in the health-care, safety, security, industry, environment sectors, and in general opening the possibility for an improved quality of life at work, on travel, at home, practically "everywhere, anytime".
Reconfigurable Radio Systems
icon_mobile_dropdown
Performance requirements for analog-to-digital converters in wideband reconfigurable radios
David Naughton, Gerard Baldwin, Ronan Farrell
With the current trend towards software defined radio, several candidate architectures for the analog receiver front-end have been presented. A common proposal for software defined reconfigurable radio is to develop a wideband ADC and utilise this for capturing a large segment of the spectrum. This would enable the subsequent signal processing operations of channel selection and data extraction to be carried out by a digital processor. This would allow the radio to be reconfigured by simply changing the software. In analysis of these systems, powerful neighbouring signals, or blockers, are considered but it has been conveniently assumed that suitable dynamic range will be available at the ADC. This is an acceptable assumption in narrowband systems where automatic gain control and analogue channel select filters can be used, but is not appropriate for a wideband system. In this paper we present an analysis based on bit-error-rates (BER) which shows the effect of blockers in a wideband architecture on the performance of the communication link and on the dynamic range requirements of the ADC.
A modular testbed for hardware reconfigurable radio at the 2.4 GHz ISM band
Gerard Baldwin, Ronan Farrell
A modular testbed for use in developing software defined radio is documented in this paper. The testbed is focused on the 2.4 GHz ISM band but may be used at other frequencies. An RF transceiver with variable transmit/receive frequencies and bandwidths is provided. It provides the capability to support many modulation schemes and standards such as GSM, UMTS, IEEE 802.11b and parts of the IEEE802.16 standards. It performs the RF functions of the radio, with the other PHY and MAC layer functions such as equalisation and error-coding being performed by a host computer. It communicates with the host computer system through a USB2 interface allowing data rates of up-to 60Mbytes a second. An API is used for communication with the host computer system allowing for modulation/demodulation and coding/decoding in software on the host system and reconfiguration of the radio system.
Memory Circuits
icon_mobile_dropdown
Design and simulation of an embedded DRAM cell made up of MOSFETs having alternative gate dielectrics
N. Konofaos, Th. Voilas, G. Ph. Alexiou
Design and construction of new sub-micron MOSFETs with alternative gate dielectrics has emerged as a new technology for use in high-performance logic or low power memory circuits. The modelling of the new devices needs to take into account the effects that the new dielectrics have on the MOS device performance. In this paper, we examine such effects in terms of both capacitance and leakage current effects. First, we investigate the role of the parasitic capacitances appearing at the MOS device due to either material related processes or metallization. These capacitances are modelled accordingly in order to derive the device characteristics. Then, leakage currents are taken into account and the whole device is simulated using a 90 nm technology based on the BSIM4 model equations, suitably modified to account for these effects. The application of such devices on memory circuits is examined in order to take into account device parameters such as the threshold voltage, ouput currents and timing. As a result, the design of an embedded DRAM based on the MOSFETs with the alternative gate dielectrics is presented and analysed. The single MOSFET behaviour and subsequently the DRAM circuit performance are presented and the relevant characteristics are derived. As a result, the simulation revealed low output currents for the MOSFETs and high refresh rates for the DRAM circuits. Deviations from the ideal case are examined and solutions and further work are proposed.
A Dickson charge pump circuit driven by boosted clock for low-voltage flash memories
Hyoung-Joo Kim, Gil-Su Kim, Soo-Won Kim
In this paper, a Dickson charge pump circuit driven by boosted clock for low-voltage flash memories is proposed. Voltage pumping gain of each stage can be considerably improved by using high-voltage clock instead of using conventional clock. Since the voltage gain of each stage can be considerably increased by using this method, it can generate a high output voltage although the body effect still exists. Simulation results show that it can generate enough voltage to apply to the flash memory even at a supply voltage less than 1.5V. Because this method can be applied to almost all types of the improved Dickson charge pump circuits, it is expected that this method will improve the performance of the high voltage generating circuits for low-voltage flash memories.
Keynote Session V
icon_mobile_dropdown
Practical considerations for real-time super-resolution implementation techniques over video coding platforms
Gustavo M. Callico, Sebastian Lopez, Rafael Peset Llopis, et al.
This paper addresses practical considerations for the implementation of algorithms developed to increase the image resolution from a video sequence by using techniques known in the specialized literature as super-resolution (SR). In order to achieve a low-cost implementation, the algorithms have been mapped onto a previously developed video encoder architecture. By re-using such architecture and performing only slight modifications on it, the need for specific, and usually high-cost, SR hardware is avoided. This modified encoder can be used either in native compression mode or in SR mode, where SR can be used to increase the image resolution over the sensor limits or as a smart way to perform electronic zoom, avoiding the use of high-power demanding mechanical parts. Two SR algorithms are presented and compared in terms of execution time, memory usage and quality. These algorithms features are analyzed from a real-time implementation perspective. The first algorithm follows an iterative scheme while the second one is a modified version where the iterative behavioural has been broken. The video encoder together with the new SR features constitutes an IP block inside Philips Research, upon which several System on Chip (SoC) platforms are being developed.
Multimedia II
icon_mobile_dropdown
Real-time super-resolution over raw video sequences
This paper addresses the enhancement of the spatial resolution of a video sequence from a low resolution video sequence in real time. The technique used, known as super-resolution reconstruction, exploits the relative motion from frame to frame that produces sub-pixel shifts. The algorithm, based on a previous version mapped onto a video encoder architecture, is oriented towards a hardware implementation and requires resources optimization. In order to achieve a good resolution improvement, the motion estimation algorithm must produce motion vectors as close to the real ones as possible. At the same time, this motion estimation must match real time requirements. Therefore, an exhaustive technique is applied in combination with a simple segmentation of each frame for a motion prediction refinement. Experimental results have been obtained for a set of video sequences subjected to different motion characteristics.
A low-cost bidimensional smart pixel network for video coding operations
Optimum visual and hearing qualities at high compression ratios as well as reduced area/power dissipation are key factors for actual and future commercial mobile multimedia devices. In this sense, a real time Smart Pixels Array designed to perform efficiently key video coding operations is presented in this paper. In particular, the array introduced is capable to perform the Discrete Wavelet Transform (DWT), Zerotree Entropy (ZTE) Coding and Frame Differencing (FD) over SQCIF images (128×96 pixels) by dividing them into wavelet blocks (8×8 pixels). In order to perform these tasks, the array has been designed as a bidimensional network of interconnected smart pixel processors working in a massively parallel fashion, allowing the operation at very low clock frequencies and hence, low power dissipation. Each of these smart pixels is composed by a photodetector, an analog-digital converter in order to obtain a digital representation of the light intensity received by the photodetector and a Ferroelectric Liquid Crystal placed over the whole surface of the pixel to display the image. Additionally, each pixel has a dedicated circuitry associated which performs all the specific computations related with the three video coding operations previously mentioned, exhibiting a power dissipation of 4.15 μW@128 kHz and a square area of 110x110 μm2 using a 0.25 μm CMOS technology. The array has been integrated into a mobile multimedia device prototype, fully designed at our research centre, capable to send and receive compressed audio and video information with a total power consumption of 1.36 W in an area of 351.5 mm2.
FPGA implementation of a fuzzy based video de-interlacing algorithm
P. Brox, S. Sanchez-Solano, I. Baturone, et al.
De-interlacing algorithms are used to convert interlaced video into progressive scan format. Among the different techniques reported in the literature, motion adaptive de-interlacing techniques that combine spatial and temporal interpolation according to the presence of motion achieve good results with a low computational cost. This paper presents the FPGA implementation of a motion adaptive algorithm which employs fuzzy logic in detecting motion and edges. Motion, which is evaluated at each pixel of the de-interlaced frame, determines the interpolation between an enhanced edge-dependent line average method and field insertion. Extensive simulations with video sequences show the advantages performance of the proposed method over other well-known de-interlacing techniques. The hardware implementation of the algorithm has been carried out on a FPGA obtaining a low-cost solution for real-time processing.
Analog, Mixed-Signal, and Power Circuit Design Methodologies and Tools
icon_mobile_dropdown
Simulation-based high-level synthesis of Nyquist-rate data converters using MATLAB/SIMULINK
Jesus Ruiz-Amaya, Jose M. de la Rosa, Manuel Delgado-Restituto, et al.
This paper presents a toolbox for the simulation, optimization and high-level synthesis of Nyquist-rate Analog-to-Digital (A/D) and Digital-to-Analog (D/A) Converters in MATLAB. The embedded simulator uses SIMULINK C-coded S-functions to model all required subcircuits including their main error mechanisms. This approach allows to drastically speed up the simulation CPU-time up to 2 orders of magnitude as compared with previous approaches - based on the use of SIMULINK elementary blocks. Moreover, S-functions are more suitable for implementing a more detailed description of the circuit. For all subcircuits, the accuracy of the behavioral models has been verified by electrical simulation using HSPICE. For synthesis purposes, the simulator is used for performance evaluation and combined with an hybrid optimizer for design parameter selection. The optimizer combines adaptive statistical optimization algorithm inspired in simulated annealing with a design-oriented formulation of the cost function. It has been integrated in the MATLAB/SIMULINK platform by using the MATLAB engine library, so that the optimization core runs in background while MATLAB acts as a computation engine. The implementation on the MATLAB platform brings numerous advantages in terms of signal processing, high flexibility for tool expansion and simulation with other electronic subsystems. Additionally, the presented toolbox comprises a friendly graphical user interface to allow the designer to browse through all steps of the simulation, synthesis and post-processing of results. In order to illustrate the capabilities of the toolbox, a 0.13μm CMOS 12-bit@80MS/s analog front-end for broadband power line communications, made up of a pipeline ADC and a current steering DAC, is synthesized and high-level sized. Different experiments show the effectiveness of the proposed methodology.
Analog and Mixed-Signal Design Methodologies and Tools
icon_mobile_dropdown
Geometrically constrained parasitic-aware synthesis of analog ICs
In order to speed up the design process of analog ICs, iterations between different design stages should be avoided as much as possible. More specifically, spins between electrical and physical synthesis should be reduced for this is a very time-consuming task: if circuit performance including layout-induced degradations proves unacceptable, a re-design cycle must be entered, and electrical, physical, or both synthesis processes, would have to be repeated. It is also worth noting that if geometric optimization (e.g., area minimization) is undertaken after electrical synthesis, it may add up as another source of unexpected degradation of the circuit performance due to the impact of the geometric variables (e.g., transistor folds) on the device and the routing parasitic values. This awkward scenario is caused by the complete separation of said electrical and physical synthesis, a design practice commonly followed so far. Parasitic-aware synthesis, consisting in including parasitic estimates to the circuit netlist directly during electrical synthesis, has been proposed as solution. While most of the reported contributions either tackle parasitic-aware synthesis without paying special attention to geometric optimization or approach both issues only partially, this paper addresses the problem in a unified way. In what has been called layout-aware electrical synthesis, a simulation-based optimization algorithm explores the design space with geometric variables constrained to meet certain user-defined goals, which provides reliable estimates of layout-induced parasitics at each iteration, and, thereby, accurate evaluation of the circuit ultimate performance. This technique, demonstrated here through several design examples, requires knowing layout details beforehand; to facilitate this, procedural layout generation is used as physical synthesis approach due to its rapidness and ability to capture analog layout know-how.
Simulation-based low-level optimization tool for analog integrated circuits
In this paper, a tool based on free software to perform low level optimization on analog designs is presented. Nowadays, the use of design automation tools for microelectronic circuits design is extending from digital to analog circuits, due in part to the fact that although the analog part of a mixed signal ASIC takes only the 10% of the silicon area, it represents almost 90% of the whole design time. For analog circuits, design process can be divided in two major tasks: topology selection and device sizing. The tool here presented consists on a simulation based optmizh is used to perform automatic low level analog circuit sizing. The tool is composed of three modules: a layout generator, which includes a parasitic extractor, an alaog circuit simulator and a circuit optimizer. The two first modules are respectively Magic and Spice from Berkeley, while the third one, the optimizer, has been developed to evaluate dc, ac, and transient sensitivity simulations performed by Spice and make corrections on the layout sizing. Optimization process starts with a certain topology and standard sized devices, which is then extracted by Magic and simulated by Spice. Performance is evaluated and a sizing correction is proposed. These simulation and corrections are done on an iterative loop until circuit performance reaches design parameters. The tool is demonstrated with an example of a simple analog subcircuit optimization, where parameters like silicon area or power dissipation are optimized, while the circuit keeps on design parameters.
Voltage-Controlled Oscillators
icon_mobile_dropdown
Emitter degenerated voltage controlled oscillators for millimeter wave operation
Three oscillators are presented for operation in the 23-25 GHz range where consideration is given to noise, frequency, and power effects of different tank sizes and topologies. The circuits in question employ inductor based tanks with negative resistance provided through an emitter degenerated cross-coupled pair, and use emitter degeneration and inductive peaking to provide bandwidth extension within the single stage output buffers. These circuits are implemented in a 0.18 μm SiGe BiCMOS technology with a 54 GHz ft, and use single device stacks to achieve low voltage operation with VDD as low as 900 mV, while consuming as little as 2.25 mW of power.
A 3.2-GHz fully integrated low-phase noise CMOS VCO with self-biasing current source for the IEEE 802.11a/hiperLAN WLAN standard
C. Quemada, I. Adin, G. Bistue, et al.
A 3.3V, fully integrated 3.2-GHz voltage-controlled oscillator (VCO) is designed in a 0.18μm CMOS technology for the IEE 802.11a/HiperLAN WLAN standard for the UNII band from 5.15 to 5.35 GHz. The VCO is tunable between 2.85 GHz and 3.31 GHz. NMOS architecture with self-biasing current of the tank source is chosen. A startup circuit has been employed to avoid zero initial current. Current variation is lower than 1% for voltage supply variations of 10%. The use of a self-biasing current source in the tank provides a greater safety in the transconductance value and allows running along more extreme point operation The designed VCO displays a phase noise and output power of -98dBc/Hz (at 100 KHz offset frequency) and 0dBm respectively. This phase noise has been obtained with inductors of 2.2nH and quality factor of 12 at 3.2 GHz, and P-N junction varactors whose quality factor is estimated to exceed 40 at 3.2 GHz. These passive components have been fabricated, measured and modeled previously. The core of the VCO consumes 33mW DC power.
NMOS symmetric load ring VCOs modeling for submicron technologies
Voltage Controlled Oscillators (VCOs) are a key element in PLL design. The simulation of VCOs is a time consuming process because transient circuit simulations must run long enough that the steady state is attained. Furthermore, the robustness of design against operating and technological conditions must also be tested by simulating the circuits at several corners, thus making the design methodology based in iterative simulation rather prohibitive for this class of circuits. The development of efficient and reliable VCO models is therefore a very important task, not only for the automation of the circuit design, but for design space exploration as well. Besides accuracy and simplicity, models must easily adapt to the rapid technology evolution. In order to grant such robustness, we must develop models based on transistor level technological parameters. This paper presents an accurate model for submicron Voltage Controlled Oscillators (VCOs). The model obtained is based on the Npower MOS model, yielding quite accurate results for sub micron technologies. An example considering a 1.2V TSMC013 VCO is presented, where the accuracy of the results obtained against Hspice simulation is shown. Results obtained in about 2 seconds have 4% average error, compared to simulations taking over 15 minutes.
An 18-GHz integrated double-balanced direct down-conversion mixer and emitter degenerated quadrature VCO in 47-GHz ft SiGe
An integrated 18 GHz double-balanced direct down-conversion mixer and emitter degenerated quadrature VCO is designed and fabricated in IBM 47 GHz ft SiGe BiCMOS process. A novel headroom optimization scheme is proposed to optimize mixer conversion gain and linearity. The mixer uses an LC tank to reduce voltage supply. With a 3.3 V supply voltage the mixer core consumes 16.5mW and the output buffer matched to 50 Ω consumes 33mW. Measurements indicate a conversion gain of 4.5 dB, a double-side band noise figure of 7.1dB, an IIP3 of -1dBm, and 1dB compression point at -12.2dBm output power. The mixer has the best figure-of-merit compared to recently published mixers operating at similar frequencies in a Si-based process. The voltage controlled oscillator uses an emitter degenerated LC oscillator core with both SiGe HBT and CMOS buffers to achieve oscillation providing direct downconversion for the aforementioned mixer. The oscillator has two anti-phase coupled cores to lower the phase noise through frequency locking, the unused output ports terminated with 50 Ω. The two circuits (several variants of each) are integrated monolithically, with an oscillator breakout with a phase noise performance of -99 dBc/Hz (at 1 MHz separation) with 1 GHz tuning range while pulling 19 mA from a 2.5 V rail. The paper will include all the necessary design equations used to optimize both circuits along with comparisons with other published results.
Digital Test and Verification
icon_mobile_dropdown
A functional validation methodology based on error models for measuring the quality of digital integrated circuits
Celia Lopez-Ongil, Luis Entrena-Arrontes, Teresa Riesgo-Alcaide, et al.
Functional validation plays an important role in the design cycle of digital integrated circuits. The generation of good test benches is required for checking the complete circuit behaviour. Early location of design errors could highly reduce the development time and cost for these circuits. There are several initiatives for the development of methods that enhance the functional validation of a design. Traditionally, logic abstraction level has been most employed for this purpose, but recent years have shown a strong trend to treat the problem at higher abstraction levels, where design teams normally work. High abstraction levels and automatic synthesis tools are currently being used in top-down methodology. These aspects make difficult to find out design errors when the circuit is described in lower levels of abstraction. It is crucial to obtain a complete functional validation system applicable in the first design stages, where circuits are currently being designed, and also usable along the whole design process for further test plans. In this paper we propose a complete methodology for performing high quality functional validation. The proposed method checks the capability of a given test bench to detect design errors in a circuit description. This checking employs functional simulation of the circuit description at RT level together with the application of error models. An automatic and formal protocol has been developed so that design teams could apply it with no extra effort. The method provides a measurement of the quality of functional validation as well as the location of non-enough validated areas in the circuit. Therefore, the proposed method helps designers in the process of performing the functional validation of their circuits, which could be applied automatically from RT descriptions to lower abstraction levels. Finally, experimental results have proved the correctness of the proposed method as well as the error models applied.
Built-in test engine and fault simulation for memory
P. McEvoy, R. Farrell
In this paper an on-chip method for testing high performance memory devices will be presented. This new technique occupies minimal area and retains the full flexibility of existing methods for the dynamic introduction of new test patterns. This is achieved through microcode test instructions and the associated on-chip state machine. The proposed methodology will enable at-speed testing of memory devices, reducing the overall test time. The relevancy of this work is placed in context with an introduction to memory testing and the techniques and algorithms generally used today. Additionally, we examine the use of fault simulation in methodology evaluation for memory test. Finally we present a prototype design for the implementation of this methodology that incurs minimal test latency and provides a programmable interface to enable varying fault coverage and location patterns.
A one-step algorithm for finding the optimum solution of the state justification problem in RTL designs using MILP
The state justification problem is the decision problem of finding a sequence of states and input values that satisfy an output condition for a given state machine or RTL description. In such problems, there always exist optimal state sequences that require a minimum number of clock cycles to reach the desired state. As Boolean decision problems, state justification problems can be expressed as satisfiability problems (SAT) by using the time-frame expansion algorithm. Boolean SAT or BDD-based techniques are bit-level decision procedures commonly used by industrial hardware verification tools. Unfortunately, these approaches are not efficient enough, because they do not inherit the word-level information from the RTL design. Recent approaches to the SAT problem are addressed to RTL designs containing instances of both, word-level arithmetic blocks for data flow, and bit-level Boolean logic for control flow. These approaches transform the whole SAT problem for an RTL description into a mixed integer linear program (MILP). This paper presents a new approach that finds in a single step, the optimum input sequence for a given RTL description to reach a desired state. This is accomplished by applying a novel time-frame expansion method that guarantees an optimal solution and avoids performing time-frame expansions iteratively. Experimental results will demonstrate that the proposed methodology can solve any state justification problem in one step for complex FSMs. The main application of this procedure is the test pattern generation, where the main problem is to reduce the length of test sequences that verifies a microcircuit.
A complete hardening method for the generation of fault tolerant circuits
Fault Tolerance has become an important requirement for integrated circuits, not only in safety critical applications like aerospace circuits, but also for applications working at the earth surface. Since the appearance of nanometer technologies, the sensitiveness of integrated circuits to radiation has increased notably, making the occurrence of soft errors much more frequent. Therefore, hardened circuits are currently required in many applications where fault tolerance was not a requirement in the very near past. In this paper, tools and methods for the whole hardening process of a circuit are presented: tools for the automatic insertion of fault tolerant structures in a circuit description and methods for the evaluation of fault tolerance achieved. These methods allow the evaluation of fault tolerance by means of emulation in platform FPGAs, which offer a much faster way to perform evaluation than simulation based techniques. Different circuits are used to test the proposed tool for inserting fault tolerant structures. Fault tolerance evaluation is performed using the proposed fault emulation methods, before and after applying hardening process, showing the fault tolerance improvement. The proposed techniques for evaluation have been compared, in terms of evaluation time, with previously proposed solutions and with simulation based solutions, showing improvements of several orders of magnitude.
Multimedia III
icon_mobile_dropdown
Parallel-pipeline 2-D DCT/IDCT processor chip
This paper describes the architecture of an 8x8 2-D DCT/IDCT processor with high throughput and a cost-effective architecture. The 2D DCT/IDCT is calculated using the separability property, so that its architecture is made up of two 1-D processors and a transpose buffer (TB) as intermediate memory. This transpose buffer presents a regular structure based on D-type flip-flops with a double serial input/output data-flow very adequate for pipeline architectures. The processor has been designed with parallel and pipeline architecture to attain high throughput, reduced hardware and maximum efficiency in all arithmetic elements. This architecture allows that the processing elements and arithmetic units work in parallel at half the frequency of the data input rate, except for normalization of transform which it is done in a multiplier operating at maximum frequency. Moreover, it has been verified that the precision analysis of the proposed processor meets the demands of IEEE Std. 1180-1990 used in video codecs ITU-T H.261 and ITU-T H.263. This processor has been conceived using a standard cell design methodology and manufactured in a 0.35-μm CMOS CSD 3M/2P 3.3V process. It has an area of 6.25 mm2 (the core is 3mm2) and contains a total of 11.7k gates, of which 5.8k gates are flip-flops. A data input rate frequency of 300MHz has been established with a latency of 172 cycles for the 2-D DCT and 178 cycles for the 2-D IDCT. The computing time of a block is close to 580ns. Its performances in computing speed as well as hardware complexity indicate that the proposed design is suitable for HDTV applications.
Evaluation of architectures for an ASP MPEG-4 decoder using a system-level design methodology
Trends in multimedia consumer electronics, digital video and audio, aim to reach users through low-cost mobile devices connected to data broadcasting networks with limited bandwidth. An emergent broadcasting network is the digital audio broadcasting network (DAB) which provides CD quality audio transmission together with robustness and efficiency techniques to allow good quality reception in motion conditions. This paper focuses on the system-level evaluation of different architectural options to allow low bandwidth digital video reception over DAB, based on video compression techniques. Profiling and design space exploration techniques are applied over the ASP MPEG-4 decoder in order to find out the best HW/SW partition given the application and platform constraints. An innovative SystemC-based system-level design tool, called CASSE, is being used for modelling, exploration and evaluation of different ASP MPEG-4 decoder HW/SW partitions. System-level trade offs and quantitative data derived from this analysis are also presented in this work.
System level design and power analysis of architectures for SATD calculus in the H.264/AVC
Conti Massimo, Francesco Coppari, Simone Orcioni, et al.
The new generation of video coding standards (H.264/MPEG Advanced Video Codec) addresses the requirements of a network-friendly and scalable video representation, and increasing by a factor of two the compression efficiency of the current technology. The H.264 uses the SATD metric for the calculus of the prediction error. The SATD procedure may be called about 1 million times during the visualization of a 352x288 pixel video sequence of 10 seconds. Therefore the accurate design of a dedicated hardware for the SATD is relevant in the performance of the complete codec. This paper presents four architectures described in SystemC for the VLSI implementation of the calculus of the SATD metric. The performances of the architectures in terms of signal to noise ratio and power dissipation have been evaluated using a new SystemC library developed by the authors for the estimation of power consumption in a SystemC description of the architecture. Comparisons have been performed for different values of the number of bits of the internal representation for the four architectures. Four standard video sequences (Akiyo, Stefan, Mobile&calendar, Container) have been used to test the performance of the architectures.
Data Converters
icon_mobile_dropdown
Design of a 12-bit 80MS/s pipeline analog-to-digital converter for PLC-VDSL applications
Jesus Ruiz-Amaya, Manuel Delgado-Restituto, Juan F. Fernandez-Bootello, et al.
This paper describes the design of a 12-bit 80MS/s pipeline Analog-to-Digital converter implemented in 0.13mm CMOS logic technology. The design has been computer-aided by a developed toolbox for the simulation, synthesis and verification of Nyquist-Rate Analog-to-Digital and Digital-to-Analog Converters in MATLAB. The embedded simulator uses SIMULINK C-coded S-functions to model all required subcircuits including their main error mechanisms. This approach allows to drastically speed up the simulation CPU-time and makes the proposed tool an advantageous alternative for fast exploration of requirements and as a design validation tool. The converter is based on a 10-stage pipeline preceded by a sample/hold with bootstrapping technique. Each stage gives 1.5 effective bits, except for the first one which provides 2.5 effective bits to improve linearity. The Analog-to-Digital architecture uses redundant bits for digital correction, it is planned to be implemented without using calibration and employs a subranging pipeline look-ahead technique to increase speed. Substrate biased MOSFETs in the depletion region are used as capacitors, linearized by a series compensation. Simulation results show that the Multi-Tone Power Ratio is higher than 56dB for several DMT test signals and the estimated Signal-to-Noise Ratio yield is supposed to be better than 62 dB from DC to Nyquist frequency. The converter dissipates less than 150mW from a 3.3V supply and occupies less than 4 mm2 die area. The results have been checked with all process corners from -40° to 85° and power supply from 3V to 3.6V.
A high linearity 14-bit pipelined charge summation ADC
Nigel Duignan, Ronan Farrell
Presented in this paper is a low power, area efficient pipeline analog-to-digital converter (ADC), utilising a charge summation technique and a switched-capacitor implementation. Utilising switched capacitor, a staircase ramp is produced caused by the switching capacitors and a fixed reference voltage, as opposed to a linear ramp. The advantage of the charge summation technique is the reduction in power usage as the charging time of the capacitors is small so for most of the sample period the circuit is quiescent. The paper presents the use of this architecture as a 14-bit pipelined ADC, which can sample data at a rate of 1 MSps. The pipeline architecture itself is novel as the typical sub-DAC is not required. The signal-to-noise ratio (SNR) of the ADC is improved by using a spatial over-sampling technique, which reduces the thermal noise effect on in the switched capacitor circuit. The effects of opamps finite gain and offset on the linearity of the ramp are reduced by employing a finite gain and offset compensated integrator architecture and through the use of low-resolution pipeline stages. The proposed architecture is a strong candidate for applications demanding high resolution with low power requirements.
A simple 3.8-mW 300-MHz 4-bit flash analog-to-digital converter
This paper presents a fully differential comparator that can be used in a N bit Flash A/D converter as continuous-time sigma-delta modulator quantizer. The comparator is an extension of the dynamic comparator presented by Lewis and Gray, resulting in a 4 bit A/D. Its main advantages are : compact architecture based on MOS transistor only, without any passive components such as resistance ladder or switch capacitance, fully differential input and output voltages, operating at very low voltage. Using this comparator, a 4 bit flash A/D converter has been designed in a 0.13μm CMOS technology, under 1.2V supply voltage. It operates at 300Msample/s, suitable for over sampled data converter. The simulation shows a 3.8mW power consumption for the whole ADC.
Design of a 12-bit 80-MS/s CMOS digital-to-analog converter for PLC-VDSL applications
Jesus Ruiz-Amaya, Manuel Delgado-Restituto, J. Francisco Fernandez-Bootello, et al.
This paper describes the design of a 12-bit 80MS/s Digital-to-Analog converter implemented in 0.13mm CMOS logic technology. The design has been computer-aided by a developed toolbox for the simulation and verification of Nyquist-Rate Analog-to-Digital and Digital-to-Analog converters in MATLAB. The embedded simulator uses SIMULINK C-coded S-functions to model all required subcircuits including their main error mechanisms. This approach allows to drastically speed up the simulation CPU-time and makes the proposed tool an advantageous alternative for fast exploration of requirements and as a design validation tool. The converter is segmented in a unary current-cell matrix for 8 MSB's and a binary-weighted array for 4 LSB's. Current sources of the converter are laid out separately from current-cell switching matrix core block and distribute in double centroid to reduce random errors and transient noise coupling. The linearity errors caused by remaining gradient errors are reduced by a modified Q2 Random-Walk switching sequence. Simulation results show that the Spurious-Free Dynamic-Range is better than 58.5dB up to 80MS/s. The estimated Signal-to-Noise Distortion Ratio yield is 99.7% and it is supposed to be better than 58dB from DC to Nyquist frequency. Multi-Tone Power Ratio is higher 59dB for several DMT test signals. The converter dissipates less than 129mW from a 3.3V supply and occupies less than 1.7mm2 die area. The results have been checked with all process corners from -40° to 85° and power supply from 3V to 3.6V.
10-bit 100-MS/s sample-and-hold amplifier adopting positive feedback technique
Gil-Su Kim, Jae-Tack Yoo, Hoon-Jae Ki, et al.
Since a switched-capacitor sample-and-hold amplifier (SHA) has a feedback loop outside its op-amp, an op-amp with positive feedback technique (PFT) can be adopted to enhance the DC-gain of the op-amp. This paper proposes a positive feedback amplifier with dramatically increased DC-gain of 127-dB and reports that a SHA adopting proposed amplifier is stable by feedback operation. Measurement results demonstrated stable operation with spurious free dynamic range (SFDR) of 70-dB.
FPGAs
icon_mobile_dropdown
FPGA implementation of sparse matrix algorithm for information retrieval
Slobodan Bojanic, Ruzica Jevtic, Octavio Nieto-Taladriz
Information text data retrieval requires a tremendous amount of processing time because of the size of the data and the complexity of information retrieval algorithms. In this paper the solution to this problem is proposed via hardware supported information retrieval algorithms. Reconfigurable computing may adopt frequent hardware modifications through its tailorable hardware and exploits parallelism for a given application through reconfigurable and flexible hardware units. The degree of the parallelism can be tuned for data. In this work we implemented standard BLAS (basic linear algebra subprogram) sparse matrix algorithm named Compressed Sparse Row (CSR) that is showed to be more efficient in terms of storage space requirement and query-processing timing over the other sparse matrix algorithms for information retrieval application. Although inverted index algorithm is treated as the de facto standard for information retrieval for years, an alternative approach to store the index of text collection in a sparse matrix structure gains more attention. This approach performs query processing using sparse matrix-vector multiplication and due to parallelization achieves a substantial efficiency over the sequential inverted index. The parallel implementations of information retrieval kernel are presented in this work targeting the Virtex II Field Programmable Gate Arrays (FPGAs) board from Xilinx. A recent development in scientific applications is the use of FPGA to achieve high performance results. Computational results are compared to implementations on other platforms. The design achieves a high level of parallelism for the overall function while retaining highly optimised hardware within processing unit.
Integrated circuit debug through FPGA emulation: application to a PIC-18 macrocell
Mario Garcia-Valderas, Eduardo de la Torre-Arnanz, Fernando Casado-Ortiz, et al.
FPGA emulation has become a common way to check if a digital circuit has been correctly designed. Although in the last years FPGA vendors have developed tools to embed logic analysers along with circuits in FPGAs, like Chipscope ILA from Xilinx, FPGA emulation still lacks the availability of more effective and versatile debug methods and tools. In order to check microprocessor system designs, several approaches have been used, including several combinations of logic simulators, instruction simulators, hardware emulators and in-circuit emulators. Nowadays, System-On-Chip design requires the implementation of microprocessor cores in FPGAs for prototyping. These cores do not usually include built-in debug features. In this paper, methods and tools for the development and operation of FPGA debug features are presented. Debug features are implemented in FPGAs through the insertion of JTAG accessible debug modules into the target design. The debug modules that have already been designed offer features that range from simple event detection and signal monitoring to the most powerful and resource consuming, like tracing, complex event and sequence detection and microprocessor in-circuit emulation. The most important properties of the presented debug features are their high configurability, which allow adjusting them to available logic resources, remote control of debug logic and expandability by means of user customized debug blocks. Tools have been developed to automate the required tasks: debug logic selection and configuration, debug logic insertion and debug logic operation. The proposed methods and tools have been applied to a microprocessor system based on a PIC-18 macrocell and implemented in a Xilinx Spartan-3 FPGA.
ACE16k based stand-alone system for real-time pre-processing tasks
Luis Carranza, Francisco Jimenez-Garrido, Gustavo Linan-Cembrano, et al.
This paper describes the design of a programmable stand-alone system for real time vision pre-processing tasks. The system's architecture has been implemented and tested using an ACE16k chip and a Xilinx xc4028xl FPGA. The ACE16k chip consists basically of an array of 128x128 identical mixed-signal processing units, locally interacting, which operate in accordance with single instruction multiple data (SIMD) computing architectures and has been designed for high speed image pre-processing tasks requiring moderate accuracy levels (7 bits). The input images are acquired using the optical input capabilities of the ACE16k chip, and after being processed according to a programmed algorithm, the images are represented at real time on a TFT screen. The system is designed to store and run different algorithms and to allow changes and improvements. Its main board includes a digital core, implemented on a Xilinx 4028 Series FPGA, which comprises a custom programmable Control Unit, a digital monochrome PAL video generator and an image memory selector. Video SRAM chips are included to store and access images processed by the ACE16k. Two daughter boards hold the program SRAM and a video DAC-mixer card is used to generate composite analog video signal.
Self-similar module for FP/LNS arithmetic in high-performance FPGA systems
Lambert Spaanenburg, Stefan Mohl
The scientific community has gratefully embraced floating-point arithmetic to escape the close attention for accuracy and precision required in fixed-point computational styles. Though its deficiencies are well known, the role of the floating-point system as standard has kept other number representation systems from coming into practice. The paper discusses the relation between fixed and floating-point numbers from a pragmatic point of view that allows to mix both systems to optimize FPGA-based hardware accelerators. The method is developed for the Mitrion "processor on demand" technology, where a computationally intensive algorithm is transformed into a dedicated. The large gap in cycle time between fixed and floating-point operations and between direct and reverse operations makes the on-chip control for the fine-grain pipelines of parallel logic very complicated. Having alternative hardware realizations available can alleviate this. The paper uses a conjunctive notation, also known as DIGILOG, to introduce a flexible means in creating configurable arithmetic of arbitrary order using a single module type. This allows the Mitrion hardware compiler to match the hardware closer to the demands of the specific algorithm. Typical applications are in molecular simulation and real-time image analysis.
VLSI implementation of RSA encryption system using ancient Indian Vedic mathematics
Himanshu Thapliyal, M. B. Srinivas
This paper proposes the hardware implementation of RSA encryption/decryption algorithm using the algorithms of Ancient Indian Vedic Mathematics that have been modified to improve performance. The recently proposed hierarchical overlay multiplier architecture is used in the RSA circuitry for multiplication operation. The most significant aspect of the paper is the development of a division architecture based on Straight Division algorithm of Ancient Indian Vedic Mathematics and embedding it in RSA encryption/decryption circuitry for improved efficiency. The coding is done in Verilog HDL and the FPGA synthesis is done using Xilinx Spartan library. The results show that RSA circuitry implemented using Vedic division and multiplication is efficient in terms of area/speed compared to its implementation using conventional multiplication and division architectures.
Digital Design Methodologies and Tools II
icon_mobile_dropdown
CHIADO: compilation of high-level computationally intensive algorithms to dynamically reconfigurable computing systems
Reconfigurable computing has already confirmed a significant potential for accelerating certain computing tasks. However, the most successful applications relied on user expertise to design a specific architecture implemented by the hardware structures of the reconfigurable computing device. Hence, one of the most challenging issues is to map, efficiently and automatically, computations (described in software programming languages) to reconfigurable computing devices. This paper presents CHIADO, a research project aiming a compiler framework to map efficiently software programs to reconfigurable computing platforms, especially the ones based on FPGA (Field-Programmable Gate Array) devices. The framework is also intended to support research of new optimization techniques. The project, based on our previous work on compiling Java bytecodes to FPGAs, focuses on high-performance solutions, schemes to estimate the impact of some transformations supported by the compiler (partial/full loop unrolling), and schemes to take advantage of dynamic reconfiguration (e.g., temporal partitioning). This paper gives an overview about the CHIADO project, shows the framework, and enumerates the main project goals.
A methodology for the characterization of arithmetic circuits on CMOS deep submicron technologies
Adrian Estrada, Carlos J. Jimenez, Manuel Valencia
Integration technologies have favored the design and implementation of more complex circuits. Thanks to this increased complexity, these circuits are capable of implementing algorithms which a few years ago were too expensive in both area and computational resources. However, they now offer interesting choices which should be considered. This new generation of integrated circuits nevertheless presents other kinds of restrictions that the designer should bear in mind. Parameters such as frequency of operation or power consumption are new restrictions that the designer has to deal with in order to fulfill the conditions established by the circuit functionality. Finally, the shrinking integration scale of current technologies makes the timing behavior of the design differ from previous technologies. Thus, a review of the timing behavior of the digital circuit should be done. So far, arithmetic circuits have been used as a benchmark for the analysis and design procedures of digital circuits. Therefore, it is our goal now to analyze both conventional and modern arithmetic circuits structures for different deep-submicron technologies. To achieve this goal, a good solution is to characterize a set of algorithmic circuits for several deep submicron processes, so that the designer can select the most suitable one depending upon the intended application and existing restrictions. In this paper, the first steps to attain such selection are presented. In particular, we propose a design and VHDL characterization methodology based on an RTL description of each component, on the utilization of an automated synthesis tool, and on the generation of logic characteristics from the logic level. This methodology is applied to a set of adders structures, the results of which are also presented.
An efficient structural technique for Boolean decomposition
Boolean decomposition techniques offer a powerful alternative to traditional algebraic methods when partitioning a circuit graph in the technology independent stage of the circuit design flow. These techniques usually require to transform the circuit from a structural representation to a representation based on Binary Decision Diagrams (BDDs). It is well known that BDDs can grow exponentially in some cases, so the power of Boolean decomposition comes at the expense of an exponential increase in the size of the circuit representation. The following stages in the design flow may suffer severely from the space penalty imposed on each partitioned block. To cope with this space explosion, each block of the partitioned circuit has to be re-synthesized before further processing. The extra re-synthesis, on the other hand, may impose a prohibitive time/space penalty on the design flow. This paper proposes an inexpensive technique to avoid re-synthesizing the BDD blocks obtained after Boolean decomposition. This technique works by structurally partitioning the original circuit representation, according to information provided by the partitioned BDD blocks. After all the blocks have been recovered, the BDDs are not needed and can be discarded. The resulting circuit will be proportional to the original circuit representation, and not to the intermediate BDD representation.
Technology mapping in library-free logic synthesis
Jingyue Xue, Dhamin Al-Khalili, Come N. Rozon
Library-free logic synthesis is an innovative approach that provides a fully customized design performance while avoiding the huge cost of developing and maintaining the extensive cell libraries. Its strength is coming from the use of a virtual library based on on-the-fly cell generation. However, the flexibility of the virtual library makes it impossible to exploit the existing methodologies that are based on the pre-characterized standard cell libraries. The authors developed a creative approach to map the design into customized CMOS complex gates using virtual library technique. This is a timing-driven process, which consists of four phases: logic transformation, logic partitioning, gate mapping and transistor re-ordering. The performance of CMOS complex gates and the logic path derived from the extracted transistor topology are used in guiding the synthesis process. The proposed mapping algorithm was used in combination with our topology-based performance estimation model to synthesize some of the MCNC91 benchmarks. The results show that our algorithm can achieve 42% improvement in area and 43% improvement in power compared to that same designs synthesized by Synposys' Design Analyzer.
Computing a perfect input assignment for probabilistic verification
Maxim Teslenko, Elena Dubrova, Hannu Tenhunen
Design verification is the task of establishing that a given design meets the intended behavior. The growing complexity of verification instances requires new methods that can provide high quality verification coverage for large, complex designs. Probabilistic verification complements existing simulation-based and formal verification techniques by providing a distinct trade-off between coverage and capacity. Probabilistic approach maps two Boolean functions onto hash values and compare these values for equivalence. The major drawback of probabilistic verification is the non-zero probability of collision of hash values of two non-equivalent functions, producing "false positive" verification results. In this paper, we prove the existence of a perfect input assignment which never causes collisions. We show that the equivalence of hash values computed for a perfect input assignment implies the equivalence of functions with 100% probability.
Poster Session
icon_mobile_dropdown
A new geometrical approach to design centering of analog circuits
A new geometrical approach to design centering of analog circuits is presented. It is based on the use of symbolic analysis techniques, which permit to perform only one simulation of the circuit in order to determine a set of parameter values maximizing the manufacturing yield. The new approach can be an advantageous alternative to preexisting techniques of design centering, whose major limiting factor is constituted by the considerable computational effort required for simulating many times the circuit under consideration. A program implementing the new design centering approach has been realized. Its application to circuit examples, already presented by other authors, is reported in order to compare the results obtained through the developed procedure with those of preexisting techniques.
CMOS integrator based lock-in pixel for heterodyne interferometry
This article presents a prototype of a CMOS phase sensor for high accuracy (1 Angstrom) heterodyne interferometry. Switched integrators realization of a lock-in pixel for 4-bucket phase detection algorithm is described and illustrated by experimental results. Factors that limit the accuracy of this implementation and possible ways for its improvement are discussed.
Single poly PMOS-based CMOS-compatible low voltage OTP
Paola Vega-Castillo, Wolfgang H. Krautschneider
A PMOS-based non-volatile memory cell fully compatible with standard CMOS fabrication processes is presented. It consists of a PMOS access transistor in series with a PMOS transistor whose gate is left floating. The cell configuration eliminates the requirement of a control gate, and therefore can be fabricated without using double poly gates. The cell saves area compared to other single poly non-volatile memory cells based on CMOS approaches, which require both NMOS and PMOS transistors. It also avoids the risk of latch-up. The cells were fabricated using a 350nm standard CMOS process. The programming mechanism of the cell is hot electron injection. The programming operation can be performed at programming voltages as low as |Vds|=4.5V. The cell can be used as a low voltage OTP and provides a very cheap alternative to integrate OTPs in CMOS ICs without any modification of the fabrication process.
Modeling of frequency agile devices: development of PKI neuromodeling library based on hierarchical network structure
P. Sanchez, J. Hinojosa, R. Ruiz
Recently, neuromodeling methods of microwave devices have been developed. These methods are suitable for the model generation of novel devices. They allow fast and accurate simulations and optimizations. However, the development of libraries makes these methods to be a formidable task, since they require massive input-output data provided by an electromagnetic simulator or measurements and repeated artificial neural network (ANN) training. This paper presents a strategy reducing the cost of library development with the advantages of the neuromodeling methods: high accuracy, large range of geometrical and material parameters and reduced CPU time. The library models are developed from a set of base prior knowledge input (PKI) models, which take into account the characteristics common to all the models in the library, and high-level ANNs which give the library model outputs from base PKI models. This technique is illustrated for a microwave multiconductor tunable phase shifter using anisotropic substrates. Closed-form relationships have been developed and are presented in this paper. The results show good agreement with the expected ones.
A temperature control system for integrated resistive gas sensor arrays
Giuseppe Ferri, Nicola Guerrini, Vincenzo Stornelli
A temperature control system for integrated resistive gas sensor arrays is proposed. The circuit is a part of a portable system for ambient gas monitoring formed by a sensor array, the IC front-end, the temperature control system with the heater and the pattern recognition algorithm for the processing of the acquired data from the front-end. The sensors are arranged in order to detect a particular kind of gas among which CO2, CH4, H2 and SO2, through a 4-channel read-out front-end able to furnish the digital output signal. The temperature control is simplified by the presence of a second resistance matched with the sensor that operates as a thermal sensor. In this manner it is possible to control the sensor temperature without interference. The problem of the temperature control of the heater is reduced to the control of a resistance. The current (more generally the power) delivered to the heater resistance must be such that the temperature has to remain constant. This task is demanded to the second resistance, close to the heater one, that remains at the same temperature. Typically, such kinds of controls are implemented by topologies that maintain current sources or heater currents constant. In this work, the control circuit is able to maintain the power delivered to the heater resistance as constant.
CCII-based inductance simulators for mechanical oscillation control
Giuseppe Ferri, Nicola Guerrini
In this work we present active inductance simulators developed for the control of the mechanical oscillation of a metallic beam. It is possible to reduce the amplitude of these oscillations by subtracting energy to the beam itself. The conversion from mechanical to electrical energy can be done through a piezo-electric sheet connected to the metallic beam. The equivalent circuit is a classical RLC resonating circuit. The required inductance value depends on the oscillating mode that we want to control and can be of hundreds and, in some cases, thousands of Henry (H). A series resistance compensation can help in attenuating the beam vibrations. The solutions proposed in this work allow the implementations of simple circuits, with particular symmetries, which can be also suitable for integrated applications once an integrated CCII is designed. In the literature, circuit implementations performing equivalent inductances are typically based on amplifiers (for example, Antoniou's circuit). Our solutions are based on second generation current conveyors (CCIIs) and allow to obtain both grounded and floating equivalent inductances, of about 1000 H values, working within a regulated frequency range of 3-4 decades.
Novel low-voltage fully differential buffer
Giuseppe Ferri, Nicola Guerrini, Manolo Sperini
In this paper we present a new integrated CMOS fully-differential buffer. The proposed solution has the peculiarity to avoid external CMFB and show rail-to-rail characteristics, so that it is particularly useful for low-voltage (± 0.75V) applications. Simulation results confirm the theoretical expectations showing, in particular, an excellent internal common mode control.
Application of clock gating techniques at a flip-flop level to switching noise reduction in VLSI circuits
Pilar Parra, Javier Castro, Manuel Valencia, et al.
One of the most important sources of switching noise in large VLSI circuits is the clock-driven circuitry, meaning that memory elements are the main source of noise in digital circuits. This paper faces the application of clock-gating, a well known low-power technique, to the reduction of switching-noise generation. Sources of switching noise in master-slave flip-flops will be analyzed. It will be shown how different solutions for the clock-gated logic show very different results regarding switching-noise generation. Illustrative examples characterized through HSPICE simulations, as well as the application of clock-gating to 16-bit synchronous counter as demonstrator, will provide useful design guidelines for reduction of switching noise generation.
DC modeling of PN integrated cross varactors
Benito Gonzalez, Jose Antonio Perez, Sunil L. Kemchandani, et al.
In this paper models for the capacitance of cross integrated varactors based in the PN junction are presented. Three different approximations are assumed, in order to reproduce the measured results of the capacitance. The relative error with the measured capacitance is under 10% in all cases.
Sectored receiver model for calculation of the impulse response on IR wireless indoor channels using Monte Carlo based ray-tracing algorithm
The indoor optical channel simulation can significantly benefit the design of high performance infrared (IR) systems, but requires algorithms and models that accurately fit the channel characteristics. One of the limitations of the IR links is the intersymbol interference caused by multipath dispersion. For fixed emitter and receiver locations, multipath dispersion is completely characterized by the channel impulse response. Therefore, to have an algorithm and a propagation model that allow us to determine the impulse response for different IR links is necessary. The use of angle-diversity receivers makes possible the reduction of the impact of ambient light noise, path loss and multipath distortion, in part by exploiting the fact that they are often received from different directions than the desired signal. Basically, there are three ways to get angle-diversity detection: using conventional, imaging or sectored receivers. In contrast to previous works, we present a model for sectored receivers, that is, a set of photodiodes placed in hemispheric form, upon which a Monte Carlo based ray-tracing algorithm allows us to obtain the impulse response and to study those optical links that are characterized by the use of sectored receivers. Using the obtained results, it is possible to establish those parameters of the sectored receiver structure that better performances present with respect to the IR channel features: the path loss and the rms delay spread.
Study of the proximity effect in high q inductors for wireless LAN (WLAN)
I. Cendoya, J. Mendizabal, N. Sainz, et al.
The quality factor (Q) measures the ability of a component to preserve the energy received during the circuit operation. Q is the most important parameter in an inductor. It is mainly limited by the loss due to inductor metal resistance, substrate resistance, and the resistance associated with induced Eddy current below the inductor metal trace. One of the pernicious effects for the Q of an inductor is the proximity effect. Proximity effect is caused by the magnetic field generated by the own inductor and induces parasitic currents in the metal tracks causing an increase in the resistance and thus diminishing the Q. The objective of this paper is to study this effect and consequently to obtain some inductor design rules, which allow the designer to implement high quality inductors. This paper is focused on balanced inductors for a WLAN application using CMOS 0.18 μm technology.
Prototype board for the test of self-timed circuits developed in FPGAs
M. Sanchez Raya, R. Jimenez Naharro, J. Castro Ramirez
Nowadays, there exist a high number of commercial FPGA prototype boards. Nevertheless, these boards are basically oriented to functional verification of synchronous designs. So, it would be interesting to include tests modules dedicated to characterization of digital circuits. These circuits have not to be limited to synchronous circuits, but asynchronous circuits will also be considered due to their potential advantage. Among parameters to characterization, we are going to include the latency, throughput, power consumption and noise. One of the missions of this characterization will be the comparison among synchronous and asynchronous implementations. In most of existent boards, the measure of certain merit parameters of the design is hindered by the impossibility of varying the frequency of the clock signal, or for the inexistence of measuring points of power consumption or high-speed signals. The main novelty that contributes the design of this board is the possibility of extracting dynamic parameters of the design operation implemented on FPGA. In this work, a prototype board based on FPGA is proposed. One of the main novelty is the inclusion of an autonomous test system permitting functional verification and characterization of implemented designs. As an application, a test bench has been developed in order to compare and validate several arithmetic circuits, including synchronous and asynchronous implementations.
An approach to a VHDL-AMS library for RF component models
In this paper we present a contribution to create a VHDL-AMS radio-frequency component library. Currently, integrated circuit technology tends to integrate in a sole chip not only mixed signal but also mixed technology systems, going to a more general definition of the so called Systems On Chip. A library of RF models would be useful to model, in a same framework, circuits and systems of different physical domains, including RF, which will certainly optimise design process of such systems. Generally, VHDL-AMS, the analogue and mixed signal extension to IEEE standard VHDL does not include specific formulation for RF devices or systems modeling, as it does not support distributed parameters for simulation or description purposes. However, RF devices can be modeled by means of more general VHDL-AMS resources, like sentences including algebraic and trigonometric relations.
Multiphase clock generators with controlled clock impulse width for programmable high order rotator SC FIR filters realized in 0.35 µm CMOS technology
Complexity of clock generator is one of the most important parameters in the design and optimization of switched-capacitor (SC) finite impulse response (FIR) filters. There are different SC FIR filter architectures. Some of them need a simple clock generator but the others require a quite complicated multiphase clock system. In the latter case an external clock system (i.e., outside the integrated circuit) is unrealistic because of a great number of the required external pins. We have implemented various SC FIR filter architectures together with complex internal clock generators in the CMOS 0.8 μm and 0.35 μm technologies. One of the most important problems in the design process was the optimization of waveforms and widths of the clock impulses. SC FIR filters are very sensitive to parameters of clock systems. Thus the clock generators must be designed very precisely. We demonstrate results of the design of the 64-phase clock generator for a programmable rotator SC FIR filter. In our approach the width of the clock impulses is controlled by two external signals. This is a very convenient solution, because optimization of the clock impulses, which was difficult in other approaches, is currently much easier. The internal clock generator area is ca. 0.15 mm2 in the CMOS 0.35 μm technology, i.e., only 7 % of the entire SC FIR filter chip area.
A fully integrated low-noise amplifier in SiGe 0.35 µm technology for 802.11a WIFI applications
In the last years, WIFI market has shown an incredible growth, exceeding expectations. This paper presents the design of two fully integrated LNAs using a low cost SiGe 0.35 um technology for the 5 GHz band, according to the IEEE 802.11a WIFI standard. One LNA has an asymmetric configuration and the other a balanced configuration. A comparison between the two LNAs has been made. All passives devices are on chip, including integrated inductors which have been designed by electromagnetic simulations. This work demonstrates the feasibility of a low cost silicon technology for the design of 5 GHz band circuits
Search strategy for relevant parasitic elements and reduction of their influence on the operation of SC FIR filters realized in CMOS technology
Parasitic capacities pose a serious problem in switched capacitor finite impulse response (SC FIR) filters realized as VLSI systems in CMOS submicron technologies. The influence of these parasitic elements is especially visible in the stopband of the filter frequency response. To design mixed digital-analog SC FIR filters is a difficult task. Filters of this class have to be designed using full-custom method. SC FIR filters of high orders N are very complex systems with thousands of transistors, capacitors, which, in turn, make the basis for many active elements, switches, delay elements, memories and other circuitry. One of the most important stages during the design process is post-layout HSPICE verification. However, the simulation of separated blocks does not suffice to have enough knowledge of the operation of the whole system. Optimization requires netlist simulations of the entire system, with presence of typically between 5000-30000 of parasitic capacities, where only about hundred of them are critical ones. Analysis which aims at finding these elements, in practice, is not possible because of the complexity of the entire system. The heuristic method of searching for relevant parasitic elements presented in this paper is based on the assumption that all parasitic elements create a set. The main task is to divide this set into subareas. In order to do this particular groups of nets in the layout must be labeled using unique names. Then particular groups of parasitic elements are filtered out from the netlist. Each filtering stage generates two netlists with separate areas of parasitic elements. After the analysis of the simulation results has been done there remains to make the decision concerning subsequent filtering operations. The iteration method is very quick, convenient, efficient and does not require deep knowledge of the simulated system. Many stages of this method can be easy implemented with CAD tools. In realized projects, after no more than 15-60 iterations critical parasitic capacities were found. In realization of the four chips in CMOS 0.8mm and 0.35mm technologies this method issued in very good results-the attenuation in the stopband, which is very important parameter, was improved by about 20-25 dB.
A deterministic BIST scheme for test time reduction in VLSI circuits
A Built-In Self-Test scheme for VLSI scan-based digital circuits, capable of considerably reducing the number of test cycles, is presented. The core circuit structure consists of a modification of the original scan-based circuit requiring no extra I/O pin. Only a moderate area increment is used to accommodate the extra test circuitry. The structure does not use scan-out, but scan-in exclusively, which implies that the complete circuit responses are observed through the circuit primary-outputs. Based on this structure, a deterministic ROM-based Built-In Self-Test scheme has been developed. In this scheme, the circuit responses are compressed in a Multiple-Input Signature Register. Deterministic test patterns are stored in two ROMs. The first stores the sub-patterns to be serially loaded into the scan chain, while the second stores the sub-patterns to be applied in parallel to the circuit primary inputs. All the control bits for clocks and for selecting the loading of a new sub-pattern into the scan chain are also included in this last ROM. Thus, the clocks and the select-mode input are the only external inputs to the scheme. The comparison of the proposed scheme with a similar one, based on the classical full single-serial scan-path, for a set of benchmark circuits, shows a 19% reduction in ROM-bits, while a reduction of over 45% in the test time is obtained.
Low-cost printed antenna design in the band of 2.4 GHz
Pere Marti, Moises Serra, Jordi Carrabina
In this article we present a study and the corresponding implementation of low cost printed antennas in the 2,4GHz band. These antennas work with low cost transceivers that give a FR input/output signal in the band of interest. The work is part of a project in the field of sensor networks using technology such as Zigbee or even simpler and cheaper systems. We have focused our attention on parameters such as the antenna impedance, which is very important for achieving maximum power transfer with the transceiver while avoiding adaptation circuits. We are also interested in avoiding balums. We have analyzed these printed antennas in terms of their efficiency and radiation pattern using electromagnetic simulation software. Two structures have been evaluated and compared. The first is a structure derived from a monopole and slot antenna and the second is a printed patch antenna.
Rapid prototyping with the visual data environment of an OFDM WLAN system
Moises Serra, Pere Marti, Jordi Carrabina
In this paper a rapid prototyping design flow is presented and applied to a prototype of the base-band physical layer of a Hiperlan/2 WLAN transceiver. This physical layer is a high performance multi-rate system that contains computationally intensive algorithms. A new method for prototyping the design flow and verifying the process is to use the latest generation of system level design environments (visual data flow environment) for DSPs. The System Generator and Matlab/Simulink tools form a visual data flow environment for FPGA allow us to model DSP systems and explore algorithms. This environment also translates designs into hardware implementations that are faithful, synthesizable and efficient, which can be explored and refined in rapid prototyping platforms.
Evaluation of a MEMS based theft detection circuit for RFID labels
Damith C. Ranasinghe, Peter H. Cole
In the proliferation of RFID technology anti-theft labels are continuing to evolve. In the functional hierarchy of RFID labels the battery-powered labels are a set of higher class labels referred to as active labels. Often these labels are employed for the tagging of expensive goods, with aim of both tracking and preventing the theft of the item. The battery powering such active labels must have very low internal and external current drain in order to prolong the life of the battery while being in a state of functionality to signal a theft of the labelled item. However due to circuit complexity or the desired operating range the electronics may drain the battery more rapidly than desired and the label may not last the shelf life of the product. The theft detection mechanism presented in this paper conserves power and thus prolongs the battery life of an active anti-theft label. A solution available for the development of such a theft detection circuit uses electroacoustic energy conversion using a MEMS device on a label IC to provide a high sensitivity result. This paper presents the results of an analysis conducted to evaluate the performance and the capabilities of such a theft detection circuit.
Circuit Design for RF Applications
icon_mobile_dropdown
Turn-on circuits based on standard CMOS technology for active RFID labels
David Hall, Damith C. Ranasinghe, Behnam Jamali, et al.
The evolution of RFID Systems has lead to the development of a class hierarchy in which the battery powered labels are a set of higher class labels referred to as active labels. The battery powering active transponders must last for an acceptable time, so the electronics of the label must have very low current consumption in order to prolong the life of the battery. However due to circuit complexity or the desired operating range the electronics may drain the battery more rapidly than desired but use of a turn-on circuit allows the battery to be connected only when communication is needed, thus lengthening the life of the battery. Two solutions available for the development of a turn on circuit use resonance in a label rectification circuit to provide a high sensitivity result. This paper presents the results of experiments conducted to evaluate resonance in a label rectification circuit and the designs of fully integrable turn-on circuits. We have also presented test results showing a successful practical implementation of one of the turn on circuit designs.