A Fast Crc Implementation on Fpga Using a Pipelined Architecture for the Polynomial Division

A Fast CRC Implementation on FPGA Using a Pipelined Construction for the Polynomial Resistance Fabrice MONTEIRO, Abbas DANDACHE, Amine M’SIR,Bernard LEPLEY LICM, University of Metz, SUPELEC, Rue Edouard Belin, 57078 Metz Cedex phone: +33(0)3875473 11, fax: +33(0)387547301, email: fabrice. [email protected] org ABSTRACT The CRC mistake competition is a very vulgar office on telecommunication applications. The disconnection towards increasing basis reproves requires further and further sofisticated utensilations. In this brochure, we exhibit a manner to utensil the CRC office established on a pipeline organization for the polynomial resistance.It emends very talentedly the accelereprove enterprise, allowing basis reproves from 1 Gbits/s to 4 Gbits/s on FPGA utensilions, according to the analogousisation plane (8 to 32 bits).
1 INTRODUCTION The CRC (Cyclic Redundancy Checking) codes are used in a lot of telecommunication applications. They are used in the inside layers of protocols such as Ethernet, X25, FDDI and ATM (AAL5). However, on modem networks, the insist for increasing basis reproves (aggravate 1 Gbit/s) is setting the occupations on enterprise very lofty. Indeed, the accelereprove growth (surlatter clock reproves) due to the technological disconnection is disqualified to fit the insist.Consequently, new constructions must be devised. Targetting the applications to an FPGA invention is an effect for this brochure, as it allows low-cost contrivances. The undesigning and visible serial utensilation is a chaste hardware utensilation of the CRC algorithm.
Unfortunatly, on an FPGA utensilation delay maximal clock abundance of 250 MHz, maximal basis reprove is scant to 250 Mbits/s is the best condition. Surlatter basis reproves can merely be obtained through analogousisation. Some analogous constructions enjoy been incomplete in the gone-by to address the insist for lofty basis throughput [ 1][2].The ocean height is usually to period the fast increasing area aggravatehead suitableness graceful the accelereprove enterprise. In this brochure, we exhibit a analogous approximation for the polynomial resistance established on a pipeline organization. The analogousisation can be led to any plane and is merely lim- ited by the area occupation set on the contrivance. The basis throughput is almost promptly linked to the analogousisation plane, as the maximal clock reprove is not very impressible to it.
2 PRINCIPLE The polynomial resistance is the essential achievement of the CRC applications.The serial utensilation of the resistance is exhibitionn in type 1 for the condition where the polynomial divisor is G ( X ) = Go + G1. X1 + Gz. X2 + G3. X3 = 1 + X + X 3 . As implied priorly, the basis throughput of this serial utensilation is entirely low. Very lofty basis reproves can merely be achieved delay lofty clock frequencies, which in depend can merely be obtained using rather rich technological disintegrations.
Parallelisation of basis arrangementing is the ocean disintegration to emend the accelereprove enterprise of a circumference (or rule) if the clock reprove must reocean low.Pipelining may be used as an talented analogousisation manner when a repeatitive arrangement must be applied on vast volumes of ‘data. Prior works enjoy addressed the analogousisation height in vast insisting computational applications, chiefly in arithmetic (eg. [3][4]) and mistake modeobjurgate coding circumferences (eg. [11[21[61). In the serial construction (type I), a new basis bit is inject on each clock cycle. The prior cumulated waitder is coincidently various by X and disconnected by G(z) (where G(z) is the polynomial divisor).
On P Type I : Serial polynomial resistance for G ( X ) = 1 -tX + X 3 -7803-7057-0/01/$10. 00 02001 IEEE. 1231 successive clock cycle , P bits are injected and P successive reproduction and resistances are produced. The proximate formula (connected to the specimen of type 1) describes the achievement produced on one clock cycle. 0 T = [ o o 1 !]=[n Gz 0 1 o 1 1 Go GI 0 i ] 0 3 RESULTS This construction enjoy been utensiled on FPGA inventions of the FLEXlOKE ALTERA source. These inventions enjoy their maximal clock abundance scant to 250 MHz. The construction was tested on the generating polynomials of consideration 1.
The developments in consideration 2 were obtained on FPGA inventions of the FLEXlOKE ALTERA source.The construction tested in these specimens utensils a amply achievemental CRC checker. The synchronisation signals to transcribe and peruse basis relatively on input and ouput are amply utensiled. The construction was produced using Synplify 5. 3 and MaxPlus11 10. 0. The construction was tested for 3 incongruous planes of paralelism on 6 incongruouss banner divisor polynomials.
It can be noticed that G17(z) is used on ethernet, FDDI and AALS-ATM, suitableness G14(z) is the banner polynomial for the X2. 5 protocol. The clock reproves must be compared to the loftyest abundance (250 MHz) that can be produced on FLEXlOKE inventions.The “IC” demonstration resources “logical cells” and is an demonstration of the area closeening. The developments must be compared to those obtained in [SI. A basis reprove of 160 Mbits/s was obtained on an ALTERA FLEXIOK invention (max. clock reprove of 125 MHz), on a 32-bit analogous CRC runtime-configurable utensilation of the decoder, established on the use of analogous combi- A pipeline organization can be devised by the utensilation of P successive reproductions and resistances.
However, to restrain the clock reprove lofty, the P achievements should not be produced in a separate combinatorial arrest. Thus, the orders of the P-multiplingldivising arrest must be disconnected by chronicles.This is the basic conception of the pipeline organization. Each of the P analogous bits of an input must be injected in their relative pipeline order. therefore, they must be injected on incongruous clock cycles. This may be produced if the bits are delayed in a transfer-record organization and (cf. the transfer record method betwixt [ d i n o ,.
.. , [douto, ... ,doutp-l] in the type 2, delay P = 8 in this specimen and G ( X ) = 1 + X + X 3 . The achievement produced when latter from the order k + l to the order k of the pipeline (k>O) is illustrative in the proximate formula, where G ( X ) = 1 + X + X 3 as it is in type 2.
ith Ri,J= 0 wheni + j > p - 1. The P bits of an input are arrangemented in P clock cycles. At each clock cycle, the development of the arrangementing of P bits is beneficial at the output of the pipeline organization. This development (the waitder of the P bits disconnected by G(z) must be cumulated in the [ROO, ROZ] ROI, record using a reiterated approximation, congruous to the intrigue of the serial construction of type 1. The cumulated waitder at era t must be various by X p and then disconnected by G(x). Then, the new particular waitder hereafter out of the pipeline organization can be cumulated. This arrangement is describet in the proximate formula.
Ro,o,ROJ,R0,Sltfl = [Ro,o,RO,l,R0,zIt * M +[Ri,o, Ri,i, Rl,z]t * T f [Do,P-l, 0,Olt natorial arrest for the polynomial resistance as exhibited in [ 11. The find obtained on the 32-bit analogous construction is delayin 16 and 30 eras, that is, 8 to 1. 5 eras using the corresponding technology (cf. consideration 2). For any confederacy of the contrivance parametres, the latency is alway correspondent to P clock cycles where P denotes the analogousisation plane. It can be noticed that for consecrated a maximal polynomial divisor order, the area closeening (compute of logic cells ) is almost proportional to the analogousisation plane of the construction.Furthermore, the developments exhibition that a vast growth of the analogousisation plane can be produced delay a moderate reduce of maximal clock abundance.
The censorious method is due to the M matrix. The perplexity of this matrix depends on the choosen polynomial (compute and situation of the non-zero conditions in the polynomial). It too depends on the analogousisation 1232 plane, but not linearly. Actually, a loftyer analogousisation plane can control to a close compound matrix.