TY - GEN
T1 - Modified montgomery modular multiplication using 4:2 Compressor and CSA adder
AU - Thapliyal, Himanshu
AU - Ramasahayam, Anvesh
AU - Kotha, Vivek Reddy
AU - Gottimukkula, Kunul
AU - Srinivas, M. B.
PY - 2006
Y1 - 2006
N2 - The efficiency of the Public Key encryption systems like RSA and ECC can be improved with the adoption of a faster multiplication scheme. In this paper, Modified Montgomery multiplications and circuit architectures are presented. The first modified Montgomery multiplier uses 4:2 compressor and carry save adders (CSA) to perform large word length additions. The total delay for a single modular multiplication using the proposed approach is 7XOR+1 AND gate compared to 8XOR+1AND gate of the recently proposed fastest algorithm. The second modified Montgomery multiplier uses a novel proposed hardware unit that outputs carry save representation of the 4-input operands in 3XOR delays. The total delay for a single modular multiplication using the novel hardware unit is 5XOR+1 AND gate compared to 6XOR+1AND gate of the recently proposed algorithm. The optimal transistor implementations of the proposed approaches have also been presented. The proposed transistor implementations are highly optimized in terms of area, speed and low power. The proposed Montgomery multiplication circuit will be of eminent importance when implemented for higher word length such as 1024 and 2048 as there will be saving in the propagation delays by 1024 and 2048 XOR gates respectively compared to the recently proposed fastest algorithm.
AB - The efficiency of the Public Key encryption systems like RSA and ECC can be improved with the adoption of a faster multiplication scheme. In this paper, Modified Montgomery multiplications and circuit architectures are presented. The first modified Montgomery multiplier uses 4:2 compressor and carry save adders (CSA) to perform large word length additions. The total delay for a single modular multiplication using the proposed approach is 7XOR+1 AND gate compared to 8XOR+1AND gate of the recently proposed fastest algorithm. The second modified Montgomery multiplier uses a novel proposed hardware unit that outputs carry save representation of the 4-input operands in 3XOR delays. The total delay for a single modular multiplication using the novel hardware unit is 5XOR+1 AND gate compared to 6XOR+1AND gate of the recently proposed algorithm. The optimal transistor implementations of the proposed approaches have also been presented. The proposed transistor implementations are highly optimized in terms of area, speed and low power. The proposed Montgomery multiplication circuit will be of eminent importance when implemented for higher word length such as 1024 and 2048 as there will be saving in the propagation delays by 1024 and 2048 XOR gates respectively compared to the recently proposed fastest algorithm.
UR - http://www.scopus.com/inward/record.url?scp=33847104147&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33847104147&partnerID=8YFLogxK
U2 - 10.1109/DELTA.2006.70
DO - 10.1109/DELTA.2006.70
M3 - Conference contribution
AN - SCOPUS:33847104147
SN - 0769525008
SN - 9780769525006
T3 - Proceedings - Third IEEE International Workshop on Electronic Design, Test and Applications, DELTA 2006
SP - 414
EP - 417
BT - Proceedings - Third IEEE International Workshop on Electronic Design, Test and Applications, DELTA 2006
T2 - Third IEEE International Workshop on Electronic Design, Test and Applications, DELTA 2006
Y2 - 17 January 2006 through 19 January 2006
ER -