scholarly journals A System-Level Exploration of Binary Neural Network Accelerators with Monolithic 3D Based Compute-in-Memory SRAM

Electronics ◽  
2021 ◽  
Vol 10 (5) ◽  
pp. 623
Author(s):  
Jeong Hwan Choi ◽  
Young-Ho Gong ◽  
Sung Woo Chung

Binary neural networks (BNNs) are adequate for energy-constrained embedded systems thanks to binarized parameters. Several researchers have proposed the compute-in-memory (CiM) SRAMs for XNOR-and-accumulation computations (XACs) in BNNs by adding additional transistors to the conventional 6T SRAM, which reduce the latency and energy of the data movements. However, due to the additional transistors, the CiM SRAMs suffer from larger area and longer wires than the conventional 6T SRAMs. Meanwhile, monolithic 3D (M3D) integration enables fine-grained 3D integration, reducing the 2D wire length in small functional units. In this paper, we propose a BNN accelerator (BNN_Accel), composed of a 9T CiM SRAM (CiM_SRAM), input buffer, and global periphery logic, to execute the computations in the binarized convolution layers of BNNs. We also propose CiM_SRAM with the subarray-level M3D integration (as well as the transistor-level M3D integration), which reduces the wire latency and energy compared to the 2D planar CiM_SRAM. Across the binarized convolution layers, our simulation results show that BNN_Accel with the 4-layer CiM_SRAM reduces the average execution time and energy by 39.9% and 23.2%, respectively, compared to BNN_Accel with the 2D planar CiM_SRAM.

1997 ◽  
Vol 473 ◽  
Author(s):  
J. A. Davis ◽  
J. D. Meindl

ABSTRACTOpportunities for Gigascale Integration (GSI) are governed by a hierarchy of physical limits. The levels of this hierarchy have been codified as: 1) fundamental, 2) material, 3) device, 4) circuit and 5) system. Many key limits at all levels of the hierarchy can be displayed in the power, P, versus delay, td, plane and the reciprocal length squared, L-2, versus response time, τ, plane. Power, P, is the average power transfer during a binary switching transition and delay, td, is the time required for the transition. Length, L, is the distance traversed by an interconnect that joins two nodes on a chip and response time, τ, characterizes the corresponding interconnect circuit. At the system level of the hierarchy, quantitative definition of both the P versus td and the L-2 versus τ displays requires an estimate of the complete stochastic wiring distribution of a chip.Based on Rent's Rule, a well known empirical relationship between the number of signal input/output terminals on a block of logic and the number of gate circuits with the block, a rigorous derivation of a new complete stochastic wire length distribution for an on-chip random logic network is described. This distribution is compared to actual data for modern microprocessors and to previously described distributions. A methodology for estimating the complete wire length distribution for future GSI products is proposed. The new distribution is then used to enhance the critical path model that determines the maximum clock frequency of a chip; to derive a preliminary power dissipation model for a random logic network; and, to define an optimal architecture of a multilevel interconnect network that minimizes overall chip size. In essence, a new complete stochastic wiring distribution provides a generic basis for maximizing the value obtained from a multilevel interconnect technology.


2020 ◽  
Vol 2020 ◽  
pp. 1-9
Author(s):  
Jun Zhao ◽  
Xumei Chen

An intelligent evaluation method is presented to analyze the competitiveness of airlines. From the perspective of safety, service, and normality, we establish the competitiveness indexes of traffic rights and the standard sample base. The self-organizing mapping (SOM) neural network is utilized to self-organize and self-learn the samples in the state of no supervision and prior knowledge. The training steps of high convergence speed and high clustering accuracy are determined based on the multistep setting. The typical airlines index data are utilized to verify the effect of the self-organizing mapping neural network on the airline competitiveness analysis. The simulation results show that the self-organizing mapping neural network can accurately and effectively classify and evaluate the competitiveness of airlines, and the results have important reference value for the allocation of traffic rights resources.


Micromachines ◽  
2021 ◽  
Vol 12 (6) ◽  
pp. 622
Author(s):  
Dongpeng Zhang ◽  
Anjiang Cai ◽  
Yulong Zhao ◽  
Tengjiang Hu

The V-shaped electro-thermal MEMS actuator model, with the human error factor taken into account, is presented in this paper through the cascading ANSYS simulation model and the Fuzzy mathematics calculation model. The Fuzzy mathematics calculation model introduces the human error factor into the MEMS actuator model by using the BP neural network, which effectively reduces the error between ANSYS simulation results and experimental results to less than 1%. Meanwhile, the V-shaped electro-thermal MEMS actuator model, with the human error factor included, will become more accurate as the database of the V-shaped electro-thermal actuator model grows.


Author(s):  
Anil S. Baslamisli ◽  
Partha Das ◽  
Hoang-An Le ◽  
Sezer Karaoglu ◽  
Theo Gevers

AbstractIn general, intrinsic image decomposition algorithms interpret shading as one unified component including all photometric effects. As shading transitions are generally smoother than reflectance (albedo) changes, these methods may fail in distinguishing strong photometric effects from reflectance variations. Therefore, in this paper, we propose to decompose the shading component into direct (illumination) and indirect shading (ambient light and shadows) subcomponents. The aim is to distinguish strong photometric effects from reflectance variations. An end-to-end deep convolutional neural network (ShadingNet) is proposed that operates in a fine-to-coarse manner with a specialized fusion and refinement unit exploiting the fine-grained shading model. It is designed to learn specific reflectance cues separated from specific photometric effects to analyze the disentanglement capability. A large-scale dataset of scene-level synthetic images of outdoor natural environments is provided with fine-grained intrinsic image ground-truths. Large scale experiments show that our approach using fine-grained shading decompositions outperforms state-of-the-art algorithms utilizing unified shading on NED, MPI Sintel, GTA V, IIW, MIT Intrinsic Images, 3DRMS and SRD datasets.


2021 ◽  
Vol 5 (1) ◽  
Author(s):  
Mingxue Ma ◽  
Yao Ni ◽  
Zirong Chi ◽  
Wanqing Meng ◽  
Haiyang Yu ◽  
...  

AbstractThe ability to emulate multiplexed neurochemical transmission is an important step toward mimicking complex brain activities. Glutamate and dopamine are neurotransmitters that regulate thinking and impulse signals independently or synergistically. However, emulation of such simultaneous neurotransmission is still challenging. Here we report design and fabrication of synaptic transistor that emulates multiplexed neurochemical transmission of glutamate and dopamine. The device can perform glutamate-induced long-term potentiation, dopamine-induced short-term potentiation, or co-release-induced depression under particular stimulus patterns. More importantly, a balanced ternary system that uses our ambipolar synaptic device backtrack input ‘true’, ‘false’ and ‘unknown’ logic signals; this process is more similar to the information processing in human brains than a traditional binary neural network. This work provides new insight for neuromorphic systems to establish new principles to reproduce the complexity of a mammalian central nervous system from simple basic units.


2021 ◽  
Vol 30 ◽  
pp. 2826-2836 ◽  
Author(s):  
Yifeng Ding ◽  
Zhanyu Ma ◽  
Shaoguo Wen ◽  
Jiyang Xie ◽  
Dongliang Chang ◽  
...  

Author(s):  
Lei Si ◽  
Zhongbin Wang ◽  
Xinhua Liu

In order to accurately and conveniently identify the shearer running status, a novel approach based on the integration of rough sets (RS) and improved wavelet neural network (WNN) was proposed. The decision table of RS was discretized through genetic algorithm and the attribution reduction was realized by MIBARK algorithm to simply the samples of WNN. Furthermore, an improved particle swarm optimization algorithm was proposed to optimize the parameters of WNN and the flowchart of proposed approach was designed. Then, a simulation example was provided and some comparisons with other methods were carried out. The simulation results indicated that the proposed approach was feasible and outperforming others. Finally, an industrial application example of mining automation production was demonstrated to verify the effect of proposed system.


Sign in / Sign up

Export Citation Format

Share Document