# EXPLOITING VOLTAGE DRIVEN SWITCHING OF FERROMAGNETS FOR NOVEL SPIN BASED DEVICES AND CIRCUITS

A Dissertation

Submitted to the Faculty

of

Purdue University

by

Akhilesh Jaiswal

In Partial Fulfillment of the

Requirements for the Degree

of

Doctor of Philosophy

May 2019

Purdue University

West Lafayette, Indiana

# THE PURDUE UNIVERSITY GRADUATE SCHOOL STATEMENT OF DISSERTATION APPROVAL

Dr. Kaushik Roy, Chair

School of Electrical and Computer Engineering

- Dr. Anand Raghunathan School of Electrical and Computer Engineering
- Dr. Dinesh Somasekhar

Intel Corporation

Dr. Vijay Raghunathan School of Electrical and Computer Engineering

## Approved by:

Dr. Pedro Irazoqui

Head of the School Graduate Program

Dedicated to H. H. Sri Sai Narayan Baba and to my Parents

#### ACKNOWLEDGMENTS

I would like to express my sincerest gratitude to my advisor Prof. Kaushik Roy for being a 'Mentor', a 'Teacher' and an 'Inspiration'. His uncanny ability to see through the details while not losing the sight on the big-picture has been instrumental in shaping my perspective to approach a given research problem, in general. He has always been keen to listen to new ideas and give pertinent suggestions. His *mentorship through questions* has been very effective in guiding me toward the right direction while still allowing me to think independently and figure out the specific details. Needless to say, without him in the background, this research would not have been possible in the first place.

I would also like to thank my doctoral dissertation committee: Prof. Anand Raghunathan, Dr. Dinesh Somashekhar and Prof. Vijay Raghunathan for their valuable feedback that have helped me improve the quality of my research work. My sincerest thanks to my undergraduate mentors Prof. Ramchandra Manthalkar and Prof. Suhas Gajre for believing in my abilities and for their constant encouragement. It was by chance I stumbled upon them and ended up having lifelong 'Role Models' to look upon. I would also thank Dr. Xuanyao Fong for introducing me to spin device physics and helping me develop the simulation models. I would also take this opportunity to thank my manager and mentor Dr. Ajey Jacob during my internship at Globalfoundries Research Lab. I am grateful for his encouragement to motivate me to think and come up with new ideas and for his help to develop those ideas in formal research projects.

Finally, a big thanks to my collaborators and fellow lab-mates at Nano-electronics Research Lab, Purdue for making my stay at Purdue enjoyable. I would also like to thank my friends Akshay Pohekar, Amol More, Bharat Shinde, Harshad Surana and many others, who I might have missed, for all the smiles and laughters. To my parents and sisters and their families my sincerest thanks for their love, support and patience.

## TABLE OF CONTENTS

|    |       |          | F                                                                                  | Page  |
|----|-------|----------|------------------------------------------------------------------------------------|-------|
| LI | ST O  | F TAB    | LES                                                                                | ix    |
| LI | ST O  | F FIGU   | URES                                                                               | х     |
| A] | BSTR  | ACT      | · · · · · · · · · · · · · · · · · · ·                                              | cviii |
| 1  | Intro | oduction | n and Motivation                                                                   | 1     |
| 2  | VCN   | /IA Phy  | rsics and Modeling                                                                 | 5     |
|    | 2.1   | Intro    | duction to VCMA mechanism                                                          | 5     |
|    |       | 2.1.1    | VCMA mechanism: Voltage asymmetry                                                  | 5     |
|    |       | 2.1.2    | VCMA mechanism: Precessional switching                                             | 7     |
|    | 2.2   | Device   | e Modeling                                                                         | 9     |
|    |       | 2.2.1    | Magnetization Dynamics based on stochastic LLGS equation including the VCMA effect | 9     |
|    |       | 2.2.2    | MTJ Resistance model                                                               | 10    |
|    |       | 2.2.3    | Self-Consistent SPICE Compatible Magnetization Dynamics and<br>Resistance Model    | 11    |
| 3  |       | ,        | nemory Stateful Vector Logic Operations based on Voltage Con-<br>netic Anisotropy  | 13    |
|    | 3.1   | Intro    | duction and Related Work                                                           | 13    |
|    | 3.2   | Prope    | osed <i>in-situ</i> , in-memory Stateful Vector Logic Operations                   | 16    |
|    |       | 3.2.1    | Stateful vector IMP gates                                                          | 16    |
|    |       | 3.2.2    | Stateful parallel NOT gates                                                        | 20    |
|    |       | 3.2.3    | Other Logic Gates                                                                  | 23    |
|    |       | 3.2.4    | Stateful XOR Gate                                                                  | 24    |
|    | 3.3   | Result   | 55                                                                                 | 26    |
|    | 3.4   | Summ     | ary                                                                                | 29    |
| 4  | Mag   | neto-ele | ectric Switching Mechanism and Modeling                                            | 30    |

## Page

|   | 4.1  | Introduction to Magneto-electric Switching of Ferro-magnets                                                                           | 30 |
|---|------|---------------------------------------------------------------------------------------------------------------------------------------|----|
|   | 4.2  | Modeling and Simulation                                                                                                               | 32 |
|   | 4.3  | Device Characteristics                                                                                                                | 37 |
|   |      | 4.3.1 Scalability                                                                                                                     | 37 |
|   |      | 4.3.2 Switching Speed                                                                                                                 | 38 |
| 5 | A St | cochastic Leaky-Integrate-Fire Neuron using Magneto-electric Switching                                                                | 39 |
|   | 5.1  | Introduction and Related Work                                                                                                         | 39 |
|   | 5.2  | Proposed Stochastic Leaky-Integrate-Fire Neuron                                                                                       | 41 |
|   | 5.3  | SNN Topology for pattern recognition                                                                                                  | 47 |
|   | 5.4  | Synaptic Learning Mechanism                                                                                                           | 47 |
|   | 5.5  | Hardware Implementation                                                                                                               | 48 |
|   | 5.6  | Simulation Methodology                                                                                                                | 49 |
|   | 5.7  | Summary                                                                                                                               | 51 |
| 6 | MES  | SL: Proposal for a Non-volatile Cascadable $\underline{M}$ agneto- $\underline{E}$ lectric $\underline{S}$ pin $\underline{L}$ ogic . | 52 |
|   | 6.1  | Introduction and Related Work                                                                                                         | 52 |
|   | 6.2  | ME Logic Family and Cascadability                                                                                                     | 55 |
|   | 6.3  | Results and Discussions                                                                                                               | 59 |
|   | 6.4  | Summary                                                                                                                               | 60 |
| 7 | Volt | age-Driven Domain-Wall Motion based Neuro-Synaptic Devices                                                                            | 61 |
|   | 7.1  | Introduction and Related Work                                                                                                         | 61 |
|   | 7.2  | Magneto-Electric DW motion based on Elastic Coupling                                                                                  | 63 |
|   | 7.3  | Magneto-Electric DW motion based Neuro-Synaptic Devices                                                                               | 66 |
|   |      | 7.3.1 LIF Neuron                                                                                                                      | 66 |
|   |      | 7.3.2 Programmable Synapse                                                                                                            | 68 |
|   | 7.4  | Device Modeling and Simulation                                                                                                        | 71 |
|   | 7.5  | Results                                                                                                                               | 75 |
|   |      | 7.5.1 Neuro-synaptic behavior of the proposed devices                                                                                 | 75 |
|   | 7.6  | Conclusion                                                                                                                            | 78 |

## Page

| 8  | Ener | gy-Efficie | ent Memories   | using Ma   | gneto-l | Electri | ic Sw | ritchi | ng d | of Fe | erroi | na | gne | ts | 79 |
|----|------|------------|----------------|------------|---------|---------|-------|--------|------|-------|-------|----|-----|----|----|
|    | 8.1  | Introdu    | ction          |            |         |         |       |        |      |       |       | •  |     |    | 79 |
|    | 8.2  | ME de      | vices under co | nsideratio | on      |         |       |        |      |       |       | •  |     |    | 80 |
|    | 8.3  | Device (   | Characteristic | s          |         |         |       |        |      |       |       | •  |     |    | 83 |
|    |      | 8.3.1      | Writability    |            |         |         |       |        |      |       |       | •  |     |    | 83 |
|    |      | 8.3.2 l    | Readability .  |            |         |         |       |        |      |       |       | •  |     |    | 84 |
|    |      | 8.3.3      | Switching Spe  | ed         |         |         |       |        |      |       |       | •  |     |    | 85 |
|    | 8.4  | ME Me      | mory Design .  |            |         |         |       |        |      |       |       | •  |     |    | 87 |
|    |      | 8.4.1 I    | ME Dual Port   | Memory     |         |         |       |        |      |       |       | •  |     |    | 87 |
|    |      | 8.4.2 I    | ME CAM         |            |         |         |       |        |      |       |       | •  |     |    | 88 |
|    | 8.5  | Summar     | су             |            |         |         |       |        |      |       |       | •  |     |    | 90 |
| 9  | Sum  | mary and   | l Future Work  | ٢          |         |         |       |        |      |       |       | •  |     |    | 91 |
| А  | App  | endix .    |                |            |         |         |       |        |      |       |       | •  |     |    | 93 |
|    | A.1  | Introdu    | ction          |            |         |         |       |        |      |       |       | •  |     |    | 93 |
|    | A.2  | Propose    | d Spin Dice .  |            |         |         |       |        |      |       |       | •  |     |    | 94 |
|    | A.3  | Results    |                |            |         |         |       |        |      |       |       | •  |     |    | 96 |
| RI | EFER | ENCES      |                |            |         |         |       |        |      |       |       | •  |     | 1  | 01 |
| VI | TA   |            |                |            |         |         |       |        |      |       |       | •  |     | 1  | 12 |

## LIST OF TABLES

| Tabl | Page                                                                                                |
|------|-----------------------------------------------------------------------------------------------------|
| 2.1  | MTJ parameters used in the simulation model for analyzing the VCMA effect                           |
| 3.1  | Average energy consumption per-bit and latency in the IMP and NOT vector operations                 |
| 4.1  | Summary of Parameters used for our simulations for analyzing the ME effect<br>35 $$                 |
| 5.1  | Summary of parameters used in our simulations for analysis of ME based<br>Neuron                    |
| 7.1  | Parameters used for simulations adopted from [110,112] for studying ME-<br>DW Neuro-Synaptic Device |
| 8.1  | Summary of Parameters used for our simulations                                                      |
| 8.2  | Variation of MTJ Resistance with $t_{MgO}$                                                          |
| 8.3  | Comparison of proposed ME-XNOR CAM                                                                  |

#### LIST OF FIGURES

### Figure

2

- 1.1 (a) The depiction of change in resistance in the P and the AP state of the MTJ. (b) The switching of the MTJ stack from P to AP state due top electron flow from the PL to the FL and vice-versa.
- 1.2 Various spin based switching mechanisms and devices (Field driven, Current driven: STT and SHE (Spin Hall Effect), Voltage Driven: VCMA and ME) and comparative switching power consumption for each device [12].
- (a) A VCMA based MTJ. The MTJ consists of a *pinned layer* and a *free* 2.1*layer* separated by a non-magnetic spacer. When a voltage is applied across the MTJ, there is a redistribution of electrons in the d-orbitals thus making the interface anisotropy sensitive to the applied voltage. (b) Schematic representation of the voltage asymmetry of the VCMA based MTJs. When a positive (negative) voltage is applied across the VCMA MTJ the energy barrier (EB) decreases (increases) due to the lowering (enhancement) of the interface anisotropy. Thus, VCMA mechanism makes the MTJ asymmetric with respect to voltage polarity, a positive voltage assists in switching the MTJ whereas a negative voltage makes it much harder to switch the MTJ. (c) Figure representing the precessional switching mechanism. When a positive voltage is applied across the MTJ such that the interface anisotropy is sufficiently lowered, the magnetization vector becomes free to precess around the *hard axis* due to the effective inplane field  $(H_{in-plane})$ . Inset shows the lowering of the interface anisotropy on application of sufficiently high positive voltage (V>0). While the magnetization vector is precessing around the hard-axis, if the voltage pulse is switched OFF when the magnetization is close to point A, it would slowly dampen towards -z direction, thereby switching the direction of magnetization by  $180^{\circ}$ ..... 6 2.2The NEGF based MTJ-resistance model [40] benchmarked against experimental data from [41]. 122.3A graphical representation of the various components of our self-consistent magnetization dynamics and resistance transport model. . . . . . . . . . . . . . . . . . 12

3.1(a) The truth table for two input IMP operation. The columns B and B' are the same except for row 1, highlighted in red. (b) The array configuration showing the voltages at various SLs and WLs and the current flow during the stateful computation of the bit-wise IMP operation. (c) A simplified circuit showing the voltage divider configuration resulting due to the applied voltages at the SLs and WLs. (d) A typical magnetization dynamics during the switching of the MTJ-2 from the P to the AP state, when MTJ-1 is in the P state. Note, this switching dynamics is a typical STT dominated switching, VCMA effect lowers the EB for MTJ-2, thereby allowing the small current flowing through the MTJ-2 to be able 17(a) The truth table for NOT operation. (b) The array configuration show-3.2ing the voltages at various BLs and WLs and the current flow during the stateful computation of the massively parallel NOT operation. (c) A typical magnetization dynamics showing the precessional switching behavior of the VCMA MTJs mimicking the NOT operation. On application of proper voltages, irrespective of the initial state of the magnetization direction (+z or -z), the magnetization vector switches by 180° thereby implementing the desired stateful NOT operation. . . . . . . . . . . . . . 203.3 Based on the proposed stateful operations as described in the above subsections an IMP and NOT operation can be completed in one cycle, whereas a two cycle operation can implement the NAND, NIMP and NOT logic. Similarly, a three cycle operation can be used for the AND and NOR logic computation. For multi-cycle logic, the part of logic highlighted in red can be computed in the first cycle, the part in white can be computed in second cycle, while the part highlighted in blue would be computed in the third cycle. 23(a) A truth table for XOR gate. The logic output B' retains its original 3.4value when the operand A is 'L', whereas if the operand A is 'H', the new value for B' is the complement of its original value B. (b) Figure shows the array structure used for implementing the XOR operation. The voltages on BLs represent the bits corresponding to the operand A, while the data stored in the MTJs represent the bits corresponding to the operand B. The values in the MTJs are inverted conditionally only if the bits corresponding to the operand A are 'H' *i.e.* only if the respective SLs are pulled high. Note, in the example shown, the bit value for  $A_1$  is 'L', as such, BL-1 is kept low. Therefore, no current flows through the column corresponding to BL-1 and hence the bits corresponding to BL-1 consume no energy. . . . 24

Page

Fie

| Figu | re                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | Page |
|------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------|
| 3.5  | (a)Probability of B's final state being 'H' (or digital '1') for the four initial cases of A and B (00,01,10,11) in the vector IMP operation, as a function of the voltage pulse. At a pulse width of $\sim 25$ ns, the correct IMP result is obtained. (b) Probability of inverting the state of the VCMA MTJ due to precessional switching as a function of the pulse duration. The switching probability peaks at $\sim 2$ ns due to the half-cycle rotation of the magnetization dynamics.                                                                                          | . 27 |
| 4.1  | (a) A graphical representation of a multi-ferroic material. Multi-ferroic are those materials that exhibit more than one ferroic order (ferro-electricity, ferro-magnetism and ferro-eleasticity). (b-c) A ferro-magnet in physical contact with an ME oxide. When an electric field is applied in the +z direction the ferro-magnet switches to the +x direction and <i>vice-versa</i> (d) Schematic for an ME-MTJ. By applying appropriate voltage across the ME oxide the state of the MTJ can be changed from parallel (P) to anti-parallel (AP).                                   | . 30 |
| 4.2  | (a) A typical evolution of magnetization components $mx$ , $my$ , $mz$ on application of a voltage pulse. The magnet is being switched from +x direction to -x direction. (b) The parallel and anti-parallel resistance obtained from our NEGF model [40] and benchmarked to experimental data from [69]. The resistance-voltage characteristics of Fig 4(b), were abstracted into a behavioral model for simulation.                                                                                                                                                                   | . 34 |
| 4.3  | (a) A typical trajectory followed by the magnetization vector when switched<br>using STT mechanism. The STT mechanism initially acts as an anti-<br>damping torque and subsequently as a damping torque thereby switching<br>the state of the ferro-magnet. (b) A typical trajectory followed by the<br>magnetization vector when switched using the ME mechanism. With ap-<br>plication of an external voltage the magnetization tries to orient itself<br>towards the direction of the ME field and finally dampens, resulting in a<br>180 <sup>o</sup> switching of the ferro-magnet |      |
| 5.1  | (a) A biological neuron with interconnecting synapses. (b) A represen-<br>tative model for a biological neural network. $V_i$ s are the input spikes<br>generated by pre-neurons. The neuron emits a spike, if the membrane po-<br>tential $(V_{mem})$ crosses a certain threshold $(V_{th})$ . The weighted summation<br>is usually carried out by a resistive crossbar array. Our proposed ME<br>device aims to emulate the LIF and thresholding behavior of a biological<br>neuron                                                                                                   | . 39 |

| Figu | re                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | Page |
|------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------|
| 5.2  | Schematic of the proposed LIF ME neuron. Thick ME oxide (5nm) sand-<br>wiched between the metal contact and the ferro-magnet, acts as a capac-<br>itor. Diode connected transistor M1 prevents back flow of charges stored<br>on the ME capacitor, while resistor R1 determines the rising time constant<br>for the capacitor. M2 constitutes the leak path, when the voltage on the<br>Leak/Reset terminal is zero.                                                                                                                                                                           | . 42 |
| 5.3  | The stochastic switching behavior of the proposed ME neuron as a func-<br>tion of the voltage across ME capacitor. The switching probability was<br>obtained for 10,000 runs using magnetization dynamics model with ther-<br>mal noise and pulse duration of 1ns                                                                                                                                                                                                                                                                                                                              | . 45 |
| 5.4  | Simulation results for the ME neuron, shown in Fig. 5.2. Top panel<br>shows the input spikes fed to the $V_{in}$ terminal of the device. Middle panel<br>shows the voltage across the ME capacitor, exhibiting the typical leaky-<br>integrate dynamics. Bottom panel, illustrates the switching of the ferro-<br>magnet from +x to -x direction generating a spike annotated as <i>Spike-1</i> .<br>No more spikes are generated until the device is reset to its initial position<br>by applying a negative voltage. After reset, device emits a second spike<br>annotated as <i>Spike-2</i> |      |
| 5.5  | <ul> <li>(a) SNN topology for pattern recognition. The input neurons are fully connected to the excitatory post-neurons, each of which is connected to the corresponding inhibitory neuron in a one-on-one manner. There are lateral inhibitory connections from each inhibitory neuron to all the excitatory post-neurons except the one from which it received a forward connection.</li> <li>(b) STDP learning algorithm, wherein the change in synaptic conductance is exponentially related to the difference in the spike times of the pre- and post-neuronal pair.</li> </ul>           | . 46 |
| 5.6  | A typical crossbar implementation of the SNN topology using the pro-<br>posed ME neuron. Memristive devices constitute the synapses, while the<br>proposed device mimics the LIF post-neurons. The on-chip learning cir-<br>cuit programs the synaptic conductance based on spike timing. Inputs to<br>the system are spike trains corresponding to the 28×28 image pixels from<br>the MNIST dataset                                                                                                                                                                                           |      |
| 5.7  | (a) Synaptic weights connecting the $28 \times 28$ input pre-neurons to each of the 200 excitatory post-neurons towards the end of the training phase. (b) Classification accuracy verses the number of excitatory post-neurons                                                                                                                                                                                                                                                                                                                                                                |      |

| 6.1 | (a) (Left) Figure illustrating the ME switching of a ferro-magnet with applied electric field. A positive voltage on the upper terminal switches the magnet in positive x direction and <i>vice-versa</i> (Right) An MTJ stack consisting of an MgO sandwiched between two nano-magnets. The resistance of the MTJ is a function of the voltage and the relative orientation of the magnetization directions. (b) The proposed four terminal logic-device. The upper (lower) nano-magnet can be switched by application of a voltage pulse on terminal 1 (2). The resistance of the MTJ stack can be sensed between terminals 3 and 4. The thickness of the ME oxide and the MgO spacer can be tuned independently to improve the write-efficiency and the sensing margin simultaneously. | 54 |
|-----|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----|
| 6.2 | (a) Proposed ME XNOR gate. Only when both the ferro-magnets point in the same direction, the output of the inverter goes high, thus implementing an XNOR function. Inset shows the truth table for the XNOR function. L represents a digital 0 and H represents a digital 1. (b) Proposed ME NAND/NOR gate. For NAND operation, the inverter is sized such that the output goes low only if both the MTJ stacks are in anti-parallel (high-resistance) state. Whereas, for NOR operation, the sizing of the output inverter is such that it goes high only if both the MTJ stacks are in parallel (low-resistance) state.                                                                                                                                                                 | 55 |
| 6.3 | (Left) Truth table for an IMP and NIMP logic gate. (Bottom) The set<br>of logic gates forming a complete logic basis along with the IMP/NIMP<br>gate. (Right) The proposed 2 input ME IMP gate. Inset shows the state<br>of the ME-MTJs under various inputs.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | 57 |
| 6.4 | Figure illustrating cascading of two ME XNOR gates. Initially, a reset operation is carried out by applying negative voltage pulses on terminals 'A', 'B', 'C', 'G <sub>1</sub> ' and 'G <sub>2</sub> '. On the other hand, when data is applied the two stages are activated in a typical domino-style, one after another. A representative timing diagram illustrates the waveforms on various nodes.                                                                                                                                                                                                                                                                                                                                                                                   | 58 |
| 7.1 | (a) The replication of the domain pattern of the FE layer into the FM layer due to local strain coupling. An effective uniaxial anisotropy is induced in the region of the FM above the a-domain, while a cubic anisotropy is induced in the region over the c-domain. (b) Due to high aspect ratio the demagnetization anisotropy of the FM tends to align the magnetization of the FM along the length of the magnet, thereby resulting in almost 180° angle between the magnetizations in the two regions of the FM                                                                                                                                                                                                                                                                    | 64 |

Page

| XV |
|----|
|----|

| Figu | re                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | Page |
|------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------|
| 7.2  | The proposed non-volatile LIF neuron based on elastic coupling between<br>the FE-DW and FM-DW. The position of the FM-DW represents the<br>membrane-potential, while the switching activity of the MTJ emulates<br>the firing behavior of the neuron.                                                                                                                                                                                                                                    | . 67 |
| 7.3  | Micromagnetic simulation showing the domain wall shape and structure. The zoomed image shows a $90^{\circ}$ domain wall which has been transformed to a $180^{\circ}$ domain wall due to shape anisotropy.                                                                                                                                                                                                                                                                               | . 69 |
| 7.4  | The proposed non-volatile programmable synapse based on elastic cou-<br>pling between the FE-DW and FM-DW. The position of the FM-DW<br>modulates the conductane between Terminal-1 and 3 of the device. The<br>FM-DW position, and thus the conductance of the synapse, can be modi-<br>fied by applying a +ve or -ve voltage across Terminal-2 and 3                                                                                                                                   | . 70 |
| 7.5  | Depinning velocities of the magnetic domain wall, for positive and neg-<br>ative velocities. The blue plot was obtained by using periodic boundary<br>conditions and parameters from [112]. The red plot was obtained without<br>periodic boundary conditions and scaled dimensions                                                                                                                                                                                                      | . 74 |
| 7.6  | Leaky integrate and fire behavior of the proposed neuron in response to<br>input train of spikes. (a) Input voltage spike train received by the neu-<br>ron. (b) FM-DW position (acts as membrane potential variable). (c)<br>x-component of magnetization under the MTJ stack. Once the MTJ<br>switches, the neuron fires, and the domain wall is reset to its initial po-<br>sition. The inset shows the average magnetization under the MTJ when<br>the domain wall traverses the MTJ | . 76 |
| 7.7  | Plot of MTJ conductance $G_{MTJ}$ of the synaptic device in response to voltage pulses exhibits a controlled behavior of the synaptic weights. This can be used for better learning algorithms like 'ASP' for precise tuning of synaptic weight values. The leaky behavior of the synaptic weights can be implemented using a small -ve voltage across the device.                                                                                                                       | . 77 |
| 8.1  | (a) Schematic of the ME-MTJ and (b) ME-XNOR. The ferromagnets in contact with respective ME oxides can be switched by applying appropriate voltages across the ME oxides. The direction of switching can be reversed by changing the polarity of the applied voltage. Due to shape anisotropy the easy axis of the ferro-magnets lie along the $\pm x$ axis                                                                                                                              | . 81 |

| Figur | ce                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | Page |
|-------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------|
|       | (a) Switching probability versus voltage applied across the ME capacitor. It can be seen larger the ME co-efficient lower is the voltage required to switch the direction of magnetization. (b) The failure probability obtained versus voltage. Each point on the graph was obtained by 1,000 simulations of the stochastic LLG equation. The voltage was applied for a duration of 500ps and the state of the magnet was investigated after the application of the voltage pulse to verify if the magnet has switched within the applied pulse duration. |      |
|       | (a) (Left axis) Bit-cell TMR versus MgO thickness obtained from our NEGF based transport model. In each case, a transistor in series is used with Width/Length (W/L) ratio as specified in the figure. (Right axis) The RC time constant as a function of the MgO thickness. (b) A typical 3D switching trajectory of the magnetization under influence of applied voltage.                                                                                                                                                                                |      |
|       | 1-Read / 1-Write dual port memory using decoupled read/write path of ME-MTJs. The top row of ME-MTJs are being written into by activating the RWL, while the bottom row of ME-MTJs can be simultaneously read by activating WWL.                                                                                                                                                                                                                                                                                                                           |      |
|       | Proposed CAM based on ME-XNOR device. The upper and lower ferro-<br>magnets comprising the ME-XNOR device can be used to store the input<br>data and the data to be matched, respectively. The $\overline{match}$ signal goes low<br>if and only if all the p-MOSes of a particular row are turned OFF                                                                                                                                                                                                                                                     |      |
|       | Schematic of an STT-MRAM bit cell being utilized as VC-SD. The bit-<br>cell consists of the MTJ in series with an access transistor. The proposed<br>TRNG can be implemented using a standard STT-MRAM array. The<br>operation consists of "Reset", "Relax" and "Read" operations. The cor-<br>responding control signals WL, BL and SL have been shown                                                                                                                                                                                                    |      |
|       | Magnetization dynamics of the same VC-SD device for two different simulation runs                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | . 94 |

| A.3 | Our benchmarked results for (a) only VCMA-induced switching, and (b) combined VCMA and STT switching. (c) NEGF results obtained from our transport model. We have matched the parallel and anti-parallel resistance to the reported value of $11K\Omega$ and $25K\Omega$ respectively. All the benchmarking is done with respect to the experiment [38]. The MTJ is of circular cross-sectional area with diameter $40nm$ and FL thickness $0.9nm$ . The oxide thickness is $1.3nm$ . An external field of magnitude $31mT$ is applied to provide the necessary in-plane magnetic field. It is worth noting here that the external field was only considered during the benchmarking process. For VC-SD operation, no external field was provided by the demagnetization field of the magnet. The MTJ operating voltage is $0.7V$ . 97 |
|-----|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| A.4 | A sample SD trajectory for the proposed TRNG. The magnetization switches to "hard-axis" and subsequently relaxes to one of the stable magnetization states                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
| A.5 | Switching probability (measured over 500 independent stochastic LLG simulations) for varying "reset" pulse width $(1 - 6ns)$ . The randomness offset remains limited within reasonable bounds (< 10%) even with (a) variations in cross-sectional area (5%) and thickness (2%), and (b) temperature                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |

Page

#### ABSTRACT

Akhilesh Jaiswal Ph.D., Purdue University, May 2019. Exploiting Voltage Driven Switching of Ferromagnets for Novel Spin based devices and circuits. Major Professor: Dr. Kaushik Roy.

The *spin* of an electron has for long excited researchers both with respect to its fundamental physics and technological applications. Consequently, the traditional field driven switching of ferromagnets gave way for more scalable current driven switching based on the well-known spin transfer torque phenomenon. However, in the quest for better energy-efficiency, the manipulation of electron spin through pure voltage driven or voltage-assisted mechanisms are being intensely explored. In this research, we demonstrate that the very physics and the characteristics of such voltage driven devices enable interesting possibilities with respect to memory, neuromorphic and logic applications. We rely on the recent experimental demonstrations of two novel voltage effects on nano-magnets – the voltage controlled magnetic anisotropy (VCMA) and the pure voltage driven magneto-electric (ME) effect. Specifically, we propose in-situ, in-memory, vector logic operations by exploiting the voltage asymmetry and precessional switching dynamics of the VCMA effect to construct 'stateful' logic gates. Stateful logic are those in which the same device acts as a storage element and compute engine, simultaneously. In addition, we show that the pure voltage driven mono-domain switching and domain-wall motion of nano-magnets through the ME effect can be leveraged to construct neuro-mimetic devices exhibiting leakyintegrate-fire dynamics of biological neurons and as well as non-volatile synaptic elements. Further, we propose a voltage driven logic-device using the ME switching and demonstrate that the proposed logic-device can be used to construct a complete cascadable logic family including XNOR, IMP (implication), NAND and NOR gates. Additionally, we present an energy and area efficient content addressable memory using a logic compatible ME-XNOR device. The presented research shows that voltage driven switching can augment the very functionality and widen the application scope of spin based devices and circuits.

## 1. INTRODUCTION AND MOTIVATION

Right from the very inception of the idea of an electron having an intrinsic 'spin' in early 1920s [1], electron spin has intrigued physicist and technologist alike. Seminal theoretical and experimental works by the likes of Paul Dirac [2], Stern-Gerlach [3], Albert Fert [4] and others made possible the first field driven magnetic storage devices. However, such field driven devices were not very scalable due to the requirement of an external magnetic field for switching the state of the magnetic memories. It was Slonczewski's theoretical work published in 1996 that predicted current driven switching of ferromagnets by the spin transfer torque (STT) mechanism [5]. The discovery of the STT phenomenon lead to an entire paradigm shift in spin based devices and their applications.

Let us have a quick look at the STT mechanism. The basic magnetic device is a magnetic heterostructure consisting of two nano-magnets separated by an insulating oxide, as shown in Fig. 1.1(a), called the magnetic tunnel junction. The direction of magnetization in one of the nano-magnets is fixed (the pinned layer), while the direction of magnetization for the other nano-magnet can be changed (the free layer). When the pinned and the free layer point in the same direction (Fig. 1.1(a)), MTJ is in the low resistance state (P state or digital '0') whereas, when the directions of magnetization in the two layers are anti-parallel, the MTJ is in the high resistance state (AP state or digital '1'). A read operation is accomplished by applying a small voltage across the MTJ and sensing its resistance. For writing a digital bit into the MTJ, a state transition may be required from the P to AP state or the AP to P state. For an AP to P transition, the electrons flow from the pinned layer (PL) into the free layer (FL), as shown in Fig. 1.1(b). The pinned layer acts as a polarizer and the electrons flowing from the pinned layer to the free layer are polarized in the direction of the pinned layer. This spin-polarized current exerts a torque on the free layer [5] making



Fig. 1.1. (a) The depiction of change in resistance in the P and the AP state of the MTJ. (b) The switching of the MTJ stack from P to AP state due top electron flow from the PL to the FL and vice-versa.

its magnetization parallel to that of the pinned layer. For writing a digital '1' (P to AP), current direction has to be reversed. In this case, spins pointing in the direction of the pinned layer easily pass through the MTJ leading to accumulation of opposite spins in the free layer. These accumulated spins exert a torque on the free layer making it anti-parallel with respect to the pinned layer. Thanks to the rich physics of the STT effect and its associated magnetization dynamics many non-volatile memories [6], non-volatile logic [7], logic-in memory [8], non-Boolean computations [9], neuromorphic applications [10], combinatorial optimization [11] *etc.* have became possible.

STT mechanism, however, suffers from relatively high switching power consumption owing to the current induced nature of the switching mechanism [13]. In order to reduce the switching power various voltage driven or voltage assisted switching mechanisms are being actively investigated [14, 15]. As shown in Fig. 1.2, voltage driven schemes allows lowering of power consumption with respect to switching the state of the device from one stable state to the other. We would be considering two different voltage driven effects 1) the voltage controlled magnetic anisotropy (VCMA) and 2) the magneto-electric (ME) effect. VCMA mechanism allows one to modulate the energy barrier of an MTJ by application of an electric field. Lower the energy barrier, lower is the switching current requirement. On the other hand, the ME effect lever-



Fig. 1.2. Various spin based switching mechanisms and devices (Field driven, Current driven: STT and SHE (Spin Hall Effect), Voltage Driven: VCMA and ME) and comparative switching power consumption for each device [12].

ages the coupling between different order parameters (ferroelectricity, ferromagnetism and ferroelasticity) to achieve pure voltage driven switching of ferromagnets.

In this research, we demonstrate that the voltage driven switching of ferromagnets not only helps in achieving energy-efficiency, it opens up new avenues for memory, neuromorphic and logic applications. Specifically, chapter 2 describes the physics and modeling of the VCMA effect [16, 17]. In chapter 3, we propose *in-situ*, in-memory, stateful logic operations by exploiting the very physics of the VCMA mechanism [17]. Chapter 4 discusses the ME effect and its modeling [18]. Chapters 5 and 6 propose novel ME devices as neuromorphic [19] and Boolean logic primitives [18, 20], respectively. Chapter 7 introduces a novel neuro-synaptic device based on pure voltage driven domain-wall motion using elastically coupled ferroelectric and ferromagentic layers [21]. In chapter 8, we present a 1-Read /1-Write dual port memory and an energy- and area-efficient content addressable memory using ME devices [22]. Chapter 7 elaborates on the future work and summarizes the research work. In Appendix, we present an energy-efficient true random number generators using the VCMA effect [23].

## 2. VCMA PHYSICS AND MODELING

### 2.1 Introduction to VCMA mechanism

### 2.1.1 VCMA mechanism: Voltage asymmetry

The basic device structure under consideration is the two terminal magnetic tunnel junction (MTJ). An MTJ consists of two nano-magnets separated by an insulating oxide as shown in Fig. 2.1(a). The MTJ is called a perpendicular MTJ if the magnetization directions of the two nano-magnets are perpendicular to the plane of the nano-magnets. One of the nano-magnets is fixed called the *pinned layer* (PL), while the other nano-magnet can be switched by applying a voltage across the MTJ called the *free layer* (FL). The MTJ has two stable states called the parallel (P) state and the anti-parallel (AP) state. When the magnetization of the two nano-magnet is in the same direction the MTJ is in low resistance P state and *vice-versa*.

Conventionally, the state of the MTJ has been switched using the current induced spin transfer torque (STT) phenomenon [24]. The basic physics associated with the STT phenomenon relies on the fact that a *spin polarized* current passing through the FL exerts a torque on the FL thereby flipping the state of the MTJ from the P to the AP state and *vice-versa*. This exerted torque by the STT mechanism has to be sufficient to overcome the energy barrier (EB) associated with the FL. In perpendicular MTJ, it is the interface anisotropy that creates the required energy barrier between the two stable states of the MTJ. In general, higher the EB, higher is the current required to switch the MTJ. One of the key challenges associated with the STT phenomenon is the high switching current requirement [25]. In order to reduce the current requirement for switching the nano-magnet various voltage driven switching phenomenon are under intense research investigation [26, 27]. One of the most



Fig. 2.1. (a) A VCMA based MTJ. The MTJ consists of a *pinned layer* and a *free layer* separated by a non-magnetic spacer. When a voltage is applied across the MTJ, there is a redistribution of electrons in the d-orbitals thus making the interface anisotropy sensitive to the applied voltage. (b) Schematic representation of the voltage asymmetry of the VCMA based MTJs. When a positive (negative) voltage is applied across the VCMA MTJ the energy barrier (EB) decreases (increases) due to the lowering (enhancement) of the interface anisotropy. Thus, VCMA mechanism makes the MTJ asymmetric with respect to voltage polarity, a positive voltage assists in switching the MTJ whereas a negative voltage makes it much harder to switch the MTJ. (c) Figure representing the precessional switching mechanism. When a positive voltage is applied across the MTJ such that the interface anisotropy is sufficiently lowered, the magnetization vector becomes free to precess around the hard axis due to the effective in-plane field  $(H_{in-plane})$ . Inset shows the lowering of the interface anisotropy on application of sufficiently high positive voltage (V>0). While the magnetization vector is precessing around the hard-axis, if the voltage pulse is switched OFF when the magnetization is close to point A, it would slowly dampen towards -z direction, thereby switching the direction of magnetization by  $180^{\circ}$ .

promising technique and easy to incorporate in the two terminal MTJ stack is the voltage controlled magnetic anisotropy (VCMA) effect [26].

VCMA effect is the phenomenon of being able to modulate the interface anisotropy of the MTJ stack by applying a voltage across the MTJ [28]. Application of an electric field modulates the relative occupancy of the valence d-orbitals, as shown schematically in Fig. 2.1(a), thereby effectively changing the interface anisotropy [29,30]. Recall, in perpendicular MTJs it is the interface anisotropy that is primarily responsible for creating the required EB. A large EB is required for maintaining the non-volatility of the MTJ devices. However, a large EB also makes it harder to switch the nano-magnets during the write process. VCMA effect allows one to temporarily reduce the EB by reducing the interface anisotropy in response to electric field. The reduced EB makes it easier to switch the nano-magnets, thereby reducing the switching current requirement. On the other hand, if the direction of the electric field is reversed, EB increases due to the VCMA effect making it much more difficult to switch the nano-magnet. This increase or decrease in the EB due to application of an electric voltage across the MTJ is shown schematically in Fig. 2.1(b). The figure shows that, the VCMA effect makes the MTJ stack *asymmetric* with respect to the voltage polarity. With favorable voltage polarity (pinned layer at higher potential than the free layer) the MTJ can be easily switched while if the voltage polarity is reversed the MTJ would be difficult to switch. In fact, it has been experimentally shown that when the EB is increased by applying a voltage, the MTJ breaks down at sufficiently higher voltages but does not switch [31]. In later section, we would describe how this voltage asymmetry of the VCMA based MTJs would be used to construct stateful IMP (implication) logic for vector operations.

#### 2.1.2 VCMA mechanism: Precessional switching

VCMA effect allows for a new switching dynamics, called the *precessional switching*, in contrast to the typical STT based switching phenomenon [16]. The precessional switching dynamics can be understood with respect to Fig. 2.1(c). Let us assume the magnetization of the FL is initially pointing in +z-direction due to the interface anisotropy that tends to align the magnetization direction perpendicular to the plane of the nano-magnet. As a consequence of the VCMA effect, when a voltage is applied across the MTJ, the interface anisotropy decreases. If the decrease in the interface anisotropy is sufficient, the magnetization would no longer be bound by the interface anisotropy and would be free to deviate from its initial position (+z)in this case). Now, assume there is a small in-plane field in +x-direction (denoted as  $H_{in-plane}$  in Fig. 2.1(c)) either due to the shape anisotropy, or such in-plane field can be engineered in the MTJ stack as experimentally demonstrated in [32]. Since, the interface anisotropy has been reduced by voltage application (V>0 in Fig. 2.1(c)) and there is an effective field in the +x direction, the magnetization would tend to align itself to the effective field. It would do so by precessing and slowly damping towards the +x direction. This behavior is graphically depicted in Fig. 2.1(c), where the magnetization initially starts from position 'I' and then follows the trajectory marked by points A-B-C on application of electric filed across the MTJ (V>0).

If we turn OFF the applied voltage when the magnetization is at point A in Fig. 2.1(c), the magnetization would slowly dampen and point in the -z direction due to the interface anisotropy. Thus, by timing the voltage pulse such that magnetization makes a half-cycle around the *hard-axis* (+x in this case), it can be switched by  $180^{\circ}$ . This switching due to the precession of the magnetization across the hard-axis is called precessional switching. VCMA based precessional switching has several advantages including low energy-requirement and high switching speed [33]. We would later describe how this precessional switching of the VCMA MTJs can be used to construct a massively parallel NOT operation.

### 2.2 Device Modeling

In this section, we describe the coupled device-circuit simulation model developed for analyzing VCMA based MTJ and associated circuits. The model integrates and self-consistently solves the magnetization dynamics and electron transport model in a SPICE platform, enabling a rigorous circuit simulation for evaluating of energy and performance metrics.

## 2.2.1 Magnetization Dynamics based on stochastic LLGS equation including the VCMA effect

The magnetization vector in a mono-domain nanomagnet follows the dynamics governed by the well-known *Landau-Lifshitz-Gilbert-Slonczewski (LLGS)* equation [34,35]. LLGS equation can be written as follows:

$$\frac{\partial \widehat{m}}{\partial t} = -|\gamma|\widehat{m} \times \overrightarrow{H}_{EFF} + \alpha \widehat{m} \times \frac{\partial \widehat{m}}{\partial t} + \overrightarrow{STT}$$
(2.1)

$$\overrightarrow{H}_{EFF} = \overrightarrow{H}_{ext} + \overrightarrow{H}_{demag} + \overrightarrow{H}_{ani} + \overrightarrow{H}_{thermal}$$
(2.2)

$$\overrightarrow{STT} = |\gamma|\beta(\widehat{m} \times (\epsilon \widehat{m} \times \widehat{P} + \epsilon' \widehat{P}))$$
(2.3)

where  $\widehat{m}$  is the unit magnetization vector,  $\alpha$  is the Gilbert damping constant,  $\gamma$  is the gyromagnetic ratio,  $H_{EFF}$  is the effective magnetic field experienced by the nanomagnet and  $\overrightarrow{STT}$  is the STT torque acting on the nanomagnet. The first term on the right hand side of Eq. 2.1 relates to magnetization precession along  $H_{EFF}$  while the second and last terms describe the damping torque and STT, respectively.  $H_{EFF}$ includes an external field  $(H_{ext})$ , demagnetization field due to shape anisotropy [36]  $(H_{demag})$ , the interface perpendicular anisotropy field [13]  $(H_{ani})$  and stochastic field due to thermal noise  $(H_{thermal})$ , as described in Eq. 2.2. The  $\overrightarrow{STT}$  torque is expressed in Eq. 2.3, where  $\beta$  is the rate of spin transfer into the MTJ-FL,  $\epsilon$  is the spin injection efficiency,  $\widehat{P}$  is the polarization of the incoming spin current and  $\epsilon'$  describes the STT field-like torque. Further, as detailed in Section 2.1, VCMA effect modulates the interface anisotropy of the MTJ stack in response to an applied voltage. VCMA effect is thus modeled using a voltage dependent anisotropy constant  $(K_i)$ , which is incorporated in the LLGS equation through  $H_{ani}$ , as follows:

$$K_{i} = K_{i0} - \xi \frac{V_{MTJ}}{t_{MgO}}$$
(2.4)

$$H_{ani} = \frac{2K_i m_z}{M_s t_{FL}} \hat{z} \tag{2.5}$$

where  $\xi$  is the VCMA coefficient,  $V_{MTJ}$  is the voltage applied across the MTJ stack,  $t_{MgO}$  is the spacer oxide thickness,  $K_{i0}$  is the nominal value of anisotropy constant at zero voltage (no VCMA),  $M_s$  is saturation magnetization and  $t_{FL}$  is thickness of the FL nanomagnet. The thermal noise was included in the LLGS equation using a thermal field given by Brown's model [37] as:

$$\overrightarrow{H}_{thermal} = \overrightarrow{\zeta} \sqrt{\frac{2\alpha k_B T}{|\gamma| M_S \rho_{mtj} dt}}$$
(2.6)

where  $\overrightarrow{\zeta}$  is a vector having components that are Gaussian random variables with zero mean and standard deviation of 1,  $\rho_{mtj}$  is the volume of the nanomagnet, T is the ambient temperature,  $k_B$  is the Boltzmann's constant and dt is the simulation time step. The device dimensions and other parameters used in simulation are tabulated in Table 2.1.

### 2.2.2 MTJ Resistance model

The resistance of the MTJ was modeled using the non-equilibrium Green's function (NEGF) approach, benchmarked against experimental data from [41], as illustrated in Fig. 2.2. The details of various equations used in our NEGF model can be found in [40]. Our NEGF model is based on a potential profile wherein a non-magnetic barrier separates two nano-magnets. The non-magnetic barrier is characterized by its energy-barrier while the nano-magnets by their *band-splitting* energy. The results obtained by the NEGF calculations were encapsulated in an analytical fitting model

| Parameters                             | Value                             |
|----------------------------------------|-----------------------------------|
| MTJ Diameter $(W_{MTJ})$               | 40nm                              |
| MTJ-FL thickness $(t_{FL})$            | 0.9nm                             |
| MTJ-spacer thickness $(t_{MgO})$       | 1.3nm                             |
| MTJ-PL polarization                    | 0.4                               |
| Saturation Magnetization $(M_S)$       | $1257.3 emu/cm^3$ [38]            |
| Gilbert Damping Factor ( $\alpha$ )    | 0.02                              |
| Tunneling Magnetoresistance (TMR)      | 125%                              |
| VCMA coefficient $(\xi)$               | $3.72e - 8 \ ergV^{-1}/cm^2$ [39] |
| Interface Anisotropy, at 0V $(K_{i0})$ | $1.1 erg/cm^3$                    |
| External field $(H_{ext})$             | $100Oe\ \widehat{y}$              |

Table 2.1. MTJ parameters used in the simulation model for analyzing the VCMA effect.

such that the resulting MTJ resistance was modeled as a SPICE compatible voltage dependent resistance.

## 2.2.3 Self-Consistent SPICE Compatible Magnetization Dynamics and Resistance Model

A SPICE compatible device-circuit model was developed in Verilog-A for the VCMA-MTJ. The Verilog-A model concurrently solves the LLGS equation, the MTJ resistance model and the associated circuit equations. Predictive transistor models [42] were used for the access transistors, thus completing the 1-T 1-VCMA MTJ bit-



Fig. 2.2. The NEGF based MTJ-resistance model [40] benchmarked against experimental data from [41].



Fig. 2.3. A graphical representation of the various components of our self-consistent magnetization dynamics and resistance transport model.

cell model. Fig. 2.3 shows graphically the various building blocks associated with our self-consistent device-circuit simulation framework.

## 3. *IN-SITU*, IN-MEMORY STATEFUL VECTOR LOGIC OPERATIONS BASED ON VOLTAGE CONTROLLED MAGNETIC ANISOTROPY

### 3.1 Introduction and Related Work

Shanon in his seminal work on Boolean logic laid the foundation for digital logic design as a part of his master's thesis [43]. The underlying idea being the fact that the basic Boolean gates like AND and OR can be easily implemented using electronic switches. With the invention of transistor switches [44], almost a decade after Shanon's work, digital logic quickly gained ground and has become the workhorse of today's information processing [45].

In general, the state-of-art digital processors rely heavily on Boolean gates constituting the *computational unit* which is separate from the *storage unit* consisting of numerous memory cells. This decoupled architecture wherein memory and compute units are physically separated is named after its inventor as the *von-Neumann architecture* [46]. The von-Neumann architecture forms the backbone of almost all the available commercial processors. Despite the tremendous strides made in computing efficiency powered by the von-Neumann machines, it fails to deliver the required speed and efficiency demanded by the recent developments in big-data, artificial intelligence, Internet-of-things (IoT) *etc* [47]. The major limitation associated with the von-Neumann architecture is the so-called *von-Neumann bottleneck* [48]. This bottleneck mainly arises from the limited data transfer rate between the physically decoupled compute and memory units. The frequent *to-and-fro* data transfer between the compute and the memory units, not only limits the overall throughput but also results in large energy overhead associated with each data transfer. In order to mitigate the limitations associated with the von-Neumann bottleneck one promising approach is to enable *in-memory* vector computations [49, 50].

These novel computing paradigms termed as *in-memory* computations aim to implement some (or all) aspects of Boolean logic computations as close to the memory units as possible, thereby avoiding expensive data transfer between the compute and memory units, resulting in higher throughput and better energy-efficiency. Such inmemory computations using conventional silicon based complementary-metal-oxidesemiconductor (CMOS) technology has been demonstrated in [51]. The basic idea behind the in-memory compute mechanism proposed in [51] is to activate multiple rows of memory-cells and read-out a voltage which is proportional to the desired logic computations. However, silicon technology is itself facing tremendous challenges due to aggressive scaling of the CMOS transistors [52-54]. As such, novel memory technologies like spin based magnetic random access memories (MRAMs) [6,41], resistive RAMs [55], phase change materials based memories [56] are being actively investigated for possible replacement of silicon based technologies. A key benefit of these novel technologies is their *non-volatility*. The non-volatile characteristics of these memory units make them well-suitable for ultra-low leakage applications ultimately increasing the energy-efficiency [57].

Exploration of in-memory compute designs using such non-volatile technologies are crucial to meet the energy and throughput requirement demanded by the emerging data intensive applications. Spin-transfer-torque MRAM (STT-MRAM) based in-memory Boolean computations have been proposed in [8, 58]. These in-memory architecture rely on the peripheral read circuits to implement the actual computations. Nevertheless, the peripheral circuits being close to the memory array does provide energy and throughput benefits. The logic computation results are only available when the data is being read from the memory array. This implies if one were to do multiple logic operations which are dependent on the intermediate results, one would require to do a read operation for every logic computation. Thus each in-memory logic operation is inevitably associated with a memory read operation even for intermediate results leading to decreased memory throughput and energy-efficiency.

As opposed to the aforementioned works which use the memory peripheral circuits to do the actual logic computations there are other classes of in-memory compute designs that do computations 'in-situ' using 'stateful' memory devices wherein the same device acts both as a memory element and compute unit. The well known memristive implication (IMP) logic demonstrated in [59] is a good example of such stateful computations. However, the limited endurance of memristors in general make these devices unsuitable for on-chip cache or IoT applications that have extreme longevity requirement. Out of all the non-volatile technologies, spin based devices are the only devices that have high switching speed as well as unlimited endurance. Few works on stateful computations using spin devices can be found in [60], [61]. Specifically, the work presented in [60] uses a three terminal device exploiting the spin Hall effect and the voltage controlled magnetic anisotropy in spin devices to do stateful computations. However, one of the inputs to these devices is an electrical quantity *i.e.* input charge current. This in turn implies if we were to compute say the vector AND operation on the logic states stored in two separate memory rows, one of the memory rows will have to be read first, then converted into electrical signal (a current in this case) before the actual logic computation can be completed. This requirement of 'read before compute' would lead to degraded benefits in throughput and energy.

In this work, we employ the very physics of voltage controlled magnetic anisotropy to construct *in-situ*, in-memory, stateful computations using a two terminal spin device. Specifically, we use the voltage asymmetry of the VCMA effect to construct IMP (implication) logic and the precessional dynamics of the VCMA switching process to propose a *massively parallel* NOT and XOR operation. The key highlights of the work presented in this chapter and its advantages over previous works are as follows:

1. We propose *in-situ*, in-memory stateful IMP vector computations using the voltage asymmetry of the VCMA effect on two terminal magnetic tunnel junctions (MTJs). In addition, we propose a *massively parallel* NOT and XOR operation by exploiting the precessional switching dynamics of VCMA based MTJs.

- 2. Further, the massively parallel behavior of the proposed NOT gate allows multicycle computation of other Boolean functions including AND, OR, NAND, NOR, NIMP(complement of IMP), thereby constructing a rich logic functionality embedded within the memory array in a stateful manner.
- 3. One of the major advantages of the proposed *in-situ*, in-memory stateful vector computations is the fact that we rely on the well known 1 transistor 1 MTJ bitcell without making any changes in the magnetic device or the bit-cell circuit. This is turn makes our proposal attractive from manufacturability point of view. Further, as opposed to [8,58] our logic computations do not rely on complex read operations given the fact that reading MTJ devices in general is a complex circuit problem. In addition, as opposed to the work in [60], we do not need to represent the logic operands by an electrical input, rather both the logic operands can be stored in the memory array leading to higher throughput.
- 4. We have developed a detailed device-circuit model comprising of self-consistent magnetization dynamics and electron transport model integrated seamlessly in SPICE environment to study the feasibility of the proposed logic computations.

## 3.2 Proposed *in-situ*, in-memory Stateful Vector Logic Operations

### 3.2.1 Stateful vector IMP gates

Let us assume we have two VCMA based MTJs – 'MTJ-1' and 'MTJ-2' storing two input data bits 'Bit-1' and 'Bit-2', respectively. We wish to compute the implication (IMP) of bits 'Bit-1' and 'Bit-2' such that the new value of the MTJ-2 would correspond to the IMP of the original values of bits 'Bit-1' and 'Bit-2'. Further, let us assume that this logic computation has to be done in a 'stateful' manner such that



Fig. 3.1. (a) The truth table for two input IMP operation. The columns B and B' are the same except for row 1, highlighted in red. (b) The array configuration showing the voltages at various SLs and WLs and the current flow during the stateful computation of the bit-wise IMP operation. (c) A simplified circuit showing the voltage divider configuration resulting due to the applied voltages at the SLs and WLs. (d) A typical magnetization dynamics during the switching of the MTJ-2 from the P to the AP state, when MTJ-1 is in the P state. Note, this switching dynamics is a typical STT dominated switching, VCMA effect lowers the EB for MTJ-2, thereby allowing the small current flowing through the MTJ-2 to be able to selectively switch the MTJ-2 as desired.

the same VCMA MTJs (that function as memory elements storing bits 'Bit-1' and 'Bit-2') also act as logic computation units.

In order to understand the proposed stateful computations, let us consider the truth table of a two input IMP gate shown in Fig. 3.1(a). Note, the first column (A) would physically represent possible states of MTJ-1 and the second column (B) would represent states of MTJ-2. The third column (B') represents the new state of MTJ-2 after the logic operation has been completed. Interestingly, in Fig. 3.1(a), column B is same as B' except for row 1 (highlighted in red). Further, we assume the low digital level (L) is mapped to the P state of the MTJ and high digital level (H) is mapped to the AP state. This implies in order to do the stateful computations, when the operand 'A' (MTJ-1) is in the P state and operand 'B' (MTJ-2) is also in the P state, the state of MTJ-2 should change from P to AP, thereby mimicking the logic operation corresponding to row 1 of Fig. 3.1(a). Further, for all other cases since B = B', the state of the MTJ-2 should not change. Thus, if we can retain the state of MTJ-2 for rows 2, 3, 4 and change the state from P to AP for row 1 we would have effectively accomplished the IMP operation.

Fig. 3.1(b) and (c), illustrates the device-circuit technique to do the aforementioned IMP computation. Let us assume we have two vector input operands 'A' and 'B'. The bits 'A<sub>0</sub>' to 'A<sub>N</sub>' corresponding to the input 'A' are stored in upper row of the memory array as shown in Fig. 3.1(b). Similarly, bits 'B<sub>0</sub>' to 'B<sub>N</sub>' corresponding to the input 'B' are stored in lower row of the memory array. In order to do the bitwise IMP computations for operands 'A' and 'B' we would activate the corresponding word-lines WL-1 and WL-N. Simultaneously, a voltage  $V_{DD}$  would be applied to SL-1, while SL-N would be grounded resulting in a current flow as marked by the red arrow in Fig. 3.1(b). A simplified version of the resulting circuit configuration, considering one column consisting of one bit from the vector operand 'A' and corresponding bit from the vector operand 'B', is shown in Fig. 3.1(c).

Fig. 3.1(c) is basically a voltage divider, the voltage at node 'mid' depends on the resistance states of MTJ-1 and MTJ-2. Note, in this circuit configuration the pinned-layer of MTJ-1 has a lower voltage than the free-layer, while for MTJ-2 the pinned-layer is at a higher voltage than the free-layer. This in turn implies, with reference to Fig. 2.1(b), MTJ-1 has a higher energy barrier (EB) while MTJ-2 has a lowered energy barrier owing to the VCMA effect. As such, it is much easier to switch MTJ-2 while the state of MTJ-1 would remain intact due to increase in its EB. Further, the voltage at node 'mid' would be higher (lower) when MTJ-1 is in the P (AP) state due to the voltage-divider effect. By appropriate choice of  $V_{DD}$  and the MTJ resistances, the circuit in Fig. 3.1(c) can be designed such that MTJ-2 switches from the P to the AP state only when MTJ-1 is in the P state. A higher voltage at node 'mid' (corresponding to the P state of MTJ-1) would imply enhanced lowering of the EB for MTJ-2 allowing the small current flowing through the MTJ-2 to be able to deterministically switch the MTJ-2 from the P to the AP state as desired.

Note, it is due to the lowered EB of the MTJ-2, that the small current flowing through the MTJs can switch the MTJ-2, but not the MTJ-1 (since the EB for MTJ-1 has increased due to its voltage polarity). The current flowing through the MTJ-2 switches its state due to the STT effect, given the fact that the switching current requirement for MTJ-2 has been conditionally (only when MTJ-1 is in the P state) reduced due to the voltage at node 'mid'. This STT like switching behavior, as shown in Fig. 3.1(d), is evident from the magnetization dynamics of MTJ-2, simulated using the model described in the previous section. Note, the P to AP switching of the MTJ-2 only when MTJ-1 is in the P state implements both the rows 1 and 3 of the Fig. 3.1(a). Specifically, when MTJ-1 is in the AP state, voltage at node 'mid' is not high enough to sufficiently lower the EB of MTJ-2, thereby retaining its original state corresponding to row 3. However, when MTJ-1 is in the P state is to the AP state corresponding to row 1.

The state for MTJ-2 (corresponding to the column B' in Fig. 3.1(a)) for remaining rows 2 and 4 is same as the column B and is the AP state. Further, the current flow direction is such that it always tries to switch MTJ-2 to the AP state. Thus, for rows 2 and 4, MTJ-2 is initially in the AP state, moreover, the current flowing through the MTJ-2 is also trying to switch it to the AP state, thereby the state of MTJ-2 is



Fig. 3.2. (a) The truth table for NOT operation. (b) The array configuration showing the voltages at various BLs and WLs and the current flow during the stateful computation of the massively parallel NOT operation. (c) A typical magnetization dynamics showing the precessional switching behavior of the VCMA MTJs mimicking the NOT operation. On application of proper voltages, irrespective of the initial state of the magnetization direction (+z or -z), the magnetization vector switches by  $180^{\circ}$  thereby implementing the desired stateful NOT operation.

retained for both rows 2 and 4. As such, by merely activating WL-1 and WL-2 and applying appropriate voltages on lines SLs, *in-situ* stateful vector IMP operation can be achieved leveraging the fact that VCMA effect selectively lowers the EB for the MTJ-2 based on its asymmetric voltage polarity.

### 3.2.2 Stateful parallel NOT gates

NOT is a one variable operation, therefore, let us consider a single bit-cell consisting of 1 transistor - 1 VCMA MTJ. In order to reverse the current state of the MTJ we can use the precessional switching dynamics of the VCMA effect. As explained earlier, when sufficient voltage is applied across the VCMA MTJ, the interface anisotropy decreases and in presence of an effective in-plane field the magnetization starts precessing around the hard axis as shown in Fig. 2.1(c). If the input voltage

21

pulse is clocked such that the magnetization has made a half cycle around the hard axis the direction of magnetization would have been effectively reversed by  $180^{\circ}$ .

Interestingly, irrespective of whether the initial state of the magnetization vector was pointing in the +z or the -z direction, when a sufficient positive voltage is applied to lower the interface anisotropy, the magnetization would start precessing around the hard-axis. This implies, when the magnetization vector would have completed a halfcycle around the hard-axis, if it initially started from + z direction (-z direction), it would now be pointing closer to the -z direction (+z direction). If the voltage pulse is turned OFF when the magnetization has made a half-cycle around the hard-axis it would effectively have switched by 180°. Therefore, irrespective of the initial state of the MTJ, the magnetization direction would always be reversed if the input voltage pulse is clocked such that the magnetization has only completed a half-cycle around the hard axis.

This unipolar switching characteristic of the VCMA MTJ, wherein the magnetization always switches by 180° on application of appropriate voltage pulse, can be used to construct a massively parallel vector NOT operation as shown in Fig. 3.2(bc). Let us assume we have to do a NOT operation for all the bits corresponding to rows WL-1 and WL-N. Both WL-1 and WL-N would be pulled high to activate the access transistors and proper voltage  $V_{DD}$  needs to be applied to BL-1 through BL-N. This  $V_{DD}$  would be dictated by the VCMA MTJ characteristics such that the magnetization starts precessing around the hard-axis. Usually, the voltage required for VCMA based precessional switching is higher than the voltage requirement for STT-dominated switching [15]. After a predetermined time duration, corresponding to the half cycle precession of the magnetization, the WL and  $V_{DD}$  voltages would be pulled low, thereby reversing the state of all the MTJs connected to both WL-1 and WL-N.

It might be instructive to comment that the switching mechanism during the IMP operation, described in the previous sub-section was STT dominated, the VCMA effect during the IMP operation merely reduced the EB such that the STT current can switch the device. In contrast, for the NOT operation, the switching dynamics is VCMA dominated, that results in precessional switching of the MTJs. The VCMA dominated switching dynamics is also evident from our simulation result shown in Fig. 3.2(c), which shows a typical magnetization trajectory during the precessional switching based NOT operation. Note, in the upper (lower) part of Fig. 3.2(c), the magnetization vector starts from +z -axis (-z -axis) and makes approximately a half-cycle around the x-axis before it dampens and consequently settles down in the -z direction (+z direction). Therefore, irrespective of its initial direction, the magnetization vector is always reversed when it completes a half-cycle around the hard axis. The presence of both the STT and VCMA dominated regime in the same MTJ device has been demonstrated experimentally in many works including [15].

In principle we can activate all the WLs in the memory array, simultaneously, such that the entire memory array can be flipped in a massively parallel manner. However, in practice the number of WLs that can be simultaneously activated would be limited by the peripheral circuits and the current drivability of the drivers connected to BLs and WLs. Nevertheless, multiple rows can be easily flipped in one cycle resulting in a massively parallel stateful NOT operation. Further, one could argue that precise timing control of the voltage pulses are required for the proper functioning of the NOT operation and given circuit level variations the write-error-rate (WER) for the proposed NOT operation would be exceptionally high. It is worth mentioning, by proper circuit techniques such errors can be mitigated. In fact, as demonstrated in [39], authors in [39] were able to obtain WER as low as 1e-14 for precessional switching in VCMA MTJs. A detailed description of the peripheral circuits and write-scheme used for mitigating the WER in precessional switching of VCMA MTJs can be found in [39]. Further, the WER for the AP to the P precessional switching is slightly different from the P to the AP precessional switching. The difference arises due to the existence of the small current flow through the VCMA MTJ favoring one particular switching direction as opposed to the other. However the difference is



Fig. 3.3. Based on the proposed stateful operations as described in the above sub-sections an IMP and NOT operation can be completed in one cycle, whereas a two cycle operation can implement the NAND, NIMP and NOT logic. Similarly, a three cycle operation can be used for the AND and NOR logic computation. For multi-cycle logic, the part of logic highlighted in red can be computed in the first cycle, the part in white can be computed in second cycle, while the part highlighted in blue would be computed in the third cycle.

usually small and has been extensively studied in [33]. In summary, the precessional switching of the VCMA MTJs can be used as a massively parallel NOT operation.

### 3.2.3 Other Logic Gates

It has already been demonstrated that we can accomplish vector IMP and NOT operations in one cycle. In principle, since the IMP operation is a universal gate, the proposed scheme can be used for mapping any arbitrary Boolean computations. However, since NOT is a massively parallel operation, the IMP operation can be combined with the NOT operation to achieve various other basic Boolean gates. For example, as shown in Fig. 3.3, by using two cycles stateful NAND/OR/NIMP logic operations can be accomplished. Further, if we assume three cycles, stateful AND/NOR operations can be computed using the proposed techniques. Note, as opposed to the stateful IMP logic in memristive crossbars [59], the present proposal has significant advantages due to the fact that the NOT operation can be achieved

in a massively parallel manner that too in the usual 1-transistor-1-MTJ bit-cell, thereby enabling other stateful logic operation as in NAND/NOR *etc.* 

For the sake of completeness, note that the 1-transistor-1-VCMA MTJ array can still be used as a conventional memory block. Interested readers can refer to [39] on how the read and write operations can be accomplished in such a memory array. The present proposal of stateful computations augments the usual memory operations by *in-situ* logic computations thereby allowing one to overcome the von-Neumann bottleneck resulting in higher throughput and energy-efficiency.

### 3.2.4 Stateful XOR Gate



Fig. 3.4. (a) A truth table for XOR gate. The logic output B' retains its original value when the operand A is 'L', whereas if the operand A is 'H', the new value for B' is the complement of its original value B. (b) Figure shows the array structure used for implementing the XOR operation. The voltages on BLs represent the bits corresponding to the operand A, while the data stored in the MTJs represent the bits corresponding to the operand B. The values in the MTJs are inverted conditionally only if the bits corresponding to the operand A are 'H' *i.e.* only if the respective SLs are pulled high. Note, in the example shown, the bit value for A<sub>1</sub> is 'L', as such, BL-1 is kept low. Therefore, no current flows through the column corresponding to BL-1 and hence the bits corresponding to BL-1 consume no energy.

Here we describe how one can implement stateful XOR operation using the precessional switching dynamics of the VCMA mechanism. Unlike the IMP and NOT operation proposed in the manuscript, the XOR operation requires representation of one of the operands as an electrical input *i.e.* one of the operands is represented by the voltage on the bit-line (BL). This implies if we were to compute the XOR of two vector operands stored in two different rows of the memory array, one of the rows will have to read first, then converted into an electrical input (a voltage in this case) and applied to the BL before the XOR operation can be completed. This results in a requirement of 'read before compute' as opposed to the IMP and NOT operations. However, an interesting possibility is the fact that the XOR operation exploits the precessional switching dynamics and therefore, has the potential of enabling massively parallel XOR operations similar to the NOT operation.

In order to understand the functionality of the stateful XOR operation, let us consider the truth table of the XOR gate as shown in Fig. 3.4(a). The key observation with respect to the truth table is that the operand B retains its original value when the operand A is 'L' (highlighted in blue), whereas when the operand A is 'H' the state of the operand B has to be inverted (highlighted in red). This implies the XOR operation can be seen as conditional NOT operation, wherein the operand B is inverted only when the operand A is 'H'.

We have already seen that the precessional switching dynamics of the VCMA mechanism can be used to perform the NOT operation. Based on such precessional mechanism the proposed bit-wise, stateful, parallel XOR operation can be performed as shown in Fig. 3.4(b). The operand A is represented as the voltages on the lines BL. For example, if the Nth bit of the vector operand A is  $A_N = {}^{\circ}\text{H}{}^{\circ}$ , BL-N would be pulled up to VDD and if  $A_1 = {}^{\circ}\text{L}{}^{\circ}$ , BL-1 would remain at 0 volts. The row WL-1 that is supposed to store the vector operand B would then be activated by pulling WL-1 to a high voltage. By ensuring the WL-1 is ON only for a time duration such that the pulse width corresponds to the half-cycle of the magnetization vector, the bits of operand B can be conditionally inverted based on whether the corresponding bit of operand A was 'H' or 'L', thereby completing the XOR operation.

A major benefit of the proposed stateful XOR operation is the fact that we apply a non-zero voltage to the BL only if the corresponding bit of the operand A is 'H'. As such, for those cases where the corresponding bit of the operand A is 'L', the concerned bit-cells consume no energy as both the SL and the BL for those bits are at zero volts. Statistically, this would reduce the energy consumption by almost 50%. Given the extensive use of the XOR operation in many compute applications and the fact that implementing XOR using CMOS transistors is expensive in terms of both energy and area, the present proposal potentially paves the way for low energy and low area XOR in-memory computations. Another benefit of the proposed XOR operation is the possibility of doing a massively parallel operation similar to the NOT operation. Suppose, the operand A is an encryption key that has to be XORed with all the data stored in multiple rows of the memory array. In principle, all the WLs can be simultaneously activated, such that all the bits in the corresponding rows flip conditionally based on the voltages at respective SLs, thereby completing the XOR operation for multiple rows in a single cycle. The energy consumption for the proposed XOR operation per bit would be same as the NOT operation except the fact that in 50% cases when the bits of operand A are zeros, no energy would be consumed.

#### 3.3 Results

Using the comprehensive simulation model presented in Fig. 2.3, we evaluate the functionality and performance of the proposed in-memory vector computations. Note, during the process of magnetization switching, the resistance of the MTJ keeps changing which in turn would change the voltage across the MTJ. Therefore, both the STT and the VCMA strength is a function of the instantaneous direction of magnetization. In order to properly capture these effects a self-consistent SPICE model like the one described in Fig. 2.3 is required as opposed to mixed mode models that solve decoupled LLGS and resistance equations separately.



Fig. 3.5. (a)Probability of B's final state being 'H' (or digital '1') for the four initial cases of A and B (00,01,10,11) in the vector IMP operation, as a function of the voltage pulse. At a pulse width of  $\sim 25$ ns, the correct IMP result is obtained. (b) Probability of inverting the state of the VCMA MTJ due to precessional switching as a function of the pulse duration. The switching probability peaks at  $\sim 2$ ns due to the half-cycle rotation of the magnetization dynamics.

The vector IMP operations are performed using the STT-dominated switching of MTJs. In performing an IMP operation on vectors A and B, the current flows from the bit-cells storing bits corresponding to operand A to bit-cells corresponding to operand B, eventually replacing vector B with the resulting bit-wise IMP operation (refer to Fig. 3.1). Also, the negative-VCMA effect on bit-cells storing operand A prevents them from switching their state. Fig. 3.5(a) shows the probability of B's final state - which represents the result - being '1' (or 'H' or AP) for the four possible A and B inputs '00', '01', '10' and '11', as a function of the applied voltage pulse width. The simulation is done for various runs in presence of stochastic thermal variations. It can be observed that when the initial state of B is 'H' or AP (for inputs '01' and '11'), the final state is also AP, irrespective of A's state. This is because the direction of the current flow restricts B from switching from AP to P state. On the other hand, for the input '11', B never switches its state since the current flowing through the bit-cells in this case is designed to be lower than the critical current required for STT switching, given the fact that the voltage across MTJ-2 is not high enough to

sufficiently lower its EB. However, for the input '00', B switches with a probability of  $\sim 1$ , for a voltage pulse width of  $\sim 25$ ns, thus verifying the functionality and robustness of the bit-wise IMP operation. The average energy consumption per-bit and latency of the IMP operation is tabulated in Table 3.1.

While IMP uses STT-dominated switching, NOT operation is primarily VCMAdominated. As described earlier, the magnetization starts precessing along the hardaxis, once a sufficient voltage is applied across the MTJ (see Fig. 3.2(c)). Note that the  $V_{DD}$  for the NOT operation is specifically chosen, so as to ensure VCMAdominated precessional dynamics. Fig. 3.5(b) shows the switching probability as a function of voltage pulse width, in presence of thermal variations. The switching probability shows an oscillatory behavior since the final state of the MTJ depends on the magnetization vector direction at the instant when the voltage is turned off. Such oscillating switching probability is typical for precessionally switched magnets. When the magnetization makes a half-cycle of precession ( $\sim 2$  ns) around the hardaxis, a switching probability close to 1 is achieved, thus confirming the expected functionality for the NOT operation. The presented figure is for the P to the AP switching, similar oscillating probability was also obtained for the AP to P switching. Note that the NOT operation is massively parallel. Even multiple vectors can be inverted simultaneously, by activating the corresponding WLs and SLs of the bitcells. Table 3.1 enumerates the energy consumption per-bit and latency of the NOT operation.

Table 3.1. Average energy consumption per-bit and latency in the IMP and NOT vector operations.

| Vector operation | Average Energy      | Latency | $V_{DD}$        |
|------------------|---------------------|---------|-----------------|
| IMP              | 1.22pJ              | 25 ns   | $1.7\mathrm{V}$ |
| NOT              | $0.067 \mathrm{pJ}$ | 2ns     | $0.8\mathrm{V}$ |

### 3.4 Summary

The conventional von-Neumann computing architecture fails to deliver the required energy and throughput efficiency for emerging data intensive applications like artificial intelligence, IoT *etc.* Enabling in-memory computations is being hailed by the research community as a promising technique with a potential to go beyond the von-Neumann computing model. In this chapter, we have proposed *in-situ*, inmemory Boolean stateful computations by leveraging the very physics of voltage controlled magnetic anisotropy in MTJs. The voltage asymmetry of VCMA based MTJs has been used to propose a stateful IMP operation, while the precessional switching dynamics has been exploited for constructing a massively parallel NOT and XOR operations. Further, various other gates including AND, OR, NAND, NOR, NIMP can be easily computed using multi-cycle operations. Our results have been verified by a detailed self-consistent magnetization dynamics and resistance model. In addition, the present proposal does not require any changes in the basic magnetic device or the the bit-cell circuit, thereby making our proposal feasible from manufacturability point of view.

## 4. MAGNETO-ELECTRIC SWITCHING MECHANISM AND MODELING



Fig. 4.1. (a) A graphical representation of a multi-ferroic material. Multi-ferroic are those materials that exhibit more than one ferroic order (ferro-electricity, ferro-magnetism and ferro-eleasticity). (b-c) A ferro-magnet in physical contact with an ME oxide. When an electric field is applied in the +z direction the ferro-magnet switches to the +x direction and *vice-versa* (d) Schematic for an ME-MTJ. By applying appropriate voltage across the ME oxide the state of the MTJ can be changed from parallel (P) to anti-parallel (AP).

In this chapter, we introduce a new physics enabling pure voltage driven switching of ferro-magnets called the magneto-electric effect. In the following chapters we would show how the ME effect can be exploited to construct neuro-mimetic as well as cascadable Boolean logic devices.

### 4.1 Introduction to Magneto-electric Switching of Ferro-magnets

ME effect is the physics of generating magnetization from an applied electric field [14], [62]. ME devices usually consists of a single phase or composite multi-ferroic material (for example BiFeO<sub>3</sub> [63], BaTiO<sub>3</sub> [64]) in contact with a nano-magnet. Application of an electric field to the multi-ferroic material results in an effective

magnetic field experienced by the nano-magnet. If the generated magnetic field is strong enough, the magnetization of the nano-magnet can be reversed. This electric field driven switching of the nano-magnets shows better energy efficiency and speed as compared to the classic current induced spin-transfer-torque switching [5].

In the case of a single phase ME oxide (like BiFeO<sub>3</sub>), the switching of the nanomagnet due to applied electric field can be explained as follows [63]. BiFeO<sub>3</sub> is a multiferroic material. A multi-ferroic is a material that exhibits more than one ferroic-order (ferro-electricity characterized by electric polarization (P), ferro-magnetism characterized by magnetization (M) and ferro-elasticity characterized by strain  $(\epsilon)$ ). In a multi-ferroic material more than one ferroic order can be coupled to each other, as shown schematically in Fig. 4.1(a). Ferro-electricity in BiFeO<sub>3</sub> arises due to the shift of Bi<sup>+</sup> cations owing to its hybridization with the surrounding oxygen atoms [65]. The electric polarization of BiFeO<sub>3</sub>, which is coupled to the (anti) ferromagnetism of the constituent Fe atoms, can be switched by the application of an electric field. Further, the (anti) ferromagnetism of BiFeO<sub>3</sub> can be coupled to the ferro-magnetism of an underlying nano-magnet. The magnetization of the nano-magnet can be switched in response to the applied electric field across BiFeO<sub>3</sub> [63]. The various coupling mechanisms that lead to electric-field driven reversal of magnetization direction in the underlying ferromagnet is currently a topic of intense research [63], [64].

Nevertheless, the efficiency of the ME effect is usually abstracted by the MEcoefficient denoted as  $\alpha_{ME}$  [63].  $\alpha_{ME}$  is the ratio of magnetic field generated per unit applied electric field. Experimentally,  $\alpha_{ME}$  of  $1 \times 10^{-7}$  s m<sup>-1</sup> has been reported in the literature [63]. Schematically, an ME switched nano-magnet is shown in Fig. 4.1(bc). When an electric field is applied in the +z direction, the nano-magnet switches to the +x direction due to the ME effect. If the direction of the applied electric field is reversed, the nano-magnet switches to -x direction. Thus, pure voltage driven switching of the ferro-magnet can be achieved by using the ME effect, resulting in highly energy-efficient write mechanism. Along side the ME switched magnet in Fig. 4.1(b-c), we also show a conventional MTJ stack consisting of an oxide-spacer separating two nano-magnets grown on top of an ME oxide in Fig 4.1(d). A parallel alignment of the magnetization directions in both the nano-magnets results in a low resistance parallel (P) state, while an antiparallel (AP) alignment leads to a high resistance state. The difference in the parallel and anti-parallel resistance is usually indicated by a term called Tunnel Magneto-Resistance (TMR) ratio. In order to maximize the TMR, MgO is usually used as the oxide-spacer in the MTJ stack. The use of MgO as the oxide-spacer can be justified from first principles analysis [66], which indicates that the coupling of the Bloch states between the nano-magnet and MgO has an important role in deciding the overall resistance of the MTJ stack.

Thus, we are dealing with two oxides - ME oxide for switching the nano-magnet and MgO for reading the state of the MTJ stack. A pure voltage driven MTJ can be envisioned as shown in Fig. 4.1(d). Fig. 4.1(d) consists of an MTJ stack such that its free-layer is in physical contact with the ME oxide. A positive voltage on the ME oxide will result in parallel (P) state of the MTJ and a negative voltage would result in anti-parallel (AP) state. The resulting device called the ME-MTJ exhibits the following desirable characteristics 1) ME-MTJ is based on pure voltage driven switching and hence results is highly energy-efficient write operations 2) ME-MTJ has de-coupled read/write paths. In the next section we would describe the modeling and key device characteristics of ME-MTJs.

### 4.2 Modeling and Simulation

Next, we describe the modeling and simulation framework that was developed to evaluate the ME switching of ferro-magnets. The device level simulation framework consisted of coupled magnetization dynamics and electron transport model. The required voltage and the time taken for deterministic switching of the nano-magnet due to the ME effect were obtained from stochastic-magnetization dynamics equations including thermal noise. On the other hand, the resistance of the MTJ stack in parallel and anti-parallel state, as a function of the applied voltage was estimated using Non-equilibrium Greens function (NEGF) formalism [67].

Under mono-domain approximation, the magnetization dynamics were modeled using the well-know phenomenological equation called the *Landau-Lifshiz-Gilbert* (LLG) equation [34]. LLG equation can be written as [35]

$$\frac{\partial \widehat{m}}{\partial \tau} = -\widehat{m} \times \vec{H}_{EFF} - \alpha \widehat{m} \times \widehat{m} \times \vec{H}_{EFF}$$
(4.1)

where  $\tau$  is  $\frac{|\gamma|}{1+\alpha^2}t$ . In (4.1),  $\alpha$  is the Gilbert damping constant,  $\gamma$  is the gyromagnetic ratio,  $\hat{m}$  is the unit vector in the direction of the magnetization, t is time and  $H_{EFF}$  is the effective magnetic field.  $H_{EFF}$  can be written as

$$H_{EFF} = \vec{H}_{demag} + \vec{H}_{interface} + \vec{H}_{thermal} + \vec{H}_{ME}$$
(4.2)

where  $\vec{H}_{demag}$  is the demagnetization field due to shape anisotropy.  $\vec{H}_{interface}$  is interfacial perpendicular anisotropy,  $\vec{H}_{thermal}$  is the stochastic field due to thermal noise and  $\vec{H}_{ME}$  is the field due to ME effect.

 $\vec{H}_{demag}$  can be written in SI units as [36]

$$\vec{H}_{demag} = -M_S(N_{xx}m_x\hat{x}, N_{yy}m_y\hat{y}, N_{zz}m_z\hat{z})$$
(4.3)

where  $m_x$ ,  $m_y$  and  $m_z$  are the magnetization moments in x, y and z directions respectively.  $N_{xx}$ ,  $N_{yy}$  and  $N_{zz}$  are the demagnetization factors for a rectangular magnet estimated from analytical equations presented in [68].  $M_s$  is the saturation magnetization. The interfacial anisotropy can be represented as [13]

$$\vec{H}_{interface} = (0\hat{x}, 0\hat{y}, \frac{2K_i}{\mu_o M_S t_{FL}} m_z \hat{z})$$
(4.4)

where  $K_i$  is the effective energy density for interface perpendicular anisotropy and  $t_{FL}$ is thickness of the free layer. As mentioned earlier, the ME effect can be abstracted through the parameter  $\alpha_{ME}$  [70]

$$\vec{H}_{ME} = (\alpha_{ME}(\frac{V_{ME}}{t_{ME}})\hat{x}, \ 0\hat{y}, \ 0\hat{z})$$
(4.5)



Fig. 4.2. (a) A typical evolution of magnetization components mx, my, mz on application of a voltage pulse. The magnet is being switched from +x direction to -x direction. (b) The parallel and antiparallel resistance obtained from our NEGF model [40] and benchmarked to experimental data from [69]. The resistance-voltage characteristics of Fig 4(b), were abstracted into a behavioral model for simulation.

where,  $\alpha_{ME}$  is the co-efficient for ME effect,  $V_{ME}$  is the voltage applied across the terminals of the ME capacitor and  $t_{ME}$  is thickness of the ME oxide, responsible for induction of a magnetic field in response to an applied electric field.  $\vec{H}_{ME}$  was multiplied with suitable constant for unit conversion.

The thermal field was included by the following stochastic equation [37]

$$\vec{H}_{thermal} = \vec{\zeta} \sqrt{\frac{2\alpha k_B T}{|\gamma| M_S Vol \ dt}}$$
(4.6)

where  $\vec{\zeta}$  is a vector with components that are zero mean Gaussian random variables with standard deviation of 1. *Vol* is the volume of the nano-magnet, *T* is ambient temperature, *dt* is simulation time step and  $k_B$  is Boltzmann's constant.

Equations (4.1)-(4.6) constitute a set of stochastic differential equations. This system of equations was solved numerically by using the Heun's method [71]. The solution of the given set of equations, enable us to get the required voltage as well as the switching time for the nano-magnets used in our simulations. The various device

| Parameters                                      | Value                  |  |
|-------------------------------------------------|------------------------|--|
| Magnet Length $(L_{mag})$                       | $45nm \times 2.5$      |  |
| Magnet Width $(W_{mag})$                        | 45nm                   |  |
| Magnet Thickness $(t_{FL})$                     | 2.5nm                  |  |
| ME Oxide Thickness $(t_{ME})$                   | 5nm                    |  |
| Saturation Magnetization $(M_S)$                | $1257.3 \ KA/m \ [38]$ |  |
| Gilbert Damping Factor ( $\alpha$ )             | 0.03                   |  |
| Interface Anisotropy $(K_i)$                    | $1mJ/m^2$ [38]         |  |
| ME Co-efficient $(\alpha_{ME})$                 | $0.15/c^*ms^{-1}$      |  |
| Relative Di-electric constant $(\epsilon_{ME})$ | 500 [70]               |  |
| Temperature $(T)$                               | 300K                   |  |
| CMOS Technology                                 | 45nm PTM [42]          |  |

Table 4.1.Summary of Parameters used for our simulations for analyzing the ME effect

\*c = Speed of light.

dimensions and material parameters used in our simulations are summarized in Table 4.1 . A typical evolution of the magnetization components in response to an voltage pulse is shown in Fig. 4.2(a).

It is to be noted that, multi-ferroics and ME effect is currently an active research area [14,62]. Although many theoretical works have proposed models for describing the origin and the behavior of the ME effect, yet a detailed understanding of the physics of ME effect is still under intense research investigation. Further, experimental demonstration of ME switched ferro-magnets can be found in the literature as in [63], yet a global magnetization reversal by ME effect has remained elusive. Due to lack of such experimental results, the ME parameters mentioned in Table 4.1, are not tied to any particular material system or experiment, they are more like predictive parameters that abstract the details of the ME switching into a simple model which can be used in conjunction with the LLG equation to predict the switching time and energy. A more rigorous benchmarking of ME parameters for future logic devices can be found elsewhere as in [72]. Note, the functionality of the devices proposed in the following chapters do not depend on the exact values of the ME parameters used for simulation. Therefore, our simple ME model serves the purpose to check the feasibility of our present proposals from device as well as circuits perspective.

The magnetization dynamics equations were coupled with the resistance of the MTJ stack, which was modeled using the non-equilibrium Green's function (NEGF) formalism. A detailed description of the NEGF model for MTJs can be found in [40]. The parallel and anti-parallel resistance obtained from our experimentally benchmarked NEGF equations were then abstracted into a behavioral model. The results from the NEGF based resistance model is shown in Fig. 4.2(b) The magnetization dynamics equations along with the resistance of the MTJ stack obtained from the NEGF equations, provided a coupled device model that can be used for characterizing the proposed logic-device.

### 4.3 Device Characteristics

Using our simulation framework presented in the previous section, we highlight some of the key characteristics of the ME based switching of ferro-magnets viz. a) scalability and b) switching speed.



Fig. 4.3. (a) A typical trajectory followed by the magnetization vector when switched using STT mechanism. The STT mechanism initially acts as an anti-damping torque and subsequently as a damping torque thereby switching the state of the ferro-magnet. (b) A typical trajectory followed by the magnetization vector when switched using the ME mechanism. With application of an external voltage the magnetization tries to orient itself towards the direction of the ME field and finally dampens, resulting in a 180<sup>o</sup> switching of the ferro-magnet.

## 4.3.1 Scalability

The ME-MTJ in Fig. 4.1(d) shows good scaling in terms of voltage and energy requirements. This desirable scaling trend can be attributed to the decrease in the ME capacitance with the decrease in ME oxide area. In fact, the switching energy (proportional to  $CV^2$ , C being the ME oxide capacitance and V the voltage required to switch the nano-magnet) linearly decreases with the scaling in ME oxide area and has a square law dependence with respect to voltage scaling. Additionally, the device also requires an MTJ stack for a read-out operation. Recently, many experimental works [73], [74] have demonstrated scaling of the MTJ structure to as small as 20nm in diameter. Thus, in terms of areal dimensions of the ME oxide as well as the MTJ stack, the proposed device shows desirable scaling trend.

### 4.3.2 Switching Speed

ME driven magnetization dynamics can lead to sub-1ns switching speed as compared to the STT mechanism which typically requires 5-10ns of switching time. For STT switching, the STT effect acts as an anti-damping torque initially and as a damping torque subsequently thereby switching the nano-magnet. A typical STT switching curve is shown in Fig. 4.3(a). On the other hand, ME switching follows a much simpler dynamics, similar to the switching process due an external magnetic field, as shown in Fig. 4.3(b). For the material parameters shown in Table 4.1, the nanomagnets used in our simulations could switch within 500ps. Thus, as compared to other non-volatile logic devices based on injection of spin current and STT switching mechanism [75], the proposed logic-device shows faster switching speed. In practice, the total switching time would be the sum of the switching time of the ferro-electric polarization in the ME oxide and the ferro-magnetic switching estimated from our LLG equations. Theoretically, the ferro-electric switching time can be of the order  $\sim$ 70ps [76], which is much less than the ferro-magnetic switching time estimated from our simulations. We have therefore, neglected the ferro-electric switching time in our calculations.

In the next chapter, we would describe a neuro-mimetic device exhibiting the stochastic leaky-integrat-fire dynamics of biological neurons using the ME switching mechanism described in this chapter.

# 5. A STOCHASTIC LEAKY-INTEGRATE-FIRE NEURON USING MAGNETO-ELECTRIC SWITCHING

### 5.1 Introduction and Related Work



Fig. 5.1. (a) A biological neuron with interconnecting synapses. (b) A representative model for a biological neural network.  $V_i$ s are the input spikes generated by pre-neurons. The neuron emits a spike, if the membrane potential  $(V_{mem})$  crosses a certain threshold  $(V_{th})$ . The weighted summation is usually carried out by a resistive crossbar array. Our proposed ME device aims to emulate the LIF and thresholding behavior of a biological neuron.

The principles governing the functioning of the human brain have fascinated the research community due to its computational efficiency in solving classification and recognition problems. Numerous efforts are being made for hardware implementations of devices/ circuits/ systems that can mimic the computations of the human brain, thereby leading to energy-efficient neuromorphic systems. The primitives of a

neuromorphic hardware comprises of *neurons* and *synapses*, that are inspired from their biological counterparts.

A biological neuron is shown in Fig. 5.1(a). Based on the signals received through the dendrites, the soma of the neuron generates *action potentials* or *spikes* which is carried by the axon from one neuron to another through connections called synapses. The electrical dynamics of the neuron is controlled by wide variety of ion channels that allows ions like Na<sup>+</sup>, K<sup>+</sup>, Cl<sup>-</sup>, *etc.* to move in and out of the neuronal cell. The voltage difference between the interior of a neuron and its surroundings is called the *membrane potential*. When particular ion channel opens on being excited, the membrane potential of the neuron increases due to inflow of positive ions. Simultaneously, the deviation from the equilibrium potential causes diffusion of ions resulting in slow decrease of the membrane potential is sufficiently built-up beyond a certain *threshold*, a positive feedback process kicks in and the neuron emits a spike characterized by generation of a sharp electrical potential for a small period of time [77].

A typical spiking neural network (SNN) is composed of a set of *pre* (input) and post (output) neurons connected through synapses. Upon sufficient excitation, the neurons generate *spikes*, and encode information in the timing or frequency of the spikes. Fig. 5.1(b) shows a widely accepted simplified model for biological neural networks. The input pre-neuronal spikes are altered by the associated weights  $W_i$ , and summed up as shown in the figure. The resulting output alters the membrane potential ( $V_{mem}$ ) of the post-neuron in a typical leaky-integrate fashion [77], as shown in inset in Fig. 5.1. The post-neuron emits a spike if the membrane potential crosses a certain threshold ( $V_{th}$ ). The membrane potential is subsequently reset, and the post-neuron is prevented from spiking for a certain duration of time known as the refractory period. Further, various neuro-science experiments and computational models [78] have convincingly demonstrated the relevance of the stochastic firing behavior of the biological neurons. Therefore, a simplistic yet realistic model that emulates the dynamics of a biological neuron should include stochasticity along with the leaky-integrate-fire dynamics.

Hardware realizations of large-scale SNNs on general-purpose von-Neumann computing machines are power inefficient compared to the human brain. Broadly, hardware implementations of SNNs can be classified into two categories i) CMOS digital/analog implementations and ii) SNNs using emerging devices. Digial and analog CMOS implementations of spiking neurons can be found in [79] and [80], respectively. A major issue concerning CMOS SNNs is the standby leakage power consumption. Hence, non-volatile devices mimicking neuronal and synaptic dynamics are well suited for such a sparse system like SNNs due to negligible leakage power dissipation. Phase change devices have recently been demonstrated to emulate the integrate-fire dynamics [81], but they lack the leaky behavior. On the other hand, the magnetization dynamics of a nano-magnet was shown to follow the the leaky-integrate behavior of biological neurons [82]. However, since the leaky-integrate behavior stems from the physics of the magnetization dynamics dictated by the material properties, it is difficult to control and tune as required.

In this chapter, we propose a novel non-volatile spin based *stochastic-leaky-integratefire* neuron using the magneto-electric (ME) switching of ferro-magnets. The dynamics of the voltage across the ME capacitor exhibits the typical leaky-integrate behavior, and switches the ferro-magnet underlayer. In addition, the presence of thermal noise results in stochastic switching dynamics, which can be used to emulate the stochastic firing behavior of the biological neurons.

## 5.2 Proposed Stochastic Leaky-Integrate-Fire Neuron

The proposed low-energy neuronal device is shown in Fig. 5.2. It consists of a ferro-magnet under a thick ME oxide. The metal contact to the ME oxide and the underlying ferro-magnet form two plates of the ME capacitor. Further, we assume



Fig. 5.2. Schematic of the proposed LIF ME neuron. Thick ME oxide (5nm) sandwiched between the metal contact and the ferro-magnet, acts as a capacitor. Diode connected transistor M1 prevents back flow of charges stored on the ME capacitor, while resistor R1 determines the rising time constant for the capacitor. M2 constitutes the leak path, when the voltage on the Leak/Reset terminal is zero.

that a sufficient positive voltage on the ME capacitor switches the ferro-magnet in -x direction and *vice-versa*.

The ferro-magnet is extended to form the free layer for a magnetic tunnel junction (MTJ). The MTJ stack consists of the two ferro-magnets separated by an oxide spacer. Conventionally, when the magnetizations of the two ferro-magnets point in the opposite directions, the resistance of the MTJ stack is high (anti-parallel state) as compared to the case when the two ferro-magnets point in the same direction (parallel state). In order to improve the sensing margin between the parallel and anti-parallel resistance of the MTJ stack, MgO is widely used as the oxide spacer. Due to the decoupled read and write path, the thickness of the ME oxide and that of the MgO spacer can be optimized independently to improve the switching efficiency as well as the resistance sensing margin, respectively.

The ferro-magnet under the ME oxide is initially reset to +x direction by applying a negative pulse on the *Leak/Reset* terminal shown in Fig. 5.2. After the reset phase, the *Leak/Reset* terminal is set to zero volts. Therefore, transistor M2 acts as a leak path for the ME capacitor. On the other hand, diode connected M1 and R1 constitute the charging path. Thus, the voltage on the ME capacitor follows the leak and integrate dynamics of a biological neuron due to the co-existence of a charging (M1-R1) and a discharging (M2) path. If the ME capacitor is sufficiently charged, such that the generated magnetic field is greater than the anisotropy field of the ferro-magnet, the magnet switches from its initial reset position (+x direction) to -x direction, thus mimicking the thresholding behavior of a biological neuron. As the ferro-magnet switches to -x direction, the MTJ stack that has its pinned layer always pointing in -x direction, transitions from the high resistance (anti-parallel) to the low resistance (parallel) state. The output of the inverter goes from low to high due to the voltage divider effect, thereby generating an output spike. We would like to note that the charging and the discharging time constants of the leak-integrate path for the proposed neuron device can be easily tuned through the resistance R1 and transistor M2, respectively.

As compared to a CMOS-only implementation, the non-volatility of the ferromagnet would help reduce the leakage power of the neuronal circuit. Moreover, for a CMOS LIF neuron, the output spike has to be latched using additional circuits, either to mimic the refractory period of the neuron or to wait for the peripheral hardware circuit to read the output spike and perform the necessary computations. In the proposed device, the latching operation is inherent in the ferro-magnet due to its non-volatility. Also, as compared to the recent experimental demonstration of an integrate-fire neuron in phase change device [81], the present proposal can potentially be more energy-efficient due to lower operating voltages, and has the inherent benefit of almost unlimited endurance. In addition, the proposed ME device implements a leaky-integrate-fire neuron as opposed to the integrate-fire neuron of [81]. Additionally, as elaborated later, the proposed neuronal device exhibits probabilistic switching dynamics, which is indicative of the stochasticity exhibited by cortical neurons.

| Parameters                                      | Value                  |  |
|-------------------------------------------------|------------------------|--|
| Magnet Length $(L_{mag})$                       | $45nm \times 2.5$      |  |
| Magnet Width $(W_{mag})$                        | 45nm                   |  |
| Magnet Thickness $(t_{FL})$                     | 2.5nm                  |  |
| ME Oxide Length $(L_{ME})$                      | 60nm                   |  |
| ME Oxide Thickness $(t_{ME})$                   | 5nm                    |  |
| Saturation Magnetization $(M_S)$                | $1257.3 \ KA/m \ [38]$ |  |
| Gilbert Damping Factor ( $\alpha$ )             | 0.03                   |  |
| Interface Anisotropy $(K_i)$                    | $1mJ/m^2$ [38]         |  |
| ME Co-efficient $(\alpha_{ME})$                 | $0.5/c * ms^{-1}$ [83] |  |
| Relative Di-electric constant $(\epsilon_{ME})$ | 500 [70]               |  |
| Temperature $(T)$                               | $300 \mathrm{K}$       |  |
| CMOS Technology                                 | 45nm PTM [42]          |  |

Table 5.1. Summary of parameters used in our simulations for analysis of ME based Neuron

\*c = Speed of light.

Fig. 5.3, shows the switching probability of the ferro-magnet as a function of the voltage on ME capacitor obtained using our mixed mode simulation framework with the parameters shown in Table 5.1. The stochastic behavior of the switching mechanism can be attributed to the fluctuations in the initial position of the magnetization direction due to thermal noise. The noisy characteristic of the proposed ME neuron mimics the stochasticity of biological neurons. In Fig. 5.4, we show the results from a mixed mode SPICE-MATLAB simulation of the device described in Fig. 5.2. As expected, the voltage on the ME capacitor ( $V_{mem}$  in Fig. 5.4) shows the typical leaky-integrate behavior. If the accumulated voltage is sufficient enough, the device switches from +x to -x direction. The output of the inverter goes high to produce a spike. The neuron (ferro-magnet) remains non-responsive to further input spikes, unless it is reset by applying a negative pulse on the *Reset/Leak* terminal, as shown



Fig. 5.3. The stochastic switching behavior of the proposed ME neuron as a function of the voltage across ME capacitor. The switching probability was obtained for 10,000 runs using magnetization dynamics model with thermal noise and pulse duration of 1ns.



Fig. 5.4. Simulation results for the ME neuron, shown in Fig. 5.2. Top panel shows the input spikes fed to the  $V_{in}$  terminal of the device. Middle panel shows the voltage across the ME capacitor, exhibiting the typical leaky-integrate dynamics. Bottom panel, illustrates the switching of the ferro-magnet from +x to -x direction generating a spike annotated as *Spike-1*. No more spikes are generated until the device is reset to its initial position by applying a negative voltage. After reset, device emits a second spike annotated as *Spike-2*.



Fig. 5.5. (a) SNN topology for pattern recognition. The input neurons are fully connected to the excitatory post-neurons, each of which is connected to the corresponding inhibitory neuron in a one-on-one manner. There are lateral inhibitory connections from each inhibitory neuron to all the excitatory post-neurons except the one from which it received a forward connection. (b) STDP learning algorithm, wherein the change in synaptic conductance is exponentially related to the difference in the spike times of the pre- and post-neuronal pair.

in Fig. 5.4. Thus, by using the mixed mode simulation framework we demonstrate the feasibility of the proposed stochastic LIF neuron.

### 5.3 SNN Topology for pattern recognition

We evaluate the applicability of the proposed neuron on a two-layered SNN used for pattern recognition as shown in Fig. 5.5(a). Each pixel in the input image pattern constitutes an input neuron whose spike rate is proportional to the corresponding pixel intensity. The input pre-neurons are fully connected to every ME post-neuron in the excitatory layer. The excitatory post-neurons are further connected to the inhibitory neurons in a one-on-one manner, each of which inhibits all the excitatory neurons except the forward-connected one. The various synaptic connections between the pre-neurons and the post-neurons can be implemented efficiently by memristive crossbar arrays [84]. Lateral inhibition prevents multiple post-neurons from spiking for similar input patterns. The excitatory post-neurons are further divided into various groups, where the neurons belonging to a group are trained to recognize varying representations of a particular class of input patterns fixed *a priori*.

## 5.4 Synaptic Learning Mechanism

The synapses connecting the input neurons to the post-neurons (excitatory connections in Fig. 5.5 (a)) are subjected to synaptic learning, which causes the connected post-neuron to spike exclusively for a specific class of input patterns. Spike timing dependent plasticity (STDP), wherein the synaptic conductance is updated based on the extent of temporal correlation between pre- and post-neuronal (postsynaptic) spike trains is widely used to achieve plasticity in SNNs. The strength of a synapse is increased/potentiated (decreased/depressed) if a pre-spike occurs prior to (later than) the post-spike as shown in Fig. 5.5(b). The conventional STDP algorithm [85] considers the correlation only between pairs of pre- and post-synaptic spikes, while ignoring the information embedded in the post-neuronal spiking frequency. Hence, we exploit an enhanced STDP algorithm, wherein the STDP-driven synaptic updates are regulated by a low-pass filtered version of the membrane potential [85] that is a proxy for the post-neuronal spiking rate. According to the enhanced STDP algorithm, an STDP-driven synaptic update is carried out only if the filtered membrane potential of the corresponding post-neuron exceeds a pre-specified threshold. This ensures that synaptic learning is performed only on those synapses, where the connected post-neuron spikes at a higher rate indicating a strong correlation with the input pattern.

Additionally, we augmented the enhanced STDP algorithm with a reinforcement mechanism to further improve the efficiency of synaptic learning. Each post-neuron in the excitatory layer is designated *a priori* to learn a specific class of input patterns. During the learning phase, the corresponding synapses are potentiated (depressed) if the post-neuron spikes for an input pattern whose class matches with (differs from) its designated class. The reinforced learning scheme enables the synapses to encode a better representation of the input patterns.

### 5.5 Hardware Implementation

We present a possible crossbar arrangement of the synapses and ME neurons (Fig. 5.6) for an energy-efficient realization of the SNN. Multilevel memristive technologies [79, 86] and spintronic devices [87] have been proposed to efficiently mimic the synaptic dynamics. Each pre-neuronal voltage spike is modulated by the interconnecting synaptic conductance to generate a resultant current into the ME neuron. The neuron integrates the current leading to an increase in its membrane potential, which leaks until the arrival of subsequent voltage spikes at the input. The ME neuron switches conditionally based on the membrane potential, to produce an output spike. The on-chip learning circuit samples the post-neuronal spike to program the corresponding synaptic conductances based on spike timing [79]. The energy-efficiency of the crossbar architecture stems from the localized arrangement of the neurons and synapses compared to von-Neumann machines with decoupled memory and processing units.



Fig. 5.6. A typical crossbar implementation of the SNN topology using the proposed ME neuron. Memristive devices constitute the synapses, while the proposed device mimics the LIF post-neurons. The on-chip learning circuit programs the synaptic conductance based on spike timing. Inputs to the system are spike trains corresponding to the  $28 \times 28$  image pixels from the MNIST dataset.

### 5.6 Simulation Methodology

We developed a comprehensive device to system-level simulation methodology to evaluate the efficacy of an SNN composed of the proposed ME neurons for a pattern recognition application. The LIF dynamics of the ME neuron were validated using the mixed-mode simulation framework described earlier. The crossbar architecture of a network of such ME neurons was simulated using an open-source SNN simulator known as BRIAN [88] for recognizing digits from the MNIST dataset [89]. The leakyintegrate characteristics of the ME neurons were modeled using differential equations with suitable time constants while the switching dynamics were determined from stochastic LLG simulations. The synapses were modeled as behavioral multilevel weights. The enhanced STDP algorithm was implemented by recording the time instants of pre- and post-spikes, and regulating the weight updates with the averaged membrane potential.



Fig. 5.7. (a) Synaptic weights connecting the  $28 \times 28$  input pre-neurons to each of the 200 excitatory post-neurons towards the end of the training phase. (b) Classification accuracy verses the number of excitatory post-neurons.

Upon the completion of the training phase, digit recognition is performed by analyzing the spiking activity of different groups of neurons in the SNN, each of which learned to spike for a class of input patterns assigned *a priori*. Each input image is predicted to represent the class (digit) associated with the neuronal group with the highest average spike count over the duration of the simulation. The classification accuracy is then determined from the number of images correctly recognized by the SNN and the total number of input images. The classification performance is reported using ten thousand images from the MNIST testing image set.

Fig. 5.7(a) shows the synaptic weights connecting the  $28 \times 28$  input neurons to each of the 200 post-neurons towards the end of the training process. It can be seen that the synapses learned to encode the different input patterns. The LIF dynamics of the proposed ME neuron and the reinforced STDP learning algorithm helped achieve a classification accuracy of 81% for a network of 1600 neurons. It is evident from Fig. 5.7(b) that the classification performance can be improved by increasing the number of excitatory post-neurons.

The proposed ME neuron consumed 17.5 fJ and 1.04 fJ for read and reset operations, respectively. The read energy consists of the short circuit energy dissipated in the voltage divider formed by the reference MTJ and the ME device along with the energy consumed by the CMOS inverter. On the other hand, the reset energy is much lower, since the reset operation merely involves pulling down the ME capacitor to a negative reset voltage. The energy dissipated in charging the ME capacitor per training iteration was estimated by averaging the charging currents for all the neurons during a particular training iteration. The average ME capacitor charging energy was thus estimated to be 246 fJ per neuron per training iteration, which is energy-efficient compared to CMOS neurons that were reported to consume pJ of energy [80]. A major factor that leads to the energy-efficiency of the present proposal is the intrinsic non-volatility of the ferromagnets. Due to the non-volatility, CMOS latches are not required to store the output spikes. Moreover, the read path and associated CMOS circuit needs to be activated only during the read operation, thereby saving standby power dissipation.

### 5.7 Summary

Amid the quest for new device structures to mimic the neuronal dynamics, we have proposed a spin based neuron using the voltage driven magneto-electric switching of ferro-magnets. The proposed ME neuron emulates the stochastic firing behavior of a biological neuron along with the characteristic leaky-integrate-fire (LIF) dynamics. From a device perspective, the use of ME effect leads to energy-efficient as well as fast switching dynamics of the underlying ferro-magnet. Moreover, the proposed device structure allows independent optimization of the ME oxide and the MgO spacer to improve the switching as well as the sensing efficiency of the ME neuron. By using a device-to-system level simulation framework, we have demonstrated the efficacy of the present proposal for a standard hand-written digit recognition task. We believe that the emulation of the stochastic LIF dynamics of a biological neuron using our proposed ME device, would open up new possibilities for efficient hardware implementations for a wider range of computational and recognition tasks.

## 6. MESL: PROPOSAL FOR A NON-VOLATILE CASCADABLE <u>MAGNETO-ELECTRIC SPIN LOGIC</u>

### 6.1 Introduction and Related Work

CMOS technology has been the driving force behind the ever improving computing efficiency for the past few decades [90], [91]. However, as the miniaturization of CMOS devices continues, issues like the leakage power consumption, short channel effects, increased variability *etc.* have necessitated exploration of novel devices [92], [93]. Further, with the current emphasis on smart sensors and Internet of Things (IoT), low leakage non-volatile computing devices have became more attractive than ever before. Such beyond-CMOS logic devices are expected to augment/complement the existing CMOS technology [94].

Spin based logic devices are a promising candidate for beyond-CMOS technologies due to 1) non-volatility (ability to retain data in absence of power supply) and hence low leakage power consumption and 2) area-efficiency. As such, many proposals for logical operations using spin devices can be found in the literature. All spin logic (ASL) [7], [75] is one of the widely studied logic families based on non-local spin currents. The dependence of ASL logic on non-local spin currents presents a major drawback due to short *spin-flip* lengths in metallic channels. On the other hand, nonvolatile logic based on magnetic tunnel junctions (MTJs) embedded within CMOS logic circuits were explored in [95]. In addition, ME based logic devices have been explored in [96], [97]. The ME logic presented in [96] suffers from the requirement of a complex DC (direct current) bridge including three resistors for cascading. In [97], an XOR device was proposed, however details of cascading and the complexity of the required cascading circuits is missing. Recently, a spin logic based on magnetoelectric switching and the Inverse Rashba Edelstein effect was proposed in [70]. In this chapter, we not only demonstrate that our proposed logic-device can function as XNOR, NAND, IMP (implication) and NOR gate based on the configuration but also show that easy cascadability can be achieved by using minimal number of CMOS devices.

Specifically, we combine two scalable physics, 1) the switching of a ferro-magnet through a multi-ferroic material using the ME effect and 2) the resistance change of an MTJ as a function of the magnetization directions of the constituting ferro-magnets, to propose a non-volatile cascadable Magneto-Electric Spin Logic (MESL). The key highlights of the presented non-volatile logic are as follows:

- 1. We exploit the inherent coupling of multi-ferroic materials with the underlying magnetization direction of a ferro-magnet to achieve voltage driven low energy switching of nano-magnets. By stacking two such nano-magnets, in contact with respective ME oxides, we form an MTJ stack. We demonstrate that the resulting device can be used as a logic-element that can be used to implement complex Boolean functions.
- 2. Using a coupled magnetization dynamics and electron transport simulation, we show that the proposed logic-device exhibits good scalability, better robustness with respect to the influence of thermal noise and high switching speed as compared to the conventional current driven switching of nano-magnets.
- 3. Realizations of two input XNOR, NAND, IMP and NOR gates, forming a complete logic family, has been demonstrated. Further, we show that the proposed MESL gates can be easily cascaded using a *global-reset* operation and *dominostyle* clocking.
- 4. Typically CMOS logic family requires area expensive storage elements (for example, a flip-flop circuit), in order to retain the output of the logic gates. Such flip-flop circuits become redundant in the proposed MESL gates due to its inherent non-volatility.



Fig. 6.1. (a) (Left) Figure illustrating the ME switching of a ferromagnet with applied electric field. A positive voltage on the upper terminal switches the magnet in positive x direction and *vice-versa* (Right) An MTJ stack consisting of an MgO sandwiched between two nano-magnets. The resistance of the MTJ is a function of the voltage and the relative orientation of the magnetization directions. (b) The proposed four terminal logic-device. The upper (lower) nano-magnet can be switched by application of a voltage pulse on terminal 1 (2). The resistance of the MTJ stack can be sensed between terminals 3 and 4. The thickness of the ME oxide and the MgO spacer can be tuned independently to improve the write-efficiency and the sensing margin simultaneously.

In Fig. 6.1(a-b), we show the proposed device structure. The device in Fig. 6.1(b) consists of two nano-magnets in contact with respective ME oxides. Due to their multi-ferroic nature, each of the ME oxides are coupled to the magnetization direction of the underlying nano-magnet. When a positive voltage is applied on the upper or the lower ME oxide, the corresponding nano-magnets switch to +x direction, while for a negative voltage the magnets point in the -x direction. The upper and the lower nano-magnets are separated by an MgO spacer to form an MTJ stack, thus constituting the four-terminal device structure.

The four-terminal nature of the proposed device leads to decoupled read and write paths. Terminals 1 and 2 can be used as the write terminals by applying proper voltage levels to switch the underlying nano-magnets. We assume the ME oxides are thick enough such that the tunneling current flowing through the ME oxides is small enough to be neglected. On the other hand, the state of the device can be read by passing a current (or applying a voltage) between terminals 3 and 4. The proposed logic-device thus exhibits 1) low energy consumption due to electric field switching 2) decoupled read/write path such that respective oxides (ME oxide and MgO) can be optimized separately for read and write operations. In the next section, we describe how the proposed logic-device of Fig. 6.1(b) in conjunction to the ME-MTJ can be used for constructing non-volatile XNOR, NAND, IMP and NOR gates.

## 6.2 ME Logic Family and Cascadability



Fig. 6.2. (a) Proposed ME XNOR gate. Only when both the ferromagnets point in the same direction, the output of the inverter goes high, thus implementing an XNOR function. Inset shows the truth table for the XNOR function. L represents a digital 0 and H represents a digital 1. (b) Proposed ME NAND/NOR gate. For NAND operation, the inverter is sized such that the output goes low only if both the MTJ stacks are in anti-parallel (high-resistance) state. Whereas, for NOR operation, the sizing of the output inverter is such that it goes high only if both the MTJ stacks are in parallel (low-resistance) state.

The proposed two-input XNOR gate is shown in Fig. 6.2(a). The inputs to the XNOR gate are terminals 'A' and 'B'. A positive voltage represents a digital '1' and a negative voltage represents a digital '0'. If, both the inputs are the same, the two nano-magnets will either point in +x direction or in -x direction and the MTJ stack

would be in the low resistance (parallel) state. The voltage divider consisting of the reference MTJ and the actual MTJ stack, will drive the output of the inverter high, if and only if the MTJ stack is in parallel (low resistance) state. Thus, an XNOR function can be implemented using the configuration shown in Fig. 6.2(a). It is to be noted that while the resistance of the MTJ is being sensed, the voltage divider effect results in a non-zero voltage on the upper ferro-magnet denoted as node 'T' in Fig. 6.2(a). Since the upper ferro-magnet constitutes one of the plates of the upper ME capacitor, the voltage at node 'T' might switch the direction of the upper ferro-magnet. In order to avoid any inadvertent switching of the ferro-magnet, the sensing voltage, the resistance of the reference MTJ and the trip-point of the inverter were selected such that the voltage at node 'T' is less than the minimum voltage required to switch the ferro-magnet.

Next, we propose an ME NAND gate as illustrated in Fig. 6.2(b). The proposed NAND gate is composed of a series connection of two ME logic-devices. Each of the two series connected ME logic-device consists of an ME oxide in contact with a nano-magnet and separated by a fixed magnet using MgO spacer. The two MTJ stacks, shown in Fig. 6.2(b), switch to the high resistance anti-parallel state, only if the corresponding inputs are high. The output circuit forms a voltage divider as shown on the right hand part of Fig. 6.2(b). The ratio of the widths of PMOS and NMOS transistors in the output inverter are chosen such that the inverter output goes low if and only if both the MTJ stacks are in the high resistance (anti-parallel) state (or in other words both the inputs 'A' and 'B' are high). Thus, the circuit in Fig. 6.2(b) implements a NAND function by proper selection of the transistor widths. Interestingly, the same circuit shown in Fig. 6.2(b) can also mimic the behavior of a NOR gate. For the NOR gate, the PMOS and NMOS transistors in the inverter are sized such that, output of the inverter goes high if and only if both the MTJ stacks are in low resistance state (or in other words only if both 'A' and 'B' are low).

Next, we propose an IMP logic gate based on voltage driven magneto-electric (ME) switching of ferromagnets, as shown in Fig. 6.3. The IMP gate consists of a



Fig. 6.3. (Left) Truth table for an IMP and NIMP logic gate. (Bottom) The set of logic gates forming a complete logic basis along with the IMP/NIMP gate. (Right) The proposed 2 input ME IMP gate. Inset shows the state of the ME-MTJs under various inputs.

series connection of two ME-MTJs, forming a voltage divider. The node  $V_{mid}$  of the voltage divider is connected to an inverter. When a positive (negative) voltage pulse is applied on the inputs  $V_A$  and  $V_B$ , the respective ME-MTJs switch to low resistance parallel (P) (high resistance anti-parallel (AP)) state. The resistance of the two ME-MTJs and the *trip-point* of the inverter is chosen such that the inverter output goes low if and only if the ME-MTJ1 is in P state and ME-MTJ2 is in AP state. Thereby, as illustrated in the table in inset of Fig. 6.3(c), the proposed circuit mimics an IMP gate. Transistor M1 selectively provides a ground connection when the ME-MTJ1 is being written into.

Now that we have all the basic gates for computations, we would present the cascadability of our proposed logic gates. As an example, let us consider, cascading two ME-XNOR gates. As shown in Fig. 6.4, the output of the first XNOR gate is connected directly to the input of the next XNOR gate. Initially, we do a reset operation by application of a negative voltage pulse so that all the magnets point in the -x direction. This can be achieved by applying a negative voltage on input terminals 'A', 'B' and 'C', as shown in the timing diagram of Fig. 6.4. Simultaneously, we pull the 'G<sub>1</sub>' and 'G<sub>2</sub>' terminals of the inverter to negative reset voltages, thus,



Fig. 6.4. Figure illustrating cascading of two ME XNOR gates. Initially, a reset operation is carried out by applying negative voltage pulses on terminals 'A', 'B', 'C', 'G<sub>1</sub>' and 'G<sub>2</sub>'. On the other hand, when data is applied the two stages are activated in a typical dominostyle, one after another. A representative timing diagram illustrates the waveforms on various nodes.

driving the output of inverters to negative voltages. As such, a global reset can be achieved by simply applying a negative voltage on all the input and the intermediate terminals.

After the reset phase, terminals ' $G_1$ ' and ' $G_2$ ' are kept at zero volts for normal operation. Data inputs can now be applied on terminals 'A', 'B' and 'C'. Based on the the inputs at 'A', 'B' and 'C' the input magnets would flip if required, making the MTJ stack either high or low resistance. We then apply, voltage pulses on nodes ' $V_1$ ' and ' $V_2$ ' one after another, in a typical domino style [98]. When a voltage pulse is applied on node ' $V_1$ ', stage 1 (see Fig. 6.4) evaluates and the node ' $Out_1$ ' goes high

or to zero volts. If 'Out<sub>1</sub>' is high, the next stage magnet corresponding to terminal 'Out<sub>1</sub>' switches to +x-direction. If, however, the inputs are such that node 'Out<sub>1</sub>' remains at zero volts, the corresponding next stage magnet does not switch and stays in the desired -x direction. Once the first stage has evaluated, we apply a pulse on terminal 'V<sub>2</sub>', stage-2 evaluates and produces the desired output on 'Out<sub>2</sub>'.

Thus, easy cascadability, is achieved by use of the reset scheme and domino style clocking. Though, clocking is necessary for functioning of the proposed gates, it has been used in almost all non-volatile spin logic to reduce leakage power consumption [7], [70].

### 6.3 **Results and Discussions**

The energy associated with a single gate, for example the XNOR gate of Fig. 6.2(a), can be estimated as follows. The total switching energy would consists of the energy to reset the two nano-magnets, the energy to switch the nano-magnets depending on the incoming data and the energy to turn ON the transistor M1 shown in Fig. 6.2(a).

$$E_{Swi\ Total} = 2C_{ME}V_{Reset}^2 + 2C_{ME}V_{Data}^2 + C_GV_G^2 \tag{6.1}$$

where  $C_{ME}$  is the capacitance of the ME capacitor,  $C_G$  is the gate capacitance of transistor M1,  $V_{Reset}$  is the reset voltage,  $V_{Data}$  is the voltage on terminals 'A'/'B' and  $V_G$  is the gate voltage for transistor M1. Using our simulation  $E_{Swi \ Total}$  was estimated to be 5.5 fJ. Similarly, the read-out energy would consists of the energy associated with the voltage divider and the inverter. From our simulations, the readout energy was estimated to be 30 fJ, assuming a time duration of 500ps. The energy of other logic gates can be similarly estimated and are in the same range.

## 6.4 Summary

Non-volatile logic devices are of particular interest given the current emphasis on low power mobile devices, event-driven sensing and Internet of Things. In the present work, we have exploited the ability to switch a ferro-magnet using the ME effect, to propose a non-volatile logic-device. Besides its inherent non-volatility, the present proposal achieves significant benefits in terms of switching energy of the ferro-magnets due to the use of ME effect. Further, the proposed MESL gates can be easily cascaded to implement more complex Boolean functions. From a device perspective, the proposed logic-device shows good scalability, better robustness to thermal fluctuations and high switching speed. We envisage that the proposed MESL gates could be a promising candidate for beyond-CMOS low leakage logic devices.

# 7. VOLTAGE-DRIVEN DOMAIN-WALL MOTION BASED NEURO-SYNAPTIC DEVICES

## 7.1 Introduction and Related Work

As stated in chapter 5, neuromorphic systems aim to mimic the computations of the human brain in order to develop novel energy-efficient computing platforms. However, there exists an inherent mismatch between the computing model for neuromorphic systems and the underlying CMOS transistor - which forms the building block of the present hardware implementations. As such, novel nano-scale devices are required that can efficiently imitate the behavior of the underlying building blocks of a neuromorphic system - the *neurons* and the *synapses*. In chapter 5, we have presented a mono-domain magnet switched through the ME effect as a stochastic-leakyintegrate-fire neuron primitive. In this chapter we would present a purely voltage driven domain-wall magnet not just as a neural primitive but also a synaptic device.

Hardware implementations of spiking neurons have conventionally relied on digital [99, 100] as well as analog [101, 102] CMOS circuits. Apart from area expensive implementations, CMOS spiking neurons suffer from high leakage power dissipation. Such a standby power dissipation is a major concern given that the spiking neural networks (SNNs) show large scale sparsity. Non-volatile devices that can mimic the neuronal functionality are of particular interest for such sparse systems due to zero standby power dissipation. Many non-volatile devices have been proposed as being able to exhibit the neuronal behavior. For example, phase change devices have shown to mimic the integrate-fire dynamics of biological neurons [81]. Similarly, domain wall (DW) magnetic devices have also been reported to exhibit the integrate-fire dynamics [103]. However, both the phase change and the DW neuron proposals show integrate-fire dynamics as opposed to the leaky-integrate-fire behavior of biological neurons. On the other hand, an LIF neuron based on the magnetization behavior of a ferro-magnet under an input current governed by the spin transfer torque (STT) mechanism has been presented in [82]. However, the LIF characteristic of the neuron presented in [82] arises due to the physical mechanisms governing the magnetization dynamics and hence, is difficult to control. Another proposal for an LIF neuron using a mono-domain ferro-magnet switched by the magneto-electric (ME) effect has been reported in [19] and has been described in detail in chapter 5. Though local magnetization switching through the ME effect has been demonstrated, yet a global reversal of the magnetization vector has remained elusive [104, 105]. On the other hand, emerging nonvolatile memory technologies have been demonstrated for energyefficient implementations of biological synapses including phase-change devices [106], memristive devices [107] and spintronic devices [87, 108, 109]. With this background in this chapter, we show that the recent experimental demonstrations [110-112] of a ferro-magnetic DW (FM-DW) motion through voltage driven coupling with an underlying ferro-electric DW (FE-DW) can be used to construct voltage-controlled energy-efficient LIF neuron and non-volatile programmable synapse. The key highlights of the present paper are as follows:

- We propose a neuro-mimetic LIF neuron and synaptic device based on elastic coupling between an FM-DW and an FE-DW. The strong pinning of the FM-DW to the underlying FE-DW allows for pure voltage driven control of the FM-DW. The voltage driven movement of the FM-DW along with the resistance change of a magnetic tunnel junction (MTJ), allows to mimic behaviors of biological neurons and synapses.
- 2. Further, only the firing event of the neuron presented in [82] and [19] was non-volatile, where as the membrane potential in both the cases was volatile and would decay to zero in absence of the power supply. For the present proposal, the membrane potential is represented by the position of the FM-DW and hence is non-volatile.

- 3. Advantageously, the same device with minimal modifications can also be used as a low-energy synaptic device. This is in contrast to traditional domain-wall based synaptic devices that are driven by current and hence are expensive in terms of energy-expenditure.
- 4. We have developed a device to circuit level simulation framework to analyze the behavior of the proposed neuro-synaptic devices. Our simulation framework comprises of micromagnetic simulation of the magnetization dynamics and nonequilibrium Green's function (NEGF) based resistance model for the MTJ.

# 7.2 Magneto-Electric DW motion based on Elastic Coupling

Ferro-magnetic domain wall (FM-DW) motion has conventionally been driven by magnetic field or through the more scalable mechanism - the current induced spin transfer torque (STT) phenomenon [113]. However, the current based DW motion results in relatively high energy dissipation and therefore, extensive research investigation for pure voltage driven DW motion has gained ground in recent times [110–112]. One way of achieving such voltage driven FM-DW motion is through elastic coupling induced through a ferro-electric – ferro-magnetic heterostructure [112]. Our proposal is based on the recent experimental evidence of controlled FM-DW motion under applied electric field [110].

The mechanism driving the FM-DW under influence of an applied electric field can be understood by referring to the heterostructure shown in Fig. 7.1(a). It consists of a multi-ferroic ferro-electric material like  $BaTiO_3$  [110] in physical contact with a ferro-magnet. Materials like  $BaTiO_3$  show spontaneous electric polarization at room temperature. This polarization arises from the small atomic shift of Ti ions with respect to the oxygen octahedron [114]. Such atomic displacement of ions with respect to one another also results in a macroscopic strain. These materials usually show stripe pattern of domains - where the displacement of constituting ions is in the same direction, separated by a thin DW. In many cases these domains are separated



Fig. 7.1. (a) The replication of the domain pattern of the FE layer into the FM layer due to local strain coupling. An effective uniaxial anisotropy is induced in the region of the FM above the a-domain, while a cubic anisotropy is induced in the region over the c-domain. (b) Due to high aspect ratio the demagnetization anisotropy of the FM tends to align the magnetization of the FM along the length of the magnet, thereby resulting in almost 180° angle between the magnetizations in the two regions of the FM.

by an angle of  $90^{\circ}$ . For example, a  $90^{\circ}$  domain wall with domains pointing in-plane (a-domains) and those pointing out-of-plane (c-domains) is shown in Fig. 7.1(a).

When a ferro-magnetic material is grown on top of such a ferro-electric material it experiences different amount of local strain based on the underlying domain structure of the FE material. Due to different kind of strains experienced by the FM on top of a-domain versus c-domain, different magnetic anisotropies exist in the two regions. As shown schematically in Fig. 7.1(a), the part of the FM over the a-domain experiences a uniaxial anisotropy while the part of the FM over the c-domain experiences a cubic anisotropy. Such different anisotropies in the region over the a- and the c-domains has been experimentally measured [110]. The details of the crystal structure in the a- and the c- domains of the FE layer that leads to the induction of the uniaxial and the cubic anisotropy can be found in [115]. Due to these locally induced strain anisotropy, the FM forms a domain pattern resembling the domain structure of the underlying FE material (see Fig. 7.1(a)). There exists a strong *pinning potential* between the FE-DW and the FM-DW due to i) different amount of local anisotropies ii) the anisotropy change is almost abrupt since the FE-DW has a typical width of few lattice constants (which is an order of magnitude smaller than a typical DW width of an FM [112]).

When a transverse electric field is applied to the FE material, the domain favoring the electric field expands at the expense of the other domain. This leads to an FE-DW motion with respect to applied electric field. Due to strong pinning, the FM-DW gets dragged with the underlying FE-DW resulting in voltage controlled FM-DW motion. One of the issues with the FM-DW formed due to elastic coupling to the FE-DW is that the FM-DW shows less than 180° difference in magnetization orientations in the two domains (refer Fig. 7.1(a)). This is because the magnetization of the FM over the c-domain inclines itself at an angle of 45° with the FM-DW. The position of an FM-DW can be sensed through a magnetic tunnel junction (MTJ) whose resistance varies as a function of the average magnetization direction which switches by an angle of 180°. In the present case, due to voltage induced FM-DW motion, the magnetization can switch only by an angle less than 180°, thereby significantly reducing the resistance sensing margin of the MTJ.

Interestingly, the FM-DW can be changed to a  $180^{\circ}$  DW by exploiting the demagnetization anisotropy of a high aspect ratio FM [110]. Let us assume the FM on the top of the FE material is patterned into rectangular magnets such that the domain walls in the FE and FM layers make an angle of  $45^{\circ}$  with the length of the magnet, as shown in Fig. 7.1(a), thereby aligning the magnetization of the FM over the c-domain along the length of the FM. Due to the rectangular shape of the FM, a demagnetization anisotropy would tend to keep the magnetization of the FM aligned with the longer dimension of the FM *i.e.* towards the (positive or negative) x-axis. Thereby, for the part of FM on the a-domain, the negative x-axis direction would be favored and the magnetization in that region would point in negative xdirection. Thus, through proper engineering of the FM shape one can obtain almost 180° FM-DW although the underlying FE domains exhibits a 90° separation. As we will observe later, this 180° FM-DW will allow better resistance sensing margin for the proposed devices.

# 7.3 Magneto-Electric DW motion based Neuro-Synaptic Devices

Exploiting the aforementioned 180° voltage driven FM-DW motion, we can construct neuro-synaptic devices, wherein the membrane-potential of the neuron and the weights of the synapses are represented by the position of the FM-DW.

# 7.3.1 LIF Neuron

The proposed LIF neuron is shown in Fig. 7.2. The device consists of a ferromagnet / ferro-electric heterostructure in contact with one another. The metal contact to the ferro-electric layer is explicitly shown in the figure. When a positive voltage is applied on the metal contact, the a-domain expands at the cost of the cdomain, resulting in an FE and FM-DW motion in the positive x-direction. Similarly, on application of a negative voltage, the c-domain expands causing a DW motion in the negative x-direction. The motion of the FE-DW drags with it the FM-DW in response to the applied voltage. The range of motion of the FE-DW is constrained by the area covered under the metal contact, thus ensuring the FE-DW never disappears in the ferro-electric material.

At the rightmost end, the ferro-magnetic layer forms an MTJ structure with a tunneling oxide (MgO) and a fixed ferromagnetic layer called the *pinned layer* (PL). When the PL and the average magnetization direction (of the region of ferro-magnet under the MTJ cross-section) points in the same (opposite) direction the MTJ exhibits low resistance parallel state 'P' (high resistance anti-parallel state 'AP'). A reference MTJ is used to form a voltage divider connected to a CMOS inverter. The resistance of the reference MTJ and the trip point of the inverter is chosen such that the output terminal denoted as *spike* goes high if and only if the lower MTJ is in parallel state.



Fig. 7.2. The proposed non-volatile LIF neuron based on elastic coupling between the FE-DW and FM-DW. The position of the FM-DW represents the membrane-potential, while the switching activity of the MTJ emulates the firing behavior of the neuron.

Note, the fact that the ferro-magnet shows 180° DW allows for maximum difference in the parallel and anti-parallel resistance of the MTJ, thereby increasing the voltage difference at the input of the inverter allowing robust operation.

The desirable characteristics of the proposed device can be enumerated as 1) the voltage driven motion of the DW allows low energy consumption 2) the write path (through the ferro-electric metal contact) and the read path (through the voltage divider) ensures decoupled read/write operations, thereby allowing independent op-timization of the read and write paths 3) the voltage divider circuit along with the CMOS inverter allows for low overhead reading of the spiking event.

The LIF characteristic of the proposed device can be understood as follows: 1) A positive voltage on the metal contact will result in motion of the FM-DW in +xdirection. Since the DW position is non-volatile the forward motion of the FM-DW mimics the *integrate dynamics* of biological neurons. 2) A small negative voltage on the metal contact would lead to an FM-DW motion in the negative x-direction, thereby imitating the *leaky behavior* of the neuron. Note, it is required that the membrane-potential of a neuronal device (represented by the position of FM-DW) should keep leaking at all times except when it has received sufficient excitation in form of input spikes. For a typical current driven FM-DW motion, such a leaky characteristics would incur unacceptable energy consumption. This is due to the fact that all the neurons will continuously require a negative current flow through the device to ensure the FM-DW keeps moving slowly in the negative x-direction. In the proposed voltage controlled neuronal device, the leaky behavior merely requires a small negative voltage on the metal contact. Since ferro-electric materials are usually insulators, this negative voltage would not incur any short circuit leakage current thus allowing negligible energy overhead for mimicking the leaky behavior. 3) Finally, when the FM-DW travels far enough in the +x-direction the average magnetization of the FM under the MTJ switches and the output of the inverter goes high indicating the *firing event* of the proposed device. Thus, the proposed device exhibits the nonvolatile leaky-integrate-fire dynamics.

# 7.3.2 Programmable Synapse

The proposed synaptic device under investigation is shown in Fig. 7.4. This device is very similar to the neuronal device described above. The device consists of a ferro-magnet / ferro-electric heterostructure elastically coupled together. When a positive voltage is applied on the metal contact (between Terminal-2 and 3), the FE-DW moves in the positive x-direction. The FE-DW drags along with it the FM-DW in the positive x-direction in response to the positive voltage. Similarly, for a



Fig. 7.3. Micromagnetic simulation showing the domain wall shape and structure. The zoomed image shows a  $90^{\circ}$  domain wall which has been transformed to a  $180^{\circ}$  domain wall due to shape anisotropy.

negative voltage, the FE-DW, and thus FM-DW, move in the negative x-direction. In addition, the free ferro-magnetic layer forms an MTJ structure with a tunneling oxide (MgO) and a pinned ferromagnetic layer (between Terminal-1 and 3). When the FM-DW position is at one end (x = 0), the MTJ is fully in the high resistance AP state. When the FM-DW is at the other end of the device, the MTJ is in the low resistance P state. However, if the FM-DW is somewhere in between, the resistance of the MTJ is a parallel combination of AP and P, as follows:

$$G_{MTJ} = \frac{x \times G_P + (L - x) \times G_{AP}}{L}$$
(7.1)

where  $G_{MTJ}$  is the MTJ conductance,  $G_P$  and  $G_{AP}$  are the parallel and anti-parallel conductances of the MTJ, respectively, x is the FM-DW position and L is the total length of the magnet. This simplified equation holds because the length of the domain wall is small compared to the length of the magnet. The resistance or conductance



Fig. 7.4. The proposed non-volatile programmable synapse based on elastic coupling between the FE-DW and FM-DW. The position of the FM-DW modulates the conductance between Terminal-1 and 3 of the device. The FM-DW position, and thus the conductance of the synapse, can be modified by applying a +ve or -ve voltage across Terminal-2 and 3.

can be sensed between Terminal-1 and 3. Thus, the conductance of this device can be set anywhere between  $G_P$  and  $G_{AP}$ , representing the synaptic conductance in SNNs.

The synaptic behavior of the device can be understood as follows: 1) A positive voltage on the metal contact will result in FM-DW motion in +ve x-direction, and thereby increase the MTJ conductance. This mimics the *long-term potentiation*, or strengthening of the synaptic weights in SNNs. 2) A negative voltage on the metal contact results in FM-DW motion in -ve x-direction, and decreases the synaptic conductance. This mimics the *long-term depression*, or reduction in synaptic strength in SNNs. 3) Interestingly, the proposed device can exhibit leaky-behavior in addition to the usual non-volatile multi-level memory characteristics that can be used to model *'forgetting'* in synapses through short term memory mechanism. For a typical current driven FM-DW motion, such a leaky characteristic would incur unacceptable energy

consumption. In the envisioned voltage controlled synaptic device, the leaky behavior merely requires a small negative voltage across the FE layer. The small negative voltage on the metal contact recedes the FM-DW position, thereby reducing the synaptic strength continuously over time.

# 7.4 Device Modeling and Simulation

A mixed-mode simulation framework was developed for the analysis of the proposed device structure. The simulation framework consists of three components a) the exponential dependence of FE-DW velocity on the applied voltage in accordance to the Merz's Law [116], b) the micromagnetic response of the FM-DW due to elastic coupling with the underlying FE-DW motion, c) the resistance change of the MTJ as a function of device dimensions and magnetization directions based on non-equilibrium Green's function (NEGF) formalism [40].

**FE-DW velocity** The field driven dynamics of FE domain walls has been extensively studied in the past [117], [118], and the velocity of FE-DW has been observed to depend exponentially on the applied voltage. This exponential dependence is given by the Merz's Law [116]. Experimental evidence of exponential FE-DW motion in BaTiO<sub>3</sub> has been demonstrated in [119]. Merz's Law can be written as

$$v_{FE} = K_{FE} \times exp(a/E) \tag{7.2}$$

where  $v_{FE}$  is the FE-DW velocity, E is the electric field, given by  $\frac{V_{FE}}{t_{FE}}$ , where  $V_{FE}$  is the applied voltage across the FE layer, and  $t_{FE}$  is the thickness of the FE layer.  $K_{FE}$ , a are fitting parameters from experimental data adopted from [119]. As shown later in the manuscript, the FM-DW would closely follow the motion of the FE-DW [112]. As such, the FM-DW will also have an exponential dependence of velocity with respect to the applied voltage.

Table 7.1. Parameters used for simulations adopted from [110, 112] for studying ME-DW Neuro-Synaptic Device

| Parameters                          | Value                    |
|-------------------------------------|--------------------------|
| Magnet Length $(L_{mag})$           | 1.5um                    |
| Magnet Width $(W_{mag})$            | 100nm                    |
| Magnet Thickness $(t_{mag})$        | 2.5nm                    |
| FE Oxide Thickness $(t_{FE})$       | 100nm                    |
| Saturation Magnetization $(M_S)$    | $1.7 	imes 10^6 A/m$     |
| Gilbert Damping Factor ( $\alpha$ ) | 0.01                     |
| Exchange Stiffness $(K_{ex})$       | $2.1\times 10^{-11} J/m$ |
| Cubic Anisotropy $(K_c)$            | $4 \times 10^4 J/m^3$    |
| Uniaxial Anisotropy $(K_u)$         | $2 \times 10^4 J/m^3$    |
| Simulation cell size $(dx, dy, dz)$ | 2.93, 2.93, 2.5nm        |
| Temperature $(T)$                   | 300K                     |

Micromagnetic FM-DW dynamics For simulating the FM-DW dynamics in response to the underlying FE-DW motion, we used a GPU based micromagnetic simulator called MuMax [120]. The a- and c-domains of the FE material result in different amount of strains due to the multi-ferroic nature of materials like BaTiO<sub>3</sub> [115]. This strain is transferred to the FM layer on top of the FE layer. Thus, the FM layer experiences different amount of strains depending on the fact whether the FE layer underneath has an a-domain or a c-domain. This results in local strain anisotropy experienced by the FM layer. When the FE-DW moves (in accordance to

the Merz's Law), the local strain anisotropy experienced by the FM layer moves as well. Thus, in order to mimic the different anisotropies experienced by the FM layer, we divided the simulation region in two, each having a different strain anisotropy due to the underlying a- and c-domains. This simulation methodology is similar to the ones adopted in [110, 112]. A uniaxial anisotropy with a constant  $K_u$  was added for that part of the FM layer that was supposed to be above the a-domain. Similarly, a cubic anisotropy with a constant  $K_c$  was added for the region of the FM layer corresponding to the c-domain. Thus, there exists an anisotropy boundary (AB) in the FM layer wherein the anisotropy changes from uniaxial to cubic. The various simulation parameters used in our framework are summarized in Table 7.1. Recall, because of the demagnetization field one would expect the FM-DW to exhibit  $\sim 180^{\circ}$ DW, even though the underlying FE layer shows a  $90^{\circ}$  DW. This can be confirmed by the micromagnetic simulation results shown in Fig. 7.3. As seen in the figure, the magnetizations near the DW have  $90^{\circ}$  difference in their orientations. But, as one moves away from the DW, the magnetizations (in the 'blue' region) slowly tend to orient themselves to  $180^{\circ}$  due to the effect of the demagnetization field.

In order to mimic the movement of the FE-DW on application of a voltage across the FE layer, we shifted the AB in accordance to the Merz's law. It was found that the FM-DW followed the FE-DW linearly up to a certain velocity called the *depinning velocity*. Beyond the depinning velocity, the FM-DW was not able to keep pace with the fast moving FE-DW. Thus, the maximum achievable velocity constraint arises due to the slower response of the FM-DW to the fast moving FE-DW. Note, a detailed description of the FM-DW dynamics beyond the depinning velocity can be found in [112]. In this work, we would only utilize the velocity regime below the depinning velocity where the FM-DW linearly follows the FE-DW. As shown in Fig. 7.5, we plot the FM-DW velocity ( $v_{FM}$ ) as a function of the FE-DW velocity  $v_{FE}$ . The blue line, represents the FM-DW velocity for a periodic boundary condition similar to [112]. On the other hand, the red line corresponds to a more realistic simulation where the magnet dimension was taken to be  $1.5um \times 100nm \times 2.5 nm$ . The simulations show



Fig. 7.5. Depinning velocities of the magnetic domain wall, for positive and negative velocities. The blue plot was obtained by using periodic boundary conditions and parameters from [112]. The red plot was obtained without periodic boundary conditions and scaled dimensions.

a slight decrease in the depinning velocities ( $\sim 550$  m/s and  $\sim 210$  m/s respectively, for positive and negative direction), as compared to the periodic boundary case. This can be attributed to the increased demagnetization due to shape anisotropy in the smaller magnets. A higher demagnetization field suppresses the control of the strain anisotropy from the underlying FE layer. For the LIF neuron, we shall use the positive velocity for the integrate operation and the negative velocity for imitating the leaky dynamics of the neuron. Similarly, for the synaptic device, we would use the positive velocity for long-term potentiation and the negative velocities for long-term depression and leaky-behavior of synaptic weights.

**Resistance Model** To model the resistance of the MTJ stack, non-equilibrium Green's function (NEGF) formalism [40] was used. The details of the NEGF model for estimation of the resistance of the MTJ as a function of applied voltage and the average magnetization direction was adopted from [40]. The NEGF model used for the

analysis of the proposed neuro-synaptic device is same as the one described in context to chapter 4. The results from the NEGF equations were abstracted into a Verilog-A model, which was used in SPICE simulations along with predictive technology models (PTM) [42] for CMOS transistors.

### 7.5 Results

#### 7.5.1 Neuro-synaptic behavior of the proposed devices

Fig. 7.6 shows the behavior of the proposed neuron, obtained from the mixedmode simulation model developed for the device. The FM-DW was initially assumed to be at x = 0 position (refer Fig. 7.2). Correspondingly, at the extreme right end of the magnet, the MTJ is in anti-parallel resistance state. A train of voltage spikes is applied to the metal contact, which mimics the pre-synaptic spikes received by the neuron (Fig. 7.6(a)). In a typical neuromorphic system, the input data is encoded in the frequency or the timing of the incoming spikes. It is to be observed that the resting potential of the input spikes is a negative voltage (-1V) upon which the spikes are superimposed and are represented by voltage pulses of amplitude 2V with a time duration of 1ns.

When an incoming spike is applied to the neuron, the FE-DW and the FM-DW move in the positive x-direction thus implementing the integrate behavior. In absence of any incoming spike, the neuron sees a negative voltage on its input terminal and the FE-DW as well as the FM-DW slowly move towards the negative x-direction, thereby mimicking the leaky behavior (Fig. 7.6(b)). When the FM-DW reaches far enough in the positive x-direction, the average x-component of the magnetization  $(m_x)$  beneath the MTJ switches from -1 to +1 (Fig. 7.6(c)). In response, the output of the voltage divider would switch from low to high indicating that the neuron has emitted a spike. This output spike voltage can be used to trigger the reset phase, and a negative voltage can then be applied at the metal contact, thereby moving the domain wall back to the initial position (x = 0). The device remains non-responsive



Fig. 7.6. Leaky integrate and fire behavior of the proposed neuron in response to input train of spikes. (a) Input voltage spike train received by the neuron. (b) FM-DW position (acts as membrane potential variable). (c) x-component of magnetization under the MTJ stack. Once the MTJ switches, the neuron fires, and the domain wall is reset to its initial position. The inset shows the average magnetization under the MTJ when the domain wall traverses the MTJ.



Fig. 7.7. Plot of MTJ conductance  $G_{MTJ}$  of the synaptic device in response to voltage pulses exhibits a controlled behavior of the synaptic weights. This can be used for better learning algorithms like 'ASP' for precise tuning of synaptic weight values. The leaky behavior of the synaptic weights can be implemented using a small -ve voltage across the device.

to any more incoming spikes during this duration, mimicking the *refractory period* of the neuron.

In Fig. 7.7, we show the simulation results for the synaptic device. Again, the FM-DW was initially assumed to be at x = 0 position (refer Fig. 7.4). The conductance of the synaptic device  $(G_{MTJ})$  is plotted with time in response to a positive and negative voltage pulses. During the positive applied voltage, the FM-DW moves in the +ve x-direction, thereby increasing the conductance, as described in Eqn 7.1. On application of a negative voltage, the FM-DW moves in the -ve x-direction, thereby decreasing the conductance. Thus we obtain a voltage-controlled conductance to set the synaptic weights as desired. Also, a small negative voltage would allow slow decay of the synaptic conductance to account for the 'forgetting' mechanism in synapses.

# 7.6 Conclusion

Hardware implementations of neuromorphic systems are of paramount interest due to their efficiency in solving recognition and classification tasks. Towards that end, in this paper we propose a non-volatile leaky-integrate-fire neuron and a programmable synapse using the voltage driven bi-directional FM-DW motion. The FM-DW motion arises in response to the FE-DW motion of an underlying FE layer due to elastic coupling. The energy efficiency of the present proposal results from its intrinsic non-volatility and the pure voltage driven nature of the FM-DW movement. A mixed mode simulation framework consisting of micromagnetic simulation for magnetization dynamics and NEGF model for resistance of the MTJ was used to demonstrate the feasibility of the proposed neuro-synaptic device. The use of voltage driven domainwall motion allows us to easily implement leaky behavior in the neuro-synaptic device without use of constant current requirement as in the case for conventional current drive domain-wall motion.

# 8. ENERGY-EFFICIENT MEMORIES USING MAGNETO-ELECTRIC SWITCHING OF FERROMAGNETS

# 8.1 Introduction

Magneto-resistive memories based on current driven Spin Transfer Torque (STT) [121], have attracted immense research interest due to their non-volatility, almost unlimited endurance and area-efficiency [122]. However, STT based memories suffer from inherent low switching speed and high write-energy consumption [13]. The high switching energy requirement for STT based memories can be attributed to the current driven switching of the ferromagnets. Although, novel physics like the spin Hall effect (SHE) [123] has been used to improve the spin generation efficiency, thereby lowering the required switching energy, the current driven nature of the SHE results in relatively high energy consumption. As such, voltage driven reversal of the magnetization direction has attracted considerable research interest, in an attempt to lower the write energy of spintronic devices.

As we have noted in previous chapters, voltage induced Magneto-Electric (ME) effect, has shown potential for fast and energy-efficient switching of ferromagnets [14]. The core of the research work on ME effect has been driven by the so called *multi-ferroic* materials. Multi-ferroics inherently exhibit more than one order parameters (for example, materials possessing ferro-magnetism as well as ferro-electricity and/or ferro-elasticity) [62]. The coupling between such order parameters allows to control one order parameter, for example, the ferromagnetism of a material through another order parameter like the effect of applied voltage on ferro-electricity or ferro-elasticity [65].

Devices based on such multi-ferroic materials are expected not only to have low switching energy consumption but also better switching speed than the conventional current driven spintronic devices. Therefore, many device proposals for memory [124], [125] and logic applications [18,70,126] of the ME effect can be found in the literature. In this chapter, we use the ME-MTJ and ME-XNOR device to propose novel memory configurations using some of the intrinsic characteristics exhibited by these devices. Specifically, the key highlights of the proposed memory bit-cells are as below.

- We analyze two voltage driven spintronic devices based on ME effect viz. ME-MTJ [126] and ME-XNOR [18,126] device with focus on memory applications. Specifically, we analyze these devices with respect to writability, readability and switching speed.
- 2. We propose two novel non-volatile energy-efficient memories i) a 1-Read / 1-Write dual port memory that utilizes the decoupled read and write path of ME-MTJs and ii) a content addressable memory (CAM) based on the compact XNOR operation enabled by ME-XNOR device.
- 3. Our results are based on a coupled stochastic magnetization dynamics implemented through the well-known *Landau-Lifshitz-Gilbert* equation and a nonequilibrium Greens function (NEGF) electron transport model.

#### 8.2 ME devices under consideration

Out of the various possible ME switching mechanisms, we would focus on the exchange bias coupled devices as was detailed in chapter 4. The basic phenomenon driving the switching process is the fact that the exchange bias field can be reversed from one direction to another by application of an electric field [127]. For the devices shown in Fig. 8.1(a) and (b), let us assume the ferro-magnet has an in-plane easy axis due to the shape anisotropy. When an external voltage is applied, based on the voltage polarity, the generated exchange bias field either points in the +x or the -x direction. If the generated ME field is strong enough to overcome the in-plane



Fig. 8.1. (a) Schematic of the ME-MTJ and (b) ME-XNOR. The ferromagnets in contact with respective ME oxides can be switched by applying appropriate voltages across the ME oxides. The direction of switching can be reversed by changing the polarity of the applied voltage. Due to shape anisotropy the easy axis of the ferro-magnets lie along the  $\pm x$  axis.

anisotropy, the magnetization direction switches under the influence of the exchange bias field.

We consider two ME based devices – ME-MTJ [126] and ME-XNOR [18, 126], with focus on memory applications. ME-MTJ consists of an MTJ in contact with an ME oxide underlayer as shown in Fig. 8.1(a). The MTJ itself is composed of a *pinned layer* (PL), a *free layer* (FL) and an oxide spacer (usually MgO [66]). Depending on the orientations of the free and the pinned layer, the ME-MTJ can be in either low resistance parallel (P) state or high resistance anti-parallel (AP) state. The normalized difference in the resistances of the AP and P state is expressed by the tunnel magneto-resistance (TMR) ratio of the MTJ.

In order to switch the ME-MTJ from P (AP) to AP (P) state a positive (negative) voltage exceeding a certain threshold needs to be applied on terminal 1 in Fig. 8.1(a). The metal contact to the ME oxide, the ME oxide itself and the free layer of the MTJ



Fig. 8.2. (a) Switching probability versus voltage applied across the ME capacitor. It can be seen larger the ME co-efficient lower is the voltage required to switch the direction of magnetization. (b) The failure probability obtained versus voltage. Each point on the graph was obtained by 1,000 simulations of the stochastic LLG equation. The voltage was applied for a duration of 500ps and the state of the magnet was investigated after the application of the voltage pulse to verify if the magnet has switched within the applied pulse duration.

can be considered as a capacitor. On the other hand, the value stored in the ME-MTJ can be read by sensing the resistance between terminals 1 and 2.

In Fig. 8.1(b) we show the ME-XNOR device. The ME-XNOR device consists of two free layers separated by MgO and in contact with respective ME oxides. If the voltage polarity on the terminals 1 and 2 are the same, the MTJ stack would be in P state (measured between terminals 3 and 4), while a different voltage polarity on the two terminals would lead to an AP state. Thus, the proposed device emulates an XNOR functionality. ME-XNOR device have been used for logic applications [18] and has been described in chapter 6. In this chapter, we would show that ME-XNOR device can be used to construct an energy efficient CAM. The device model used for analyzing both the ME-MTJ and the ME-XNOR device has been presented earlier in chapter 4.

| Parameters                                      | Value                  |
|-------------------------------------------------|------------------------|
| Magnet Length $(L_{mag})$                       | $45nm \times 2.5$      |
| Magnet Width $(W_{mag})$                        | 45nm                   |
| Magnet Thickness $(t_{FL})$                     | 2.5nm                  |
| ME Oxide Thickness $(t_{ME})$                   | 5nm                    |
| Saturation Magnetization $(M_S)$                | $1257.3 \ KA/m \ [38]$ |
| Gilbert Damping Factor ( $\alpha$ )             | 0.03                   |
| Interface Anisotropy $(K_i)$                    | $1mJ/m^2$ [38]         |
| ME Co-efficient $(\alpha_{ME})$                 | $0.15/c^*ms^{-1}$      |
| Relative Di-electric constant $(\epsilon_{ME})$ | 500 [70]               |
| Temperature $(T)$                               | 300K                   |
| CMOS Technology                                 | 45nm PTM [42]          |
|                                                 |                        |

Table 8.1.Summary of Parameters used for our simulations

\*c = Speed of light.

# 8.3 Device Characteristics

# 8.3.1 Writability

Writing into ME devices is accomplished by application of appropriate voltages across the ME capacitor. An important parameter that dictates the write voltage and hence the write energy is the magneto-electric co-efficient ( $\alpha_{ME}$ ).  $\alpha_{ME}$  is the ratio of magnetic field generated per unit applied electric field [128]. Experimentally, various ME materials have shown  $\alpha_{ME}$  in the range 0.1/c to 1/c (c is speed of light) [72]. In Fig. 8.2 (a) we plot the switching probability as a function of voltage across the ME capacitor for different values of  $\alpha_{ME}$ . It can be seen, ME materials with high  $\alpha_{ME}$  are desirable for achieving low write energy.

# 8.3.2 Readability

In a memory configuration, a CMOS transistor is used in series with the storage device. Therefore, the bit-cell TMR *i.e.* the TMR of the device with the series resistance of the CMOS transistor is a more relevant metric for the sensing margin as opposed to the device TMR. In Fig. 8.3(a), we have shown the bit-cell TMR as a function of MgO thickness assuming a 45nm PTM [42] transistor in series with varying W/L (width/length) ratios. It can be seen a higher value of MgO thickness is required to increase the bit-cell TMR and reduce the parasitic effect of the transistor series resistance [13, 28]. For the ME devices, due to the decoupled read/write paths, the thickness of the MgO oxide can be increased without degrading the write efficiency (which is dictated by the ME oxide). Thus, the decoupled read/write paths for ME devices allows for better sensing due to increased bit-cell TMR. The increase in resistance of the MTJ also helps to reduce the read disturb failures, due to reduced current flowing through the MTJ during the read operation [122].

The higher MTJ resistance, however, adversely affects the read access speed of the bit-cell. This increase in read delay results from the fact that the RC time constant (where R is effective bit-cell resistance and C is the bit-line capacitance) increases with increase in the MgO thickness. We estimated the parasitic C for a 128x128 memory sub-array using the tool CACTI [129]. In Fig. 3(a), we have plotted the RC time constant for the MTJ in P and AP state as a function of the oxide thickness. It can be observed that beyond 1.5nm the RC time constant becomes greater than 500ps. Therefore, for fast read operations, the MgO thickness should be kept below 1.5nm. For the results presented later in the chapter, we have kept the MgO thickness



Fig. 8.3. (a) (Left axis) Bit-cell TMR versus MgO thickness obtained from our NEGF based transport model. In each case, a transistor in series is used with Width/Length (W/L) ratio as specified in the figure. (Right axis) The RC time constant as a function of the MgO thickness. (b) A typical 3D switching trajectory of the magnetization under influence of applied voltage.

to 1.4nm. The parallel resistance of the MTJ for some of the typical oxide-thickness values are reported in Table 8.2.

Table 8.2. Variation of MTJ Resistance with  $t_{MgO}$ 

| $t_{MgO}(\text{nm})$    | 1     | 1.2  | 1.4  | 1.6  |
|-------------------------|-------|------|------|------|
| $R_p(\mathbf{k}\Omega)$ | 0.469 | 2.09 | 9.31 | 41.4 |

# 8.3.3 Switching Speed

Though, a detailed switching dynamics for ME devices is still under research investigation [128], yet it is expected that ME switching would be much faster as compared to STT switching [72]. This is because ME switching dynamics behaves as if the magnetization direction is being switched by an external field which does not



Fig. 8.4. 1-Read / 1-Write dual port memory using decoupled read/write path of ME-MTJs. The top row of ME-MTJs are being written into by activating the RWL, while the bottom row of ME-MTJs can be simultaneously read by activating WWL.

require an incubation delay [130] to initiate the switching process. In Fig. 8.3(b) we have shown a typical 3D trajectory of the ME switching mechanism. It can be seen if the applied electric field is strong enough, the magnetization vector starts switching without any initial incubation delay. In our simulations for an  $\alpha_{ME}$  of 1/c, complete reversal was obtained within 500ps.

#### 8.4 ME Memory Design

#### 8.4.1 ME Dual Port Memory

The proposed 1-Read / 1-Write dual port memory using ME-MTJs is shown in Fig. 8.4. Each bit-cell consists of one ME-MTJ and two transistors. The transistor connected to WWLs are the write transistors and those connected to RWLs are the read transistors. Data can be written into the ME-MTJs by activating the write transistors of a particular row and applying appropriate write voltages (positive or negative) on WBLs. Similarly, for reading out the data, the read transistors of a given row are activated and a read voltage is applied on RBLs. The current flowing through the bit-cell is then compared with a reference to sense the current state of the ME-MTJ.

A dual port memory is characterized by simultaneous read and write operations *i.e.* while one row of the memory array is being read simultaneously another row of the memory array can be written into, thereby, improving the memory throughput [131]. The dual port nature of the proposed ME-MTJ memory can be explained as follows. Let us consider row-1 in Fig. 8.4 is being written into. The write transistors corresponding to row-1 would be activated and by application of proper voltages on WBLs, a P or an AP state can be written into the ME-MTJs. Simultaneously, the read transistors corresponding to row-2 are activated and by sensing the current flowing through the RBLs, the state of the ME-MTJs connected to row-2 can be sensed. Thus, ME-MTJs can be used to construct dual port memories, thereby, increasing the memory throughput. Our simulations indicate, write energy consumption per bit of 0.072 fJ for  $\alpha_{ME} = 1/c$  and read energy consumption of 3.6fJ for read voltage of 200mV and read time of 1ns. For the present proposal ME switching enables two orders of magnitude improvement in write energy and 8x improvement in switching speed as compared to STT based MTJs [132], in addition to improved TMR and throughput.

## 8.4.2 ME CAM

The ME-XNOR based CAM cell is shown in Fig. 8.5 (a). The function of M1 is to selectively provide the ME-oxide capacitor with a ground connection when Data Input Line ( $D_{in}$ ) is activated. In the read circuit, a reference MTJ ( $Ref_{MTJ}$ ) forms a voltage divider with the resistance of the MTJ ( $R_{MTJ}$ ). The match signal is obtained at the drain of p-MOS M2 (denoted by node  $\overline{match}$ ), where a low voltage indicates a match is obtained and vice-versa. The node  $\overline{match}$  is pre-charged to  $V_{DD}$ . The strengths of the n-MOS and the p-MOS transistors, connected to the  $\overline{match}$  line, are adjusted such that even one activated p-MOS in a row is enough to maintain the output node in its pre-charged state.

The operation of the circuit can be divided into three modes: i) Write Mode, ii) Data Input Mode and iii) Read Mode. To write data in the lower (upper) ferromagnet, a write pulse corresponding to bit '1' (positive voltage) and '0' (negative voltage), respectively, is applied on the BL  $(D_{in})$  with the WL (DWL) activated. If the digital bit written in the lower ferromagnet is same as the data to be matched (stored in the upper ferromagnet), the MTJ switches to low resistance state. Finally in the read mode, a read pulse of 1 V ( $V_{READ}$ ) is applied for the read process. The output of the inverter goes 'high' only if the MTJ is in low resistance state indicating that the bit written in the top magnet in mode (i) matches the bit stored in the bottom magnet. Matching of all bits in a row turns all the p-MOS OFF and *match* goes low, indicating that a match is found. Note, the NMOS and PMOS widths are chosen such that even with one activated PMOS, the NMOS transistor would not be able to pull down the match line. This in turn ensures the match line is discharged only when all the bits in the word match enabling the CAM operation. The write and read energy per bit was found to be 0.072 fJ and 15 fJ, respectively, indicating two orders of magnitude improvement in write energy and comparable read energy as compared to previous works as in [133](Table 8.3).



Fig. 8.5. Proposed CAM based on ME-XNOR device. The upper and lower ferromagnets comprising the ME-XNOR device can be used to store the input data and the data to be matched, respectively. The  $\overline{match}$  signal goes low if and only if all the p-MOSes of a particular row are turned OFF.

| 1            | 1 1                 |                  |              |
|--------------|---------------------|------------------|--------------|
| Memory Type  | STT [134]           | VCMA [133]       | ME-XNOR      |
| Structure    | 9 <b>T-2</b> MTJ    | 4T-2MTJ          | 5T-1MeXNOR   |
| Read/Write   | 1V/ 1V              | 1V/ 1V           | 1V/ 0.2V     |
| Write Speed  | $2ns^+$             | 1ns              | 0.5 ns       |
| Search Speed | 0.1ns               | 0.2ns            | 0.8ns        |
| Swi. Energy  | $100~{ m fJ/bit^+}$ | $10~{ m fJ/bit}$ | 0.072 fJ/bit |

Table 8.3.Comparison of proposed ME-XNOR CAM

### 8.5 Summary

The prospects of achieving voltage driven switching of magnetization has renewed the interest for future low-power non-volatile spintronic memories. It is intuitive to expect lower write energy as compared to current based counterparts of spintronic devices. In this chapter we not only try to quantize the energy and speed benefits obtained by use of ME based devices, but also propose novel memory application by use of ME-MTJs and ME-XNOR devices. Specifically, we show that the decoupled read-write port of ME-MTJs can be used to construct a 1-Read / 1-Write dual port memory thereby increasing the overall memory throughput. The presented memory array supports simultaneous read and write operations from two different rows of the memory array. Additionally, we have also presented a CAM based on the ME-XNOR device. The proposed CAM requires lesser number of transistors due to the compact XNOR operation enabled by the ME XNOR device, resulting in an area-efficient as well as energy-efficient CAM.

## 9. SUMMARY AND FUTURE WORK

In the quest for energy-efficiency with respect to switching of spin devices, intensive research exploration is being pursued for voltage driven and voltage assisted switching mechanisms. In this research, we have focused on two such voltage effects – the voltage controlled magnetic anisotropy effect and the magneto-electric effect. Further, by exploiting the unique physics of the VCMA mechanism we have presented *in-situ*, in-memory logic computations using 'stateful' devices. Our proposal does not require any modifications either in the magnetic device or the bit-cell circuit, thereby, making the proposal attractive from manufacturability point of view. It is worth mentioning that, stateful logic operations are a promising technique to overcome the well-known von-Neumann bottleneck.

In addition, the switching of a mono-domain magnet through the ME effect has been used to construct a stochastic-leaky-integrate fire neuronal device mimicking the dynamics of biological neurons. We believe the emulation of four characteristics of a biological neuron *viz*. the stochasticity, the leaky, the integrate and the fire dynamics in a single device would pave the way for efficient hardware implementations for wider class of classification and recognition tasks. Finally, we have also demonstrated that the pure voltage driven nature of the ME effect allows one to create an entire family of logic gates including the XNOR, IMP, NAND, and NOR gates that can be easily cascaded. Such non-volatile logic gates are becoming increasingly important with the current emphasis on intermittently powered systems and IoT applications. We have also leveraged the availability of the ME-XNOR device to construct an area and energy efficient content addressable memory.

Furthermore, we have exploited pure voltage driven magnetic domain wall motion for constructing a neuro-synaptic device. Our proposal is based on recent experimental studies that have convincingly demonstrated that by elastically coupling a ferromagnetic domain wall to an underlying ferro-electric domain wall, the ferro-magnetic domain wall can be moved by application of an electric field. The applied electric field moves the ferro-electric domain wall which in turn drags the ferro-magnetic domain wall due to elastic coupling, thus allowing ultra-low-energy movement of the ferro-magnetic domain wall. The controlled voltage driven bi-directional motion of the domain-wall enables embedding *leaky* behavior in the proposed device without resorting to energy-expensive static flow of current. Such *leaky* behavior have been shown to be important in mimicking bio-plausible neuronal behavior as well as synaptic models for short-term memory.

Finally, we believe various proposals presented as a part of this research opens up new possibilities for further work. For example, it has been shown mathematically that stochastic-leaky-integrate-fire neurons can be used to mimic computations in accordance to Bayesian inference. ME based neurons mimic all the four required behavior including the stochasticity, the leaky, the integrate and the fire dynamics in a single device allowing one to build low-energy hardware implementations of such inference engines. Further, the proposed CAM cell using ME-XNOR was designed to give a binary *match* or *no-match* output. Interestingly, binary neural networks require binary dot-products which consists of bit-wise XORs followed by counting of number of 1's. This can be achieved in the proposed CAM cell by sensing an analog voltage at the match-line instead of making a binary decision. The analog voltage proportional to number of 1's can be converted to a digital value, thereby enabling approximate binary dot products. As such, ME-XNOR device in the proposed CAM-like configuration with suitable modifications can be used to perform in-memory acceleration of binary neural networks. APPENDIX

# A. APPENDIX

### A.1 Introduction

True Random Number Generators (TRNGs) are becoming increasingly popular in cryptography and other security applications. However, conventional TRNG designs in hardware often result in significantly high area and power consumption [135] and hence recent research efforts have been directed to developing compact, low power and high throughput TRNGs based on emerging technologies like the Magnetic Tunnel Junction (MTJ "spin-dice") [136–138]. The random number generation process usually takes place through the application of two current pulses, namely the "reset" pulse to orient the magnet to a known initial state and subsequently the "roll" pulse to switch the magnet with probability of 0.5. The stochastic switching nature of the MTJ arises from the inherent thermal noise present in the device. However, the quality of the random number generated is not sufficiently high due to variations in the magnitude of current required to switch the MTJ with 50% probability (arising from PVT variations). Hence expensive post-processing schemes are usually required [136]. In this work, we explore the design of a Voltage Controlled Spin-Dice (VC-SD) using the recently discovered phenomena of Voltage Controlled Magnetic Anisotropy (VCMA) in an MTJ structure to orient the ferromagnet along a metastable magnetization direction and subsequently utilizing thermal noise to produce random switching of the magnet to either one of the stable magnetization directions. In addition to power and reliability benefits, the proposed TRNG is able to provide better resiliency against PVT variations.



Fig. A.1. Schematic of an STT-MRAM bit cell being utilized as VC-SD. The bit-cell consists of the MTJ in series with an access transistor. The proposed TRNG can be implemented using a standard STT-MRAM array. The operation consists of "Reset", "Relax" and "Read" operations. The corresponding control signals WL, BL and SL have been shown



Fig. A.2. Magnetization dynamics of the same VC-SD device for two different simulation runs

## A.2 Proposed Spin Dice

The basic spin-dice operation proposed in this work is based on the principle of VCMA where, application of a voltage pulse across an MTJ with a "free layer" (FL)

possessing interfacial perpendicular magnetic anisotropy, results in the reduction of the interfacial anisotropy field. Different mechanisms like the relative occupancy of d-orbital electrons [30] have been proposed as the physical processes that lead to the VCMA effect. Due to the reduced anisotropy in the perpendicular direction, the FL magnetization tries to orient in the in-plane, i.e. "hard-axis" direction (due to the in-plane component of demagnetization/external field). For small time durations of the voltage pulse, the FL magnetization is not oriented sufficiently along the "hardaxis" and hence the switching probability of the MTJ possesses a bias toward one of the stable magnetization directions, depending on whether the FL magnetization had "up-spin" or "down-spin" magnetization component along with the "in-plane" component. However, as the applied pulse duration increases, the major component of the magnetization begins to orient along the "hard-axis" resulting in  $\sim 50\%$  probability of switching to either the Anti-parallel (AP) or Parallel (P) MTJ state. After the "reset" phase (magnetization oriented along "hard-axis"), the magnetization relaxes to either of the two stable states by a characteristic time constant,  $\tau_D = \frac{1+\alpha^2}{\alpha\gamma H_K}$ , where  $\alpha$  is Gilbert's damping factor,  $\gamma$  is gyromagnetic ratio of electron and  $H_K$  is the effective magnetic anisotropy field. The "relax" phase was taken to be  $3\tau_D$  in duration followed by the "read" phase. The MTJ structure and operation of an array of such VC-SDs have been depicted in Fig. A.1. Magnetization dynamics of the same device for two different simulation runs have been shown in Fig. A.2. A modified version of the Landau-Lifshitz-Gilbert-Slonczewski (LLGS) [34] equation was utilized in this work for modeling the MTJ dynamics in presence of the VCMA effect. Further the LLGS simulation framework was coupled with Non-Equilibrium Green's Function (NEGF) [40] based transport simulation framework to model electron transport in the MTJ. The simulation framework was calibrated to experimental results reported in Ref. [15] for a CoFeB/MgO/CoFeB MTJ stack (Fig. A.3). The simulation parameters have been outlined in Table A.2.

| Parameters                       | Value                                     |
|----------------------------------|-------------------------------------------|
| Free layer area                  | $\frac{\pi}{4} \times 100 \times 40 nm^2$ |
| Free layer thickness             | 0.9nm                                     |
| Saturation Magnetization, $M_S$  | $1257.3 \ KA/m \ [38]$                    |
| Gilbert Damping Factor, $\alpha$ | 0.075                                     |
| VCMA Coefficient, $\xi$          | 350 f J/V - m                             |
| MgO Thickness, $t$               | 1.4nm                                     |
| Interface anisotropy, $K_i$      | $0.9267 mJ/m^2$                           |
| MTJ "Reset" voltage              | 0.75V                                     |
| CMOS technology                  | 45nm SOI CMOS                             |
| Pulse width, $t_{PW}$            | 1-6ns                                     |
| Temperature, $T$                 | 300, 400, 500K                            |

### A.3 Results

Fig. A.4 demonstrates the SD trajectory for a particular sample run. In order to evaluate the performance of the proposed VC-SD, we performed 500 stochastic LLG simulations with varying "reset" pulse width. As can be seen from Fig. A.5, the switching probability achieves a value of ~ 0.45 (10% offset from the ideal value of 0.5) for sufficiently large values of the "reset" pulse width,  $t_{PW}$ . The offset is due to the effect of STT induced by current flowing through the MTJ "pinned layer" (PL). However, such an offset can be easily removed by standard post-processing techniques like von-Neumann's algorithm [136]. Additionally, further optimization in material parameters can be performed to reduce the impact of STT during the "hardaxis" orientation of the magnet. In order to assess the impact of PVT variations on the randomness of the proposed VC-SD, we performed analysis for two extreme cases of ±5% and ±2% variation in the FL area and thickness. As can be observed from Fig. A.5, the randomness remains almost similar to the nominal value. The



Fig. A.3. Our benchmarked results for (a) only VCMA-induced switching, and (b) combined VCMA and STT switching. (c) NEGF results obtained from our transport model. We have matched the parallel and anti-parallel resistance to the reported value of  $11K\Omega$ and  $25K\Omega$  respectively. All the benchmarking is done with respect to the experiment [38]. The MTJ is of circular cross-sectional area with diameter 40nm and FL thickness 0.9nm. The oxide thickness is 1.3nm. An external field of magnitude 31mT is applied to provide the necessary in-plane magnetic field. It is worth noting here that the external field was only considered during the benchmarking process. For VC-SD operation, no external field was required since the in-plane magnetic field for "hard-axis" orientation was provided by the demagnetization field of the magnet. The MTJ operating voltage is 0.7V.

same discussion holds true even for variation in the temperature (Fig. A.5). In contrast, the standard STT-MRAM SD exhibits randomness offsets of 54% and 24%(on either side of the optimized randomness value of 0.5) for similar variations in device dimensions and temperature respectively. Considering the cut-off "reset" cycle duration as 3ns and the "relax" cycle duration as 1.5ns ( $3\tau_D$ ), the total throughput of the VC-SD is evaluated to be 4.5ns. The associated "reset" energy consumption is estimated to be 77 f J/random bit which is 36% improvement in comparison to standard MTJ-SD [137]. To conclude, the potential benefits offered by the proposed TRNG may be summarized as follows: (a) RN generation in the proposed SD takes place by the application of a single voltage pulse in contrast to "reset" and "roll" pulses in conventional SD, (b) Higher quality random numbers can be generated by ensuring sufficient duration of the "reset" voltage pulse even in the presence of PVT variations in comparison to standard MTJ-SD, (c) Low energy consumption and high throughput can be achieved, (d) The voltage effect allows the usage of a thicker oxide, thereby not only enhancing the MTJ reliability but also the robustness of the read operation (by reducing the read disturb as well as read decision failures). (e) Due to reduced operational current requirement, a smaller access transistor leads to lower cell-area as compared to the conventional SD.



Fig. A.4. A sample SD trajectory for the proposed TRNG. The magnetization switches to "hard-axis" and subsequently relaxes to one of the stable magnetization states



Fig. A.5. Switching probability (measured over 500 independent stochastic LLG simulations) for varying "reset" pulse width (1-6ns). The randomness offset remains limited within reasonable bounds (< 10%) even with (a) variations in cross-sectional area (5%) and thickness (2%), and (b) temperature

REFERENCES

#### REFERENCES

- [1] A. H. Compton, "The magnetic electron," *Journal of the Franklin Institute*, vol. 192, no. 2, pp. 145–155, 1921.
- [2] P. A. Dirac, "The quantum theory of the electron," in Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering Sciences, vol. 117, no. 778. The Royal Society, 1928, pp. 610–624.
- [3] B. Friedrich and D. Herschbach, "Stern and gerlach: How a bad cigar helped reorient atomic physics," *Physics Today*, vol. 56, no. 12, pp. 53–59, 2003.
- [4] M. N. Baibich, J. M. Broto, A. Fert, F. N. Van Dau, F. Petroff, P. Etienne, G. Creuzet, A. Friederich, and J. Chazelas, "Giant magnetoresistance of (001) fe/(001) cr magnetic superlattices," *Physical review letters*, vol. 61, no. 21, p. 2472, 1988.
- [5] J. C. Slonczewski, "Current-driven excitation of magnetic multilayers," *Journal* of Magnetism and Magnetic Materials, vol. 159, no. 1, pp. L1–L7, 1996.
- [6] Y. Huai, "Spin-transfer torque mram (stt-mram): Challenges and prospects," *AAPPS bulletin*, vol. 18, no. 6, pp. 33–40, 2008.
- [7] B. Behin-Aein, D. Datta, S. Salahuddin, and S. Datta, "Proposal for an all-spin logic device with built-in memory," *Nature nanotechnology*, vol. 5, no. 4, pp. 266–270, 2010.
- [8] S. Jain, A. Ranjan, K. Roy, and A. Raghunathan, "Computing in memory with spin-transfer torque magnetic ram," *arXiv preprint arXiv:1703.02118*, 2017.
- [9] D. Fan, S. Maji, K. Yogendra, M. Sharad, and K. Roy, "Injection-locked spin hall-induced coupled-oscillators for energy efficient associative computing," *IEEE Transactions on Nanotechnology*, vol. 14, no. 6, pp. 1083–1093, 2015.
- [10] A. Sengupta, P. Panda, P. Wijesinghe, Y. Kim, and K. Roy, "Magnetic tunnel junction mimics stochastic cortical spiking neurons," arXiv preprint arXiv:1510.00440, 2015.
- [11] Y. Shim, A. Jaiswal, and K. Roy, "Ising computation based combinatorial optimization using spin-hall effect (she) induced stochastic magnetization reversal," *Journal of Applied Physics*, vol. 121, no. 19, p. 193902, 2017.
- [12] R. Andrawis, A. Jaiswal, and K. Roy, "Design and comparative analysis of spintronic memories based on current and voltage driven switching," *IEEE Transactions on Electron Devices*, 2018.

- [13] A. Jaiswal, X. Fong, and K. Roy, "Comprehensive scaling analysis of current induced switching in magnetic memories based on in-plane and perpendicular anisotropies," *IEEE Journal on Emerging and Selected Topics in Circuits and* Systems, vol. 6, no. 2, pp. 120–133, June 2016.
- [14] M. Fiebig, "Revival of the magnetoelectric effect," Journal of Physics D: Applied Physics, vol. 38, no. 8, p. R123, 2005.
- [15] S. Kanai, Y. Nakatani, M. Yamanouchi, S. Ikeda, H. Sato, F. Matsukura, and H. Ohno, "Magnetization switching in a cofeb/mgo magnetic tunnel junction by combining spin-transfer torque and electric field-effect," *Applied Physics Letters*, vol. 104, no. 21, p. 212406, 2014.
- [16] S. Sharmin, A. Jaiswal, and K. Roy, "Modeling and design space exploration for bit-cells based on voltage-assisted switching of magnetic tunnel junctions," *IEEE Transactions on Electron Devices*, vol. 63, no. 9, pp. 3493–3500, 2016.
- [17] A. Jaiswal, A. Agrawal, and K. Roy, "In-situ, in-memory stateful vector logic operations based on voltage controlled magnetic anisotropy," *Scientific reports*, vol. 8, no. 1, p. 5738, 2018.
- [18] A. Jaiswal and K. Roy, "Mesl: Proposal for a non-volatile cascadable magnetoelectric spin logic," *Scientific Reports*, vol. 7, 2017.
- [19] A. Jaiswal, S. Roy, G. Srinivasan, and K. Roy, "Proposal for a leaky-integratefire spiking neuron based on magnetoelectric switching of ferromagnets," *IEEE Transactions on Electron Devices*, 2017.
- [20] A. Jaiswal, I. Chakraborty, and K. Roy, "A non-volatile cascadable magnetoelectric material implication logic," in 2017 75th Annual Device Research Conference (DRC). IEEE, 2017, pp. 1–2.
- [21] A. Jaiswal, A. Agrawal, P. Panda, and K. Roy, "Voltage-driven domain-wall motion based neuro-synaptic devices for dynamic on-line learning," arXiv preprint arXiv:1705.06942, 2017.
- [22] A. Jaiswal, I. Chakraborty, and K. Roy, "Energy-efficient memory using magneto-electric switching of ferromagnets," *IEEE Magnetics Letters*, vol. 8, pp. 1–5, 2017.
- [23] A. Sengupta, A. Jaiswal, and K. Roy, "True random number generation using voltage controlled spin-dice," in 2016 74th Annual Device Research Conference (DRC). IEEE, 2016, pp. 1–2.
- [24] M. D. Stiles and A. Zangwill, "Anatomy of spin-transfer torque," *Physical Review B*, vol. 66, no. 1, p. 014407, 2002.
- [25] A. Jaiswal, X. Fong, and K. Roy, "Comprehensive scaling analysis of current induced switching in magnetic memories based on in-plane and perpendicular anisotropies," *IEEE Journal on Emerging and Selected Topics in Circuits and* Systems, vol. 6, no. 2, pp. 120–133, 2016.
- [26] J. G. Alzate, P. K. Amiri, P. Upadhyaya, S. Cherepov, J. Zhu, M. Lewis, R. Dorrance, J. Katine, J. Langer, K. Galatsis *et al.*, "Voltage-induced switching of nanoscale magnetic tunnel junctions," in *Electron Devices Meeting (IEDM)*, 2012 IEEE International. IEEE, 2012, pp. 29–5.

- [27] P. Borisov, A. Hochstrat, X. Chen, W. Kleemann, and C. Binek, "Magnetoelectric switching of exchange bias," *Physical Review Letters*, vol. 94, no. 11, p. 117203, 2005.
- [28] P. K. Amiri and K. L. Wang, "Voltage-controlled magnetic anisotropy in spintronic devices," in Spin, vol. 2, no. 03. World Scientific, 2012, p. 1240002.
- [29] J. Zhang, P. V. Lukashev, S. S. Jaswal, and E. Y. Tsymbal, "Model of orbital populations for voltage-controlled magnetic anisotropy in transition-metal thin films," *Physical Review B*, vol. 96, no. 1, p. 014435, 2017.
- [30] K. Kyuno, J.-G. Ha, R. Yamamoto, and S. Asano, "First-principles calculation of the magnetic anisotropy energies of ag/fe (001) and au/fe (001) multilayers," *Journal of the Physical Society of Japan*, vol. 65, no. 5, pp. 1334–1339, 1996.
- [31] W.-G. Wang, M. Li, S. Hageman, and C. Chien, "Electric-field-assisted switching in magnetic tunnel junctions." *Nature materials*, vol. 11, no. 1, 2012.
- [32] J.-Y. Chen, D. Zhang, Z. Zhao, M. Li, and J.-P. Wang, "Field-free spin-orbit torque switching of composite perpendicular cofeb/gd/cofeb layers utilized for three-terminal magnetic tunnel junctions," *Applied Physics Letters*, vol. 111, no. 1, p. 012402, 2017.
- [33] S. Wang, H. Lee, F. Ebrahimi, P. K. Amiri, K. L. Wang, and P. Gupta, "Comparative evaluation of spin-transfer-torque and magnetoelectric random access memory," *IEEE Journal on Emerging and Selected Topics in Circuits and Systems*, vol. 6, no. 2, pp. 134–145, 2016.
- [34] T. L. Gilbert, "A phenomenological theory of damping in ferromagnetic materials," *IEEE Transactions on Magnetics*, vol. 40, no. 6, pp. 3443–3449, Nov 2004.
- [35] M. d'Aquino, "Nonlinear magnetization dynamics in thin-films and nanoparticles," Ph.D. dissertation, Università degli Studi di Napoli Federico II, 2005.
- [36] Z. Wang, G. Yu, X. Liu, B. Zhang, X. Chen, and W. Lu, "Magnetization characteristic of ferromagnetic thin strip by measuring anisotropic magnetoresistance and ferromagnetic resonance," *Solid State Communications*, vol. 182, pp. 10–13, 2014.
- [37] W. F. Brown Jr, "Thermal fluctuations of a single-domain particle," Journal of Applied Physics, vol. 34, no. 4, pp. 1319–1320, June 1963.
- [38] S. Ikeda, K. Miura, H. Yamamoto, K. Mizunuma, H. Gan, M. Endo, S. Kanai, J. Hayakawa, F. Matsukura, and H. Ohno, "A perpendicular-anisotropy CoFeB-MgO magnetic tunnel junction," *Nature materials*, vol. 9, no. 9, pp. 721–724, July 2010.
- [39] H. Noguchi, K. Ikegami, K. Abe, S. Fujita, Y. Shiota, T. Nozaki, S. Yuasa, and Y. Suzuki, "Novel voltage controlled mram (vcm) with fast read/write circuits for ultra large last level cache," in *Electron Devices Meeting (IEDM)*, 2016 *IEEE International*. IEEE, 2016, pp. 27–5.

- [40] X. Fong, S. K. Gupta, N. N. Mojumder, S. H. Choday, C. Augustine, and K. Roy, "Knack: A hybrid spin-charge mixed-mode simulator for evaluating different genres of spin-transfer torque mram bit-cells," in 2011 International Conference on Simulation of Semiconductor Processes and Devices. IEEE, 2011, pp. 51–54.
- [41] C. Lin, S. Kang, Y. Wang, K. Lee, X. Zhu, W. Chen, X. Li, W. Hsu, Y. Kao, M. Liu *et al.*, "45nm low power cmos logic compatible embedded stt mram utilizing a reverse-connection 1t/1mtj cell," in *Electron Devices Meeting (IEDM)*, 2009 IEEE International. IEEE, 2009, pp. 1–4.
- [42] Predictive Technology Models.[Online] http://ptm.asu.edu/, 2016.
- [43] C. E. Shannon, "A symbolic analysis of relay and switching circuits," *Electrical Engineering*, vol. 57, no. 12, pp. 713–723, 1938.
- [44] J. Bardeen and W. H. Brattain, "The transistor, a semi-conductor triode," *Physical Review*, vol. 74, no. 2, p. 230, 1948.
- [45] O. Lempel, "2nd generation intel® core processor family: Intel® core i7, i5 and i3," in *Hot Chips 23 Symposium (HCS), 2011 IEEE*. IEEE, 2011, pp. 1–48.
- [46] J. Von Neumann, The computer and the brain. Yale University Press, 2012.
- [47] C. P. Chen and C.-Y. Zhang, "Data-intensive applications, challenges, techniques and technologies: A survey on big data," *Information Sciences*, vol. 275, pp. 314–347, 2014.
- [48] P. G. Emma, "Understanding some simple processor-performance limits," IBM journal of Research and Development, vol. 41, no. 3, pp. 215–232, 1997.
- [49] M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M. J. Franklin, S. Shenker, and I. Stoica, "Resilient distributed datasets: A faulttolerant abstraction for in-memory cluster computing," in *Proceedings of the* 9th USENIX conference on Networked Systems Design and Implementation. USENIX Association, 2012, pp. 2–2.
- [50] E. Linn, R. Rosezin, S. Tappertzhofen, U. Böttger, and R. Waser, "Beyond von neumannlogic operations in passive crossbar arrays alongside memory operations," *Nanotechnology*, vol. 23, no. 30, p. 305205, 2012.
- [51] M. Kang, M.-S. Keel, N. R. Shanbhag, S. Eilert, and K. Curewitz, "An energyefficient vlsi architecture for pattern recognition via deep embedding of computation in sram," in Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on. IEEE, 2014, pp. 8326–8330.
- [52] K. Roy, S. Mukhopadhyay, and H. Mahmoodi-Meimand, "Leakage current mechanisms and leakage reduction techniques in deep-submicrometer cmos circuits," *Proceedings of the IEEE*, vol. 91, no. 2, pp. 305–327, 2003.
- [53] T. Ghani, K. Mistry, P. Packan, S. Thompson, M. Stettler, S. Tyagi, and M. Bohr, "Scaling challenges and device design requirements for high performance sub-50 nm gate length planar cmos transistors," in VLSI Technology, 2000. Digest of Technical Papers. 2000 Symposium on. IEEE, 2000, pp. 174– 175.

- [54] T. Skotnicki, J. A. Hutchby, T.-J. King, H.-S. Wong, and F. Boeuf, "The end of cmos scaling: toward the introduction of new materials and structural changes to improve mosfet performance," *IEEE Circuits and Devices Magazine*, vol. 21, no. 1, pp. 16–26, 2005.
- [55] B. Govoreanu, G. Kar, Y. Chen, V. Paraschiv, S. Kubicek, A. Fantini, I. Radu, L. Goux, S. Clima, R. Degraeve *et al.*, "10× 10nm 2 hf/hfo x crossbar resistive ram with excellent performance, reliability and low-energy operation," in *Electron Devices Meeting (IEDM), 2011 IEEE International.* IEEE, 2011, pp. 31–6.
- [56] H.-S. P. Wong, S. Raoux, S. Kim, J. Liang, J. P. Reifenberg, B. Rajendran, M. Asheghi, and K. E. Goodson, "Phase change memory," *Proceedings of the IEEE*, vol. 98, no. 12, pp. 2201–2227, 2010.
- [57] K. Nomura, K. Abe, H. Yoda, and S. Fujita, "Ultra low power processor using perpendicular-stt-mram/sram based hybrid cache toward next generation normally-off computers," *Journal of Applied Physics*, vol. 111, no. 7, p. 07E330, 2012.
- [58] W. Kang, H. Wang, Z. Wang, Y. Zhang, and W. Zhao, "In-memory processing paradigm for bitwise logic operations in stt-mram," *IEEE Transactions on Magnetics*, 2017.
- [59] J. Borghetti, G. S. Snider, P. J. Kuekes, J. J. Yang, D. R. Stewart, and R. S. Williams, "memristiveswitches enable statefullogic operations via material implication," *Nature*, vol. 464, no. 7290, pp. 873–876, 2010.
- [60] H. Zhang, W. Kang, L. Wang, K. L. Wang, and W. Zhao, "Stateful reconfigurable logic via a single-voltage-gated spin hall-effect driven magnetic tunnel junction in a spintronic memory," *IEEE Transactions on Electron Devices*, vol. 64, no. 10, pp. 4295–4301, 2017.
- [61] H. Mahmoudi, T. Windbacher, V. Sverdlov, and S. Selberherr, "High performance mram-based stateful logic," in *Ultimate Integration on Silicon (ULIS)*, 2014 15th International Conference on. IEEE, 2014, pp. 117–120.
- [62] N. A. Spaldin and M. Fiebig, "The renaissance of magnetoelectric multiferroics," *Science*, vol. 309, no. 5733, pp. 391–392, 2005.
- [63] J. Heron, J. Bosse, Q. He, Y. Gao, M. Trassin, L. Ye, J. Clarkson, C. Wang, J. Liu, S. Salahuddin *et al.*, "Deterministic switching of ferromagnetism at room temperature using an electric field," *Nature*, vol. 516, no. 7531, pp. 370–373, 2014.
- [64] T. H. Lahtinen, K. J. Franke, and S. van Dijken, "Electric-field control of magnetic domain wall motion and local magnetization reversal," arXiv preprint arXiv:1109.5514, 2011.
- [65] R. Ramesh, "Electric field control of ferromagnetism using multi-ferroics: the bismuth ferrite story," *Philosophical Transactions of the Royal Society of Lon*don A: Mathematical, Physical and Engineering Sciences, vol. 372, no. 2009, p. 20120437, 2014.

- [66] W. Butler, X.-G. Zhang, T. Schulthess, and J. MacLaren, "Spin-dependent tunneling conductance of fe-mgo-fe sandwiches," *Physical Review B*, vol. 63, no. 5, p. 054416, 2001.
- [67] D. Datta, B. Behin-Aein, S. Salahuddin, and S. Datta, "Quantitative model for tmr and spin-transfer torque in mtj devices," in *Electron Devices Meeting* (*IEDM*), 2010 IEEE International. IEEE, 2010, pp. 22–8.
- [68] A. Aharoni, "Demagnetizing factors for rectangular ferromagnetic prisms," Journal of applied physics, vol. 83, no. 6, pp. 3432–3434, 1998.
- [69] C. Lin, S. Kang, Y. Wang, K. Lee, X. Zhu, W. Chen, X. Li, W. Hsu, Y. Kao, M. Liu *et al.*, "45nm low power cmos logic compatible embedded stt mram utilizing a reverse-connection 1t/1mtj cell," in 2009 IEEE International Electron Devices Meeting (IEDM). IEEE, 2009, pp. 1–4.
- [70] S. Manipatruni, D. E. Nikonov, and I. A. Young, "Spin-orbit logic with magnetoelectric nodes: A scalable charge mediated nonvolatile spintronic logic," arXiv preprint arXiv:1512.05428, 2015.
- [71] W. Scholz, T. Schrefl, and J. Fidler, "Micromagnetic simulation of thermally activated switching in fine particles," *Journal of Magnetism and Magnetic Materials*, vol. 233, no. 3, pp. 296–304, 2001.
- [72] D. E. Nikonov and I. A. Young, "Benchmarking spintronic logic devices based on magnetoelectric oxides," *Journal of Materials Research*, vol. 29, no. 18, pp. 2109–2115, 2014.
- [73] W. Kim, J. Jeong, Y. Kim, W. Lim, J. Kim, J. Park, H. Shin, Y. Park, K. Kim, S. Park *et al.*, "Extended scalability of perpendicular stt-mram towards sub-20nm mtj node," in *Electron Devices Meeting (IEDM)*, 2011 IEEE International. IEEE, 2011, pp. 24–1.
- [74] M. Gajek, J. Nowak, J. Sun, P. Trouilloud, E. Osullivan, D. Abraham, M. Gaidis, G. Hu, S. Brown, Y. Zhu *et al.*, "Spin torque switching of 20 nm magnetic tunnel junctions with perpendicular anisotropy," *Applied Physics Letters*, vol. 100, no. 13, p. 132408, 2012.
- [75] C. Augustine, A. Raychowdhury, B. Behin-Aein, S. Srinivasan, J. Tschanz, V. K. De, and K. Roy, "Numerical analysis of domain wall propagation for dense memory arrays," in *Electron Devices Meeting (IEDM)*, 2011 IEEE International. IEEE, 2011, pp. 17–6.
- [76] J. Li, B. Nagaraj, H. Liang, W. Cao, C. H. Lee, and R. Ramesh, "Ultrafast polarization switching in thin-film ferroelectrics," *Applied physics letters*, vol. 84, no. 7, pp. 1174–1176, 2004.
- [77] P. Dayan and L. F. Abbott, *Theoretical neuroscience*. Cambridge, MA: MIT Press, 2001, vol. 806.
- [78] B. B. Averbeck, P. E. Latham, and A. Pouget, "Neural correlations, population coding and computation," *Nature Reviews Neuroscience*, vol. 7, no. 5, pp. 358– 366, May 2006.

- [79] B. Rajendran, Y. Liu, J.-s. Seo, K. Gopalakrishnan, L. Chang, D. J. Friedman, and M. B. Ritter, "Specifications of nanoscale devices and circuits for neuromorphic computational systems," *IEEE Transactions on Electron Devices*, vol. 60, no. 1, pp. 246–253, Dec 2013.
- [80] P. Livi and G. Indiveri, "A current-mode conductance-based silicon neuron for address-event neuromorphic systems," in 2009 IEEE international symposium on circuits and systems. IEEE, May 2009, pp. 2898–2901.
- [81] T. Tuma, A. Pantazi, M. Le Gallo, A. Sebastian, and E. Eleftheriou, "Stochastic phase-change neurons," *Nature nanotechnology*, vol. 11, pp. 693–699, May 2016.
- [82] A. Sengupta, P. Panda, P. Wijesinghe, Y. Kim, and K. Roy, "Magnetic tunnel junction mimics stochastic cortical spiking neurons," *Scientific Reports*, vol. 6, July 2016.
- [83] D. E. Nikonov and I. A. Young, "Benchmarking spintronic logic devices based on magnetoelectric oxides," *Journal of Materials Research*, vol. 29, no. 18, pp. 2109–2115, Sept 2014.
- [84] G. Srinivasan, A. Sengupta, and K. Roy, "Magnetic tunnel junction based longterm short-term stochastic synapse for a spiking neural network with on-chip STDP learning," *Scientific Reports*, vol. 6, July 2016.
- [85] C. Clopath and W. Gerstner, "Voltage and spike timing interact in STDP-a unified model," *Spike-timing dependent plasticity*, p. 294, July 2010.
- [86] S. Yu, Y. Wu, R. Jeyasingh, D. Kuzum, and H.-S. P. Wong, "An electronic synapse device based on metal oxide resistive switching memory for neuromorphic computation," *IEEE Transactions on Electron Devices*, vol. 58, no. 8, pp. 2729–2737, June 2011.
- [87] A. Sengupta, Z. Al Azim, X. Fong, and K. Roy, "Spin-orbit torque induced spike-timing dependent plasticity," *Applied Physics Letters*, vol. 106, no. 9, p. 093704, March 2015.
- [88] D. Goodman and R. Brette, "Brian: a simulator for spiking neural networks in python," *Frontiers in Neuroinformatics*, Nov 2008.
- [89] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition," *Proceedings of the IEEE*, vol. 86, no. 11, pp. 2278–2324, Aug 1998.
- [90] C. Auth, C. Allen, A. Blattner, D. Bergstrom, M. Brazier, M. Bost, M. Buehler, V. Chikarmane, T. Ghani, T. Glassman *et al.*, "A 22nm high performance and low-power CMOS technology featuring fully-depleted tri-gate transistors, self-aligned contacts and high density MIM capacitors," in *VLSI technology* (*VLSIT*), 2012 symposium on. IEEE, 2012, pp. 131–132.
- [91] S. Borkar, "Thousand core chips: a technology perspective," in Proceedings of the 44th annual Design Automation Conference. ACM, 2007, pp. 746–749.
- [92] K. Roy, S. Mukhopadhyay, and H. Mahmoodi-Meimand, "Leakage current mechanisms and leakage reduction techniques in deep-submicrometer cmos circuits," *Proceedings of the IEEE*, vol. 91, no. 2, pp. 305–327, 2003.

- [93] F. Mayer, C. Le Royer, J.-F. Damlencourt, K. Romanjek, F. Andrieu, C. Tabone, B. Previtali, and S. Deleonibus, "Impact of SOI, Si 1-x Ge x OI and GeOI substrates on CMOS compatible tunnel FET performance," in 2008 IEEE International Electron Devices Meeting. IEEE, 2008, pp. 1–5.
- [94] K. Roy, B. Jung, D. Peroulis, and A. Raghunathan, "Integrated systems in the more-than-moore era: designing low-cost energy-efficient systems using heterogeneous components," *IEEE Design & Test*, vol. 33, no. 3, pp. 56–65, 2016.
- [95] S. Matsunaga, J. Hayakawa, S. Ikeda, K. Miura, H. Hasegawa, T. Endoh, H. Ohno, and T. Hanyu, "Fabrication of a nonvolatile full adder based on logic-in-memory architecture using magnetic tunnel junctions," *Applied Physics Express*, vol. 1, no. 9, p. 091301, 2008.
- [96] J.-M. Hu, Z. Li, Y. Lin, and C. Nan, "A magnetoelectric logic gate," physica status solidi (RRL)-Rapid Research Letters, vol. 4, no. 5-6, pp. 106–108, 2010.
- [97] N. Sharma, J. Bird, P. Dowben, and A. Marshall, "Compact-device model development for the energy-delay analysis of magneto-electric magnetic tunnel junction structures," *Semiconductor Science and Technology*, vol. 31, no. 6, p. 065022, 2016.
- [98] J. M. Rabaey, A. P. Chandrakasan, and B. Nikolic, *Digital integrated circuits*. Prentice hall Englewood Cliffs, 2002, vol. 2.
- [99] J.-s. Seo, B. Brezzo, Y. Liu, B. D. Parker, S. K. Esser, R. K. Montoye, B. Rajendran, J. A. Tierno, L. Chang, D. S. Modha *et al.*, "A 45nm cmos neuromorphic chip with a scalable architecture for learning in networks of spiking neurons," in *Custom Integrated Circuits Conference (CICC)*, 2011 IEEE. IEEE, 2011, pp. 1–4.
- [100] P. Merolla, J. Arthur, F. Akopyan, N. Imam, R. Manohar, and D. S. Modha, "A digital neurosynaptic core using embedded crossbar memory with 45pj per spike in 45nm," in *Custom Integrated Circuits Conference (CICC)*, 2011 IEEE. IEEE, 2011, pp. 1–4.
- [101] P. Livi and G. Indiveri, "A current-mode conductance-based silicon neuron for address-event neuromorphic systems," in *Circuits and systems*, 2009. ISCAS 2009. IEEE international symposium on. IEEE, 2009, pp. 2898–2901.
- [102] A. G. Andreou, K. A. Boahen, P. O. Pouliquen, A. Pavasovic, R. E. Jenkins, and K. Strohbehn, "Current-mode subthreshold mos circuits for analog vlsi neural systems," *IEEE Transactions on neural networks*, vol. 2, no. 2, pp. 205–213, 1991.
- [103] A. Sengupta and K. Roy, "A vision for all-spin neural networks: A device to system perspective," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 63, no. 12, pp. 2267–2277, 2016.
- [104] J. Heron, J. Bosse, Q. He, Y. Gao, M. Trassin, L. Ye, J. Clarkson, C. Wang, J. Liu, S. Salahuddin, D. Ralph, D. Schlom, J. Iniguez, B. Huey, and R. Ramesh, "Deterministic switching of ferromagnetism at room temperature using an electric field," *Nature*, vol. 516, no. 7531, pp. 370–373, Dec 2014.

- [105] X. He, Y. Wang, N. Wu, A. N. Caruso, E. Vescovo, K. D. Belashchenko, P. A. Dowben, and C. Binek, "Robust isothermal electric control of exchange bias at room temperature," *Nature materials*, vol. 9, no. 7, pp. 579–585, 2010.
- [106] M. Suri, O. Bichler, D. Querlioz, O. Cueto, L. Perniola, V. Sousa, D. Vuillaume, C. Gamrat, and B. DeSalvo, "Phase change memory as synapse for ultra-dense neuromorphic systems: Application to complex visual pattern extraction," in 2011 International Electron Devices Meeting. IEEE, dec 2011. [Online]. Available: https://doi.org/10.1109%2Fiedm.2011.6131488
- [107] S. H. Jo, T. Chang, I. Ebong, B. B. Bhadviya, P. Mazumder, and W. Lu, "Nanoscale memristor device as synapse in neuromorphic systems," *Nano Letters*, vol. 10, no. 4, pp. 1297–1301, apr 2010. [Online]. Available: https://doi.org/10.1021%2Fnl904092h
- [108] A. Sengupta, A. Banerjee, and K. Roy, "Hybrid spintronic-cmos spiking neural network with on-chip learning: Devices, circuits, and systems," *Phys. Rev. Applied*, vol. 6, p. 064003, Dec 2016. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevApplied.6.064003
- [109] G. Srinivasan, A. Sengupta, and K. Roy, "Magnetic tunnel junction based long-term short-term stochastic synapse for a spiking neural network with on-chip STDP learning," *Scientific Reports*, vol. 6, no. 1, jul 2016. [Online]. Available: https://doi.org/10.1038%2Fsrep29545
- [110] K. J. Franke, B. Van de Wiele, Y. Shirahata, S. J. Hämäläinen, T. Taniyama, and S. van Dijken, "Reversible electric-field-driven magnetic domain-wall motion," *Physical Review X*, vol. 5, no. 1, p. 011010, Feb 2015.
- [111] B. Van de Wiele, L. Laurson, K. J. Franke, and S. Van Dijken, "Electric field driven magnetic domain wall motion in ferromagnetic-ferroelectric heterostructures," *Applied Physics Letters*, vol. 104, no. 1, p. 012401, 2014.
- [112] B. Van de Wiele, J. Leliaert, K. J. Franke, and S. Van Dijken, "Electric-fielddriven dynamics of magnetic domain walls in magnetic nanowires patterned on ferroelectric domains," *New Journal of Physics*, vol. 18, no. 3, p. 033027, 2016.
- [113] G. Tatara and H. Kohno, "Theory of current-driven domain wall motion: spin transfer versus momentum transfer," *Physical review letters*, vol. 92, no. 8, p. 086601, 2004.
- [114] A. Von Hippel, "Ferroelectricity, domain structure, and phase transitions of barium titanate," *Reviews of Modern Physics*, vol. 22, no. 3, p. 221, 1950.
- [115] K. Franke *et al.*, "Domain wall coupling in ferromagnetic/ferroelectric heterostructures: Scaling behaviour and electric field driven motion," 2016.
- [116] W. J. Merz, "Domain formation and domain wall motions in ferroelectric Ba-TiO3 single crystals," *Physical Review*, vol. 95, no. 3, pp. 690–698, 1954.
- [117] H. L. Stadler and P. J. Zachmanidis, "Nucleation and growth of ferroelectric domains in BaTiO3 at fields from 2 to 450 kVcm," *Journal of Applied Physics*, vol. 34, no. 11, pp. 3255–3260, 1963.

- [118] H. L. Stadler, "Forward velocity of 180?? ferroelectric domain walls in BaTiO 3," Journal of Applied Physics, 1966.
- [119] Y.-H. Shin, I. Grinberg, I.-W. Chen, and A. M. Rappe, "LETTERS Nucleation and growth mechanism of ferroelectric domain-wall motion."
- [120] A. Vansteenkiste and B. Van de Wiele, "Mumax: a new high-performance micromagnetic simulation tool," *Journal of Magnetism and Magnetic Materials*, vol. 323, no. 21, pp. 2585–2591, 2011.
- [121] H. Noguchi, K. Ikegami, K. Kushida, K. Abe, S. Itai, S. Takaya, N. Shimomura, J. Ito, A. Kawasumi, H. Hara, and S. Fujita, "A 3.3 ns-access-time 71.2μw/mhz 1mb embedded stt-mram using physically eliminated read-disturb scheme and normally-off memory architecture," in 2015 IEEE International Solid-State Circuits Conference-(ISSCC) Digest of Technical Papers. IEEE, 2015, pp. 1–3. [Online]. Available: http://dx.doi.org/10.1109/isscc.2015.7062963
- X. Fong, S. H. Choday, and K. Roy, "Bit-cell level optimization for non-volatile memories using magnetic tunnel junctions and spin-transfer torque switching," *IEEE Transactions on Nanotechnology*, vol. 11, no. 1, pp. 172–181, 2012.
   [Online]. Available: http://dx.doi.org/10.1109/tnano.2011.2169456
- [123] L. Liu, C.-F. Pai, Y. Li, H. Tseng, D. Ralph, and R. Buhrman, "Spin-torque switching with the giant spin hall effect of tantalum," *Science*, vol. 336, no. 6081, pp. 555–558, 2012.
- [124] N. Sharma, A. Marshall, J. Bird, and P. Dowben, "Magneto-electric magnetic tunnel junction as process adder for non-volatile memory applications," in 2015 IEEE Dallas Circuits and Systems Conference (DCAS), Oct 2015, pp. 1–4. [Online]. Available: http://dx.doi.org/10.1109/DCAS.2015.7356588
- [125] J.-M. Hu, Z. Li, L.-Q. Chen, and C.-W. Nan, "High-density magnetoresistive random access memory operating at ultralow voltage at room temperature," *Nature communications*, vol. 2, p. 553, 2011. [Online]. Available: http: //dx.doi.org/10.1038/ncomms1564
- [126] N. Sharma, A. Marshall, J. Bird, and P. Dowben, "Magneto-electric magnetic tunnel junction logic devices," in *Energy Efficient Electronic Systems (E3S)*, 2015 Fourth Berkeley Symposium on. IEEE, 2015, pp. 1–3. [Online]. Available: http://dx.doi.org/10.1109/e3s.2015.7336817
- [127] S. Wu, S. A. Cybart, D. Yi, J. M. Parker, R. Ramesh, and R. Dynes, "Full electric control of exchange bias," *Physical review letters*, vol. 110, no. 6, p. 067202, 2013. [Online]. Available: http: //dx.doi.org/10.1103/physrevlett.110.067202
- [128] J. Heron, J. Bosse, Q. He, Y. Gao, M. Trassin, L. Ye, J. Clarkson, C. Wang, J. Liu, S. Salahuddin, D. Ralph, S. D.G, J. Iniguez, B. Huey, and R. Ramesh, "Deterministic switching of ferromagnetism at room temperature using an electric field," *Nature*, vol. 516, no. 7531, pp. 370–373, 2014. [Online]. Available: http://dx.doi.org/10.1038/nature14004
- [129] S. J. Wilton and N. P. Jouppi, "Cacti: An enhanced cache access and cycle time model," *IEEE Journal of Solid-State Circuits*, vol. 31, no. 5, pp. 677–688, 1996.

- [130] N. N. Mojumder and K. Roy, "Proposal for switching current reduction using reference layer with tilted magnetic anisotropy in magnetic tunnel junctions for spin-transfer torque (stt) mram," *IEEE Transactions on Electron Devices*, vol. 59, no. 11, pp. 3054–3060, 2012.
- [131] Y. Seo, K. W. Kwon, X. Fong, and K. Roy, "High performance and energy-efficient on-chip cache using dual port (1r/1w) spin-orbit torque mram," *IEEE Journal on Emerging and Selected Topics in Circuits and* Systems, vol. 6, no. 3, pp. 293–304, Sept 2016. [Online]. Available: http://dx.doi.org/10.1109/JETCAS.2016.2547701
- [132] H. Noguchi, K. Ikegami, N. Shimomura, T. Tetsufumi, J. Ito, and S. Fujita, "Highly reliable and low-power nonvolatile cache memory with advanced perpendicular stt-mram for high-performance cpu," in 2014 Symposium on VLSI Circuits Digest of Technical Papers, June 2014, pp. 1–2. [Online]. Available: http://dx.doi.org/10.1109/VLSIC.2014.6858403
- [133] K. L. Wang, H. Lee, and P. K. Amiri, "Magnetoelectric random access memory-based circuit design by using voltage-controlled magnetic anisotropy in magnetic tunnel junctions," *IEEE Transactions on Nanotechnology*, vol. 14, no. 6, pp. 992–997, 2015. [Online]. Available: http://dx.doi.org/10.1109/tnano. 2015.2462337
- [134] S. Matsunaga, A. Katsumata, M. Natsui, T. Endoh, H. Ohno, and T. Hanyu, "Design of a nine-transistor/two-magnetic-tunnel-junction-cellbased low-energy nonvolatile ternary content-addressable memory," *Japanese Journal of Applied Physics*, vol. 51, no. 2S, p. 02BM06, 2012. [Online]. Available: http://dx.doi.org/10.1143/jjap.51.02bm06
- [135] K. Yang, D. Blaauw, and D. Sylvester, "An all-digital edge racing true random number generator robust against pvt variations," *IEEE Journal of Solid-State Circuits*, vol. 51, no. 4, pp. 1022–1031, 2016.
- [136] W. H. Choi, Y. Lv, J. Kim, A. Deshpande, G. Kang, J.-P. Wang, and C. H. Kim, "A magnetic tunnel junction based true random number generator with conditional perturb and real-time output probability tracking," in 2014 IEEE International Electron Devices Meeting. IEEE, 2014, pp. 12–5.
- [137] Y. Kim, X. Fong, and K. Roy, "Spin-orbit-torque-based spin-dice: A true random-number generator," *IEEE Magnetics Letters*, vol. 6, pp. 1–4, 2015.
- [138] A. Fukushima, T. Seki, K. Yakushiji, H. Kubota, H. Imamura, S. Yuasa, and K. Ando, "Spin dice: A scalable truly random number generator based on spintronics," *Applied Physics Express*, vol. 7, no. 8, p. 083001, 2014.

VITA

### VITA

Akhilesh Jaiswal received the B.Tech degree from Shri Guru Gobind Singhji Institute of Engineering and Technology, Nanded, India, in 2011 and his Master's degree in Electrical Engineering from the University of Minnesota in 2014. He joined the Nano-electronics Research Lab in Fall 2014 and is currently pursuing Ph.D. degree under guidance of Prof. Kaushik Roy. His broad area of research interest has been twofold: 1) modeling and device-circuit co-design for spintronic devices with focus on non-volatile storage/ in-memory computing/ neuromorphic/ logic applications; 2) CMOS-based analog and digital in-memory computing using standard bit-cells. Further, he is also interested in exploration of unconventional computing paradigms using emerging non-volatile devices. He was an intern with Globalfoundries Lab, Malta, in summer of 2017 and with ARM Research Lab, Austin, for summer 2018.