Lecture notes of EECS 151 Fall 2022 & Spring 2024 @ UC Berkeley by Prof.Shao & Prof. Wawrzynek
Author:Peiqi(Stefan) Tian
Notice: LLMs are used for generating some LaTeX code from slides and explaining some concepts.

LEC 01 Intro

在芯片设计中，non-recurring engineering (NRE) costs 指的是一次性工程费用。这些费用是在芯片设计和开发过程中产生的，通常只支付一次，而不是重复发生的。NRE成本涵盖了从设计到制造初期的各种费用，具体包括：

设计费用：如架构设计、RTL编写、验证等。
工具费用：EDA（电子设计自动化）工具的使用许可。
IP核费用：购买第三方知识产权核（如处理器核、接口IP等）。
原型制造费用：流片（Tape-out）成本，包括掩模制作和试生产。
测试费用：芯片原型的功能和性能测试。

NRE成本通常较高，尤其是在先进工艺节点（如7nm、5nm等）下。然而，一旦支付，后续量产的单位成本会显著降低。因此，NRE成本是芯片开发初期的重要投资。

摩尔定律：固定芯片面积下，晶体管尺寸减小，信号传输时间减少，增加时钟频率。

Dennard Scaling，是1974年由Robert H. Dennard及其团队提出的一项关于晶体管尺寸缩小的理论。该定律指出，随着晶体管尺寸的缩小，其功率密度保持不变，即在缩小尺寸的同时，电压和电流也成比例降低，从而保持功耗不变。于2000年左右失效。

主要内容

尺寸缩小：
$\kappa$ $\kappa > 1$ ）。具体来说：
- $\kappa$ =1.5，意味着晶体管的尺寸（如沟道长度和宽度）缩小为原来的
  $\frac{1}{\kappa} = \frac{1}{1.5} \approx 0.6667$ $\frac{2}{3}$ 。
电压和电流降低：
$V$ $I$ 也按比例缩小，即
$\begin{matrix} (1) & V \propto \frac{1}{κ}, I \propto \frac{1}{κ} . \end{matrix}$
功耗密度不变：

$P = V \times I$ $\frac{1}{\kappa^2}$ $\kappa^2$ 倍，整体功耗密度保持不变。

Device or Circuit Parameter	Scaling Factor
$t_{ox}, L, W$ (器件尺寸：氧化层厚度、沟道长度、沟道宽度)	$\frac{1}{k}$
$N_a$ (掺杂浓度)	$k$
$V$ (电压)	$\frac{1}{k}$
$I$ (电流)	$\frac{1}{k}$
$eA/t$ (电容)	$\frac{1}{k}$
$VC/I$ (每个电路的延迟时间)	$\frac{1}{k}$
$VI$ (每个电路的功耗)	$\frac{1}{k^2}$
$VI/A$ (功率密度)	$1$

推导

$\kappa$ $\kappa > 1$ $A$ $\frac{1}{\kappa^2}$ ：

\begin{matrix} (2) & A^{'} = \frac{A}{κ^{2}} \end{matrix}

$V$ $I$ 也按比例缩小：

\begin{matrix} (3) & V^{'} = \frac{V}{κ}, I^{'} = \frac{I}{κ} \end{matrix}

$P$ ：

\begin{matrix} (4) & P = V \times I \end{matrix}

$D$ 是总功耗与面积的比值。原始功耗密度为：

\begin{matrix} (5) & D = \frac{P}{A} \end{matrix}

$D'$ 为：

\begin{matrix} (6) & D^{'} = \frac{P^{'}}{A^{'}} = \frac{\frac{P}{k^{2}}}{\frac{A}{k^{2}}} = D \end{matrix}

Digital Design : Given a functional description and performance, cost, & power constraints, create an implementation using a set of primitives.

Digital systems are implemented as a interconnection of combinational logic and state elements

LEC 02 Design Abstraction

Pre-verified block designs, standard bus interfaces (or adapters) ease integration - lower NREs, shorten TTM(Time to Market)
Brings together: standard cell blocks, custom analog blocks, processor cores, memory blocks, embedded FPGAs, …
Standardized on-chip buses (or hierarchical interconnect) permit “easy” integration of many blocks.

SoC (System on Chip) 是一种将多个电子系统组件集成到单个芯片上的集成电路。通常包含处理器、内存、输入/输出接口、数字信号处理器 (DSP)、图形处理单元 (GPU) 等组件，形成一个完整的系统。

HW design challenge
- The HW systems we find useful are huge and the functions we implement are complex
- Hardware is inherently parallel (the secret sauce for high performance). Correct design of parallel systems requires management of timing and synchronization.
- The technology we use imposes physical constraints that effect cost, speed, and power.

Design Challenges are met by using layers of abstractions.


x
1
Specification (e.g., in plain text ,RV spec)  
2
    ↓ 
3
Model (e.g., in C/C++ QEMU)  
4
    ↓ 
5
* Architecture (e.g., in-order/out-of-order,pipeline/single cycle)  
6
    ↓ 
7
* RTL Logic Design (e.g., in Verilog)  
8
    ↓ 
9
* Physical design (schematic, layout; ASIC, FPGA)  
10
    ↓   
11
Manufactured part
12

13
Validation: Have we built the right thing? (is model implementing the specification and meeting the performance?)
14
Verification: Have we built the thing right? (is logic/physical design correct))

Implementing Digital Systems

Implementation Alternative

Full-custom:
- Common in analog design
- High NRE
ASIC:
- Based around a set of pre-designed (and verified) cells
FPGA
Microprocessor

Combinational Logic

• Output a function only of the current inputs (no history).

• Truth-table representation of function. Output is explicitly specified for each input combination.

• In general, CL blocks have more than one output signal, in which case, the truth-table will have multiple output columns.

Peer Instruction: Total number of possible truth tables with 4 inputs and 1 output is : 65,536

My solution:

$S$ $|S| = 16$ , so the number of total possible truth table is

\begin{matrix} (7) & | P (S) | = 2^{16} = 65, 536 \end{matrix}

Sequential Logic

• Output is a function of both the current inputs and the state.

• State represents the memory.

• State is a function of previous inputs.

• In synchronous digital systems, state is updated on each clock tick.

Any synchronous digital circuit can be represented with:

• Combinational Logic Blocks (CL), plus

• State Elements (registers or memories)

• Clock orchestrates sequencing of CL operations

LEC 03 Metrics and Verilog I

Digital Abstraction

The mapping of a continuous variable onto a discrete binary variable is done by defining logic levels.

$0 \ V$ $GND$ $V _{DD}$ .

inverter


xxxxxxxxxx
2
1
                          ------|>o-------|>o-----
2
                               Driver   Receiver

The driver produces aLOW (0)output $\ [ 0 , V _{OL}]$ HIGH (1) $\ [V _{OH} , V_{DD}]$ input $[0 , V_{IL}]$ , it will consider the input to be LOWreceiver $[V_ {IH} , V_{DD}]$ HIGH $[V_{ IL} , V_{IH}]$ , the behavior of the gate is unpredictable.

$NM$ ) is the amount of noise that could be added to a worst-case output such that the signal can still be interpreted as a valid input.

\begin{matrix} (8) & \begin{matrix} N M_{L} = V_{I L} - V_{O L} \\ N M_{H} = V_{O H} - V_{I H} \end{matrix} \end{matrix}

A necessary property of any suitable technology for logic circuits is "Restoration” or “Regeneration”

Circuits need:

to ignore noise and other non-idealities at the their inputs, and
generate "cleaned-up" signals at their output.

Voltage Transfer Characteristic

• Describes the output voltage as a function of the input voltage.

• To choose logic levels -> slope = -1 -> maximize noise margin

Cost

NRE(fixed cost)
Recurring costs (variable cost) : Cost to manufacture, test and package a unit, hence proportional to the product volume

Assuming the wafer is circular, and the die is square-shaped.

\begin{matrix} (9) & Die Yield = \frac{# good chips per wafer}{Total # chips per wafer} \times 100 % \end{matrix}

\begin{matrix} (10) & Die cost = \frac{Wafer cost}{Dies per wafer \times Die yield} \end{matrix}

\begin{matrix} (11) & Dies per wafer = \frac{wafer area}{die area} - \frac{π \times wafer diameter}{\sqrt{2 \times die area}} \end{matrix}

\begin{matrix} (12) & \begin{matrix} (empirical formula) die yield = {(1 + \frac{defects per unit area \times die area}{α})}^{- α} \\ α is approximately 3 \end{matrix} \end{matrix}

\begin{matrix} (13) & variable cost = \frac{cost of die + cost of die test + cost of packaging}{final test yield} \end{matrix}

\begin{matrix} (14) & cost per IC = variable cost per IC + \frac{fixed cost}{volume} \end{matrix}

\begin{matrix} (15) & cost of die = f (die area)^{4} \end{matrix}

uniform density of randomly occurring point defects $N$ $n$ $P(k)$ $k$ defects may be approximated by Poisson's distribution:

\begin{matrix} (16) & P (k) = \frac{e^{- m} \cdot m^{k}}{k!} \end{matrix}

$m = \frac{n}{N}$ $Y$ $k=0$ ), so:

\begin{matrix} (17) & Y = e^{- m} \end{matrix}

$D$ is the chip defect density, then:

\begin{matrix} (18) & D = \frac{n}{N A} \end{matrix}

$A$ $m = \frac{n}{N}$ $m$ , which is the average number of defects per chip, is:

\begin{matrix} (19) & m = A \cdot D \end{matrix}

$Y$ is given by the Poisson Yield Model:

\begin{matrix} (20) & Y = e^{- A D} \end{matrix}

Performance

Throughput, e.g., FLOPS.
- Peak
- average
Latency
- Average
- tail, e.g., 99th percentile CPU latency.

Digital Logic Delay

$t_p$ a logic gate : How quickly its output responds to a change at its inputs.

Defined as 50% transition points of the input and output waveforms.

$t_{pHL}$
$t_{pLH}$

Edge-triggered d-type flip-flop

On the rising edge of the clock, the input d is sampled and transferred to the output q . At all other times, the input d is ignored.

Limitations: ff cannot change their outputs instantaneously. Time is need to transfer inputs internally.

$d$ should be stable before the rising edge and remain stable for a short amount of time after the edge. These two called: setup time and hold time. Setup time mainly prevent sampling during the input is at rising edge. During this time window, the input shall not change . Once the flip-flop captures the new input, it also takes a small amount time to transfer the new value to output. This delay is called clk-to-q delay.

Digital Logic Timing

Find the Critical Path(Graph Theory)
Make it shorter
$\frac{1}{f} \ge t_{crital-path}$ $f$ is the clock frequency.

Power

$J$ )

$W$ )

$C$ $V_{DD}$ $CV_{DD}^2$ $f$ $\alpha$ 。

动态功耗为

\begin{matrix} (21) & P_{d y n a m i c} = α C V_{D D}^{2} f \end{matrix}

😭 Silly me

Verilog

See this website.

Simulation is the process of using software to emulate the behavior of a hardware design to verify its correctness.
Synthesis is the process of converting HDL code into a gate-level netlist, which consists of basic components like logic gates and flip-flops.

HDLs originally invented for simulation.

Structural Verilorg
- List of sub-components and how they are connected
Behavioral Verilog
- Describe what a component does, not how it does it
- Result is only as good as the tools
Common approach is to use behavioral descriptions for “leaf cells” and structural to build hierarchy.

LEC 04 Verilog II

In a Verilog "continuous assignment" (assign lhs = rhs;), the value of the signal on the right side is driven onto the wire on the left side. The assignment is "continuous" because the assignment continues all the time even if the right side's value changes. A continuous assignment is not a one-time event.

Operators

Arithmetic Operators

+ : Add
- : Subtract
* : Multiply
/ : Divide
% : Modulus

Logical Operators

! : Logical negation
&& : Logical AND
|| : Logical OR

Relational Operators

> : Greater than
< : Less than
>= : Greater than or equal
<= : Less than or equal

Equality Operators

== : Equality
!= : Inequality

Bitwise Operators

~ : Bitwise negation
& : Bitwise AND
| : Bitwise OR
^ : Bitwise XOR
- Reduction operators also exist for AND, OR, and XOR that have the same symbol as the bitwise operators.

Shift Operators

<< : Shift left logical
>> : Shift right logical
<<< : Arithmetic left shift
>>> : Arithmetic right shift

Other Operators

{} : Concatenation
{{}} : Replication : sign extensing a small numer into a wider one.
[MSB:LSB] : Indexing/Slicing
( condition ) ? ( exp if condition is true ) : ( exp if condition is false )

A note onwire vs. reg: Theleft-hand-side of an assign statement must be a net type (e.g., wire), while the left-hand-side of a procedural assignment (in an always block) must be a variable type (e.g., reg). These types (wire vs. reg) have nothing to do with what hardware is synthesized, and is just syntax left over from Verilog.

WARNING: for and while loops can’t be mapped to hardware! These statements are valid verilog ( and can be simulated ) , but cannot always be mapped to hardware.

Nested if structure leads to “priority logic” structure, with different delays for different inputs. Use case instead.

Generate loops are used to iteratively instantiate modules.

Implicit nets are often a source of hard-to-detect bugs. In Verilog, net-type signals can be implicitly created by an assign statement or by attaching something undeclared to a module port. Implicit nets are always one-bit wires and causes bugs if you had intended to use a vector. Disabling creation of implicit nets can be done using the `default_nettype none directive


xxxxxxxxxx
6
1
wire [2:0] a, c;   // Two vectors
2
assign a = 3'b101;  // a = 101
3
assign b = a;       // b =   1  implicitly-created wire
4
assign c = b;       // c = 001  <-- bug
5
my_module i1 (d,e); // d and e are implicitly one-bit wide if not declared.
6
                    // This could be a bug if the port was intended to be a vector.

Generate loops are used to iteratively instantiate modules. This is useful with arameters or when instantiating large numbers of the same module.


xxxxxxxxxx
6
1
wire [3:0] a, b; genvar i;
2
core first_one (1'b0, a[i], b[i]);
3
// programmatically wire later instances generate
4
for (i = 1; i < 4 ; i = i + 1) begin:name_of_this_loop
5
  core generated_core (a[i], a[i-1], b[i]);
6
end endgenerate

Verilog modules may include parameters in the module deﬁnition.


xxxxxxxxxx
17
1
module adder #(parameter width=32) 
2
  (input [width-1:0] a, input [width-1:0] b, output [width:0] s);
3

4
  s = a + b;
5

6
endmodule
7

8
module top(); 
9
  localparam adder1width = 64; 
10
  localparam adder2width = 32; 
11
  reg [adder1width-1:0] a,b; 
12
  reg [adder2width-1:0] c,d; 
13
  wire [adder1width:0] out1; 
14
  wire [adder2width:0] out2; 
15
  adder #(.width(adder1width)) adder64 (.a(a), .b(b), .s(out1)); 
16
  adder #(.width(adder2width)) adder32 (.a(c), .b(d), .s(out2)); 
17
endmodule

Multidimensional Nets in Verilog


xxxxxxxxxx
7
1
//creates a net called <netname> and describes it as an arra y of ( N + 1 ) elements, 
2
//where each element is a ( M + 1 ) bit number.
3
reg [M:0] <netname> [N:0];
4
// A memory structure that has eight 32-bit elements 
5
reg [31:0] fifo_ram [7:0]; 
6
fifo_ram[2] // The full 3rd 32-bit element 
7
fifo_ram[5][7:0] // The lowest byte of the 6th 32-bit element

Sequential

Clocked always blocksalways@(posedge clk) create a blob of combinational logic just like combinational always blocks, butalso creates a set of flip-flops (or "registers") at the output of the blob of combinational logic. Instead of the outputs of the blob of logic being visible immediately, the outputs are visible only immediately after the next clk.

Combinational circuits must have a value assigned to all outputs under all conditions. This usually means you always need else clauses or a default value assigned to the outputs.

Nested if structure leads to “priority logic” structure: with different delays for different inputs, Case version treats all inputs the same.


xxxxxxxxxx
13
1
always @(posedge clk) begin
2
  q1 <= in;
3
  q2 <= q1; // use old q1
4
  out <= q2; // use old q2
5
end
6

7
always @(posedge clk) begin
8
  q1 = in;
9
  q2 = q1; // use new q1
10
  out = q2; // use new q2
11
end
12

13
//“old” means value before clock edge, “new” means the value after most recent assignment)

Adder

Ripple Carry Adder

Ripple carry adder is that the delay for an adder to compute the carry out (from the carry-in, in the worst case) is fairly slow, and the second-stage adder cannot begin computing its carry-out until the first-stage adder has finished. This makes the adder slow.

Carry-select Adder

One improvement is a carry-select adder, shown below. The first-stage adder is the same as before, but we duplicate the second-stage adder, one assuming carry-in=0 and one assuming carry-in=1, then using a fast 2-to-1 multiplexer to select which result happened to be correct.

Subtrator

$0$ $1$ $1$ $0$ $1$ $111\ldots 111_2$ $-1$ $x + \bar{x} = -1$ $\bar{x} + 1 = -x$ $\bar{x}$ $x$ 按位取反。）

LEC 05 Combinational Logic

EDA Playground

Laws of Boolean Algebra

Identities:
- $X + 0 = X$ $X \cdot 1 = X$
- $X + 1 = 1$ $X \cdot 0 = 0$
Idempotence:
- $X + X = X$ $X \cdot X = X$
Complements:
- $X + \bar{X} = 1$ $X \cdot \bar{X} = 0$
Commutative:
- $X + Y = Y + X$ $X \cdot Y = Y \cdot X$
Associative:
- $(X + Y) + Z = X + (Y + Z) = X + Y + Z$
- $(X \cdot Y) \cdot Z = X \cdot (Y \cdot Z) = X \cdot Y \cdot Z$
Distributive:
- $X \cdot (Y + Z) = (X \cdot Y) + (X \cdot Z)$
- $X + (Y \cdot Z) = (X + Y) \cdot (X + Z)$
Absorptive:
- $X + (X \cdot Y) =(X) \cdot (1+Y) =X$
- $X \cdot (X + Y) =(X+0) \cdot (X+Y) = X +(0\cdot Y)= X$
Duality:
- $\leftrightarrow$ OR and vice versa
- $0 \leftrightarrow 1$ and vice versa
- Leave literals unchanged
- $\left\{ F(x_1, x_2, \dots, x_n, 0, 1, +, \cdot) \right\}^D = \left\{ F(x_1, x_2, \dots, x_n, 1, 0, \cdot, +) \right\}$
DeMorgan's Law
- $\overline{A + B} = \bar{A} \cdot \bar{B}$
- $\overline{A \cdot B} = \bar{A} + \bar{B}$
- Bubble Pushing
  • Pushing a bubble from input through the gate
  • Bubble comes out in the output
  • The gate flips from AND to OR or vice versa.

Canonical Forms

Sum of Products (SOP)
- Disjunctive normal form, minterm expansion.
- Minterm: a product (AND) involving all inputs for the term to be 1.
Product of Sums (POS)
- conjunctive normal form, maxterm expansion.
- Maxterm: a sum (OR) involving all inputs for the term to be 0.
- Can obtain POSs from applying DeMorgan’s law to the SOPs of F (and vice versa)

LEC 06 CL and Finite State Machine

Theorem: Any combinational logic function can be implemented as a networks of logic gates.

Kmap

The binary-reflected Gray code list for n bits can be generated recursively from the list for n − 1 bits by reflecting the list (i.e. listing the entries in reverse order), prefixing the entries in the original list with a binary 0, prefixing the entries in the reflected list with a binary 1, and then concatenating the original list with the reversed list.

Each square differs from an adjacent square by a change in a single variable(Gray code).

\begin{matrix} (22) & Y = A B C + A B \overset{―}{C} = A B (\overset{―}{C} + C) = A B \end{matrix}

Use the fewest circles necessary to cover all the 1’s.
All the squares in each circle must contain 1’s.
Each circle must span a rectangular block that is a power of 2 (i.e., 1, 2, or 4) squares in each direction.
Each circle should be as large as possible.
A circle may wrap around the edges of the K-map.
A 1 in a K-map may be circled multiple times if doing so allows fewer circles to be used.

Squares with don't care entry can be treated as 1 or 0 on demand simplify circuit.

Using multiple levels (more than 2) will reduce the cost. Sometimes also delay. Sometimes a tradeoff between cost and delay.

NAND would be used in place of all ANDs and ORs.

No convenient hand methods exist for multi-level logic simplification:

CAD tools use sophisticated algorithms and heuristics.
- These problems tend to be NP-complete
Humans and tools often exploit some special structure (example adder)

FSM

Can model behavior of any sequential circuit.

The FSM follows exactly one edge per cycle.

Moore:outputs depend only on current state. Both edges of output follow the clock.

Mealy:outputs depend on current state and inputs. Output rises with input rising edge and is asynchronous with the clock, output fails synchronous with next clock edge, the output timing behavior of the Moore machine can be achieved in a Mealy machine by “registering” the Mealy output values.

FSM State Transition Diagram:

States: node
Outputs: Labled in each node
Inputs: Labled in each arc.

Current State	Inputs	Next State
Eat	Feeding	Eat
Eat	Petting	Sleep
Sleep	Feeding	Sleep
Sleep	Petting	Annoyed
Annoyed	Feeding	Eat
Annoyed	Petting	Annoyed

Encode each state

State	Encoding
Eat	00
Sleep	01
Annoyed	10

Input	Encoding
Feeding	0
Petting	1

S1 (Current state[1])	S0(Current state[0]	X(Input)	S1'(Next State[1])	S0'(Next State[0])
0	0	0	0	0
0	0	1	0	1
0	1	0	0	1
0	1	1	1	0
1	0	0	0	0
1	0	1	1	0

\begin{matrix} (23) & S 0^{'} = \overset{―}{S 1} S 0 X + S 1 \overset{―}{S 0} \overset{―}{X} = \overset{―}{S 1} (S 0 X + S 0 \overset{―}{X}) = \overset{―}{S 1} (S 0 \oplus X) \end{matrix}

\begin{matrix} (24) & S 1^{'} = \overset{―}{S 1} S 0 X + S 1 \overset{―}{S 0} X = (S 1 \oplus S 0) X \end{matrix}

S1	S0	E(Eye open)	M (Mouth open)
0	0	1	1
0	1	0	0
1	0	1	0

\begin{matrix} (25) & E = \overset{―}{S 1} \overset{―}{S 0} + S 1 \overset{―}{S 0} = \overset{―}{S 0} \end{matrix}

\begin{matrix} (26) & M = S 1 \overset{―}{S 0} \end{matrix}

State assignment

States which are closed to each other had better to have close encoding.

An alternative approach is using One-Hot encoding. But can cost more ffs.

Verilog Imp


xxxxxxxxxx
26
1
/*State*/
2
REGISTER_R #(.N(2), .INIT(ZERO)) state(.q(ps), .d(ns), .rst(rst));
3
/*Next state & Output*/
4
always @(*) begin
5
  case (ps)
6
    ZERO: begin
7
      out = 1’b0;
8
      if (in) ns = CHANGE;
9
      else ns = ZERO;
10
    end
11
    CHANGE: begin
12
      out = 1’b1;
13
      if (in) ns = ONE;
14
      else ns = ZERO;
15
    end
16
    ONE: begin
17
      out = 1’b0;
18
      if (in) ns = ONE;
19
      else ns = ZERO;
20
    end
21
  default: begin
22
    out = 1’bx;
23
    ns = default;
24
  end
25
    endcase
26
end

LEC11 FPGA

Basic idea: Two-dimensional array of logic blocks and flip-flops with means for the user to configure.

latch is a basic memory element used to store one bit of information.

When the control signal is active, the latch is transparent, meaning the output follows the input. When the control signal is inactive, the latch holds the last value.

Most FPGAs have “SRAM based(latch)” programmability.

CLB

Based on PYNQ-Z1

Basic FPGA functional unit: Implements both combinational and sequential logic, includes,
- LUT: implemented with Latches/SRAM, which uses a MUX to select the output value from stored truth table entries, with input signals acting as the MUX's selection lines.
- FF
- MUX

SLICES

A CLB element contains a pair of slices, and each slice is composed of four 6-input LUTs and eight storage elements. The CLBs are arranged in columns in the 7 series FPGAs.

Multiple LUTs can be concatenated using MUX by using the higher significant bits of the inputs as the selection signals to implement functions with more inputs than a single LUT.

SLICEL(LOGIC):only be used for logic
SLICEM(MEMORY, additional memory elements): can be also used as memory/shift registers.

Configurable Interconnect

Each interconnection has a transistor switch. Each switch is controlled by 1-bit configuration register (bitstream).

NetList is typically a 3-D graph, tools have to figure out the optimal placement and wiring on the fpga board.

Diverse Resources on FPGA

LOGIC
BRAM: Used for storing large amounts of data
DSPs: Singal processing
CLOCKING
IO
Serial I/O + PCI

LEC 13 COMS

PN Junction

See this video.

A silicon atom has four electrons in its outermost shell, which can form covalent bonds with four adjacent silicon atoms. Since these electrons are bound by covalent bonds and cannot move freely, pure silicon does not conduct electricity. This pure form of silicon is known as an intrinsic semiconductor.

When phosphorus (P) is doped into silicon, phosphorus has five electrons in its outermost shell. Four of these electrons form covalent bonds with silicon, while the extra electron becomes a free electron. Since the charge carriers are primarily free electrons, this doped semiconductor is known as an N-type semiconductor.

When boron (B) is doped into silicon, boron has three electrons in its outermost shell. When a boron atom replaces a silicon atom in the crystal lattice, it can only form three complete covalent bonds with the surrounding four silicon atoms, leaving the fourth bond with a vacancy (i.e., a hole) due to the lack of an electron. This hole can easily attract electrons from neighboring atoms to fill it, hence it is referred to as a hole.

Origin of the Hole's Positive Charge
- Principle of Electrical Neutrality: Each silicon atom originally contributes 4 electrons (electrically neutral), while a boron atom contributes only 3 electrons, resulting in a net positive charge (+e) at its location.
- Nature of Holes: A hole is the absence of an electron in a covalent bond. When a neighboring electron moves to fill this vacancy, it is equivalent to the hole moving in the opposite direction. Since electrons carry a negative charge (-e), the absence of an electron is equivalent to a positively charged carrier (+e).

By doping both ends of the same silicon wafer, one end forms an N-type semiconductor, and the other end forms a P-type semiconductor. Before contact, both N-type and P-type semiconductors are electrically neutral (number of outer electrons = number of protons). After contact, due to the difference in electron concentration, electrons spontaneously diffuse from the N-type semiconductor to the P-type semiconductor. This causes the N-type semiconductor to become positively charged due to the loss of electrons, while the P-type semiconductor becomes negatively charged due to the gain of electrons, resulting in an electric field directed from the N-type to the P-type near the contact surface.

Under the influence of this electric field, some electrons move toward the N-type region, a motion known as drift. Eventually, the drift motion and the diffusion motion reach a dynamic equilibrium, and this region is called the PN junction.

When an external electric field is applied to the semiconductor, directed from the P-side to the N-side, as a result, the PN junction in the N-region narrows due to the replenishment of electrons, while the PN junction in the P-region narrows as electrons are removed by the electric field, leading to an increase in holes. Eventually, the PN junction disappears. Free electrons from the N-region pass through the holes in the P-region and eventually return to the power source. The PN junction only conducts when the external electric field strength exceeds the built-in electric field of the PN junction, with a typical threshold voltage of 0.7V.

When the power supply is reverse-connected, the remaining electrons in the N-region move toward the positive terminal of the power supply, while the holes in the P-region are filled by electrons from the negative terminal, resulting in a reduction of charge carriers and preventing conduction. As the internal electric field strengthens, a small number of electrons still undergo drift motion. When the external voltage increases further, avalanche breakdown occurs. Although avalanche breakdown itself does not directly damage the semiconductor, the drifting electrons collide with covalent bonds, generating new holes and free electrons. This leads to a sharp increase in current and produces significant heat, potentially causing the semiconductor to burn out.。

MOSFET

See this video.

As shown in the figure, a single crystal silicon wafer is doped to form two N-type(Red) semiconductor and a P-type (Yellow)semiconductor. At the interface between the N-type and P-type regions, a PN junction is formed, with its built-in electric field directed from the N-region to the P-region.

A capacitor is added and connected to the positive and negative terminals of the power supply. The capacitor generates an electric field, under the influence of which a large number of electrons move upward, filling some of the holes. The remaining free electrons cause a portion of the P-type semiconductor to transform into an N-type semiconductor, enabling conduction. The region that transitions from P-type to N-type is called the N-channel.

without capacitor, one of the pn junction will be widen up(reversed biased), cannot conduct.

Because build in electrical feild in pn junction will canceling some gate votage.

$V _{DD}$ $GND$ $V _{DD}$ $GND$ . We say that the output floats.

How a cpu is made.

MOS Transistor as a Resistive Switch

$V_{GS}$ $V_{GS} = V_G - V_S$ . It is a key parameter that determines whether the MOS transistor is in the ON or OFF state.

$\lvert V_{GS} \rvert \geq \lvert V_T \rvert$ $V_T$ $R_{on}$ ).
$\lvert V_{GS} \rvert < \lvert V_T \rvert$ $V_{GS}$ .

Triode mode or linear region
- $V_{GS} > V_{th}$ $V_{DS} < V_{GS} − V_{th}$ :
  The transistor is turned on, and a channel has been created which allows current between the drain and the source. The MOSFET operates like a resistor, controlled by the gate voltage relative to both the source and drain voltages.
Saturation or active mode
- $V_{GS} > V_{th}$ $V_{DS} \ge V_{GS} − V_{th}$ :
  The switch is turned on, and a channel has been created, which allows current between the drain and source. Since the drain voltage is higher than the source voltage, the electrons spread out, and conduction is not through a narrow channel but through a broader, two- or three-dimensional current distribution extending away from the interface and deeper in the substrate. The onset of this region is also known as pinch-off to indicate the lack of channel region near the drain. Although the channel does not extend the full length of the device, the electric field between the drain and the channel is very high, and conduction continues. The drain current is now weakly dependent upon drain voltage and controlled primarily by the gate-source voltage.

LEC 01 Intro

主要内容

LEC 02 Design Abstraction

Implementing Digital Systems

Implementation Alternative

Combinational Logic

Sequential Logic

LEC 03 Metrics and Verilog I

Digital Abstraction

inverter

Voltage Transfer Characteristic

Cost

Performance

Digital Logic Delay

Edge-triggered d-type flip-flop

Digital Logic Timing

Power

Verilog

LEC 04 Verilog II

Operators

Arithmetic Operators

Logical Operators

Relational Operators

Equality Operators

Bitwise Operators

Shift Operators

Other Operators

Sequential

Adder

Ripple Carry Adder

Carry-select Adder

Subtrator

LEC 05 Combinational Logic

Laws of Boolean Algebra

Canonical Forms

LEC 06 CL and Finite State Machine

Kmap

FSM

FSM State Transition Diagram:

State assignment

Verilog Imp

LEC11 FPGA

CLB

SLICES

Configurable Interconnect

Diverse Resources on FPGA

LEC 13 COMS

PN Junction

MOSFET

MOS Transistor as a Resistive Switch

Saturation or active mode