感知机是最简单的人工神经网络模型,是神经网络的基础单元。它的工作原理可以类比为"投票机制":每个输入特征都有自己的权重(重要性),将所有输入与对应权重相乘后求和,如果超过某个阈值就输出"是",否则输出"否"。
输入 → 加权和 → 激活函数 → 输出
# 感知机的Python实现示例
import numpy as np
class Perceptron:
def __init__(self):
self.weights = None
self.bias = 0
def predict(self, X):
linear_output = np.dot(X, self.weights) + self.bias
return np.where(linear_output > 0, 1, 0)
def train(self, X, y, epochs=100, lr=0.01):
n_features = X.shape[1]
self.weights = np.zeros(n_features)
self.bias = 0
for _ in range(epochs):
for xi, yi in zip(X, y):
update = lr * (yi - self.predict(xi)[0])
self.weights += update * xi
self.bias += update
多层神经网络(也称为多层感知机 MLP)在单层感知机的基础上增加了隐藏层,可以学习更复杂的非线性模式,解决了单层网络无法处理 XOR 问题的局限性。
每层神经元提取不同层次的特征
# 多层神经网络前向传播示意
class MLP:
def __init__(self, layer_sizes):
self.weights = []
self.biases = []
for i in range(len(layer_sizes)-1):
self.weights.append(np.random.randn(layer_sizes[i], layer_sizes[i+1]) * 0.1)
self.biases.append(np.zeros(layer_sizes[i+1]))
def forward(self, X):
self.activations = [X]
for w, b in zip(self.weights, self.biases):
z = np.dot(self.activations[-1], w) + b
a = np.maximum(0, z) # ReLU激活
self.activations.append(a)
return self.activations[-1]
神经网络的训练核心是通过反向传播算法(Backpropagation)不断调整参数,使损失函数最小化。
梯度从后向前逐层传播,更新参数
# 简化版反向传播
def backprop(loss, learning_rate=0.01):
# dLoss/dOutput 根据具体损失函数计算
dLoss_dOutput = ...
# 从后向前传播梯度
for i in reversed(range(len(layers))):
dOutput_dZ = activation_derivative(layers[i].output)
dZ_dW = layers[i].input
gradient = dLoss_dOutput * dOutput_dZ
layers[i].weights -= learning_rate * np.outer(dZ_dW, gradient)
layers[i].bias -= learning_rate * gradient
dLoss_dInput = layers[i].weights @ gradient
训练一个好的分类器需要注意以下关键实践:
卷积神经网络是处理图像数据的核心技术,通过局部连接和权值共享大幅减少参数数量。
卷积核滑动提取特征
# PyTorch 卷积层示例
import torch.nn as nn
class SimpleCNN(nn.Module):
def __init__(self, num_classes=10):
super().__init__()
self.conv1 = nn.Conv2d(3, 32, kernel_size=3, padding=1)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(32, 64, kernel_size=3, padding=1)
self.fc = nn.Linear(64*8*8, num_classes)
def forward(self, x):
x = self.pool(torch.relu(self.conv1(x))) # 32→16
x = self.pool(torch.relu(self.conv2(x))) # 16→8
x = x.view(-1, 64*8*8)
return self.fc(x)