Spring 2025: ECE 6545 Deep Learning with Image Analysis

2025-01-07

Week 1

Object detection: boundary of the object, what is the object, where is the object.
Semantic Segmentation: labels different sections.
Linear Classifier: draw a line in a space to classify different types of data.
Overfitting: the model matches the training set too closely, resulting in the model failing to predict correctly on new data.
Image Classification challenges:
- resolution of image
- variables
It is common to have more training data than testing data.
Class Imbalance: certain class only has limited amount of data.
K nearest neighbor classifier: find closest resemblance.
- It is never used due to slowness, overfitting
Hyperparameter: parameters that are fixed during training.
- k in K nearest neighbor classifier is a hyperparameter.
- k is usually a odd number to avoid ties when it comes to voting.
Linear Decision boundary: a straight line, plane, or hyperplane that separates different classes in a feature space.

Week 2

Linear Regression: a line that separates different types of data.
- Under mild condition, linear regression has an optimal solution.
Mean Squared Error (MSE): Average of the squared differences between observed and predicted values.
- Good for linear regression.
Supervised Learning: train model with training set and maps input to output while minimizing errors.
How to find the minimum with reference to w?
- Differential MSE with w = 0
Polynomial Regression: a curve line that separates different data.
Machine Learn Assumption: training set is drawn from the same probability distribution as test data.
- Example: train a model based on the heights of 6 - 12 years olds, but the test data are the heights of 18 - 24 years olds. The model will not generalize well.
- Ultimate Goal: has as small errors as possible.
Regularization is a technique used in machine learning to prevent overfitting by introducing additional constraints or penalties to the model’s loss function.
Maximum Likelihood Estimation: find the parameter that maximizes the likelihood of the observed data under a given probabilistic model.
- MLE estimates often converge to the expected value of the true parameter.
- MLE is found by taking the derivative of the log-likelihood and solving for zero.
- MLE is asymptotically unbiased but may be biased in small samples.
- MLE has the lowest variance possible asymptotically (efficient estimator).
- MLE is equivalent to minimizing KL divergence(minimize between 2 distributions).
Binary Classification: predicting between two classes.
Cross-Entropy Loss: Measures the difference between predicted and actual labels. It ensures that high-confidence incorrect predictions get large gradients (forcing corrections).
Squash Function (Sigmoid): Converts raw scores to probabilities (0 to 1).
- Divide each output by the sum of all outputs. What happens if the sum is negative? Exponential.

Week 3

SoftMax: transforms outputs into probabilities, ensures probabilities sum is 1.
ReLU(Rectified Linear Unit): ReLU(x) = max(x, 0). It removes any negative values and keep positive values.
Goal: use neurual network to linear separate samples.
- The more hidden layers you have, a much larger set of problems you can approximate.
- Don’t put sigmoid functions in the middent of the hidden layers, but it can be used on output layers.
Loss functions:
- MSE: regression
- BCE(Binary Cross Entropy): binary classification
- Cross Entropy: multi-class labels
Several approach for training neurual networks:
- Batch Descent
- Stochastic gradient descent: one sample at a time(epoch one iteration), converge faster.
- Mini Batch: each epoch is limited to B samples.
Computational Graph

Week 4

Forward Pass: input data moves through data in a neurual network.
Backward Pass (Backpropagation): computing gradients using the chain rule.
Weight Updates: adjusting weights based on the gradients using optimization techniques like Stochastic Gradient Descent (SGD).
Activation Functions: Non-linearity in hidden layers (e.g., ReLU, sigmoid).
Batch Processing: concepts of minibatch to speed up the process.

Week 5

CNN
Cross-correlation v.s Convolution

Week 6

Max pooling
Stride
Conv -> ReLU -> Pooling
Regularization: L2 penalty
Global average pooling can replace flattening.
Backprop for CNNs
- Local derivative for max pooling.
- Local derivative for convolution layer. Similiar front convolutional operation, compute the downstream gradient and apply the filter downstream.

Week 7

PyTorch
CNN

Week 8

Data preprocessing
Batch Normalization is a technique that improves speed(especially if training is deep) and stability for training neurual networks.
What is Normalization?
- Re-centering and re-scaling layer’s input.
Parameter initialization
Regularization
- L2 Regularization(weight decay)
- Dropout
Hyperparameter search
- Grid search: train entire dataset with small epoch
- Random search

Week 9

Image Analysis
Object Localization
- Bounding box
- Height
Model
- CNN -> flatten -> softmax classification
- ```
           -> f.c linear activation
```
- Add cross-entropy loss of classification and mse loss * lambda(to adjust the loss)
- multitask learning
Transfer Learning
- Limitation
What if there are mutliple objects?
R-CNN region based CNN
Fast R-CNN
Faster R-CNN

Week 1

Week 2

Week 3

Week 4

Week 5

Week 6

Week 7

Week 8

Week 9

Week 11