02/17/2020 Stanford- CS231-note image classification

Marmara ·
更新时间:2024-11-14
· 875 次阅读

pre : read python and numpy tutorial the problem: semantic gap (difference between computer and human) challenges: viewpoint variation (camera moves)/ illumination/ deformation(poses and positions)/ occulsion/ background clutter( look similar with background)/ intraclass variation an image classifier def classify_image(image) #some magic here? return class_label data-driven approach
-collect a dataset of images and labels
-use of machine learning to train a classifer
-evaluate the classifier on new images def train(images,labels): #machine learning! return model def predict(model, test_images): #use model to predict labels return(test_labels)

first classifer: nearest neighbors
drawback: qucik training time, long predict time

distance metric to compare images
-L1 distance (Manhattan distance) reply on coordiante system you choose/ if the individual entries have important mearnings for your task
-L2 distance (Euclidean distance) don’t reply on coordiante you choose - it’s circle/ if only generic vector and don’t know which of the different elements
-good to try first

K-Nearest Neighbors
instead of copying label from nearest neighbor, take majority vote from K cosest points
expect use k>1, smooth out decision boundaries

hyperparameters- choices about the algorithm that we set rather than learn
-what is the best value of k to use?
-what is the best distance to use?
-idea1: work best on data ❌(bad: k=1 always works perfectly on training data)
-idea2: split your data into train and test/ work best on test data ❌(bad: no idea how algorithm will perform on new data- may just pick the right set of hyperparameters that cause our algorithm to work well on this training data)
-idea3: split data into train, valuation and test/ choose hyperparameters on val and evaluate on test
-idea3: cross-validation, used for small datasets not deep learning(computation expensive)/ try each fold as validation and average the result

what is the difference between validation set and test set?
our algorithm memorizes all the training set, now we will take each of elment of the validation set compared with training data and then use this to determine what is the accuracy of classifer

representative: partition should be random for train data and test data

K-nearest neighbor on images never used
-very slow at test time
-distance metrics on pixels are not informative
-curse of dimensionability/ you never get enough images to cover each pixels in high dimensional space
(ex. dimensions=1, points=4; dimensions=2, points=16)

Linear Classification

parametric approach: we only need parameters at test time, allow our models to be more efficient and can actually run on small devices
-problem: only learn one template for each class


作者:SuriSuriStudyQuickly



cs2 image

需要 登录 后方可回复, 如果你还没有账号请 注册新账号