Thursday, September 18, 2008

Probabilistic Classification

In this activity, we did another method in image classification, Linear Discrimination Analysis. (In-depth tutorial can be found at http://people.revoledu.com/kardi/tutorial/LDA/LDA.html) The main idea of this point is to find an axis rotation where the features of a class can be grouped together while making it far away as possible from the features of the other classes,very much similar to what is done in principal component analysis(PCA). The inputs are object samples(rows in variable x), object features(columns in x) and object class(variable y). I used 2 classes from the pattern recognition activity, squidballs and piatos (since the misclassification in that activity happened between these classes). The features are area, height/width, average red (NCC) and average green (NCC). The training set is as follows:
Area height/width Red Green Class
3730 0.526882 0.475084 0.394192 1
4277 0.630000 0.455594 0.388616 1
4181 0.656250 0.462898 0.386272 1
4262 0.744186 0.479152 0.397529 1
3876 0.868421 0.564352 0.331341 2
3576 0.888889 0.522656 0.339387 2
3754 0.893333 0.525226 0.349733 2
4161 0.802469 0.503910 0.352362 2
Class 1 is piatos and class 2 is squidball. After some calculations, I obtained the covariance matrix which is equal to:
66054.609375 -11.454640 -5.145901 3.528965
-11.454640 0.016200 0.003636 -0.002744
-5.145901 0.003636 0.001212 -0.000800
3.528965 -0.002744 -0.000800 0.000632
The discriminant can be computed using
where fi is the linear discriminant, (mu) is the mean feature of class i, C is the covariance matrix, xk is the object feature set to be classified, and p is the probability of class i occuring, which is just the # of objects in class i divided by the total # of objects.
I used the same test images from the activity pattern recognition (first 4 objects were piatos, class 1, and the next 4 are squidballs, class 2). The feature set of the test images is shown below.
Area height/width Red Green Class
4717 0.888889 0.472711 0.398525 1
4870 0.663265 0.470151 0.375207 1
5709 0.647619 0.462389 0.382276 1
5554 0.666667 0.474319 0.375255 1
4144 0.960526 0.519057 0.361193 2
3503 0.912500 0.461393 0.370681 2
3985 1.000000 0.499096 0.365295 2
4592 1.11111111 0.473050 0.352714 2
Using these test image features, I obtained the following discriminant values:
f1f2f1 - f2
3092.183090.511.68
2855.52854.551.03
2942.182940.321.87
2939.132937.791.34
3006.663007.66-1
2724.062724.92-0.85
2936.22937.28-1.08
2799.462801.72-2.26
A positive (f1 - f2) means that an object is classified as class 1 while a negative (f1 - f2) is an object classified as class 2. From the table above the first 4 objects were classified as belonging to class 1 and the next 4 objects were class 2. The results were 100% correct. ^_^ I also tried shuffling the positions of the test objects, because the result might just have copied the arrangement of classes in variable y, and I still get a 100% correct classification.

//Scilab code
x = [3730 0.526882 0.475084 0.394192;
4277 0.630000 0.455594 0.388616;
4181 0.656250 0.462898 0.386272;
4262 0.744186 0.479152 0.397529;
3876 0.868421 0.564352 0.331341;
3576 0.888889 0.522656 0.339387;
3754 0.893333 0.525226 0.349733;
4161 0.802469 0.503910 0.352362];
xt = [4717 0.888889 0.472711 0.398525;
4870 0.663265 0.470151 0.375207;
5709 0.647619 0.462389 0.382276;
5554 0.666667 0.474319 0.375255;
4144 0.960526 0.519057 0.361193;
3503 0.912500 0.461393 0.370681;
3985 1.000000 0.499096 0.365295;
4592 1.111111 0.473050 0.352714];//classes to test, 1-4 class 1, 5-8 class 2
y = [1;
1;
1;
1;
2;
2;
2;
2];
x1 = [3730 0.526882 0.475084 0.394192;
4277 0.630000 0.455594 0.388616;
4181 0.656250 0.462898 0.386272;
4262 0.744186 0.479152 0.397529];
x2 = [3876 0.868421 0.564352 0.331341;
3576 0.888889 0.522656 0.339387;
3754 0.893333 0.525226 0.349733;
4161 0.802469 0.503910 0.352362];
n1 = size(x1, 1);
n2 = size(x2, 1);
m1 = mean(x1, 'r');
m2 = mean(x2, 'r');
m = mean(x, 'r');
x1o = x1 - mtlb_repmat(m, [n1, 1]);
x2o = x2 - mtlb_repmat(m, [n2, 1]);
c1 = (x1o'*x1o)/n1;
c2 = (x2o'*x2o)/n2;
C = (n1/(n1+n2))*c1 + (n2/(n1+n2))*c2;
p = [n1/(n1+n2); n2/(n1+n2)];
CI = inv(C);
for i = 1:size(xt, 1)
xk = xt(i, :);
f1(i) = m1*CI*xk' - 0.5*m1*CI*m1' + log(p(1));
f2(i) = m2*CI*xk' - 0.5*m2*CI*m2' + log(p(2));
end
class = f1 - f2;
class(class >= 0) = 1;
class(class < 0) = 2;
//code end

I give myself a grade of 10 since I was able to do the activity properly.
Collaborators: None

No comments: