Face Recognition Comparison or not

rmwkwok · November 29, 2023, 1:16am

Not sure I follow you.

Your x is the probability prediction?

x^2 > .5 implies x > 0.707…

Why the square? What’s the motivation?

It is actually harder than imagined for a well trained network to predict exactly 0.500000000000000000000 It is not impossible, but it might take years before we see one such case.

RonSo · November 29, 2023, 1:34am

Sorry, not a squaring. I couldn’t find the way to enter superscript 2. I’m just wondering about that boundary between > and >=.

rmwkwok · November 29, 2023, 1:42am

Ah! I see. No problem.

Yea, so, .5 can be ambigous, but it is also pretty unlikely, so we live with it. With >.5, we just assume the case of .5 to be classified as 0.

Cheers!

RonSo · November 29, 2023, 1:45am

So what you are saying is that small boundary condition has no material impact on the evaluation? The reason I wonder is for strong non-linear effects. And, thank you for taking the time to review and answer my question.

rmwkwok · November 29, 2023, 1:50am

You are welcome!

Just trying to understand you, how does what strong non-linear effects impact?

Up to now, I still think that, yes, it has no material impact on, actually, anything, because it almost does not happen, but I am looking for different viewpoints , so how does what strong non-linear effects impact?

By non-linear effect, maybe you are referring to the effect by those non-linear activation functions in the neural network? If so, how does it relate to the boundary of .5?

RonSo · November 29, 2023, 2:17am

Non-linear effect would be anytime f(x)–> y is non-linear (worst is exponential). In physics and weather this is called a runaway function. In this case, f(x)–> is the result of the activation functions. Another way to look at it is recently the coin-toss was found to not be .5 (equal probability). There was a slight advantage to the side that is up when the coin is tossed. If some process is slightly biased from .5 but that effect gets multiplied through subsequent stages, the result may end up being greatly biased away from .5. That’s what I mean by non-linear.

That being said, not having taken your course, I am really too ignorant to answer your good questions relative to the non-linear activation functions. I should probably enroll!

rmwkwok · November 29, 2023, 2:31am

It’s fine. I asked about the activation function because I was trying to come up with some possibilities of what you might be talking about, in case it could help move forward the discussion, but that would not have been necessary because now you have explained it

What I see from your explanation is, one process after another process, they bias the final outcome towards somewhere from .5, so it can start from .5, but the outcome is not .5 because of those processes.

In the case of a neural network, I believe it’s a little bit different - it does not start from .5 at the input layer, instead, it starts to be .5 at the outcome of the output layer.

I said it does not start from .5 at the input layer because the input layer is just a bunch of any numbers.

I said it starts being .5 at the outcome of the output layer, because we are optimizing the outcome to be away from 0 when the label is 1 and away from 1 when the label is 0.

The optimization requires only the outcome of the output layer to carry such property, and such zero-one deviation makes 0.5 a natural boundary. So, it starts being .5 at the outcome of the output layer, and no extra process thereafter to bias it.

What do you think?

RonSo · November 29, 2023, 4:05am

Thank you. That’s a great explanation. If I understand it, the prior stages (I’m not thinking of just a two stage system) all have their probabilities, but the final stage is making the final label a binary 0 or 1.

rmwkwok · November 29, 2023, 6:50am

If your “prior stages” mean “prior neural network layers”, then they do not represent probabilties.

I said it does not start from 0.5 at the input layer.

I said it starts to be 0.5 at the outcome of the output layer .

Only the outcome of the output layer can be interpreted as probabilities.

rmwkwok · November 29, 2023, 6:56am

However, if you want to suggest this way, I am all ears to hear how you develop this idea. Maybe we can start with what you mean by “a stage has its probability”?

Let’s say we are talking about a neural network. How would you connect the concept of “a stage has its probability” to a neural network? What is a stage in the context of a neural network? How does the stage be probabilistic?

RonSo · November 29, 2023, 2:00pm

I think you have me at a disadvantage. I’m used to probabilities in statistics and need to take your course to make any informed statements. But, I really appreciate your explanations! I look forward to your course.

rmwkwok · November 29, 2023, 10:17pm

Hello @RonSo,

I am sorry to have made thing look that way. I was interested in your idea and was wondering maybe you were leading us to Bayesian, or somewhere else.

I had thought about discussing this in other context (like weather forecasting), but frankly I would also worry that, even if the discussion turned out wonderful, would I be wasting time if we couldn’t just apply everything back to neural network.

Maybe we could have this conversation again when the time comes. Then, we may try to discuss whether, or better, just straight to how we can model neural network layers as probabilistic processes. I wait for that day to come.

Cheers,
Raymond

RonSo · November 29, 2023, 10:33pm

I look forward to it and thank you again for your feedback.

rmwkwok · November 29, 2023, 10:40pm

You are most welcome!

Raymond

Topic		Replies	Views
C2_W2_Assignment - How digit recognition really works Advanced Learning Algorithms week-module-2	14	615	April 10, 2023
Week 3 - Assignment 1 - Computation of Class Score: Why multiply Pc with C? Convolutional Neural Networks coursera-platform	18	742	June 4, 2022
C2_W1_Lab02_CoffeeRoasting_TF layer functions Advanced Learning Algorithms week-module-1	12	594	October 26, 2024
Convolution Confusion (Filters) C4W1 Convolutional Neural Networks week-module-1 , coursera-platform	21	275	April 15, 2024
C2_W1_Lab02_CoffeeRoasting_TF - Why does adding additional neurons result in weird plots? Advanced Learning Algorithms week-module-1	29	632	January 31, 2024

Face Recognition Comparison or not

Related topics