官术网_书友最值得收藏!

Logistic regression as a neural network

Logistic regression is a classifier algorithm. Here, we try to predict the probability of the output classes. The class with the highest probability becomes the predicted output. The error between the actual and predicted output is calculated using cross-entropy and minimized through backpropagation. Check the following diagram for binary logistic regression and multi-class logistic regression. The difference is based on the problem statement. If the unique number of output classes is two then it's called binary classification, if it's more than two then it's called multi-class classification. If there are no hidden layers, we use the sigmoid function for the binary classification and we get the architecture for binary logistic regression. Similarly, if there are no hidden layers and we use use the softmax function for the multi-class classification, we get the architecture for multi-class logistic regression.

Now a question arises, why not use the sigmoid function for multi-class logistic regression ?

The answer, which is true for all predicted output layers of any neural network, is that the predicted outputs should follow a probability distribution. In normal terms, say the output has N classes. This will result in N probabilities for an input data having, say, d dimensions. Thus, the sum of the N probabilities for this one input data should be 1 and each of those probabilities should be between 0 and 1 inclusive.

On the one hand, the summation of the sigmoid function for N different classes may not be 1 in the majority of cases. Therefore, in case of binary, the sigmoid function is applied to obtain the probability of one class, that is, p(y = 1|x), and for the other class the probability, that is, p(y = 0|x) = 1 ? p(y = 1|x). On the other hand, the output of a softmax function is values satisfying the probability distribution properties. In the diagram, refers to the sigmoid function:

A follow-up question might also arise: what if we use softmax in binary logistic regression?

As mentioned previously, as long as your predicted output follows the rules of probability distribution, everything is fine. Later, we will discuss cross entropy and the importance of probability distribution as a building block for any machine learning problem especially dealing with classification tasks.

A probability distribution is valid if the probabilities of all the values in the distribution are between 0 and 1, inclusive, and the sum of those probabilities must be 1.

Logistic regression can be viewed in a very small neural network. Let's try to go through a step-by-step process to implement a binary logistic regression, as shown here:

主站蜘蛛池模板: 尼勒克县| 宁武县| 建湖县| 陵川县| 昭通市| 铜鼓县| 怀化市| 泗水县| 梁平县| 同江市| 深泽县| 三门县| 邵武市| 黎城县| 泰州市| 新干县| 乐安县| 民和| 乌兰县| 汪清县| 山丹县| 合川市| 福州市| 体育| 上饶县| 错那县| 奉化市| 富阳市| 汶川县| 工布江达县| 聂荣县| 武安市| 太仆寺旗| 邵武市| 太白县| 利津县| 南通市| 平度市| 鹤岗市| 申扎县| 绍兴县|