Skip to main content

Binomial regression Contents Example application[edit] Specification of model[edit] Link functions[edit] Comparison between binomial regression and binary choice models[edit] Latent variable interpretation / derivation[edit] See also[edit] Notes[edit] References[edit] Navigation menup. 91Applied Statistics: Principles and Examplese

Generalized linear models


statisticsresponseBernoulli trialsexplanatory variablesbinary choice modelsdiscrete choiceutility theorygeneralized linear modellinear regressionlatent variableerror variableprobability distributionrandom variablelink functionquantile functioncumulative distribution functionmachine learningprobabilistic classificationbinary classificationbinomially distributedgeneralised linear modellikelihoodindicator functionmaximum likelihoodcumulative distribution functionprobability distributionlogistic regressionlogistic functionprobitnormal distributionlinear probability modellatent variableregression coefficientsindependent variablesdummy variablesrandom variableidentifiedlogistic distributionlogistic distributionnormal distributionStudent's t-distributioncumulative distribution functionquantile functionBernoulli trialgeneralized linear modelstandard normal distributionprobit modellogistic distributionscale parameterquantile functionlogit functionlogit modelgeneralized linear modelsdiscrete choicelatent variable modellog-Weibull












Binomial regression




From Wikipedia, the free encyclopedia






Jump to navigation
Jump to search


In statistics, binomial regression is a technique in which the response (often referred to as Y) is the result of a series of Bernoulli trials, or a series of one of two possible disjoint outcomes (traditionally denoted "success" or 1, and "failure" or 0).[1] In binomial regression, the probability of a success is related to explanatory variables: the corresponding concept in ordinary regression is to relate the mean value of the unobserved response to explanatory variables.


Binomial regression models are essentially the same as binary choice models, one type of discrete choice model. The primary difference is in the theoretical motivation: Discrete choice models are motivated using utility theory so as to handle various types of correlated and uncorrelated choices, while binomial regression models are generally described in terms of the generalized linear model, an attempt to generalize various types of linear regression models. As a result, discrete choice models are usually described primarily with a latent variable indicating the "utility" of making a choice, and with randomness introduced through an error variable distributed according to a specific probability distribution. Note that the latent variable itself is not observed, only the actual choice, which is assumed to have been made if the net utility was greater than 0. Binary regression models, however, dispense with both the latent and error variable and assume that the choice itself is a random variable, with a link function that transforms the expected value of the choice variable into a value that is then predicted by the linear predictor. It can be shown that the two are equivalent, at least in the case of binary choice models: the link function corresponds to the quantile function of the distribution of the error variable, and the inverse link function to the cumulative distribution function (CDF) of the error variable. The latent variable has an equivalent if one imagines generating a uniformly distributed number between 0 and 1, subtracting from it the mean (in the form of the linear predictor transformed by the inverse link function), and inverting the sign. One then has a number whose probability of being greater than 0 is the same as the probability of success in the choice variable, and can be thought of as a latent variable indicating whether a 0 or 1 was chosen.


In machine learning, binomial regression is considered a special case of probabilistic classification, and thus a generalization of binary classification.




Contents





  • 1 Example application


  • 2 Specification of model


  • 3 Link functions


  • 4 Comparison between binomial regression and binary choice models


  • 5 Latent variable interpretation / derivation


  • 6 See also


  • 7 Notes


  • 8 References




Example application[edit]


In one published example of an application of binomial regression,[2] the details were as follows. The observed outcome variable was whether or not a fault occurred in an industrial process. There were two explanatory variables: the first was a simple two-case factor representing whether or not a modified version of the process was used and the second was an ordinary quantitative variable measuring the purity of the material being supplied for the process.



Specification of model[edit]


The results are assumed to be binomially distributed.[1] They are often fitted as a generalised linear model where the predicted values μ are the probabilities that any individual event will result in a success. The likelihood of the predictions is then given by


L(μ∣Y)=∏i=1n(1yi=1(μi)+1yi=0(1−μi)),displaystyle L(boldsymbol mu mid Y)=prod _i=1^nleft(1_y_i=1(mu _i)+1_y_i=0(1-mu _i)right),,!

where 1A is the indicator function which takes on the value one when the event A occurs, and zero otherwise: in this formulation, for any given observation yi, only one of the two terms inside the product contributes, according to whether yi=0 or 1. The likelihood function is more fully specified by defining the formal parameters μi as parameterised functions of the explanatory variables: this defines the likelihood in terms of a much reduced number of parameters. Fitting of the model is usually achieved by employing the method of maximum likelihood to determine these parameters. In practice, the use of a formulation as a generalised linear model allows advantage to be taken of certain algorithmic ideas which are applicable across the whole class of more general models but which do not apply to all maximum likelihood problems.


Models used in binomial regression can often be extended to multinomial data.


There are many methods of generating the values of μ in systematic ways that allow for interpretation of the model; they are discussed below.



Link functions[edit]


There is a requirement that the modelling linking the probabilities μ to the explanatory variables should be of a form which only produces values in the range 0 to 1. Many models can be fitted into the form


μ=g(η).displaystyle boldsymbol mu =g(boldsymbol eta ),.

Here η is an intermediate variable representing a linear combination, containing the regression parameters, of the explanatory variables. The function
g is the cumulative distribution function (cdf) of some probability distribution. Usually this probability distribution has a range from minus infinity to plus infinity so that any finite value of η is transformed by the function g to a value inside the range 0 to 1.


In the case of logistic regression, the link function is the log of the odds ratio or logistic function. In the case of probit, the link is the cdf of the normal distribution. The linear probability model is not a proper binomial regression specification because predictions need not be in the range of zero to one; it is sometimes used for this type of data when the probability space is where interpretation occurs or when the analyst lacks sufficient sophistication to fit or calculate approximate linearizations of probabilities for interpretation.



Comparison between binomial regression and binary choice models[edit]


A binary choice model assumes a latent variable Un, the utility (or net benefit) that person n obtains from taking an action (as opposed to not taking the action). The utility the person obtains from taking the action depends on the characteristics of the person, some of which are observed by the researcher and some are not:


Un=β⋅sn+εndisplaystyle U_n=boldsymbol beta cdot mathbf s_n +varepsilon _n

where βdisplaystyle boldsymbol beta is a set of regression coefficients and sndisplaystyle mathbf s_n is a set of independent variables (also known as "features") describing person n, which may be either discrete "dummy variables" or regular continuous variables. εndisplaystyle varepsilon _n is a random variable specifying "noise" or "error" in the prediction, assumed to be distributed according to some distribution. Normally, if there is a mean or variance parameter in the distribution, it cannot be identified, so the parameters are set to convenient values — by convention usually mean 0, variance 1.


The person takes the action, yn = 1, if Un > 0. The unobserved term, εn, is assumed to have a logistic distribution.


The specification is written succinctly as:


    • Un = βsn + εn

    • Yn={1,if Un>0,0,if Un≤0displaystyle Y_n=begincases1,&textif U_n>0,\0,&textif U_nleq 0endcases


    • ε logistic, standard normal, etc.

Let us write it slightly differently:


    • Un = βsnen

    • Yn={1,if Un>0,0,if Un≤0displaystyle Y_n=begincases1,&textif U_n>0,\0,&textif U_nleq 0endcases


    • e logistic, standard normal, etc.

Here we[who?] have made the substitution en = −εn. This changes a random variable into a slightly different one, defined over a negated domain. As it happens, the error distributions we[who?] usually consider (e.g. logistic distribution, standard normal distribution, standard Student's t-distribution, etc.) are symmetric about 0, and hence the distribution over en is identical to the distribution over εn.


Denote the cumulative distribution function (CDF) of edisplaystyle e as Fe,displaystyle F_e, and the quantile function (inverse CDF) of edisplaystyle e as Fe−1.displaystyle F_e^-1.


Note that


Pr(Yn=1)=Pr(Un>0)=Pr(β⋅sn−en>0)=Pr(−en>−β⋅sn)=Pr(en≤β⋅sn)=Fe(β⋅sn)displaystyle beginalignedPr(Y_n=1)&=Pr(U_n>0)\[6pt]&=Pr(boldsymbol beta cdot mathbf s_n -e_n>0)\[6pt]&=Pr(-e_n>-boldsymbol beta cdot mathbf s_n )\[6pt]&=Pr(e_nleq boldsymbol beta cdot mathbf s_n )\[6pt]&=F_e(boldsymbol beta cdot mathbf s_n )endaligned

Since Yndisplaystyle Y_n is a Bernoulli trial, where E[Yn]=Pr(Yn=1),displaystyle mathbb E [Y_n]=Pr(Y_n=1), we[who?] have


E[Yn]=Fe(β⋅sn)displaystyle mathbb E [Y_n]=F_e(boldsymbol beta cdot mathbf s_n )

or equivalently


Fe−1(E[Yn])=β⋅sn.displaystyle F_e^-1(mathbb E [Y_n])=boldsymbol beta cdot mathbf s_n .

Note that this is exactly equivalent to the binomial regression model expressed in the formalism of the generalized linear model.


If en∼N(0,1),displaystyle e_nsim mathcal N(0,1), i.e. distributed as a standard normal distribution, then


Φ−1(E[Yn])=β⋅sndisplaystyle Phi ^-1(mathbb E [Y_n])=boldsymbol beta cdot mathbf s_n

which is exactly a probit model.


If en∼Logistic⁡(0,1),displaystyle e_nsim operatorname Logistic (0,1), i.e. distributed as a standard logistic distribution with mean 0 and scale parameter 1, then the corresponding quantile function is the logit function, and


logit⁡(E[Yn])=β⋅sndisplaystyle operatorname logit (mathbb E [Y_n])=boldsymbol beta cdot mathbf s_n

which is exactly a logit model.


Note that the two different formalisms — generalized linear models (GLM's) and discrete choice models — are equivalent in the case of simple binary choice models, but can be extended if differing ways:


  • GLM's can easily handle arbitrarily distributed response variables (dependent variables), not just categorical variables or ordinal variables, which discrete choice models are limited to by their nature. GLM's are also not limited to link functions that are quantile functions of some distribution, unlike the use of an error variable, which must by assumption have a probability distribution.

  • On the other hand, because discrete choice models are described as types of generative models, it is conceptually easier to extend them to complicated situations with multiple, possibly correlated, choices for each person, or other variations.


Latent variable interpretation / derivation[edit]


A latent variable model involving a binomial observed variable Y can be constructed such that Y is related to the latent variable Y* via


Y={0,if Y∗>01,if Y∗<0.displaystyle Y=begincases0,&mboxif Y^*>0\1,&mboxif Y^*<0.endcases

The latent variable Y* is then related to a set of regression variables X by the model


Y∗=Xβ+ϵ .displaystyle Y^*=Xbeta +epsilon .

This results in a binomial regression model.


The variance of ϵ can not be identified and when it is not of interest is often assumed to be equal to one. If ϵ is normally distributed, then a probit is the appropriate model and if ϵ is log-Weibull distributed, then a logit is appropriate. If ϵ is uniformly distributed, then a linear probability model is appropriate.



See also[edit]


  • Linear probability model

  • Poisson regression

  • Predictive modelling


Notes[edit]



  1. ^ ab Sanford Weisberg (2005). "Binomial Regression". Applied Linear Regression. Wiley-IEEE. pp. 253–254. ISBN 0-471-66379-4..mw-parser-output cite.citationfont-style:inherit.mw-parser-output .citation qquotes:"""""""'""'".mw-parser-output .citation .cs1-lock-free abackground:url("//upload.wikimedia.org/wikipedia/commons/thumb/6/65/Lock-green.svg/9px-Lock-green.svg.png")no-repeat;background-position:right .1em center.mw-parser-output .citation .cs1-lock-limited a,.mw-parser-output .citation .cs1-lock-registration abackground:url("//upload.wikimedia.org/wikipedia/commons/thumb/d/d6/Lock-gray-alt-2.svg/9px-Lock-gray-alt-2.svg.png")no-repeat;background-position:right .1em center.mw-parser-output .citation .cs1-lock-subscription abackground:url("//upload.wikimedia.org/wikipedia/commons/thumb/a/aa/Lock-red-alt-2.svg/9px-Lock-red-alt-2.svg.png")no-repeat;background-position:right .1em center.mw-parser-output .cs1-subscription,.mw-parser-output .cs1-registrationcolor:#555.mw-parser-output .cs1-subscription span,.mw-parser-output .cs1-registration spanborder-bottom:1px dotted;cursor:help.mw-parser-output .cs1-ws-icon abackground:url("//upload.wikimedia.org/wikipedia/commons/thumb/4/4c/Wikisource-logo.svg/12px-Wikisource-logo.svg.png")no-repeat;background-position:right .1em center.mw-parser-output code.cs1-codecolor:inherit;background:inherit;border:inherit;padding:inherit.mw-parser-output .cs1-hidden-errordisplay:none;font-size:100%.mw-parser-output .cs1-visible-errorfont-size:100%.mw-parser-output .cs1-maintdisplay:none;color:#33aa33;margin-left:0.3em.mw-parser-output .cs1-subscription,.mw-parser-output .cs1-registration,.mw-parser-output .cs1-formatfont-size:95%.mw-parser-output .cs1-kern-left,.mw-parser-output .cs1-kern-wl-leftpadding-left:0.2em.mw-parser-output .cs1-kern-right,.mw-parser-output .cs1-kern-wl-rightpadding-right:0.2em


  2. ^ Cox & Snell (1981), Example H, p. 91



References[edit]



  • Cox, D. R.; Snell, E. J. (1981). Applied Statistics: Principles and Examples. Chapman and Hall. ISBN 0-412-16570-8.








Retrieved from "https://en.wikipedia.org/w/index.php?title=Binomial_regression&oldid=869219109"










Navigation menu


























(window.RLQ=window.RLQ||[]).push(function()mw.config.set("wgPageParseReport":"limitreport":"cputime":"0.400","walltime":"0.543","ppvisitednodes":"value":1381,"limit":1000000,"ppgeneratednodes":"value":0,"limit":1500000,"postexpandincludesize":"value":160281,"limit":2097152,"templateargumentsize":"value":2665,"limit":2097152,"expansiondepth":"value":12,"limit":40,"expensivefunctioncount":"value":1,"limit":500,"unstrip-depth":"value":1,"limit":20,"unstrip-size":"value":6174,"limit":5000000,"entityaccesscount":"value":0,"limit":400,"timingprofile":["100.00% 312.792 1 -total"," 37.57% 117.516 3 Template:Who?"," 30.34% 94.905 1 Template:Statistics"," 28.37% 88.741 1 Template:Navbox_with_collapsible_groups"," 28.05% 87.730 2 Template:Cite_book"," 27.93% 87.358 3 Template:Fix"," 16.75% 52.390 11 Template:Navbox"," 14.98% 46.842 3 Template:Category_handler"," 11.21% 35.078 3 Template:Delink"," 3.75% 11.736 4 Template:Icon"],"scribunto":"limitreport-timeusage":"value":"0.159","limit":"10.000","limitreport-memusage":"value":3936070,"limit":52428800,"cachereport":"origin":"mw1324","timestamp":"20190423151532","ttl":2592000,"transientcontent":false);mw.config.set("wgBackendResponseTime":105,"wgHostname":"mw1247"););

Popular posts from this blog

منجزی محتویات تیره‌های طایفه منجزی[ویرایش] مشاهیر طایفه منجزی[ویرایش] محل سکونت[ویرایش] پانویس[ویرایش] منابع[ویرایش] منوی ناوبری«نمودار اجتماعی طوایف بختیاری»«BakhtyārBAḴTĪĀRĪ TRIBE»«اسامی طوایف و شعب ایل بختیاری»ووگسترش آن

What does the writing on Poe's helmet say? Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern) Favorite questions and answers from first quarter of 2019 Latest Blog Post: Avengers: Endgame PredictionsWhat is the purpose of the blast shield helmet?Why was the Stormtrooper helmet designed this way?What does Kylo Ren place his helmet on?What does the writing on Poe Dameron's flight vest say?Is this Poe Damerons dad? (Kes Dameron)Is Poe Dameron Force-Sensitive?Why is Poe Dameron so shocked in the First Order star destroyer hangar?What does the code breaker's hat say?In “The Last Jedi” was it actually Poe's fault that so much of the resistance died?Did Poe Dameron make custom modifications to his black X-Wing?

How to implement Time Range Picker in Magento 2 Admin system.xml? The 2019 Stack Overflow Developer Survey Results Are In Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)Date field system.xmlMagento 2 - time picker on backend (xml form)How to overwrite System.xml?Magento 2 Pattern Library — Date & Time SelectorsHTTP 500 Error in System ConfigurationMagento 2 - time picker on backend (xml form)Magento 2 Add Datetime picker in system.xmlDate Time picker and time zone woesHow to implement Single Date and Time Picker in Magento 2Custom Module for Custom Column using Plugin Yes/No optionMagento 2 DateTime picker - Limit time selection rangeMagento2 UI Component admin Grid / Listing stuck loading