Logistic regression
Jump to navigation
Jump to search
Logistic regression is a type of curve fitting. It used for discrete outcome variables, e.g. pass or fail.
GNU/Octave example
% ------------------------------------------------------------------------------------------------- % TO RUN THIS FROM WITHIN OCTAVE % run logistic_regression_example.m % % TO RUN FROM THE COMMAND LINE % octave logistic_regression_example.m > tmp.txt % ------------------------------------------------------------------------------------------------- clear all; close all; % Data from example on 'Logistic regression' page of Wikipedia - https://en.wikipedia.org/wiki/Logistic_regression y = [ 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1 ]; x = [ 0.50, 0.75, 1.00, 1.25, 1.50, 1.75, 1.75, 2.00, 2.25, 2.50, 2.75, 3.00, 3.25, 3.50, 4.00, 4.25, 4.50, 4.75, 5.00, 5.50 ]; y=transpose(y); x=transpose(x); [theta, beta, dev, dl, d2l, gamma] = logistic_regression(y,x,1) % calculate standard error se = sqrt (diag (inv (-d2l))) % create array % [[ alpha se_alpha zvalue_alpha zvalue_alpha^2 P_Wald_alpha ] % [ beta_1 se_beta_1 zvalue_beta_1 zvalue_beta_1^2 P_Wald_beta_1 ] % [ beta_2 se_beta_2 zvalue_beta_2 zvalue_beta_1^2 P_Wald_beta_1 ] % [ ... ... ... ... ... ] % [ beta_n se_beta_n zvalue_beta_n zvalue_beta_1^2 P_Wald_beta_n ]] logit_arr=zeros(size(beta,1)+1,5); logit_arr(1,1)=theta; logit_arr(2:end,1)=beta; logit_arr(1:end,2)=se; logit_arr(1:end,3)=logit_arr(1:end,1)./logit_arr(1:end,2); % zvalue_i = coefficient_i/se_coefficient_i logit_arr(1:end,4)=logit_arr(1:end,3).*logit_arr(1:end,3); % zvalue_i^2 logit_arr(1:end,5)=1-chi2cdf(logit_arr(1:end,4),1) % Wald statistic is calculated as per formula in https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1065119/ % calculate the probability for a range of values studytime_arr= [ 1, 2, 3, 4, 5] len_studytime_arr=size(studytime_arr,2); % As per the manual ( https://www.gnu.org/software/octave/doc/v4.0.3/Correlation-and-Regression-Analysis.html ) GNU/Octave function fits: % logit (gamma_i (x)) = theta_i - beta' * x, i = 1 ... k-1 for ctr=1:len_studytime_arr P_logit(ctr)=1/(1+exp(theta-studytime_arr(ctr)*beta)); end % print to screen output P_logit logit_arr % create plot len_x = size(x,1) for ctr=1:len_x P_fitted(ctr)=1/(1+exp(theta-x(ctr)*beta)); end plot(x,y,'o') hold on; grid on; plot(x,P_fitted,'-') xlabel('Study time (hours)'); ylabel('Probability of passing the exam (-)');